Service

Node apps

Service

In the VTEX IO ecosystem, services play a crucial role by allowing you to run .NET and Node.js code directly on VTEX servers. Services enable VTEX IO apps to export HTTP routes, GraphQL resolvers, and event handlers to the server.

Developing services

To create and export services, you need to specify the node or dotnet builders in your app's manifest.json file. Additionally, GraphQL services can be exported using the graphql builder.

For a practical example, refer to our service example apps: Node.js, .NET, and GraphQL.

For more information, check the Developing services on VTEX IO course in the VTEX Learning Center.

Configuring services

The configuration of your service depends on the selected builder (node or dotnet). You will find the service.json file in the /node or /dotnet folder of your app. This configuration file is responsible for defining essential parameters, including timeout, memory allocation, routes, event handlers, and replicas.

Here's an example of a service.json file:

service.json


_13{
_13  "memory": 256,
_13  "ttl": 10,
_13  "timeout": 2,
_13  "minReplicas": 2,
_13  "maxReplicas": 4,
_13  "routes": {
_13    "status": {
_13      "path": "/_v/status/:code",
_13      "public": true
_13    }
_13  }
_13}

Service configuration parameters

Let's delve into the parameters you can set within the service.json file:

Name	Type	Description
`memory`	`number`	Defines the memory size (in MB) that should be allocated to the service. Default: 128, maximum: 1024.
`minReplicas`	`number`	Defines the minimum number of replicas available when the service is running. Minimum: 2 for installed apps and 1 for linked apps.
`maxReplicas`	`number`	Defines the maximum number of replicas available. Minimum: 5, maximum: 60.
`timeout`	`number`	Sets the timeout (in seconds) for aborting a connection if a request takes too long. This parameter affects only incoming requests to the app. It does not affect outgoing requests that the app makes to other clients. Default: 10, minimum: 1, maximum: 60.
`ttl`	`number`	Defines the time-to-live (in minutes) for how long the platform will keep each instance of the service running without receiving new requests. Default: 10, minimum: 10, maximum: 60. The requested value is only honored for the most recent stable version of the app. Older versions, versions that have not been deployed, linked apps, or beta versions will have the value overridden to 10.
`workers`	`number`	Specifies the number of workers to spawn for the service on production. Minimum: 1, maximum: 4.
`rateLimitPerReplica`	`object`	This object contains global parameters whose values define specific throttling limits for requests and events. The values of this object are overridden by the values set for a specific route or event.
`↳perMinute`	`number`	Defines the global maximum number of requests and event triggers allowed per minute per replica. This value is overridden by the value set for a specific route or event.
`↳concurrent`	`number`	Defines the global maximum number of requests and event triggers a replica will handle simultaneously for that application. This value is overridden by the value set for a specific route or event.
`events`	`object of objects`	Each object inside this parameter represents one event. Maps an event handler to an object that describes the sender or topics. This parameter must be used both by the sender and the receivers of the event.
`↳topics`	`array of strings`	Identifiers of the event. Previously known as `keys`, which is now deprecated.
`↳sender`	`string`	Name of the app that sends the event.
`↳settingsType`	`string`	Possible values: `pure`, `workspace`, or `userAndWorkspace`. If `workspace` or `userAndWorkspace`, the app loads the settings of the apps it depends on.
`↳rateLimitPerReplica`	`object`	This object contains parameters whose values define specific throttling limits for the event. The values of this object override the global `rateLimitPerReplica` values.
`↳↳perMinute`	`number`	Defines the maximum amount of times this event is allowed to trigger per minute per replica. This value overrides the global `rateLimitPerReplica.perMinute` value.
`↳↳concurrent`	`number`	Defines the maximum amount of times a replica will handle triggers of this event simultaneously. This value overrides the global `rateLimitPerReplica.concurrent` value.
`routes`	`object of objects`	Each object inside this represents one route. Maps route handlers to objects containing information about the route.
`↳path`	`string`	Path of the route. Check Service path patterns for details about path construction.
`↳public`	`boolean`	Defines if the route is public or private. If `true`, the route will be available at `{account}.myvtex.com` and `{account}.vtexcommercestable.com.br`. If `false`, the route will be at `app.io.vtex.com/{vendor}.{appName}/v{majorVersion}/{account}/{workspace}` and will require a `VtexidClientAutCookie` for authentication. Using `appKey` and `appToken` for authentication will not work.
`↳access`	`string`	Possible values: `public`, `authenticated`, or `authorized`.
`↳policies`	`array of objects`	Defines the actions allowed or denied with the route by whom. More details on Resource-based Policies.
`↳rateLimitPerReplica`	`object`	This object contains parameters whose values define specific throttling limits for the route. The values of this object override the global `rateLimitPerReplica` values.
`↳↳perMinute`	`number`	Defines the maximum number of requests allowed per minute per replica for this route. This value overrides the global `rateLimitPerReplica.perMinute` value.
`↳↳concurrent`	`number`	Defines the maximum number of requests a replica will handle for this route. This value overrides the global `rateLimitPerReplica.concurrent` value.

Note that most of these fields are optional, and default values provided by the platform are often sufficient.

Best practices

To help configuring your service, we describe some best practices below about many of the parameters available. This should help achieve good performance and avoid failures with your application.

Memory allocation

App developers have to evaluate the memory usage of the app to choose a proper value for the memory parameter. Consider factors such as the complexity of your application, and the size of the data structures instanced in the implementation. Adequate memory allocation helps prevent failures and performance issues.

Timeout settings

Setting an appropriate timeout value prevents requests from timing out prematurely or causing resource exhaustion. Ideally, it is expected that applications have low response times most of the time, in the range of milliseconds, and do not fail to process the requests.

In scenarios where an application has a low timeout setting and takes too long to respond, the platform will respond to the request with a timeout response and the data from the app will not be delivered to the client when the request could be processed. On the other hand, in scenarios where the timeout setting is high and the application usually does not take long to respond, if a problem occurs with the processing of the request, the platform will take too long to respond with a timeout and inform the client that a problem has occurred.

Balancing between a low or high timeout setting depends on the time that is expected to deliver a response and how fast the requester requires the response to perform its work.

Testing response times to decide a proper timeout setting can be done in various ways including, but not limited to, different connection conditions (mobile vs. tethered, instabilities, etc.), requests from different locations, different workloads and settings (memory, replicas, etc.), and measuring percentiles of the slowest response times.

Replica management

VTEX has an autoscaling feature for the IO infrastructure, which dynamically controls the number of instances of each app obeying certain parameters. We call replicas the number of instances of an app running in our infrastructure. The minimum and maximum replicas an app can have are determined by the parameters minReplicas and maxReplicas, respectively. The current replicas increase and decrease automatically according to how much the app is requested in the platform, considering all the accounts where the app is installed.

The values of minReplicas and maxReplicas should be defined considering the expected demand for the app. If you have observability on the accounts where the app is installed, some metrics can help define the parameters, e.g., the number of accesses or purchases and how they fluctuate.

Route configuration

This configuration defines the paths of the endpoints that the app will expose, among other properties like if it is public or private, and the permissions through policies. For more details about the path construction, read Service path patterns. For more information about permissions of the endpoints, read Resource-based Policies.

TTL management

Time-to-live (TTL) is the time that a service instance will keep running in the platform without receiving new requests before it is shut down. By having a TTL, we can minimize resource wastage from service instances.

When new requests are made and there are no instances available to respond, the platform has to start new instances to process and respond to the requests, a process that we call cold start. A cold start is slow, so the first request a new instance responds to takes longer than usual, and the following requests are processed as usual.

Depending on the service usage, a small TTL value can lead to cold starts happening more frequently, which causes slower response times. We recommend increasing the TTL above the default value only if you frequently notice slow response times that can be caused by cold starts.

Worker configuration

VTEX IO services use the Cluster module from Node.js for parallelism. A service starts with a main process and then creates child processes, or workers, to handle the load from requests. The number of workers is determined by the value of the workers parameter.

A higher number of workers is recommended when more processing capacity is required to handle requests, which can reduce response times. Be aware that more workers also increase the memory usage of the service, so adjust the memory parameter accordingly. If your service performs well with a single worker or does not take advantage of parallelism, we recommend not declaring a workers value or setting it to 1.