Improving Gateway Performance
Learn to adjust tracing, instrument middleware and plugins, and more
If your supergraph gateway uses Apollo Server with the @apollo/gateway
library, this article provides recommendations for improving your gateway's performance.
Consider moving to the GraphOS Router
The GraphOS Router is a high-throughput, low-latency GraphQL Federation gateway written in Rust. Its performance and consistency significantly exceed any Node.js-based gateway.
Migrating to GraphOS Router from @apollo/gateway
Disable or reduce ftv1
inline tracing
Inline tracing (also known as federated tracing or ftv1
) provides helpful field-level latency information in GraphOS Studio, however it comes at a cost: trace information can be expensive to calculate, and it's attached to subgraph responses, increasing payload size.
In Apollo Server 3.6 and later, you can set the fieldLevelInstrumentation
option to turn off inline tracing entirely, or limit tracing to a representative sample of all requests.
- During load testing, we recommend disabling tracing entirely.
- For regular production traffic, we recommend setting a low sample rate such as
0.01
(one percent of requests).
Usage Reporting plugin docs for fieldLevelInstrumentation
Change the default maxSockets
setting
The maxSockets
setting controls the maximum number of connections that @apollo/gateway
's fetch
implementation can create to each subgraph host and port. With higher values, fewer subgraph requests are queued in the gateway, but the event loop might become overloaded serializing responses from subgraphs.
In @apollo/gateway
versions prior to v0.51.0, the default maxSockets
setting is 15
, which is too low for many systems. In later versions, the default is Infinity
, which lends itself to the overload scenario.
The following ApolloGateway
constructor sets maxSockets
to 50:
import {ApolloGateway, RemoteGraphQLDataSource} from '@apollo/gateway';import fetcher from 'make-fetch-happen';const gateway = new ApolloGateway({buildService({name, url}) {return new RemoteGraphQLDataSource({name,url,fetcher: fetcher.defaults({maxSockets: 50})});}});
Configuring the subgraph fetcher
Instrument Gateway middleware and plugins
If you've installed middleware (such as Express middleware) or Apollo Server plugins, it's worth using a code profiling tool such as Clinic.js to investigate potential issues, such as synchronous code blocking the main thread or expensive plugin functions.
Use OpenTelemetry tracing
OpenTelemetry is the CNCF standard for instrumenting distributed systems. By tracing @apollo/gateway
, subgraph services, data sources, and any other systems in the request path, you might discover that the cause of performance issues occurs outside of @apollo/gateway
.
OpenTelemetry in @apollo/gateway
Investigate subgraph latency
Try running load tests against the subgraphs directly. You can simulate requests from the Gateway by querying the _entities
field:
query MyQuery__subgraphA__0($representations: [_Any!]!) {_entities(representations: $representations) {... on EntityA {fieldAfieldB}... on EntityB {fieldCfieldD}}}
By testing _entities
queries, you might discover that you need to implement a DataLoader in your reference resolvers.
The slower your subgraphs, the more time the Gateway spends keeping track of concurrent in-flight requests. The Gateway's maximum requests per second drops linearly as subgraph latencies rise.
Tweak CPU and memory limits
You might need to set higher CPU and memory limits on your @apollo/gateway
pods. Determining the best limits for your system depends on a number of factors:
- Size of operations
- Cardinality of operations (number of unique shapes)
- Complexity of query plans
- Subgraph latency
- Use of shared caches
In general, look at increasing memory limits as Node.js is single-threaded and cannot use more than one CPU. That being said, we recommend keeping CPU utilization below 50%. At higher CPU utilization percentages, you may start to see event loop delays increase relative. Monitoring both memory usage and CPU usage metrics can help provide a clearer image of bottlenecks beyond just code profiling alone.
If CPU throttling becomes a problem, consider removing the limits on your gateway pods as discussed in this blog post on Kubernetes resource limits.
Check query plans
Query plans (the steps the Gateway takes to resolve an operation from multiple subgraphs) are almost always reasonable approximations of the original operation. They're also highly cacheable so they don't incur much of a performance penalty. But it's worth looking at them to make sure they're of an appropriate size and complexity. Fragment and abstract type usage may cause undesirable query plans in specific cases.
const gateway = new ApolloGateway({debug: true // print query plans to stdout});
Add more instances
@apollo/gateway
is designed to scale horizontally. If you've tried the previous strategies and performance still isn't where it should be, distribute the load across more instances of the gateway.