Your API gateway sits at the edge of your application environment. It is where traffic from mobile apps, partner platforms, internal tools, and other services is received before it reaches the backend systems that process it. For a long time, running one centrally makes sense; routing, throttling, authentication, and observability all go through one place, operated by one team, under one contract that every service team has to honor.
Then the organization grows, and the same gateway starts costing you in ways that aren't on the architecture diagram. The traffic chart climbs, and so does the license bill, because the commercial model is priced per request. The release calendar fills up, but every change touches the same configuration, so changes get queued. A misconfigured route can take down endpoints belonging to teams that have never met. The platform team that owns the gateway becomes the bottleneck no one wants to be, and the only team capable of unblocking everyone else.
At a certain scale, the gateway stops being a piece of infrastructure and starts being a coordination problem.
That is the point at which we, working with a large enterprise client whose engineering organization had grown to dozens of teams across multiple AWS accounts, decided to take it apart.
Centralization is fine while there are five services behind it. It becomes expensive, financially and organizationally, when there are dozens, owned by teams who deploy on different cadences, in different AWS accounts, with very different latency and compliance profiles.
Three pressures pushed us toward a decentralized model:
None of these are unusual problems. They tend to surface together, and they tend to surface around the same time a team starts running quarterly capacity reviews for the gateway itself. If your central gateway is showing up in too many architecture review meetings, you are probably already paying these costs without naming them.
We replaced the central gateway with a small, repeatable building block: an AWS API Gateway plus a Lambda authorizer, packaged as a deployable Terraform module. Every service team deploys an instance into their own AWS account, in front of their own services. They own the routes, the deploys, and the bill.
That description is easy to write and harder to live with. The hard part isn't deploying gateways; it's making sure consistency doesn't collapse the moment you let go of the single shared one. Three design choices did most of that work.
Routing and rate-limiting can vary per team; authentication cannot. Every gateway runs the same authorizer logic, distributed as part of the same Terraform module. If a team consumes the module, they get the org-wide auth behavior for free. They cannot accidentally diverge from it. That is the lever that lets the platform team relax control of where gateways run without losing control of how they authorize.
Authorizer configuration is published from the central developer portal as Kafka messages. This covers which clients are valid, which scopes map to which APIs, and which keys have been rotated. Each authorizer subscribes to the topic and updates its in-memory view. A scope change made in the portal at 10:01 is in effect across every gateway within seconds, without a deploy.
The portal remains the single source of truth for what the policy is; the gateways are simply the places that enforce it. Governance doesn't have to be centralized infrastructure; it can be a centralized data flow over decentralized infrastructure. That distinction is the core of the approach.
Kafka availability is not the right thing to bet your authentication path on. The same authorizer module accepts a baked-in configuration at deploy time. This matters in three situations: cold starts, where the authorizer should not wait on a topic catch-up before serving its first request; Kafka outages, where the authorizer keeps working on its last known good configuration rather than failing closed for reasons unrelated to authentication; and edge or restricted environments where the developer-portal connection is not available.
The portal is the source of truth in steady state. The hardcoded config is the floor.
There are two builds of the authorizer. The Rust variant serves latency-sensitive paths where every millisecond of authorization overhead is visible in product metrics or cold-start sensitivity. The TypeScript variant is for teams whose APIs are not on the hot path and whose engineers are already fluent in Node. Both consume the same configuration and enforce the same rules. The choice is an ergonomics-versus-performance trade-off the team makes locally, not a decision the platform team has to make on their behalf.
The shift is most visible at three moments in a team's life.
Onboarding a new service. Previously, onboarding a new API meant a ticket to the platform team, a review cycle, a configuration change in the shared gateway, and a deploy coordinated against everyone else's release window. Now it is terraform apply against a module the team already understands, with auth behavior they don't have to reason about because it comes with the module. Onboarding becomes a single-team task rather than a multi-team coordination problem.
Rolling out an auth change. A scope that needs to be revoked across the estate used to be a coordinated rollout: a change in the central gateway, a window, a deploy, and a verification pass. Now it is a change in the developer portal that propagates as a Kafka event. Across dozens of gateways, the new policy is live within seconds. The platform team's job has moved up the stack. They no longer operate the gateway; they own the module, the authorizer code, and the event contract.
Reading the bill. Because each team's gateway lives in their account, their traffic is their cost. A team that ships a chatty client and triples its request volume sees the cost in their own AWS bill the same month. That signal lands with the people who can act on it, the team that owns the API, instead of being absorbed into a platform line item nobody reads. Meanwhile, the traffic-priced license on the old central gateway simply went away; that cost wasn't redistributed; it was eliminated.
Decentralization is not free. A few of the trade-offs worth flagging if you are evaluating this pattern.
You take on a fleet. You no longer run one gateway; you run dozens. The module makes them homogeneous, but observability, version rollout, and end-to-end testing all need to be designed for the fleet, not for the single instance.
The Kafka topic between the developer portal and the authorizers becomes a critical dependency. Its schema, retention, and replay semantics all matter. The hardcoded fallback mitigates outages but doesn't replace the need to operate that contract carefully.
Coordination shifts, but doesn't disappear. Service teams now coordinate less with the platform team but more with each other on cross-cutting concerns like CORS policy, error formats, and observability conventions. The module is where these conventions live. Keeping it opinionated is more valuable than keeping it flexible.
None of these are reasons not to do it. They are reasons to be deliberate about how you do it.
For engineers, the day-to-day change is about ownership and friction. The gateway is no longer a piece of remote infrastructure they file tickets against; it is a module in their repo. They can read it, debug it locally, and reason about its failure modes. The authorizer is only a black box in the sense that teams don't have to think about it; they can still read and debug it when needed.
For decision-makers, the more important shift is structural. API costs become legible at the team level, letting engineering and product leadership have honest conversations about traffic, growth, and unit economics. Governance and infrastructure also stop being coupled: the developer portal can stay strict (one place, one policy) while the enforcement infrastructure can sprawl as widely as the organization needs. The platform team scales sub-linearly with the rest of engineering. They are no longer the gating function for every new service; instead, they maintain the module, the authorizer, and the event contract.
That last point is where the "scaling" in this story actually lives. Hardware and throughput scale easily. The thing that doesn't scale, in a growing engineering organization, is the human coordination overhead of any piece of infrastructure that every team has to share. Decentralizing the gateway is one of the cleaner ways to push that overhead out of the critical path.
If your central API gateway is showing up in capacity reviews, license renewals, or incident retrospectives more often than feels healthy, three moves are worth considering.
The most important move is decoupling policy from enforcement. Keep the developer portal as the source of truth and let enforcement points multiply. Events are a better distribution mechanism for policy than tickets are. Package the gateway as a module rather than a service the platform team operates: their product becomes the IaC, the authorizer, and the update path, not the running infrastructure.
Making cost a local signal is the third move. Putting gateways in team-owned accounts turns API traffic into a number the team that generates it can actually see, rather than a platform line item nobody reads.
The shape of the answer is straightforward. The hard work is in the contracts: the module's API, the authorizer's configuration schema, and the event flow between the developer portal and the fleet. Get those right, and the rest follows. Whether you are weighing a decentralization step, designing a developer portal, or rethinking how policy reaches your runtime, our DevOps and Platform Engineering and Cloud Engineering teams would be glad to compare notes.