How I stopped writing custom endpoints

Notes on Google AIP and ConnectRPC, after building on this stack for years.

May 17, 2026

The Slack message every backend engineer recognizes

Most backend engineers I know have received some version of this message:

“Hey, for the new dashboard, can you add an endpoint that returns the last 50 failed runs for account X, grouped by region, with the count of associated checks per run, sorted by severity?”

You know exactly what’s coming. A new handler. A new SQL query. A new request and response shape. A new test. A new permission check. A migration through staging. Maybe two days of work end to end.

The frontend engineer needs it by Thursday.

Multiply this across a year and a team and you end up with an API surface that looks like a junk drawer. GET /dashboard/failed-runs-by-region. GET /widgets/severity-breakdown. POST /reports/weekly-summary. Each one tightly coupled to one specific view. None of them reusable. All of them carrying a long tail of maintenance.

I spent years working in that mode. Then I stopped. The change wasn’t a framework or a clever pattern. It was a deliberate choice about API shape, plus a transport that takes that shape seriously, plus the realization that the same schema you ship to your frontend can power your CLI, your MCP server, your SDKs, and anything else you haven’t thought of yet.

The thesis

There is a stack, two pieces of design, that fundamentally changes how a backend team works.

The first piece is Google’s AIPs, a public catalog of API design proposals Google uses internally. Resource-oriented APIs. A small set of standard methods. A formal filter language. Hierarchical resource names. Together, these constraints make a system where most “new endpoint” requests collapse into “different filter on an existing list call.”

The second piece is ConnectRPC. Protobuf as the schema. One handler serves gRPC, gRPC-Web, and plain HTTP/JSON without a translation layer. The browser can call it. A Go service can call it. A Python script can call it.

The schema stops being an interface contract. It becomes an artifact. From that artifact you generate: a typed server. A typed client in every language. A complete CLI. An MCP server. An OpenAPI document. Whatever consumer you need next, you write a generator once and never go back. The frontend is one of those consumers. So is curl. So is an AI agent.

The rest of this post walks through each piece in turn, with examples from the platform I’m building, anonymized just enough to make the point.

AIP, briefly, for the people who haven’t run into it

Google has been publishing API Improvement Proposals for over a decade. The catalog now runs into the hundreds, but the spine is small.

AIP-121 says your API is a graph of named resources. Nouns. A small fixed set of verbs operates on them. AIP-122 says every resource has a stable, hierarchical string identifier like projects/acme/buckets/photos that you can pass around like a primary key. AIP-131 through AIP-135 define the standard methods: Get, List, Create, Update, Delete. Each has a precise contract. AIP-160 defines a filter expression language with a formal grammar. AIP-136 covers custom methods, the deviation you take only when you must.

The list goes on. ETags, pagination, error model, long-running operations, partial responses, field masks. But those few cover most of the daily surface.

The core opinion is worth saying plainly. Verbs are scarce. Nouns are plentiful. Most products don’t have many genuinely distinct operations. They have many views over a few real things. AIP forces you to model the things first and let the views fall out of generic operations on them.

That’s the bet. Make resources the unit of design, and most “I need a new endpoint” requests stop being endpoint requests at all.

If you’ve ever used a Google Cloud API, you’ve been on the receiving end of this philosophy. The same projects.locations.instances.list pattern appears across dozens of products, with the same filter syntax, the same pagination, the same name format. There’s a reason Google can ship clients in various languages for every service. The surface is regular.

What a resource-oriented API actually looks like

Concrete example. You’re building an inventory of cloud assets. Virtual machines, storage buckets, databases, identity roles. Users need to query them.

Here’s how the asset is defined:

message Asset {
  option (google.api.resource) = {
    type: "assets.example.com/Asset"
    pattern: "assets/{asset}"
    singular: "asset"
    plural: "assets"
  };

  string name = 1 [(google.api.field_behavior) = IDENTIFIER];

  string asset_type = 2 [
    (google.api.field_behavior) = OUTPUT_ONLY,
    (io.spangenberg.field_behavior) = FILTERABLE,
    (io.spangenberg.field_behavior) = ORDERABLE
  ];

  string asset_key = 3 [
    (google.api.field_behavior) = OUTPUT_ONLY,
    (io.spangenberg.field_behavior) = FILTERABLE,
    (io.spangenberg.field_behavior) = ORDERABLE
  ];

  google.protobuf.Struct payload = 4
    [(google.api.field_behavior) = OUTPUT_ONLY];

  google.protobuf.Struct scope = 5 [
    (google.api.field_behavior) = OUTPUT_ONLY,
    (io.spangenberg.field_behavior) = FILTERABLE
  ];

  google.protobuf.Timestamp create_time = 6 [
    (google.api.field_behavior) = OUTPUT_ONLY,
    (io.spangenberg.field_behavior) = FILTERABLE,
    (io.spangenberg.field_behavior) = ORDERABLE
  ];
}

A few details matter, because every one of them does real work later.

The resource annotation declares that this thing is a noun. It has a type (assets.example.com/Asset), a pattern (assets/{asset}), and singular/plural forms. Tools downstream, client library generators, linters, IDEs, all consume this to build helpers like parseAssetName() and to validate cross-references between resources. This is not commentary. It’s structured metadata.

The name field is the resource’s permanent address. assets/i-0abcd1234 is the asset’s name forever. UIs pass it around in URLs. Other resources reference it. Caches key off it. The CLI uses it as the positional argument. There is no separate id, uri, path confusion. One field. Always called name. Always a string. Always hierarchical. This sounds boring until you’ve worked in a system where the user ID is userId in one place, user_id in another, uuid in a third, and email in a fourth.

The FILTERABLE and ORDERABLE annotations are the load-bearing part. That’s a custom field option I defined. It says: this field can appear in a filter expression or an order_by clause. The backend reads these at startup, generates a whitelist, and rejects any filter referencing a non-filterable field. The codegen reads them too, so every consumer of the proto knows what’s queryable. The schema is the API contract, including which queries are legal.

The whole extension is about thirty lines of proto. Two enum values, one repeated field option. That’s it.

syntax = "proto3";

package io.spangenberg;

import "google/protobuf/descriptor.proto";

extend google.protobuf.FieldOptions {
  // A designation of a specific field behavior (orderable, filterable, etc.)
  // in protobuf messages.
  //
  // Examples:
  //
  //   string name = 1 [(io.spangenberg.field_behavior) = ORDERABLE];
  //   State state = 1 [(io.spangenberg.field_behavior) = FILTERABLE];
  //   google.protobuf.Timestamp expire_time = 1
  //     [(io.spangenberg.field_behavior) = ORDERABLE,
  //      (io.spangenberg.field_behavior) = FILTERABLE];
  repeated io.spangenberg.FieldBehavior field_behavior = 1053 [packed = false];
}

// An indicator of the behavior of a given field (for example, that a field
// is filterable, or orderable in requests).
// This **does not** change the behavior in protocol buffers itself; it only
// denotes the behavior and may affect how API tooling handles the field.
//
// Note: This enum **may** receive new values in the future.
enum FieldBehavior {
  // Conventional default for enums. Do not use this.
  FIELD_BEHAVIOR_UNSPECIFIED = 0;

  // Denotes a field as filterable.
  FILTERABLE = 1;

  // Denotes a field as orderable.
  ORDERABLE = 2;
}

You drop that file into your IDL, annotate the fields you want queryable, and every consumer of the proto, the linter, the backend, the frontend codegen, the CLI generator, suddenly knows what’s legal. No separate config. No documentation that drifts. The schema carries the policy.

Now the service:

service AssetService {
  rpc GetAsset(GetAssetRequest) returns (Asset) {
    option (google.api.http) = {get: "/v1/{name=assets/*}"};
  }

  rpc ListAssets(ListAssetsRequest) returns (ListAssetsResponse) {
    option (google.api.http) = {get: "/v1/assets"};
  }

  rpc BatchGetAssets(BatchGetAssetsRequest)
      returns (BatchGetAssetsResponse) {
    option (google.api.http) = {get: "/v1/assets:batchGet"};
  }
}

message ListAssetsRequest {
  int32 page_size = 1;
  string page_token = 2;
  string filter = 3;
  string order_by = 4;
}

That’s the whole surface for assets. Get one. List many. Batch-get a known set.

Every other view a consumer wants is a ListAssets call with a different filter.

This is the part that takes a while to internalize, so I’ll state it plainly. ListAssets is not the endpoint that returns a flat list. It is the query engine.

The filter language is doing more work than it looks

AIP-160 defines a filter expression syntax with a formal grammar. It reads like SQL WHERE clauses crossed with a structured logging query language.

asset_type = "aws_s3_bucket"
asset_type = "aws_s3_bucket" AND scope.region = "us-east-1"
create_time > "2026-01-01T00:00:00Z" AND severity >= MEDIUM
asset_type:"aws_*" AND NOT scope.account = "sandbox"
labels.environment = "prod" AND labels.team = "platform"

You get binary comparisons. Boolean composition with AND, OR, NOT. Substring and “has” semantics via :. Dot-notation traversal into nested fields, including JSON payloads. Function calls for extension, written as call(arg, arg).

That last point matters. The grammar leaves room for extension functions. When you genuinely need something fancier, geospatial, fuzzy matching, time bucketing, you add a function rather than a new endpoint.

In the platform I’m building, the same List method serves the main inventory table, a search bar, a drill-down from a chart, a saved report with a five-clause filter the user composed in the UI, and a scheduled export that reuses the saved filter string verbatim. It also serves the CLI when someone runs assetctl list-assets --filter '...', and the MCP server when an AI agent asks “show me high-severity S3 events in eu-west-1.” Same handler. Same parser. Same authorization. Different consumer.

There is one backend handler behind all of those. The variation lives entirely in a string the consumer builds.

If you’ve ever maintained a report builder feature on top of hand-coded endpoints with hard-wired filters per report, you know exactly how much accidental complexity this eliminates.

Hierarchical names are a free routing system

AIP-122 says resource names should reflect hierarchy. If an asset has revisions, and revisions have observations, that nesting belongs in the name.

assets/i-0abcd1234
assets/i-0abcd1234/revisions/2026-05-01T12:00:00Z
assets/i-0abcd1234/revisions/2026-05-01T12:00:00Z/observations/obs_42

The methods follow naturally:

rpc ListAssetRevisions(ListAssetRevisionsRequest)
    returns (ListAssetRevisionsResponse) {
  option (google.api.http) = {get: "/v1/{parent=assets/*}/revisions"};
}

The parent field on the request is itself a resource name. You don’t pass asset_id. You pass assets/i-0abcd1234. The handler parses the name, looks up the asset, lists its children. The URL never leaks implementation detail like a numeric ID.

Then comes the part I think is key. Wildcards.

In the platform I’m building, events are deeply nested.

plugins/{plugin}/streams/{stream}/events/{event}

A caller doesn’t always want events scoped to one plugin and one stream. Sometimes it wants every event in the system. So the ListEvents method accepts plugins/-/streams/- as a parent. The - is a wildcard at any segment, and AIP supports this natively.

From the consumer’s perspective:

parent = "plugins/-/streams/-"            // every event
parent = "plugins/aws/streams/-"          // every AWS event
parent = "plugins/aws/streams/cloudtrail" // one stream

One handler. Three views. No “give me a different endpoint” Slack message.

Combine wildcards with filters and you have an unreasonable amount of expressive power for very little code. parent = "plugins/aws/streams/-" plus filter = "payload.event_source = 's3.amazonaws.com' AND severity = HIGH" is one HTTP call. Build a dashboard that fans this out across regions in parallel, or have the CLI page through it in a script, or let an AI agent ask for it through MCP. Same call, three contexts, no extra backend work.

Protobuf is the schema, and the schema is the artifact

This is the part I want to spend the most time on, because this is where the leverage compounds.

I write everything in .proto. The proto is checked in. CI generates everything else. By the time my pull request merges, I have, all in lockstep:

A Go server with typed handlers. I implement them. That’s the only code I actually write by hand.

A Go client. I use it from other services and from my tests.

A TypeScript client. The frontend team imports it.

A typed Python client. We use it in glue scripts and notebooks.

An OpenAPI document generated from the same proto. Useful for partners, useful for AI tools, useful for anyone who wants a Swagger UI.

A complete CLI. More on this in a second.

An MCP server. Same.

A linter report from the AIP linter that fails CI if my List method is missing page_size, if my filter field is the wrong type, if my resource pattern doesn’t match its declared type. The mistakes you’d otherwise catch in code review get caught in CI.

The thing to internalize: I am writing one file and getting multiple outputs. Each output is type-safe, each output is consistent with the others, and each output updates automatically when I edit the proto. The cost of supporting a new consumer (today, an AI agent via MCP) is the cost of writing one generator, paid once.

ConnectRPC is the piece that makes this practical for the web. Plain gRPC in browsers has always been painful. gRPC-Web works but requires a proxy and has rough edges around streaming and headers. ConnectRPC speaks three protocols on the same handler. Native gRPC for service-to-service. gRPC-Web for legacy clients. A clean HTTP and JSON protocol for browsers. One server. One generated client per language. Works everywhere.

The CLI you didn’t have to write

This is the consumer I personally get the most mileage out of.

einride/aip-cli-go is a protoc plugin that reads any AIP-shaped proto and emits a complete CLI. Not a stub. Not a scaffold. A working CLI with subcommands per method, flags per request field, type validation, server connection, bearer-token auth, verbose mode and more.

The configuration is a few lines in buf.gen.yaml:

plugins:
  - name: go
    out: cmd/assetctl
    opt: module=example.com/cmd/assetctl

  - name: go-aip-cli
    out: cmd/assetctl
    strategy: all
    opt:
      - module=example.com/cmd/assetctl
      - root=assetctl

You run buf generate. You run go install ./cmd/assetctl. You now have:

$ assetctl help asset

Usage:
  assetctl asset [command]

Available Commands:
  batch-get-assets batch get assets
  get-asset        get an asset
  list-assets      list assets

Global Flags:
      --address string   address to connect to
      --insecure         make insecure client connection
      --token string     bearer token used by client
  -v, --verbose          print verbose output

I did not write that. I did not write any of that. The generator read my proto, saw the Get / List / BatchGet methods, saw the request fields, and emitted the whole command tree. The day I add UpdateAsset to the service, a new update-asset subcommand shows up the next time CI runs.

The CLI honors the filter language. assetctl asset list-assets --filter 'severity = HIGH' works because the underlying RPC accepts a filter string. The CLI honors pagination. --page-size 100 --page-token <token> works because the request has those fields. It honors authorization because it just calls the server, which already knows how to authenticate.

This matters for three reasons that aren’t obvious until you’ve shipped one.

First, it’s the smallest possible reproduction for any backend bug. When a customer says “the dashboard says X but the data says Y,” you can run the CLI against production, get the raw response, and stop guessing where the bug is. The CLI is the same client the frontend uses, just without the frontend.

Second, the CLI is the easiest path to script automation. “I want to export every failed run from last month as CSV” is a one-liner. The frontend team doesn’t need to build an export button. Customers who want bulk operations get them on day one.

Third, the CLI is the thing internal teams reach for first, because it’s faster than the UI for almost everything an engineer does. Day-two operations stop being “open the dashboard, click around, copy values” and start being a shell pipeline.

I built the platform’s CLI in an afternoon. I spend roughly zero hours per month maintaining it. Every API change updates it for free.

MCP is just another consumer

The other generator that earns its keep is the MCP one.

Model Context Protocol is the spec for connecting LLMs to tools. Every tool you want to expose to an AI agent gets registered with a name, a description, and a JSON schema for arguments. The shape of an MCP tool is almost identical to the shape of an AIP RPC. A method name. A typed input. A typed output. A description.

Which means you can generate an MCP server from the same proto, and the work is mostly mechanical: walk the service descriptors, map each RPC to an MCP tool, generate JSON schemas from the proto types, route incoming tool calls to the existing gRPC handler, return the response.

In practice, an AI agent connected to the platform I’m building can do all of this through one MCP server:

“List all S3 buckets in us-east-1 that haven’t been accessed in 90 days.”
“Show me every IAM role with admin privileges in the prod account.”
“Get the lineage of asset assets/i-0abcd1234.”
“What checks failed for resource X yesterday?”

Every one of those becomes a ListAssets or ListCallResults call with the right filter, routed through the same handler the dashboard uses. The agent doesn’t need new endpoints. It needs the same query engine the frontend already has, with the same field-level access control, the same audit trail, the same rate limits. It gets all of that because it’s using the same ListAssets handler.

This is the part of the AI-tooling story that the “let’s bolt a chat interface onto our product” approach misses. The interface is easy. The hard part is exposing the actual capabilities of the system as composable primitives that an agent can reason about. If your API is already shaped like that, plugging in MCP is mostly a code generation problem.

The frontend, briefly

I’m not a frontend engineer. I pushed my team to use TanStack Query with Connect-Query because it’s the most mechanical match to the generated TypeScript client. Query keys derived from request messages. Caches that share entries across components. Generated hooks that the type system actually understands.

What I notice from the outside, watching the frontend team work, is that they stopped pinging me. The pattern they fell into is the obvious one once the API is shaped right. Need a new view? Compose a filter. Need to combine two views? Two parallel calls, merged in component state. Need to drill in? Pass the resource name to a route, the route component calls GetAsset with that name, the cache from the list call warms the detail page.

I don’t claim authority on the frontend implementation. I claim authority on the observation that the frontend team stopped asking me for new endpoints. The thing that produced that outcome was the API shape, not the rendering library. Pick a different React data layer and the result is the same, because the schema is what’s load-bearing.

The objections

Whenever I describe this stack, the same pushback shows up. Worth addressing each.

You’re doing more round-trips than a single custom endpoint. Sometimes, yes. In practice, far less than you’d think. Modern HTTP/2 multiplexes the requests over one connection. The backend reads from a fast columnar store, so each call is sub-100ms. Clients parallelize. When it genuinely matters, a giant report, a server-side join, that’s exactly when AIP-136’s custom methods earn their place. You pay the cost of a special-purpose endpoint only when the cost is justified, instead of as a default.

How do consumers discover what filters are legal. Proto descriptors are introspectable at runtime, and codegen surfaces them as types. The CLI tells you with --help because the generator can see which fields are FILTERABLE. The frontend can render a filter builder UI from the same metadata. The MCP server can describe filterable fields in its tool description so the agent knows what it’s allowed to ask for. One annotation, many consumers, same answer.

What about authorization. AIP-160 explicitly says filters are applied after the authorization check. The backend computes what the caller is allowed to see, then narrows further with the filter. You don’t filter your way into someone else’s data because the visibility set is computed from identity, not from the request body.

This sounds like REST. Sort of. With three crucial improvements. The schema is machine-readable end to end. The filter language is standardized so you’re not inventing query parameter conventions per endpoint. The transport gives you typed clients in every language for free. It’s REST that finally finished learning what it wanted to be.

Engineers will abuse the flexibility and write bad filters. Possibly. The backend stays in control. You can refuse filters that don’t hit an index. Rate-limit expensive queries. Set a hard timeout. Require a specific filter clause for certain resources. The proto annotations are the policy: if a field isn’t FILTERABLE, you can’t filter on it. It’s a fence, not a free-for-all.

The compounding effect

The thing that took me longest to appreciate is the compound interest.

Every new resource costs the same. Proto definition. One handler implementation. Free client. Free CLI subcommand. Free MCP tool. Free OpenAPI entry. The marginal cost of the Nth view drops to roughly zero, because every consumer is composition of List, a filter, and a name reference.

You also stop hoarding endpoints. New consumers you didn’t plan for, an internal tool, a partner integration, an AI agent, get the same surface every existing consumer uses. There is no “internal API” tier and “external API” tier that drift from each other. There’s one API, with one access policy, served three different ways.

The cultural effect on the team is the part you only notice in hindsight. Frontend engineers stop asking “can you add an endpoint” and start asking “what fields are filterable.” Backend engineers stop being a bottleneck for product velocity and start spending their time on data model and performance, which is where the actually hard problems live. Product managers stop scoping features around what’s cheap to build and start scoping them around what’s valuable. The constraint shifts from API surface area to data model, which is the right constraint.

Where I’ve landed

I’ve shipped products on hand-rolled REST, on GraphQL, on plain gRPC, on JSON-RPC, and on a few things I’d rather forget. Every one of those had moments where I thought it was fine. Then I’d come back two years later and find a hundred special-case endpoints, each one a tiny ledger entry of “we couldn’t model this generically at the time.”

The AIP plus ConnectRPC stack is the first time I’ve gone two years into a codebase and not found that drift. The surface stays small. The schema stays honest. New consumers land, sometimes consumers I didn’t anticipate when I designed the resource, and they get a clean target without any backend work. That happens because the primitives are general enough that someone connects two existing dots and ships a thing.

If you’re starting a new system this year, look seriously at this combination. Read AIP-121 and AIP-132. Stand up a service with ConnectRPC. Wire in einride/aip-cli-go and watch a real CLI fall out of your proto. Then try sketching an MCP server against the same descriptors and see how little code it takes.

It’s the first stack where I can’t see a reason to go back. Not because it’s new. Protobuf is older than Kubernetes. The reason is that the constraint it imposes is the right one. Model your nouns. Keep the verbs small. Let the filter language do the rest. Then generate every consumer you can imagine.

The endpoint you don’t have to write is the best kind.

For context, the platform I've been calling "the platform I'm building" is Linro. Most cloud security tools detect misconfigurations after they reach production. Linro evaluates every change at the control plane, the moment it happens, and blocks the harmful ones before they land. Humans, CI/CD, AI agents. Same policy layer for all of them. AIP is how we keep that surface tractable. One resource at a time.

Daniel Spangenberg

Discussion about this post

Ready for more?