Introduction
gRPC is a high-performance, open-source RPC framework originally developed by Google. It uses HTTP/2 for transport and Protocol Buffers as the interface definition language, enabling efficient, type-safe communication between services. As of 2026, gRPC v1.78+ is the preferred protocol for microservices communication, powering systems at Google, Netflix, Square, and thousands of other organizations — with the latest stable release at v1.81 (gRPC-Go, April 2026).
This guide covers gRPC architecture, Protocol Buffers, service definitions, streaming patterns, production implementation, browser support via gRPC-Web, xDS-based proxyless service mesh, and the broader ecosystem including gRPC-gateway, Spring gRPC, and ConnectRPC. Understanding gRPC is essential for developers building modern distributed systems.
What is gRPC?
gRPC (gRPC Remote Procedure Calls) is a framework that enables client applications to call server methods as if they were local objects. Unlike REST APIs, gRPC provides strongly-typed contracts through .proto files and efficient binary serialization through Protocol Buffers. It graduated as a Cloud Native Computing Foundation (CNCF) project in 2018 alongside Kubernetes and Prometheus.
Key features
HTTP/2 transport: Multiplexed connections over a single TCP socket, header compression via HPACK, and bidirectional streaming capabilities.
Protocol Buffers: Binary serialization that is 6-10x faster than JSON parsing and produces payloads up to 10x smaller than equivalent JSON.
Code generation: protoc generates idiomatic client and server stubs in 15+ languages including Go, Python, Java, C++, C#, Ruby, and TypeScript.
Four RPC types: Unary (request-response), server streaming, client streaming, and bidirectional streaming.
Interceptors: Middleware for cross-cutting concerns — authentication, logging, metrics, retries, and rate limiting.
Pluggable auth: Built-in support for SSL/TLS, token-based credentials, and composite credential chains.
Use cases
- Microservices communication (the dominant use case)
- Mobile and IoT backend services
- Real-time streaming feeds (notifications, logs, metrics)
- Polyglot distributed systems
- Internal API gateways
- Service mesh data plane (proxyless xDS mode)
gRPC is not typically used for public-facing browser APIs — for that, use gRPC-Web or the gRPC-gateway (covered later in this guide).
Protocol Buffers
Protocol Buffers (proto3) are Google’s language-neutral, platform-neutral mechanism for serializing structured data. The .proto file defines both the data schema and the service contract.
Basic message
Define a Person message with scalar fields, an enum, and a nested message type:
syntax = "proto3";
message Person {
string name = 1;
int32 age = 2;
string email = 3;
enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
string number = 1;
PhoneType type = 2;
}
repeated PhoneNumber phones = 4;
}
Field types
Proto3 provides scalar types for integers, floats, booleans, and strings, plus complex types for collections and mappings:
// Scalar types
int32, int64, uint32, uint64, sint32, sint64 // Variable-length integers
fixed32, fixed64, sfixed32, sfixed64 // Fixed-size integers
float, double // Floating point
bool // Boolean
string // UTF-8 text
bytes // Raw byte sequence
// Complex types
enum Status { UNKNOWN = 0; ACTIVE = 1; }
message Address {}
// Collections and mappings
repeated string tags = 1; // Array / list
map<string, string> metadata = 2; // Dictionary
oneof payload { // Union (only one field set)
string text = 3;
bytes data = 4;
}
Use oneof for mutually exclusive fields and sint32/sint64 for fields that may contain negative numbers (more efficient encoding than int32/int64).
Field numbering and evolution
Field numbers 1-15 use 1 byte in the wire format. Reserve them for frequently occurring fields. Numbers 16-2047 use 2 bytes. Never reuse a field number after removing a field:
message User {
reserved 2, 15, 20 to 30; // Cannot reuse these numbers
reserved "deprecated_field"; // Cannot reuse this name
string id = 1;
string name = 3; // Field 2 was removed, number reserved
int64 created_at = 4;
}
Defining services
A gRPC service declares RPC methods, each with one of four streaming modes:
service UserService {
// Unary RPC: single request, single response
rpc GetUser (UserRequest) returns (User);
// Server streaming: single request, stream of responses
rpc ListUsers (UserRequest) returns (stream User);
// Client streaming: stream of requests, single response
rpc CreateUsers (stream User) returns (UserResponse);
// Bidirectional streaming: stream of requests, stream of responses
rpc Chat (stream ChatMessage) returns (stream ChatMessage);
}
message UserRequest {
string user_id = 1;
}
message User {
string id = 1;
string name = 2;
string email = 3;
int64 created_at = 4;
}
message UserResponse {
bool success = 1;
string message = 2;
int32 created_count = 3;
}
message ChatMessage {
string sender_id = 1;
string content = 2;
int64 timestamp = 3;
}
Compilation
Install the Protocol Buffers compiler and generate stubs for your target language:
# macOS
brew install protobuf
# Ubuntu / Debian
sudo apt install protobuf-compiler
# Compile proto file with gRPC plugin
protoc --proto_path=src \
--proto_path=third_party \
--go_out=generated \
--go-grpc_out=generated \
src/user_service.proto
# Python
protoc --python_out=generated \
--grpc_python_out=generated \
src/user_service.proto
# JavaScript (for gRPC-Web)
protoc --js_out=import_style=commonjs:generated \
--grpc-web_out=import_style=typescript,mode=grpcweb:generated \
src/user_service.proto
Proto3 style guidelines
- Use
snake_casefor field names (converted tocamelCasein generated Java/TypeScript) - Use
PascalCasefor message and service names - Keep messages focused — one message type per concern
- Use
reservedto prevent accidental reuse of removed field numbers - Version your proto files with package declarations:
package user.v1;
Service Types
Unary RPC
The classic request-response pattern. Use for queries, lookups, and commands where the result size is predictable.
# Server implementation
class UserServiceServicer(user_service_pb2_grpc.UserServiceServicer):
def GetUser(self, request, context):
user = get_user_from_db(request.user_id)
return user_service_pb2.User(
id=user.id,
name=user.name,
email=user.email,
created_at=user.created_at
)
# Start the server
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
user_service_pb2_grpc.add_UserServiceServicer_to_server(
UserServiceServicer(), server
)
server.add_insecure_port('[::]:50051')
server.start()
server.wait_for_termination()
# Client
channel = grpc.insecure_channel('localhost:50051')
stub = user_service_pb2_grpc.UserServiceStub(channel)
response = stub.GetUser(
user_service_pb2.UserRequest(user_id='123')
)
print(response.name, response.email)
Server streaming
The server sends a sequence of responses. Use for paginated results, event feeds, or real-time notifications.
# Server
class NotificationServiceServicer(
notification_service_pb2_grpc.NotificationServiceServicer
):
def StreamNotifications(self, request, context):
for notification in get_notifications_stream(request.user_id):
yield notification_service_pb2.Notification(
id=notification.id,
message=notification.message,
timestamp=notification.timestamp
)
# Client
stub = notification_service_pb2_grpc.NotificationServiceStub(channel)
stream = stub.StreamNotifications(
notification_service_pb2.NotificationRequest(user_id='123')
)
for notification in stream:
print(f"Notification: {notification.message}")
Client streaming
The client sends a sequence of requests, and the server responds once. Use for batch uploads, log ingestion, or metrics submission.
# Server
class MetricsServiceServicer(metrics_service_pb2_grpc.MetricsServiceServicer):
def SubmitMetrics(self, request_iterator, context):
total_count = 0
for metric in request_iterator:
store_metric(metric)
total_count += 1
return metrics_service_pb2.MetricsResponse(
success=True,
processed_count=total_count
)
# Client
def generate_metrics():
for i in range(100):
yield metrics_service_pb2.Metric(
name='cpu_usage',
value=random.random() * 100,
timestamp=time.time()
)
response = stub.SubmitMetrics(generate_metrics())
print(f"Processed: {response.processed_count}")
Bidirectional streaming
Both sides send independent streams of messages. Use for chat systems, real-time collaboration, or long-lived data exchange pipelines.
# Server
class ChatServiceServicer(chat_service_pb2_grpc.ChatServiceServicer):
def StreamMessages(self, request_iterator, context):
for message in request_iterator:
response = process_and_respond(message)
yield response
# Client
def send_messages():
for msg in ['hello', 'how', 'are', 'you']:
yield chat_service_pb2.ChatMessage(
sender_id='user1',
content=msg
)
stream = stub.StreamMessages(send_messages())
for response in stream:
print(f"Server: {response.content}")
Metadata and Authentication
Metadata
Metadata are key-value pairs transmitted as HTTP/2 headers. Use them for authentication tokens, request IDs, and tracing context.
# Server: reading metadata
def GetUser(self, request, context):
metadata = dict(context.invocation_metadata())
auth_token = metadata.get('authorization', '')
if not validate_token(auth_token):
context.abort(grpc.StatusCode.UNAUTHENTICATED, 'Invalid token')
return user_service_pb2.User(...)
# Client: sending metadata
response = stub.GetUser(
user_service_pb2.UserRequest(user_id='123'),
metadata=[
('authorization', f'Bearer {access_token}'),
('x-request-id', request_id),
]
)
Credentials
gRPC supports several credential types that compose via grpc.composite_channel_credentials:
# Token-based authentication
access_creds = grpc.access_token_call_credentials('your-access-token')
# SSL/TLS channel credentials
with open('client.crt', 'rb') as f:
client_cert = f.read()
with open('client.key', 'rb') as f:
client_key = f.read()
ssl_creds = grpc.ssl_channel_credentials(
root_certificates=None,
private_key=client_key,
certificate_chain=client_cert
)
# Combine SSL + token for mutual authentication
composite_creds = grpc.composite_channel_credentials(
ssl_creds, access_creds
)
channel = grpc.secure_channel('api.example.com:443', composite_creds)
In production, always use TLS. Never use grpc.insecure_channel() outside of local development.
Interceptors
Interceptors are gRPC’s middleware pattern. They wrap RPC invocations to add cross-cutting behavior without modifying service logic.
Server-side logging interceptor
Log every method call with its duration:
class LoggingInterceptor(grpc.ServerInterceptor):
def intercept_service(self, continuation, handler_call_details):
method = handler_call_details.method
start = time.time()
try:
return continuation(handler_call_details)
finally:
duration = time.time() - start
print(f"[gRPC] {method} completed in {duration:.3f}s")
server = grpc.server(
futures.ThreadPoolExecutor(max_workers=10),
interceptors=[LoggingInterceptor()]
)
Server-side auth interceptor
Reject unauthenticated requests before they reach the handler:
class AuthInterceptor(grpc.ServerInterceptor):
def __init__(self, auth_func):
self.auth_func = auth_func
def intercept_service(self, continuation, handler_call_details):
metadata = dict(handler_call_details.invocation_metadata)
token = metadata.get('authorization', '').replace('Bearer ', '')
if not self.auth_func(token):
return grpc.unary_unary_rpc_method_handler(
lambda request, context: context.abort(
grpc.StatusCode.UNAUTHENTICATED,
'Invalid or expired token'
)
)
return continuation(handler_call_details)
Client-side retry interceptor
Retry with exponential backoff on transient failures:
class RetryInterceptor(grpc.UnaryUnaryClientInterceptor):
def __init__(self, max_retries=3):
self.max_retries = max_retries
def intercept_unary_unary(self, continuation, client_call_details, request):
last_error = None
for attempt in range(self.max_retries):
try:
return continuation(client_call_details, request)
except grpc.RpcError as e:
last_error = e
if e.code() != grpc.StatusCode.UNAVAILABLE:
raise
time.sleep(2 ** attempt) # Exponential backoff: 1s, 2s, 4s
raise last_error
Error Handling
gRPC uses canonical error codes following the model of HTTP status codes but with richer semantics for distributed systems.
# Server: returning structured errors
def GetUser(self, request, context):
user = find_user(request.user_id)
if not user:
context.set_code(grpc.StatusCode.NOT_FOUND)
context.set_details(f"User {request.user_id} not found")
return user_service_pb2.User()
if not user.is_active:
context.set_code(grpc.StatusCode.FAILED_PRECONDITION)
context.set_details("Account is deactivated")
return user_service_pb2.User()
return user_service_pb2.User(...)
# Client: handling errors
try:
response = stub.GetUser(user_service_pb2.UserRequest(user_id='999'))
except grpc.RpcError as e:
if e.code() == grpc.StatusCode.NOT_FOUND:
print("User not found, creating...")
elif e.code() == grpc.StatusCode.UNAUTHENTICATED:
print("Authentication required — refreshing token...")
elif e.code() == grpc.StatusCode.DEADLINE_EXCEEDED:
print("Request timed out — check backend latency")
else:
print(f"gRPC error {e.code()}: {e.details()}")
Common gRPC status codes:
| Code | gRPC Constant | When it occurs |
|---|---|---|
| 2 | UNKNOWN |
Generic server error |
| 4 | DEADLINE_EXCEEDED |
Client timeout |
| 5 | NOT_FOUND |
Resource not found |
| 7 | PERMISSION_DENIED |
Authenticated but not authorized |
| 8 | RESOURCE_EXHAUSTED |
Rate limit or quota exceeded |
| 14 | UNAVAILABLE |
Service down or transient failure |
| 16 | UNAUTHENTICATED |
Missing or invalid credentials |
Advanced Production Patterns
Server configuration with keepalive
Configure connection management to prevent resource leaks and idle connections:
server_options = [
# Keepalive pings every 30 seconds
('grpc.keepalive_time_ms', 30000),
# Wait 10 seconds for keepalive response
('grpc.keepalive_timeout_ms', 10000),
# Allow keepalive even without active RPCs
('grpc.keepalive_permit_without_calls', True),
# Reject pings more frequent than once per 30 seconds
('grpc.http2.min_ping_interval_without_data_ms', 30000),
# Close connection after 5 minutes idle
('grpc.max_connection_idle_ms', 300000),
# Recycle connections after 10 minutes
('grpc.max_connection_age_ms', 600000),
# Grace period before forced close
('grpc.max_connection_age_grace_ms', 30000),
]
server = grpc.server(
futures.ThreadPoolExecutor(max_workers=50),
options=server_options
)
# Client-side keepalive
channel_options = [
('grpc.keepalive_time_ms', 30000),
('grpc.keepalive_timeout_ms', 10000),
('grpc.keepalive_permit_without_calls', True),
]
channel = grpc.insecure_channel('server:50051', options=channel_options)
Deadline propagation
Set client deadlines and propagate remaining time to downstream calls:
# Client: set deadline
response = stub.GetUser(
user_service_pb2.UserRequest(user_id='123'),
timeout=5.0 # seconds
)
# Server: check remaining time before making downstream call
class UserServiceServicer(...):
def GetUser(self, request, context):
remaining = context.time_remaining()
if remaining and remaining < 0.5:
context.abort(
grpc.StatusCode.DEADLINE_EXCEEDED,
"Not enough time to process request"
)
# Propagate deadline to downstream call
downstream_response = downstream_stub.GetSettings(
request,
timeout=max(0.1, remaining - 0.2)
)
return build_response(downstream_response)
Health checking protocol
The gRPC Health Checking Protocol lets clients and load balancers determine service readiness:
// Standard health check service
service Health {
rpc Check(HealthCheckRequest) returns (HealthCheckResponse);
rpc Watch(HealthCheckRequest) returns (stream HealthCheckResponse);
}
message HealthCheckRequest {
string service = 1;
}
message HealthCheckResponse {
enum ServingStatus {
UNKNOWN = 0;
SERVING = 1;
NOT_SERVING = 2;
SERVICE_UNKNOWN = 3; // Used only by Watch
}
ServingStatus status = 1;
}
from grpc_health.v1 import health, health_pb2, health_pb2_grpc
# Create health servicer
health_servicer = health.HealthServicer()
health_servicer.set(
'user.UserService',
health_pb2.HealthCheckResponse.SERVING
)
# Add to server
health_pb2_grpc.add_HealthServicer_to_server(health_servicer, server)
Server reflection
The Server Reflection Protocol allows clients to discover available services and methods at runtime without requiring the proto file:
from grpc_reflection.v1alpha import reflection
# Enable reflection
service_names = [
user_service_pb2.DESCRIPTOR.services_by_name['UserService'].full_name,
reflection.SERVICE_NAME,
]
reflection.enable_server_reflection(service_names, server)
Tools like grpcurl use reflection to inspect and call services:
grpcurl -plaintext localhost:50051 list
grpcurl -plaintext localhost:50051 describe user.UserService
grpcurl -plaintext -d '{"user_id": "123"}' localhost:50051 user.UserService/GetUser
Connection pooling
Reuse gRPC channels and stubs — a single channel handles thousands of concurrent RPCs through HTTP/2 multiplexing:
# DO NOT create a channel per request (expensive TLS + HTTP/2 setup)
# BAD:
for request in requests:
channel = grpc.insecure_channel('server:50051') # Costly
stub = UserServiceStub(channel)
response = stub.GetUser(request)
# GOOD: reuse channel and stub
channel = grpc.insecure_channel('server:50051')
stub = UserServiceStub(channel)
for request in requests:
response = stub.GetUser(request)
Compression
Enable compression for bandwidth-sensitive workloads. gRPC supports gzip, deflate, and snappy:
# Enable compression on the channel
channel = grpc.insecure_channel(
'server:50051',
options=[('grpc.default_compression_algorithm', 2)] # 2 = gzip
)
# Or per-call
response = stub.GetUser(
request,
compression=grpc.Compression.Gzip
)
For high-throughput systems, benchmark with compression disabled — the CPU cost can outweigh bandwidth savings for small payloads under 1 KB.
Load Balancing
Client-side load balancing
gRPC’s native load balancing uses the round_robin service config. Do not manually create channel lists:
# Correct: use service config for round-robin
channel = grpc.insecure_channel(
'localhost:50051',
options=[(
'grpc.service_config',
'{"loadBalancingConfig": [{"round_robin": {}}]}'
)]
)
For DNS-based load balancing, use the dns resolver with multiple A records. For service mesh environments, use the xds resolver (see the xDS section below).
Service discovery via DNS
gRPC natively resolves DNS names. Point your channel to a DNS name with multiple A records:
channel = grpc.insecure_channel(
'user-service.example.com:50051',
options=[(
'grpc.service_config',
'{"loadBalancingConfig": [{"round_robin": {}}]}'
)]
)
Each A record is treated as a distinct backend. gRPC creates a subchannel per address and distributes RPCs according to the load balancing policy.
gRPC-Web: Browser Support
Browsers cannot speak native gRPC because they lack access to raw HTTP/2 frames and trailers. gRPC-Web is the official solution — a JavaScript client library and proxy spec that bridges this gap.
How it works
A proxy (typically Envoy) translates between the gRPC-Web wire format (HTTP/1.1 or HTTP/2) and standard gRPC (HTTP/2). The JavaScript client sends protobuf-encoded messages through the proxy.
Envoy proxy configuration
static_resources:
listeners:
- name: grpc_web_listener
address:
socket_address: { address: 0.0.0.0, port_value: 8080 }
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
codec_type: AUTO
stat_prefix: grpc_web
route_config:
name: local_route
virtual_hosts:
- name: backend
domains: ["*"]
routes:
- match: { prefix: "/" }
route:
cluster: grpc_backend
http_filters:
- name: envoy.filters.http.grpc_web
- name: envoy.filters.http.router
clusters:
- name: grpc_backend
type: STRICT_DNS
lb_policy: ROUND_ROBIN
typed_extension_protocol_options:
envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
"@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
explicit_http_config:
http2_protocol_options: {}
load_assignment:
cluster_name: grpc_backend
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: grpc-server
port_value: 50051
JavaScript client
import { GrpcWebClientImpl } from './generated/user_service.client';
import { UserRequest } from './generated/user_service';
const client = new GrpcWebClientImpl({
baseUrl: 'https://api.example.com',
});
const response = await client.getUser(
UserRequest.create({ userId: '123' })
);
console.log(response.name, response.email);
Limitations
- Only server streaming is supported — no client streaming or bidirectional streaming from browsers (the Fetch API does not support streaming request bodies)
- Requires a proxy (Envoy, gRPC-web Go proxy, or ASP.NET Core middleware)
- gRPC-Web trailers are sent in the response body rather than as HTTP/2 trailing headers
Akamai, Cloudflare, and other CDNs added gRPC-Web passthrough support in 2025-2026, making it viable for production-scale web applications.
gRPC-Gateway: REST Transcoding
The gRPC-Gateway project generates a reverse-proxy server that exposes gRPC services as RESTful JSON APIs. This lets you serve both gRPC and REST from the same proto definition.
Annotate your proto
Add google.api.http annotations to your RPC methods:
import "google/api/annotations.proto";
service UserService {
rpc GetUser(UserRequest) returns (User) {
option (google.api.http) = {
get: "/v1/users/{user_id}"
};
}
rpc CreateUser(CreateUserRequest) returns (User) {
option (google.api.http) = {
post: "/v1/users"
body: "user"
};
}
rpc ListUsers(ListUsersRequest) returns (ListUsersResponse) {
option (google.api.http) = {
get: "/v1/users"
};
}
}
Generate the gateway
# Install the gateway generator
go install github.com/grpc-ecosystem/grpc-gateway/v2/protoc-gen-grpc-gateway@latest
# Generate the gateway code
protoc -I . --grpc-gateway_out . \
--grpc-gateway_opt logtostderr=true \
--grpc-gateway_opt paths=source_relative \
user_service.proto
Run the gateway alongside your gRPC server
package main
import (
"context"
"net/http"
"github.com/grpc-ecosystem/grpc-gateway/v2/runtime"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials/insecure"
gw "path/to/generated/gateway"
)
func main() {
ctx := context.Background()
mux := runtime.NewServeMux()
opts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
}
err := gw.RegisterUserServiceHandlerFromEndpoint(
ctx, mux, "localhost:50051", opts,
)
if err != nil {
panic(err)
}
http.ListenAndServe(":8080", mux)
}
gRPC-Gateway also generates OpenAPI v2 specifications, which you can use to generate JavaScript, TypeScript, or Swift clients.
xDS and Proxyless Service Mesh
Since gRPC 1.30+ (2020), gRPC supports the xDS APIs — the same discovery APIs that Envoy uses. This enables a proxyless service mesh: gRPC applications communicate directly with the control plane, eliminating the need for sidecar proxies.
How proxyless gRPC works
Instead of routing traffic through an Envoy sidecar, the gRPC library itself implements xDS clients for service discovery, load balancing, traffic routing, and health checking. The control plane (Istio, Google Traffic Director, or a custom implementation) pushes configuration directly to gRPC clients via xDS.
sequenceDiagram
participant CP as Control Plane (xDS)
participant A as Service A (gRPC Client)
participant B as Service B (gRPC Server)
participant C as Service C (gRPC Server)
A->>CP: LDS: Listener Discovery
CP-->>A: Listener config
A->>CP: RDS: Route Discovery
CP-->>A: Route config
A->>CP: CDS: Cluster Discovery
CP-->>A: Cluster config (B & C)
A->>CP: EDS: Endpoint Discovery
CP-->>A: Endpoint list (B:10.0.0.1, C:10.0.0.2)
A->>B: RPC (via xDS load balancing)
A->>C: RPC (round-robin)
Enable xDS in gRPC
package main
import (
_ "google.golang.org/grpc/xds" // Register xDS resolvers and balancers
"google.golang.org/grpc"
)
func main() {
// Use xds:/// scheme instead of dns:///
conn, err := grpc.DialContext(
ctx,
"xds:///user-service.default.svc.cluster.local:50051",
grpc.WithTransportCredentials(insecure.NewCredentials()),
)
if err != nil {
panic(err)
}
stub := pb.NewUserServiceClient(conn)
response, err := stub.GetUser(ctx, &pb.UserRequest{UserId: "123"})
}
A bootstrap file (xds.json) tells the gRPC library how to reach the control plane:
{
"xds_servers": [
{
"server_uri": "istiod.istio-system.svc:15010",
"channel_creds": [
{ "type": "insecure" }
],
"server_features": ["xds_v3"]
}
],
"node": {
"id": "sidecar~10.0.0.1~my-app-abc123.default~default.svc",
"cluster": "default",
"locality": { "zone": "us-central-1" }
}
}
Benefits over sidecar proxy mesh
- Lower latency — no Envoy hop between services
- Reduced resource usage — no sidecar containers consuming CPU/memory
- Simplified operations — fewer containers to manage and debug
- Native load balancing — gRPC’s weighted round-robin and pick-first policies
Current xDS support status (2026)
gRPC’s xDS v3 support covers service discovery, load balancing (round_robin, weighted round_robin, pick_first), traffic splitting, and route matching. Rate limiting via global RLQS (Rate Limit Query Service) and OpenTelemetry metrics for xDS components are being added. Istio support is available but still considered experimental for some advanced features.
For teams already using gRPC throughout their stack, proxyless service mesh reduces operational overhead while maintaining the same traffic management capabilities.
Ecosystem
Spring gRPC
Spring gRPC (0.9.0 released July 2025, 1.0.0-RC1 released November 2025, dependent on Spring Boot 4.0) provides first-class gRPC support within the Spring Boot ecosystem. It offers starters for servers and clients, auto-configuration, and interceptor filtering. The Spring Initializr includes a gRPC option for new projects.
ConnectRPC
ConnectRPC is a newer protocol family that offers enhanced browser support with gRPC-compatible backends. It supports gRPC, gRPC-Web, and its own Connect protocol (HTTP/1.1 + JSON) from the same server. Unlike gRPC-Web, ConnectRPC does not require a proxy — browsers communicate directly over HTTP/1.1. If you need full bidirectional streaming from browsers alongside gRPC interoperability, ConnectRPC is a strong alternative.
gRPC ecosystem tools
| Tool | Purpose |
|---|---|
grpcurl |
CLI for interacting with gRPC servers (uses reflection) |
grpcui |
Web-based UI for testing gRPC services |
ghz |
gRPC benchmarking and load testing |
protoc-gen-validate |
Generate validation rules from proto annotations |
protoc-gen-doc |
Generate documentation from proto files |
buf |
Modern protobuf build tool with linting and breaking change detection |
grpc-gateway |
REST JSON API from gRPC services |
protoxform |
Proto refactoring and transformation |
Best Practices
Proto design
- Use proto3 for all new services
- Keep messages focused — a message should represent a single concept
- Reserve field numbers for deleted fields to prevent reuse
- Use
packagedeclarations for versioning:package user.v1; - Prefix enum values with the enum name to avoid collisions (
PHONE_TYPE_MOBILE)
Performance
- Reuse channels and stubs — they are safe for concurrent use
- Use streaming for large data transfers (batch uploads, log streams)
- Set appropriate message size limits — gRPC defaults to 4 MB
- Enable keepalive pings to maintain HTTP/2 connections during idle periods
- Benchmark compression — for sub-1 KB messages, compression adds CPU overhead
- Use
sint32/sint64for negative numbers,fixed64for large numbers above 2^28 - Avoid sending binary blobs larger than 85 KB in single messages — stream them
Reliability
- Always set client deadlines (timeouts) — default is infinite
- Implement retries with exponential backoff for
UNAVAILABLEerrors - Use health checks for load balancer endpoint management
- Enable server reflection for debugging and tooling
- Monitor gRPC metrics: request rate, latency percentiles, error codes, active streams
Security
- Always use TLS in production — never deploy
insecure_channel - Use mutual TLS (mTLS) for inter-service communication
- Implement per-method authorization via server interceptors
- Rotate TLS certificates automatically (e.g., cert-manager in Kubernetes)
- Apply rate limiting at the interceptor level
Development workflow
- Use
buffor proto file management, linting, and breaking change detection - Generate OpenAPI specs via gRPC-Gateway for frontend and documentation
- Use
grpcurlwith reflection for rapid testing during development - Check in generated code or use a CI pipeline for code generation — be consistent
Conclusion
gRPC has evolved from Google’s internal RPC framework into the standard for high-performance microservices communication. Its combination of HTTP/2 efficiency, Protocol Buffers serialization, and strong typing enables fast, reliable distributed systems. In 2026, the ecosystem extends well beyond basic RPC — with gRPC-Web bridging browsers, gRPC-Gateway serving REST APIs from the same proto definitions, and xDS-based proxyless service mesh eliminating sidecar overhead.
Whether you are connecting a handful of services or orchestrating hundreds, gRPC provides the tooling, performance, and production readiness to scale.
Resources
- gRPC Official Documentation
- Protocol Buffers Documentation
- gRPC GitHub Repository
- gRPC-Web Documentation
- gRPC-Gateway
- xDS Features in gRPC
- Performance Best Practices (gRPC.io)
- Spring gRPC
- ConnectRPC
- buf: Protobuf Build Tool
Comments