NATS Core – i33ym

In 2010, Derek Collison started building NATS while at Apcera. He had spent years working on TIBCO's messaging systems and saw a need for something simpler. Kafka was emerging as the heavyweight champion for streaming, but it required ZooKeeper, careful tuning, and significant resources. RabbitMQ needed Erlang and came with its own complexity. Collison wanted messaging that was fast, simple, and just worked.

NATS is that system. A single binary under 20MB. Memory footprint measured in tens of megabytes. Latency measured in microseconds. No external dependencies. No complex configuration. Start it, connect to it, send messages.

This essay covers Core NATS: the foundational publish-subscribe system that powers everything else. JetStream, which adds persistence, builds on top of these concepts.

Subject-Based Addressing

The fundamental concept in NATS is the subject. Unlike traditional message brokers that require you to create queues or topics before using them, NATS subjects are simply strings that act as addresses. Publishers send messages to subjects. Subscribers listen on subjects. That's it.

orders.new
orders.fulfilled
payments.processed
users.signup

Subjects use dots as delimiters to create hierarchies. This is not just convention—it enables powerful wildcard subscriptions. The hierarchy should reflect your domain model: entity, action, maybe region or type.

orders.us.east.new
orders.eu.west.fulfilled
sensors.building1.floor3.temperature

There is no need to declare subjects ahead of time. If you publish to orders.new, that subject exists. If no one is listening, the message disappears. This is by design.

Wildcards

NATS supports two wildcards for subscriptions. The asterisk * matches exactly one token. The greater-than > matches one or more tokens and must be at the end of the subject.

orders.*        matches orders.new, orders.fulfilled
                does NOT match orders.us.new (too deep)

orders.>        matches orders.new, orders.us.east.new
                matches any depth under orders.

orders.*.new    matches orders.us.new, orders.eu.new
                does NOT match orders.us.east.new

This is how you build flexible systems. A logging service subscribes to > to see everything. An analytics service subscribes to orders.> for all order events. A regional service subscribes to orders.us.* for US orders only.

Pattern 1: Publish/Subscribe

The most basic pattern is publish/subscribe. A publisher sends a message to a subject. All active subscribers on that subject receive the message. This is fan-out delivery.

import asyncio
import nats

async def main():
    nc = await nats.connect("nats://localhost:4222")
    
    # Subscribe with callback
    async def handler(msg):
        subject = msg.subject
        data = msg.data.decode()
        print(f"Received on {subject}: {data}")
    
    await nc.subscribe("orders.>", cb=handler)
    
    # Publish messages
    await nc.publish("orders.new", b'{"id": 123}')
    await nc.publish("orders.fulfilled", b'{"id": 123}')
    
    await asyncio.sleep(1)
    await nc.close()

asyncio.run(main())

If three services subscribe to orders.new, all three receive every message published to that subject. This is useful for notifications, logging, real-time updates, and event broadcasting.

Pattern 2: Queue Groups

Sometimes you want load balancing instead of fan-out. Queue groups solve this. When multiple subscribers join the same queue group, only one of them receives each message.

async def worker(nc, worker_id):
    async def handler(msg):
        print(f"Worker {worker_id} processing: {msg.data.decode()}")
    
    # Join the "processors" queue group
    await nc.subscribe("tasks", queue="processors", cb=handler)

# Start three workers
await asyncio.gather(
    worker(nc, 1),
    worker(nc, 2),
    worker(nc, 3)
)

Now when you publish to tasks, NATS randomly selects one worker to receive each message. The load distributes roughly evenly across all workers in the group.

The beauty is that scaling is trivial. Need more capacity? Start more workers. They automatically join the queue group and start receiving messages. Need to scale down? Stop workers. NATS handles the redistribution. No configuration changes, no rebalancing procedures, no coordination.

Pattern 3: Request/Reply

Sometimes you need a response. The request/reply pattern provides synchronous RPC semantics over asynchronous messaging.

# Service side
async def service(nc):
    async def handler(msg):
        # Process request
        result = process(msg.data)
        # Send reply to the inbox
        await nc.publish(msg.reply, result)
    
    await nc.subscribe("api.process", cb=handler)

# Client side
response = await nc.request("api.process", b"my data", timeout=1.0)
print(f"Got response: {response.data.decode()}")

When you call request(), the client library creates a unique inbox subject, subscribes to it, publishes the request with that inbox as the reply-to field, and waits for a response. The service sees the reply-to field and publishes its response there.

You can combine this with queue groups. If multiple service instances subscribe to api.process with the same queue group, requests automatically load-balance across them.

Message Structure

A NATS message consists of four parts:

Subject:    orders.new                    (required)
Reply-To:   _INBOX.xyz123                 (optional)
Headers:    {"X-Trace-Id": "abc"}         (optional)
Payload:    {"id": 123, "item": "..."}    (byte array)

The subject is the routing address. The reply-to is an optional inbox for responses. Headers provide metadata without touching the payload—useful for tracing, compression hints, or content types. The payload is your actual data as bytes, up to 1MB by default.

Headers follow HTTP semantics. Use them for cross-cutting concerns like distributed tracing:

headers = {
    "X-Trace-Id": trace_id,
    "X-Request-Time": str(time.time())
}
await nc.publish("orders.new", payload, headers=headers)

At-Most-Once Delivery

Core NATS provides at-most-once delivery. This means messages are delivered to active subscribers immediately, but if no one is listening, the message is gone. There is no persistence, no replay, no acknowledgments.

This is not a limitation—it is a design choice. At-most-once delivery is extremely fast. The server does not need to track message offsets, manage acknowledgments, or write to disk. Messages flow through at millions per second.

For many use cases, this is exactly what you want. Real-time telemetry where the next reading is more important than a missed one. Live dashboards where stale data is worse than no data. Event notifications where you care about what is happening now.

When you need persistence—when every message must be processed, when you need replay, when you need exactly-once semantics—that is what JetStream provides. But JetStream builds on Core NATS, so understanding these fundamentals is essential.

Connection Management

NATS clients maintain persistent connections to the server. The connection handles automatic reconnection, buffering during brief disconnections, and ping/pong keepalives.

nc = await nats.connect(
    servers=["nats://server1:4222", "nats://server2:4222"],
    reconnect_time_wait=2,
    max_reconnect_attempts=-1,  # Infinite
    ping_interval=20,
)

# Event callbacks
async def disconnected():
    print("Disconnected from NATS")

async def reconnected():
    print("Reconnected to NATS")

nc.disconnected_cb = disconnected
nc.reconnected_cb = reconnected

When the connection drops, the client automatically tries to reconnect to the next server in the list. During disconnection, publishes are buffered (up to a configurable limit) and sent when reconnected. Subscriptions are automatically re-established.

For graceful shutdown, use drain(). This stops accepting new subscriptions, processes all pending messages, then closes the connection. Critical for rolling deployments where you want to finish processing before shutting down.

Error Handling

The main errors you will encounter are timeouts and no-responders:

from nats.errors import TimeoutError, NoRespondersError

try:
    response = await nc.request("api.users", b"get", timeout=1.0)
except TimeoutError:
    # Service did not respond in time
    print("Request timed out")
except NoRespondersError:
    # No service subscribed to this subject
    print("No service available")

NATS also detects slow consumers—subscribers that cannot keep up with message flow. When this happens, the server starts dropping messages to that subscriber and notifies the client through an error callback.

Architecture Implications

Subject design is architecture design. A well-designed subject hierarchy makes wildcards useful and operations predictable. Consider these guidelines:

Use nouns for entities: orders, users, payments
Use verbs for actions: new, fulfilled, failed
Include identifiers where useful: orders.us.east.new
Keep it shallow enough for wildcards to be meaningful

Pattern selection depends on your needs. Use pub/sub when all interested parties should see every message. Use queue groups when work should be distributed. Use request/reply when you need a response.

These patterns compose. A service can subscribe to requests via a queue group (for load balancing) while also publishing events to a subject (for interested observers). The same message can trigger both work distribution and event notification.

Try It Yourself

I built an interactive demo that connects to a live NATS server. You can publish messages, create subscriptions with wildcards, see queue group load balancing in action, and experiment with request/reply: i33ym.cc/demo-nats-core

What's Next

Core NATS gives you fast, simple, fire-and-forget messaging. But many applications need more: persistence, replay, exactly-once delivery, consumer acknowledgments.

In the next session, we cover JetStream. JetStream adds a persistence layer to NATS, enabling streams that store messages, consumers that track position, and delivery guarantees beyond at-most-once. The same subjects, the same patterns, but with durability.