100,000 TPS System Architecture
Achieving 100,000 Transactions Per Second (TPS) requires a fundamental shift away from traditional web application architectures. We cannot rely on standard CRUD patterns, SQL databases, or monolithic web servers. Instead, Orderbook.trade(https://www.orderbook.trade/) adopts a High-Frequency Trading (HFT) processing model, rewritten for the decentralized era.
The Core Philosophy: Deterministic & In-Memory
To handle 100k TPS, the critical path must be free of I/O blocking. This means no database queries, no disk writes, and no network calls during the matching process.
1. The Sequencer Pattern (Single-Threaded Model)
Contrary to popular belief, multithreading usually hurts latency in matching engines due to lock contention and context/cache switching.
The Approach: We use a single-threaded "Sequencer" to process the order stream.
Why it works: By serializing all input events (orders, cancels) into a single deterministic log, we remove the need for locks and race condition handling entirely.
Result: CPU instructions flow linearly without stalling, allowing the core loop to execute in nanoseconds rather than milliseconds.
2. Pure In-Memory State
The entire state of the exchange (Orderbooks, User Balances, Risk Engine) resides in RAM.
Data Structures: We utilize
Slaballocation and cache-friendly Index Maps instead of pointers to ensure high locality.Persistence: Persistence is achieved purely through appending the input log to disk (Write-Ahead Log), not by saving the state itself. If the system crashes, we simply replay the event log from the last checkpoint to reconstruct the memory state instantly.
Performance Optimizations
Lock-Free Concurrency
While the matching core is single-threaded, the networking layer is highly parallel. We use Lock-Free Ring Buffers (Disruptor Pattern) to pass messages between the Network Threads and the Matching Thread.

Optimization: This ensures the matching core never waits for a socket write or a new connection handshake.
Zero-Copy Networking
We utilize epoll/io_uring and Zero-Copy deserialization (like Cap'n Proto or rkyv) to read packets directly from the network card buffer into the application memory without intermediate copies.
From 100k TPS to 10 TPS
How do we bridge this extreme speed to a blockchain like Ethereum or Arbitrum that runs at 10-50 TPS?
The Batcher
We do not push every trade on-chain. That would be prohibitively expensive.
Compression: The execution layer processes 100,000 trades.
Aggregation: The state differences (User A: +10 USDC, User B: -10 USDC) are aggregated over a time window (e.g., 200ms).
Proof: A ZK-Proof or Optimistic Batch is generated for just the net result.
Settlement: Only the net balance changes are written to the Smart Contract.
This architecture allows Orderbook.trade to offer NASDAQ-level performance significantly outstripping traditional DeFi AMMs, while retaining the Self-Custody security guarantees of a DEX.
Last updated