Open source · C++23 · Zero runtime overhead
DataHub
Generic async data pipeline with SQLite persistence, Boost.Asio scheduling, and lock-free message queues.
Design
The diagram below shows the full data pipeline — from raw network bytes to in-memory subscriber notifications. Every component implements the same data acceptor concept and may be omitted from the pipe if excessive. For example order book has no DB representation: data adapter pass its data directly to the data feed (the data sink is omitted).
Overview
DataHub is a header-only C++23 library providing a complete async data pipeline with zero runtime overhead via static polymorphism and compile-time reflection. It connects network sources (WebSocket, HTTP REST) to typed subscriber callbacks, with optional SQLite persistence at the data sink stage.
All scheduling runs on a shared Boost.Asio executor.
A lock-free spsc_queue bridges the network receive thread to a
dispatch strand, ensuring ordered, non-blocking delivery with no dynamic allocation
in the hot path. Schema generation and JSON serialisation are handled at compile
time via Glaze reflection — no registration macros, no runtime
type maps.
Core components
Receives raw JSON strings from any transport. Pushes each message into a
lock-free spsc_queue and posts a drain task
onto a boost::asio::strand. The strand serialises processing —
all adapter calls are ordered and thread-safe. Multiple adapters are tried
via a C++23 fold expression (A₁ || A₂ || … Aₙ):
the first adapter returning true consumes the message.
Deserialises a JSON string into a typed C++ struct using Glaze. Calls the downstream handler and propagates
its bool result — false means "not consumed", allowing fallthrough to the next adapter in the dispatcher chain.
Owns a shared_ptr<Model> (typically
data_model). Calls model->accept(range) to
persist incoming entities and receive back only the new or changed ones.
DAO with RAII table creation on construction. Uses Glaze compile-time
reflection to derive the table name, column types, and primary key — no
hand-written SQL schema. accept() runs precompiled
INSERT OR REPLACE statements and returns only the
rows that were inserted or actually changed.
Supports an optional table_suffix for per-symbol sharding.
Full CRUD, count(), and drop_table() included.
In-memory cache providing fast access to actual data for subscribers.
Abstract notification interface.
Subscribers are held as weak_ptr — expired subscribers are
silently dropped on the next notification.