Open source · C++23 · Zero runtime overhead

DataHub

Generic async data pipeline with SQLite persistence, Boost.Asio scheduling, and lock-free message queues.

Design

The diagram below shows the full data pipeline — from raw network bytes to in-memory subscriber notifications. Every component implements the same data acceptor concept and may be omitted from the pipe if excessive. For example order book has no DB representation: data adapter pass its data directly to the data feed (the data sink is omitted).

DataHub architecture diagram: horizontal pipeline from Network through Transport, data_dispatcher, data_adapter, data_sink, data_feed to data_subscription and Consumers, with data_model storage branch below data_feed

Overview

DataHub is a header-only C++23 library providing a complete async data pipeline with zero runtime overhead via static polymorphism and compile-time reflection. It connects network sources (WebSocket, HTTP REST) to typed subscriber callbacks, with optional SQLite persistence at the data sink stage.

All scheduling runs on a shared Boost.Asio executor. A lock-free spsc_queue bridges the network receive thread to a dispatch strand, ensuring ordered, non-blocking delivery with no dynamic allocation in the hot path. Schema generation and JSON serialisation are handled at compile time via Glaze reflection — no registration macros, no runtime type maps.

Core components

data_dispatcher<Acceptor...>

Receives raw JSON strings from any transport. Pushes each message into a lock-free spsc_queue and posts a drain task onto a boost::asio::strand. The strand serialises processing — all adapter calls are ordered and thread-safe. Multiple adapters are tried via a C++23 fold expression (A₁ || A₂ || … Aₙ): the first adapter returning true consumes the message.

Data Adapter

Deserialises a JSON string into a typed C++ struct using Glaze. Calls the downstream handler and propagates its bool result — false means "not consumed", allowing fallthrough to the next adapter in the dispatcher chain.

Data Sink

Owns a shared_ptr<Model> (typically data_model). Calls model->accept(range) to persist incoming entities and receive back only the new or changed ones.

Data Model

DAO with RAII table creation on construction. Uses Glaze compile-time reflection to derive the table name, column types, and primary key — no hand-written SQL schema. accept() runs precompiled INSERT OR REPLACE statements and returns only the rows that were inserted or actually changed. Supports an optional table_suffix for per-symbol sharding. Full CRUD, count(), and drop_table() included.

Data Feed

In-memory cache providing fast access to actual data for subscribers.

Data Subscription

Abstract notification interface.

Subscribers are held as weak_ptr — expired subscribers are silently dropped on the next notification.