Skip to content

feat(core): chDB connection pool, Arrow WAL zero-copy flush, and InfluxQL parity#52

Merged
austin-barrington merged 5 commits into
mainfrom
feat_con_pool
Jun 24, 2026
Merged

feat(core): chDB connection pool, Arrow WAL zero-copy flush, and InfluxQL parity#52
austin-barrington merged 5 commits into
mainfrom
feat_con_pool

Conversation

@austin-barrington

Copy link
Copy Markdown
Member

Re-architect the write path around chDB-ready Arrow batches, enable real
same-path connection pooling for concurrent flush inserts and queries, and
close major gaps in InfluxDB v1 TimeseriesQL semantics (DDL, SHOW, CQ
scheduling, rollups/MVs, SELECT INTO). Bump workspace to 0.8.3.

This is a cross-cutting release: ingest, WAL, flush, chDB adapter, query
translation, cluster schema apply, and observability all move together so
the fast path is correct end-to-end rather than bolted on in one layer.

  • Add ChdbConnectionPool: N independent Connections to the same
    --path, each with its own ChdbClient mutex (libchdb process-global
    singleton per path — see chdb/insert/concurrency.rs).

  • Round-robin checkout with try_lock on busy slots; clamp pool_size
    to 1..=32 (default 4).

  • Rewire ChdbNativeAdapter, ChdbQueryAdapter, and ChdbSession to
    use the pool instead of a single shared session.

  • Fix config/docs: chdb.pool_size > 1 is now real parallelism, not a
    deprecated no-op. Recommend server.max_concurrent_queries >= pool_size.

  • Update system-architecture.md to describe same-path multi-connection
    semantics (replacing the old per-slot subdirectory model).

  • Build chDB-ready fact-table RecordBatches at ingest time
    (application/arrow_ingest/ for line protocol, columnar, msgpack, points).

  • Introduce PreparedWalSlot / PreparedMeasurementBatch domain types with
    post-assign ingest_seq patching (domain/prepared_wal.rs).

  • Coalesce sparse prepared batches per measurement (domain/arrow_coalesce).

  • In-memory WalArrowCache indexes unflushed prepared slots by sequence;
    bounded take_range(from, to_inclusive) avoids evicting post-snapshot
    entries (fixes steady ~50% cache-miss under continuous load).

  • Versioned on-disk WAL encoding via wal_ipc (HBWA magic, v1): optional
    storage.wal_format = "arrow_ipc" alongside legacy bincode.

  • FlushService flushes prepared chunks directly through
    insert_record_batch_direct — no re-parse/re-coalesce on the hot path.

  • flush.arrow_wal_enabled (default true) gates the RAM cache; metrics
    gauge hyperbytedb_wal_arrow_cache_entries for growth/OOM watch.

  • application/wal_append.rs bundles prepared slots with legacy WalEntry
    for peer sync compatibility.

  • build_prepared_wal_slot, write_prepared_batch, schema cache refresh
    from metadata, and field-type widening reconciliation
    (ALTER TABLE ... MODIFY COLUMN when metadata union exceeds cached types).

  • Engine DDL: raw facts use ReplacingMergeTree(ingest_seq); rollup/MV
    destinations with additive partials use SummingMergeTree on sum columns.

  • Pad sparse legacy WAL batches to full ensured column sets before insert.

  • New HTTP /internal/chdb adapter hook for admin/debug paths.

  • Depend on chdb-rust feat_arrow_insert (Arrow C Data Interface insert);
    Docker builds clone that branch; root/proxy Dockerfiles stub all workspace
    crates for layer caching.

  • Split parser: dedicated lexer.rs + ddl_parser.rs for token-driven
    InfluxQL DDL/SHOW/auth (CREATE/DROP/ALTER DB/RP/user, GRANT/REVOKE,
    SHOW DATABASES/MEASUREMENTS/TAG KEYS/TAG VALUES/FIELD KEYS/SERIES/CQs/MVs).

  • Major to_clickhouse.rs expansion:

    • Raw selects always project time, ascending order.
    • GROUP BY time defaults, fill/null/with bounds, tag ordering.
    • Materialized view backfill column ordering and dest insert mapping.
    • Rollup fact views (build_coalesced_fact_view_*): sum for additive
      fields, mean → sum/count rewrite on rollup measurements.
    • CQ bounded SELECT INTO translation and time-window predicate stripping.
  • predicate_sql: shared WHERE → SQL for DELETE / DROP SERIES (local +
    replication).

  • field_type domain module; rollup combine semantics for MV/CQ fields.

  • InfluxDB v1 CQ scheduling (domain/cq_schedule.rs): bucket alignment,
    RESAMPLE EVERY/FOR validation, coverage windows, boundary-aligned
    should_run, execution interval derivation.

  • QueryService::execute_continuous_query; reconstruct CQ text for replay.

  • MaterializedViewService and ContinuousQueryService wired to new
    schedule metadata and bounded backfill paths.

  • Peer/cluster: PeerQueryService Raft mutation forwarding, leader addr
    resolution (forward node → Raft → cluster membership → metrics leader),
    MV source/dest retention policy resolution.

  • schema_mutation_apply: single apply path for Raft state machine,
    /internal/replicate-mutation, and startup metadata sync (metadata +
    chDB DDL side effects).

  • RocksDB metadata adapter extended for CQ schedule fields, rollups, and
    richer measurement meta.

  • Replication apply, hinted handoff, drain, bootstrap, and Raft log/state
    machine updated for prepared WAL and schema mutations.

  • Expanded SHOW/DDL execution, SELECT INTO, retention policy normalization,
    tag key/value discovery from series tables, authorization checks.

  • CLI 0.8.3: admin/query/export/import/repl hooks for new statement types;
    e2e test coverage extended.

  • tikv-jemallocator with background purging: return transient startup
    heap (series dedup warm + WAL replay) to the OS instead of pinning RSS.

  • Default retention sweep interval 12h (was 60s).

  • Grafana dashboards refreshed (cluster, logging, machine-monitoring);
    Kind CR manifest and docker-compose aligned with new config knobs.

  • scripts/load.sh updated for pool/Arrow WAL load testing.

  • New compat suites: combination_tests (full parse→translate→execute
    interaction tests), cq_tests, prepared_wal_tests.

  • Expanded ddl_tests, query_tests, metadata_tests, http_tests.

  • Integration/raft/sync_quorum tests updated for prepared WAL and pooling.

  • Bench stubs adjusted for new ingest signatures.

BREAKING CHANGE: chDB session pooling semantics changed — pool_size now
opens multiple same-path connections (real concurrency) instead of being
ignored/warned. Tune pool_size and max_concurrent_queries together.
New config keys: storage.wal_format, flush.arrow_wal_enabled.
Default retention interval is now 12h.

…uxQL parity

Re-architect the write path around chDB-ready Arrow batches, enable real
same-path connection pooling for concurrent flush inserts and queries, and
close major gaps in InfluxDB v1 TimeseriesQL semantics (DDL, SHOW, CQ
scheduling, rollups/MVs, SELECT INTO). Bump workspace to 0.8.3.

This is a cross-cutting release: ingest, WAL, flush, chDB adapter, query
translation, cluster schema apply, and observability all move together so
the fast path is correct end-to-end rather than bolted on in one layer.

- Add `ChdbConnectionPool`: N independent `Connection`s to the same
  `--path`, each with its own `ChdbClient` mutex (libchdb process-global
  singleton per path — see chdb/insert/concurrency.rs).
- Round-robin checkout with `try_lock` on busy slots; clamp `pool_size`
  to 1..=32 (default 4).
- Rewire `ChdbNativeAdapter`, `ChdbQueryAdapter`, and `ChdbSession` to
  use the pool instead of a single shared session.
- Fix config/docs: `chdb.pool_size > 1` is now real parallelism, not a
  deprecated no-op. Recommend `server.max_concurrent_queries >= pool_size`.
- Update system-architecture.md to describe same-path multi-connection
  semantics (replacing the old per-slot subdirectory model).

- Build chDB-ready fact-table `RecordBatch`es at ingest time
  (`application/arrow_ingest/` for line protocol, columnar, msgpack, points).
- Introduce `PreparedWalSlot` / `PreparedMeasurementBatch` domain types with
  post-assign `ingest_seq` patching (`domain/prepared_wal.rs`).
- Coalesce sparse prepared batches per measurement (`domain/arrow_coalesce`).
- In-memory `WalArrowCache` indexes unflushed prepared slots by sequence;
  bounded `take_range(from, to_inclusive)` avoids evicting post-snapshot
  entries (fixes steady ~50% cache-miss under continuous load).
- Versioned on-disk WAL encoding via `wal_ipc` (`HBWA` magic, v1): optional
  `storage.wal_format = "arrow_ipc"` alongside legacy bincode.
- `FlushService` flushes prepared chunks directly through
  `insert_record_batch_direct` — no re-parse/re-coalesce on the hot path.
- `flush.arrow_wal_enabled` (default true) gates the RAM cache; metrics
  gauge `hyperbytedb_wal_arrow_cache_entries` for growth/OOM watch.
- `application/wal_append.rs` bundles prepared slots with legacy WalEntry
  for peer sync compatibility.

- `build_prepared_wal_slot`, `write_prepared_batch`, schema cache refresh
  from metadata, and field-type widening reconciliation
  (`ALTER TABLE ... MODIFY COLUMN` when metadata union exceeds cached types).
- Engine DDL: raw facts use `ReplacingMergeTree(ingest_seq)`; rollup/MV
  destinations with additive partials use `SummingMergeTree` on sum columns.
- Pad sparse legacy WAL batches to full ensured column sets before insert.
- New HTTP `/internal/chdb` adapter hook for admin/debug paths.
- Depend on chdb-rust `feat_arrow_insert` (Arrow C Data Interface insert);
  Docker builds clone that branch; root/proxy Dockerfiles stub all workspace
  crates for layer caching.

- Split parser: dedicated `lexer.rs` + `ddl_parser.rs` for token-driven
  InfluxQL DDL/SHOW/auth (CREATE/DROP/ALTER DB/RP/user, GRANT/REVOKE,
  SHOW DATABASES/MEASUREMENTS/TAG KEYS/TAG VALUES/FIELD KEYS/SERIES/CQs/MVs).
- Major `to_clickhouse.rs` expansion:
  - Raw selects always project `time`, ascending order.
  - GROUP BY time defaults, fill/null/with bounds, tag ordering.
  - Materialized view backfill column ordering and dest insert mapping.
  - Rollup fact views (`build_coalesced_fact_view_*`): sum for additive
    fields, mean → sum/count rewrite on rollup measurements.
  - CQ bounded SELECT INTO translation and time-window predicate stripping.
- `predicate_sql`: shared WHERE → SQL for DELETE / DROP SERIES (local +
  replication).
- `field_type` domain module; `rollup` combine semantics for MV/CQ fields.

- InfluxDB v1 CQ scheduling (`domain/cq_schedule.rs`): bucket alignment,
  RESAMPLE EVERY/FOR validation, coverage windows, boundary-aligned
  `should_run`, execution interval derivation.
- `QueryService::execute_continuous_query`; reconstruct CQ text for replay.
- `MaterializedViewService` and `ContinuousQueryService` wired to new
  schedule metadata and bounded backfill paths.
- Peer/cluster: `PeerQueryService` Raft mutation forwarding, leader addr
  resolution (forward node → Raft → cluster membership → metrics leader),
  MV source/dest retention policy resolution.

- `schema_mutation_apply`: single apply path for Raft state machine,
  `/internal/replicate-mutation`, and startup metadata sync (metadata +
  chDB DDL side effects).
- RocksDB metadata adapter extended for CQ schedule fields, rollups, and
  richer measurement meta.
- Replication apply, hinted handoff, drain, bootstrap, and Raft log/state
  machine updated for prepared WAL and schema mutations.

- Expanded SHOW/DDL execution, SELECT INTO, retention policy normalization,
  tag key/value discovery from series tables, authorization checks.
- CLI 0.8.3: admin/query/export/import/repl hooks for new statement types;
  e2e test coverage extended.

- `tikv-jemallocator` with background purging: return transient startup
  heap (series dedup warm + WAL replay) to the OS instead of pinning RSS.
- Default retention sweep interval 12h (was 60s).
- Grafana dashboards refreshed (cluster, logging, machine-monitoring);
  Kind CR manifest and docker-compose aligned with new config knobs.
- `scripts/load.sh` updated for pool/Arrow WAL load testing.

- New compat suites: `combination_tests` (full parse→translate→execute
  interaction tests), `cq_tests`, `prepared_wal_tests`.
- Expanded `ddl_tests`, `query_tests`, `metadata_tests`, `http_tests`.
- Integration/raft/sync_quorum tests updated for prepared WAL and pooling.
- Bench stubs adjusted for new ingest signatures.

BREAKING CHANGE: chDB session pooling semantics changed — `pool_size` now
opens multiple same-path connections (real concurrency) instead of being
ignored/warned. Tune `pool_size` and `max_concurrent_queries` together.
New config keys: `storage.wal_format`, `flush.arrow_wal_enabled`.
Default retention interval is now 12h.
feat: add a parallelized version of coalesing and WAL
chore: update docs
fix: remove tracing as it was panicing tokio main threads. Shall re-visit later.
@austin-barrington austin-barrington merged commit a48ef9c into main Jun 24, 2026
4 checks passed
@austin-barrington austin-barrington deleted the feat_con_pool branch July 1, 2026 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant