Extract diskann-storage crate from diskann-providers#873
Extract diskann-storage crate from diskann-providers#873
Conversation
Create new Tier 2 crate diskann-storage with zero diskann-* dependencies, containing: - StorageReadProvider / StorageWriteProvider traits - FileStorageProvider (filesystem-backed) - VirtualStorageProvider (in-memory/overlay VFS, feature-gated) - Path utility functions for index artifact naming - DatasetDto transfer type - Generic protobuf load/save via prost diskann-providers now depends on diskann-storage and re-exports all moved types to preserve backward compatibility for existing consumers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update diskann-disk, diskann-tools, and diskann-benchmark to import storage types (StorageReadProvider, StorageWriteProvider, FileStorageProvider, VirtualStorageProvider, DynWriteProvider, WriteProviderWrapper, path_utility functions) from diskann-storage instead of the re-exports in diskann-providers. Fix unfulfilled lint expectations in benchmark files where #[expect(clippy::disallowed_methods)] was no longer triggered because VirtualStorageProvider::new_physical now lives in diskann-storage. Run cargo fmt --all to normalize import ordering. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR extracts storage abstractions and implementations from diskann-providers into a new Tier-2 crate, diskann-storage, and migrates downstream crates to use the new crate directly while keeping diskann-providers::storage::* as a compatibility re-export.
Changes:
- Added new
diskann-storagecrate containing storage traits, filesystem and virtual storage providers, path naming helpers, dataset DTO, and protobuf load/save utilities. - Updated
diskann-providersto re-export moved storage APIs fromdiskann-storageand to re-exportDatasetDtofromdiskann-storage. - Migrated consumer crates (
diskann-tools,diskann-disk,diskann-benchmark) to import storage traits/providers and path helpers fromdiskann-storage.
Reviewed changes
Copilot reviewed 54 out of 61 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| diskann-tools/src/utils/relative_contrast.rs | Switch storage provider imports to diskann-storage. |
| diskann-tools/src/utils/random_data_generator.rs | Test-only migration to diskann-storage virtual provider; main trait import still via providers re-export. |
| diskann-tools/src/utils/ground_truth.rs | Switch StorageReadProvider/StorageWriteProvider imports to diskann-storage. |
| diskann-tools/src/utils/generate_synthetic_labels_utils.rs | Switch DynWriteProvider + test storage providers to diskann-storage. |
| diskann-tools/src/utils/gen_associated_data_from_range.rs | Switch StorageWriteProvider + test providers to diskann-storage. |
| diskann-tools/src/utils/build_pq.rs | Move path utility + storage provider imports to diskann-storage. |
| diskann-tools/src/utils/build_disk_index.rs | Switch storage provider imports (incl. tests) to diskann-storage. |
| diskann-tools/src/bin/subsample_bin.rs | Switch FileStorageProvider/StorageWriteProvider imports to diskann-storage. |
| diskann-tools/src/bin/relative_contrast.rs | Switch FileStorageProvider import to diskann-storage. |
| diskann-tools/src/bin/random_data_generator.rs | Switch FileStorageProvider import to diskann-storage. |
| diskann-tools/src/bin/generate_pq.rs | Switch FileStorageProvider import to diskann-storage. |
| diskann-tools/src/bin/generate_minmax.rs | Switch StorageReadProvider/FileStorageProvider usage to diskann-storage. |
| diskann-tools/src/bin/gen_associated_data_from_range.rs | Switch FileStorageProvider import to diskann-storage. |
| diskann-tools/Cargo.toml | Add diskann-storage dependency (currently enabling virtual_storage in normal deps). |
| diskann-storage/src/virtual_storage_provider.rs | Add/expand VirtualStorageProvider implementation + tests. |
| diskann-storage/src/storage_provider.rs | New core storage traits + dyn adapters + wrapper + tests. |
| diskann-storage/src/proto_storage.rs | New generic prost-backed load_proto/save_proto helpers + tests. |
| diskann-storage/src/path_utility.rs | New canonical artifact path helpers + tests. |
| diskann-storage/src/lib.rs | New crate root wiring and re-exports. |
| diskann-storage/src/file_storage_provider.rs | Filesystem-backed provider documentation + expanded tests. |
| diskann-storage/src/dataset_dto.rs | New DatasetDto type + tests (moved from providers utils). |
| diskann-storage/Cargo.toml | New crate manifest (edition 2024) with virtual_storage feature. |
| diskann-providers/src/utils/utils.rs | Remove in-crate DatasetDto definition (moved). |
| diskann-providers/src/utils/mod.rs | Re-export DatasetDto from diskann-storage. |
| diskann-providers/src/storage/storage_provider.rs | Remove in-crate storage traits/dyn adapters (moved). |
| diskann-providers/src/storage/path_utility.rs | Remove in-crate path utilities (moved). |
| diskann-providers/src/storage/mod.rs | Re-export diskann-storage storage APIs for backward compatibility. |
| diskann-providers/Cargo.toml | Add dependency on diskann-storage (currently enabling virtual_storage unconditionally). |
| diskann-providers/benches/benchmarks/copy_aligned_data_bench.rs | Cleanup after storage extraction (still uses providers storage module). |
| diskann-providers/benches/benchmarks_iai/copy_aligned_data_bench_iai.rs | Cleanup after storage extraction (still uses providers storage module). |
| diskann-disk/src/utils/partition.rs | Switch storage trait imports (and tests) to diskann-storage. |
| diskann-disk/src/utils/aligned_file_reader/virtual_aligned_reader_factory.rs | Switch VirtualStorageProvider import to diskann-storage. |
| diskann-disk/src/utils/aligned_file_reader/storage_provider_aligned_file_reader.rs | Switch StorageReadProvider and test provider imports to diskann-storage. |
| diskann-disk/src/utils/aligned_file_reader/aligned_file_reader_factory.rs | Switch FileStorageProvider import (macOS/miri path) to diskann-storage. |
| diskann-disk/src/storage/quant/pq/pq_generation.rs | Switch storage traits in main + tests to diskann-storage. |
| diskann-disk/src/storage/quant/generator.rs | Switch storage traits and tests to diskann-storage. |
| diskann-disk/src/storage/disk_index_writer.rs | Switch storage traits + path utilities to diskann-storage. |
| diskann-disk/src/storage/disk_index_reader.rs | Switch StorageReadProvider + tests to diskann-storage. |
| diskann-disk/src/storage/cached_writer.rs | Switch StorageWriteProvider + tests to diskann-storage. |
| diskann-disk/src/storage/cached_reader.rs | Switch StorageReadProvider + tests to diskann-storage. |
| diskann-disk/src/search/provider/disk_vertex_provider.rs | Switch storage traits + get_disk_index_file to diskann-storage. |
| diskann-disk/src/search/provider/disk_provider.rs | Switch storage trait + path helper imports to diskann-storage. |
| diskann-disk/src/build/chunking/checkpoint/checkpoint_record_manager_with_file.rs | Switch FileStorageProvider import to diskann-storage. |
| diskann-disk/src/build/builder/quantizer.rs | Switch storage traits to diskann-storage. |
| diskann-disk/src/build/builder/inmem_builder.rs | Switch dyn write wrapper types to diskann-storage. |
| diskann-disk/src/build/builder/core.rs | Switch storage traits and path helpers to diskann-storage. |
| diskann-disk/src/build/builder/build.rs | Switch storage traits and tests to diskann-storage. |
| diskann-disk/Cargo.toml | Add diskann-storage dependency. |
| diskann-benchmark/src/utils/datafiles.rs | Import StorageReadProvider from diskann-storage (still uses providers FileStorageProvider). |
| diskann-benchmark/src/inputs/save_and_load.rs | Switch StorageReadProvider import to diskann-storage. |
| diskann-benchmark/src/backend/disk_index/build.rs | Switch storage traits to diskann-storage. |
| diskann-benchmark/src/backend/disk_index/benchmarks.rs | Switch FileStorageProvider import to diskann-storage. |
| diskann-benchmark/Cargo.toml | Add diskann-storage dependency. |
| Cargo.toml | Add diskann-storage to workspace members and workspace dependencies. |
| Cargo.lock | Record new crate and dependency graph updates. |
Comments suppressed due to low confidence (1)
diskann-storage/src/virtual_storage_provider.rs:292
- The new
trait_based_read_providertest uses&dyn StorageReadProvider<Reader = impl Read + Seek>, which is not a valid/idiomatic way to express “any reader” and may fail to compile due toimpl Traitusage inside an associated-type equality. Prefer a generic helper likefn read_via_trait<P: StorageReadProvider>(p: &P) where P::Reader: Read + Seek { ... }(or specify the concrete reader type explicitly) to keep the test compiling on stable Rust.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address PR review feedback: - diskann-providers: wire virtual_storage feature to forward to diskann-storage/virtual_storage instead of unconditionally enabling it - diskann-disk, diskann-tools: move diskann-storage virtual_storage feature to dev-dependencies only (matches prior vfs gating) - diskann-tools/random_data_generator.rs: migrate StorageWriteProvider import to diskann_storage - diskann-benchmark/datafiles.rs: migrate FileStorageProvider usage to diskann_storage Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Regarding the review comments: build_pq.rs \p_val\ (comment 1): This is a preexisting bug — our PR only changed imports in this file, not logic. Filed as a separate issue to avoid scope creep. random_data_generator.rs (comment 2): Fixed in d6128ea — split the import so \StorageWriteProvider\ comes from \diskann_storage. diskann-tools/Cargo.toml \�irtual_storage\ (comment 3): Fixed in d6128ea — moved \�irtual_storage\ feature to dev-dependencies only. diskann-providers/Cargo.toml \�irtual_storage\ (comment 4): Fixed in d6128ea — \diskann-storage\ is now without features in normal deps; \�irtual_storage\ feature forwards to \diskann-storage/virtual_storage. datafiles.rs \FileStorageProvider\ (comment 5): Fixed in d6128ea — migrated to \diskann_storage::FileStorageProvider. |
Summary
Extract a new
diskann-storageTier 2 crate fromdiskann-providersas the first phase of decomposing the providers god crate (37K lines → ~15K target).What moved to
diskann-storageStorageReadProvider/StorageWriteProviderFileStorageProviderVirtualStorageProvidervirtual_storage)DynWriteProvider/WriteProviderWrapper/WriteSeekpath_utilityDatasetDtoproto_storageKey properties
diskann-*dependencies — onlyprost+thiserror. Clean Tier 2 crate.diskann-storagecovering all moved functionality.diskann-providersre-exports all moved types; existingcrate::storage::StorageReadProviderpaths continue to work.diskann-disk,diskann-tools,diskann-benchmarkupdated to import directly fromdiskann-storage.Verification
cargo fmt --all --check✅cargo clippy --workspace --all-targets -- -D warnings✅cargo test -p diskann-storage— 38 passed, 0 failedcargo test -p diskann-providers --profile ci— 520 passed, 0 failedcargo check --workspace✅