Skip to content

[FEATURE] Support bit and sparsevec types for pgVector #11035

@NikAiyer

Description

@NikAiyer

Problem Statement

Currently @mastra/pg only supports vector (full precision) and halfvec (half precision) types. pgvector 0.7.0+ also supports two additional storage types:

  1. bit - Binary vectors using PostgreSQL's native bit type, useful for binary quantization which significantly reduces storage and improves search speed
  2. sparsevec - Sparse vectors that only store non-zero elements, useful for BM25/TF-IDF representations and other sparse embeddings

Proposed Solution

  1. Add bit and sparsevec to the VectorType union type
  2. Implement new distance metrics:
  • For bit: Hamming distance (<~>) and Jaccard distance (<%>)
  • For sparsevec: L2, cosine, and inner product (same as vector)
  1. Add appropriate operator classes:
    bit_hamming_ops, bit_jaccard_ops
    sparsevec_l2_ops, sparsevec_ip_ops, sparsevec_cosine_ops
  2. Implement dimension/element limits:
  • bit: up to 64,000 dimensions for indexes
  • sparsevec: up to 1,000 non-zero elements for indexes
  1. Handle index type restrictions:
  • IVFFlat: supports bit with Hamming distance only (bit_hamming_ops), does NOT support Jaccard or sparsevec
  • HNSW: supports bit (both Hamming and Jaccard) and sparsevec (L2, cosine, inner product)

Component

RAG

Additional Context

The recent PR #11002 added halfvec support and provides a pattern for implementation

Verification

  • I have searched the existing issues to make sure this is not a duplicate
  • I have provided sufficient context for the team to understand the request

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions