Skip to content

Embedder #2

@UB-AICLUB

Description

@UB-AICLUB

Issue: Develop Embedder API with Background Task for Embedding Storage

Resources:

Description:

Develop a FastAPI application that generates text embeddings and stores them in ChromaDB using a Celery task queue. The application and Celery worker will be organized into separate folders and hosted separately using Docker.

Requirements:

  1. Folder Structure:

    • FastAPI Application Folder:
      • Contains the FastAPI app, including the root and info endpoints.
      • Responsible for accepting requests and initiating Celery tasks.
    • Celery Worker Folder:
      • Contains the Celery task definitions, including generate_and_store_embedding.
      • Handles the background processing of embedding generation and storage.
  2. API Endpoints:

    • Root Endpoint (/):
      • Accepts a JSON request body with a "text" key, e.g., {"text": "an example text", "metadata":{...}}.metadata format TBD.
      • Initiates a background task to generate and store the embedding in ChromaDB.
      • Returns a success message, e.g., {"message": "Task started successfully"}, upon successful initiation of the task.
    • Info Endpoint (/info):
      • Provides details about the embedding model in use.
  3. Celery Task:

    • Task Name: generate_and_store_embedding
    • Functionality:
      • Receives text and metadata.
      • Generates an embedding using a pre-configured model.
      • Stores the embedding and metadata in ChromaDB via a persistent Chroma Client(while in development which will be switched with a web client accessed via Docker once ready at chromadb:8000).
      • Logs success or failure, with error handling and retries as necessary.
    • Task Name: retrieve
    • Functionality
    • Receives text amd filter data
    • returns top K similar items

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions