Add TwelveLabs video RAG template (Pegasus parser + Marengo embedder)#129
Open
mohit-twelvelabs wants to merge 1 commit into
Open
Add TwelveLabs video RAG template (Pegasus parser + Marengo embedder)#129mohit-twelvelabs wants to merge 1 commit into
mohit-twelvelabs wants to merge 1 commit into
Conversation
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi! I'm Mohit, I work at TwelveLabs (@mohit-twelvelabs).
Introduction
This adds a new, fully opt-in application template: Video RAG with TwelveLabs (
templates/video_rag_twelvelabs/). It lets a Pathway pipeline do RAG over video by bringing in two TwelveLabs models:TwelveLabsVideoParser, apw.UDF) that uploads each video as a TwelveLabs asset and turns it into a rich text description (what happens on screen, who/what appears, spoken and on-screen text, the overall topic). Pathway then indexes that text exactly like it indexes a PDF.MarengoEmbedder, aBaseEmbeddersubclass) used as the retriever embedder.Both components live in a local
pathway_twelvelabspackage and are wired in entirely throughapp.yaml(mirroring themultimodal_ragandslides_ai_searchtemplates), so models, prompts, the data source, and the LLM can all be swapped without touching Python.Context
The existing templates handle documents (PDF/DOCX/slides) but not video. Video is hard to drop into RAG because most stacks only transcribe the audio and discard everything visual. Pegasus captures the whole video as text, and Marengo gives a shared multimodal embedding space. This extends Pathway's live-sync + in-memory-index story to a new modality with zero new infrastructure.
How has this been tested?
templates/video_rag_twelvelabs/test_twelvelabs.py: 4 no-network unit tests (stubbed SDK; run without credentials) covering the embedder vector shape, the Pegasus upload-then-analyze flow, failed-asset handling, and an embedding-dimension regression test. 2 of these are dimension/default checks; a 5th test is a live smoke test that's skipped unlessTWELVELABS_API_KEYis set.MarengoEmbedder.get_embedding_dimension()correctly reports 512 (this required overriding the base probe, which assumes a single-vector return).black,isort --profile black,flake8) pass on all new files. The new module type-checks cleanly undermypy; the template dir is added to the existing[tool.mypy] excludelist, consistent with the other RAG templates.Types of changes
This is purely additive: a new template directory plus one row in the main README table and one entry in the mypy exclude list. No existing template, default, or behavior is changed.
Related issue(s):
Checklist:
You can grab a free API key at https://twelvelabs.io — there's a generous free tier.