Conversation
- Introduced a shared `get_openai_client` function to streamline OpenAI and Azure client initialization. - Updated multiple modules to utilize the new client function, enhancing code maintainability and reducing redundancy. - Added test cases to ensure proper mocking of the OpenAI client in various analysis and processing scenarios. - Updated documentation in README.md to include instructions for running tests within the Docker environment.
- Introduced a new `transcribe_via_litellm` function to handle audio transcription through the LiteLLM proxy. - Updated the `run` function to conditionally use LiteLLM proxy or Deepgram API based on provided options. - Enhanced documentation for default options to include LiteLLM configuration. - Added error handling for missing required keys for both transcription methods.
|
You have run out of free Bugbot PR reviews for this billing cycle. This will reset on March 17. To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial. |
- Introduced `get_async_openai_client` function for asynchronous OpenAI and Azure client initialization. - Updated documentation to clarify usage of LiteLLM proxy for both chat and embeddings. - Enhanced error handling for missing configuration keys in async client setup.
There was a problem hiding this comment.
Pull request overview
Adds a shared OpenAI/Azure/LiteLLM client factory and migrates OpenAI-dependent links/storage to optionally route requests through a LiteLLM proxy while keeping direct OpenAI/Azure support.
Changes:
- Introduces
lib.openai_client.get_openai_client()(and async variant) to centralize provider/proxy selection. - Updates multiple links and storage modules to use the shared client constructor.
- Adds LiteLLM-proxy transcription path to the Deepgram link, and updates/extends tests + docs accordingly.
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
server/lib/openai_client.py |
New shared OpenAI/Azure/LiteLLM client constructors used across the codebase. |
server/links/analyze/__init__.py |
Uses shared client creation instead of direct OpenAI/Azure client instantiation. |
server/links/analyze/tests/test_analyze.py |
Updates tests to patch get_openai_client and inject clients. |
server/links/analyze_and_label/__init__.py |
Switches to shared get_openai_client. |
server/links/analyze_and_label/tests/test_analyze_and_label.py |
Updates tests to patch get_openai_client. |
server/links/analyze_vcon/__init__.py |
Switches to shared get_openai_client. |
server/links/check_and_tag/__init__.py |
Switches to shared get_openai_client. |
server/links/detect_engagement/__init__.py |
Switches to shared get_openai_client and skips cleanly when credentials are absent. |
server/links/detect_engagement/tests/test_detect_engagement.py |
Updates tests to patch get_openai_client. |
server/links/deepgram_link/__init__.py |
Adds LiteLLM proxy transcription path (OpenAI-compatible audio transcriptions). |
server/links/openai_transcribe/__init__.py |
Switches to shared get_openai_client. |
server/storage/milvus/__init__.py |
Uses shared get_openai_client for embeddings generation. |
server/storage/milvus/test_milvus.py |
Updates fixture to patch get_openai_client instead of OpenAI. |
server/storage/chatgpt_files/__init__.py |
Uses shared get_openai_client for file/vector-store operations. |
README.md |
Adds guidance for running tests in Docker and OTEL env workaround. |
.gitignore |
Ignores redis_data/ directory. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| organization=opts["organization"] if opts["organization"] else None | ||
| ) | ||
|
|
||
| # Initialize OpenAI client (supports LiteLLM proxy via LITELLM_PROXY_URL + LITELLM_MASTER_KEY in opts/env) |
There was a problem hiding this comment.
Comment mentions LiteLLM proxy credentials coming from "opts/env", but get_openai_client currently reads from opts only (it does not consult environment variables). Update the comment to avoid implying env-based configuration unless the caller explicitly injects env values into opts.
| # Initialize OpenAI client (supports LiteLLM proxy via LITELLM_PROXY_URL + LITELLM_MASTER_KEY in opts/env) | |
| # Initialize OpenAI client (supports LiteLLM proxy via LITELLM_PROXY_URL + LITELLM_MASTER_KEY provided in opts) |
| api_key=options["api_key"], | ||
| ) | ||
| client = get_openai_client(options) | ||
| file = client.files.create(file=open(file_name, "rb"), purpose=options["purpose"]) |
There was a problem hiding this comment.
client.files.create is called with file=open(file_name, "rb") without closing the file handle. This can leak descriptors in long-running workers. Use a context manager (open the file in a with-block) and pass the handle, or explicitly close it after the upload call.
| file = client.files.create(file=open(file_name, "rb"), purpose=options["purpose"]) | |
| with open(file_name, "rb") as upload_file: | |
| file = client.files.create(file=upload_file, purpose=options["purpose"]) |
| litellm_url = (opts.get("LITELLM_PROXY_URL") or "").strip().rstrip("/") | ||
| litellm_key = (opts.get("LITELLM_MASTER_KEY") or "").strip() | ||
|
|
||
| if litellm_url and litellm_key: | ||
| logger.info("Using LiteLLM proxy at %s", litellm_url) | ||
| organization = opts.get("organization") or opts.get("organization_key") | ||
| project = opts.get("project") or opts.get("project_key") | ||
| return OpenAI( | ||
| api_key=litellm_key, | ||
| base_url=litellm_url, | ||
| organization=organization if organization else None, | ||
| project=project if project else None, | ||
| timeout=120.0, | ||
| max_retries=0, | ||
| ) |
There was a problem hiding this comment.
LiteLLM proxy base_url handling is inconsistent: sync client uses base_url=litellm_url (no "/v1"), but OpenAI clients typically expect the base URL to include the version prefix. This will likely send requests to the wrong paths unless callers include "/v1" themselves. Consider normalizing the proxy URL (append "/v1" exactly once) so both sync and async clients behave the same and configuration is unambiguous.
| logger.info("Using LiteLLM proxy at %s (async)", litellm_url) | ||
| return AsyncOpenAI( | ||
| api_key=litellm_key, | ||
| base_url=litellm_url + "/v1", |
There was a problem hiding this comment.
get_async_openai_client unconditionally appends "/v1" to LITELLM_PROXY_URL. If the configured URL already ends with "/v1", this produces "/v1/v1" and breaks requests. Normalize the URL similarly to the sync path (strip trailing slashes, then append "/v1" only if missing).
| logger.info("Using LiteLLM proxy at %s (async)", litellm_url) | |
| return AsyncOpenAI( | |
| api_key=litellm_key, | |
| base_url=litellm_url + "/v1", | |
| # Normalize base URL: append "/v1" only if it's not already present | |
| litellm_base_url = ( | |
| litellm_url if litellm_url.endswith("/v1") else litellm_url + "/v1" | |
| ) | |
| logger.info("Using LiteLLM proxy at %s (async)", litellm_url) | |
| return AsyncOpenAI( | |
| api_key=litellm_key, | |
| base_url=litellm_base_url, |
| content = audio_response.content | ||
| if not content: | ||
| logger.warning("Empty audio content from %s", url) | ||
| return None | ||
| # OpenAI client expects a file-like object with optional .name | ||
| ext = os.path.splitext(url.split("?")[0])[-1] or ".mp3" | ||
| with tempfile.NamedTemporaryFile(delete=False, suffix=ext) as tmp: | ||
| tmp.write(content) | ||
| tmp_path = tmp.name |
There was a problem hiding this comment.
transcribe_via_litellm reads the entire audio into memory via audio_response.content before writing it to disk. For longer recordings this can cause high memory usage inside the worker container. Prefer streaming to the temp file using iter_content() (or shutil.copyfileobj from response.raw) to keep memory bounded.
| content = audio_response.content | |
| if not content: | |
| logger.warning("Empty audio content from %s", url) | |
| return None | |
| # OpenAI client expects a file-like object with optional .name | |
| ext = os.path.splitext(url.split("?")[0])[-1] or ".mp3" | |
| with tempfile.NamedTemporaryFile(delete=False, suffix=ext) as tmp: | |
| tmp.write(content) | |
| tmp_path = tmp.name | |
| # OpenAI client expects a file-like object with optional .name | |
| ext = os.path.splitext(url.split("?")[0])[-1] or ".mp3" | |
| with tempfile.NamedTemporaryFile(delete=False, suffix=ext) as tmp: | |
| bytes_written = 0 | |
| for chunk in audio_response.iter_content(chunk_size=8192): | |
| if not chunk: | |
| continue | |
| tmp.write(chunk) | |
| bytes_written += len(chunk) | |
| tmp_path = tmp.name | |
| if bytes_written == 0: | |
| logger.warning("Empty audio content from %s", url) | |
| try: | |
| os.unlink(tmp_path) | |
| except OSError: | |
| pass | |
| return None |
| # Shape expected by rest of link (confidence not provided by OpenAI-format response) | ||
| return { | ||
| "transcript": text, | ||
| "confidence": 1.0, | ||
| "detected_language": "en", | ||
| } |
There was a problem hiding this comment.
transcribe_via_litellm fabricates values for "confidence" (1.0) and "detected_language" ("en"). Downstream logic uses confidence thresholds and metrics, so this can silently change behavior (e.g., always passing minimum_confidence). Consider returning None/omitting these fields and updating the caller to handle missing confidence/language (or explicitly documenting that confidence is unavailable in the LiteLLM path).
| @patch('server.links.analyze_and_label.get_openai_client') | ||
| @patch('server.links.analyze_and_label.get_openai_client') | ||
| @patch('server.links.analyze_and_label.generate_analysis_with_labels') | ||
| @patch('server.links.analyze_and_label.is_included', return_value=True) | ||
| @patch('server.links.analyze_and_label.randomly_execute_with_sampling', return_value=True) | ||
| def test_run_analysis_exception(mock_sampling, mock_is_included, mock_generate_analysis, mock_redis_with_vcon, sample_vcon): | ||
| def test_run_analysis_exception(mock_sampling, mock_is_included, mock_generate_analysis, mock_get_client, mock_redis_with_vcon, sample_vcon): |
There was a problem hiding this comment.
There are two identical @patch('server.links.analyze_and_label.get_openai_client') decorators stacked on this test. This will inject two mocks but the test function only accepts one, causing the test to error (argument mismatch) and making the patching behavior ambiguous. Remove the duplicate patch (or update the signature accordingly).
Summary
This PR adds LiteLLM proxy support and updates all OpenAI-dependent integrations so they can optionally route requests through LiteLLM while continuing to support direct OpenAI and Azure OpenAI configurations.
The implementation is designed to be fully backward compatible. Existing deployments using OpenAI or Azure should continue to work without any configuration changes.
⸻
Changes
• Added LiteLLM proxy integration support
• Updated all relevant links to support:
• LiteLLM proxy
• Direct OpenAI
• Azure OpenAI
• Updated storage components that rely on OpenAI APIs to support the new routing mechanism
• Maintained backward compatibility with existing configurations
⸻
Affected Components
Links
• analyze
• check_and_tag
• analyze_vcon
• analyze_and_label
• detect_engagement
• openai_transcribe
• deepgram
Storage
• milvus
• chatgpt_files
⸻
Notes
• No breaking changes expected
• Configuration can now switch between LiteLLM and direct provider usage