Build production-ready Text-to-SQL system with LangChain + FastAPI + React#1
Merged
Merged
Conversation
- Root: .gitignore, .env.example, requirements.txt (pinned)
- model/: SQLAlchemy database factory (database.py) + star schema (schema.py)
- agent/: semantic layer, RAG build_index + retriever, HITL guard,
few-shot YAML examples, LCEL sql_chain pipeline
- api/: FastAPI app with CORS + global error handler; routes for
/api/query, /api/approve, /api/schema, /api/health
- data/: seed.py loads Olist CSVs into star schema SQLite tables
- frontend/: React 18 + Vite + TypeScript with dark terminal UI;
ChatWindow, SqlDisplay, ResultsTable, SchemaExplorer,
ApprovalModal components
- infra/: setup.sh, configure_env.sh, install_app.sh, systemd service,
nginx.conf, verify.sh, TROUBLESHOOTING.md, README_DEPLOY.md
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: nerdjerry <7092764+nerdjerry@users.noreply.github.com>
- model/schema.py: use lambda for datetime.utcnow() defaults so each row gets its own insertion timestamp instead of the class-definition time - api/main.py: set allow_credentials=False when allow_origins=["*"] to avoid the CORS security risk of wildcard + credentials combination - agent/sql_chain.py: cache ChatOpenAI instance at module level via _get_llm() to reuse HTTP connection pool across requests, reducing latency Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: nerdjerry <7092764+nerdjerry@users.noreply.github.com>
…th LIMIT Agent-Logs-Url: https://github.com/nerdjerry/text-to-sql/sessions/0c3fd329-f07d-41a9-931c-a2e6d59977d1 Co-authored-by: nerdjerry <7092764+nerdjerry@users.noreply.github.com>
Agent-Logs-Url: https://github.com/nerdjerry/text-to-sql/sessions/0c3fd329-f07d-41a9-931c-a2e6d59977d1 Co-authored-by: nerdjerry <7092764+nerdjerry@users.noreply.github.com>
Copilot created this pull request from a session on behalf of
techwithprateek
April 10, 2026 20:13
View session
Contributor
There was a problem hiding this comment.
Pull request overview
Introduces a full, deployable Text-to-SQL application stack (data model + seeding, LangChain RAG agent, FastAPI backend, React frontend, and EC2/nginx/systemd deployment tooling).
Changes:
- Adds star-schema SQLAlchemy models plus Olist CSV seeding script.
- Implements LangChain LCEL pipeline with ChromaDB-backed schema retrieval and a HITL guard.
- Adds FastAPI API routes (query/approve/schema/health) and a React UI, plus infra scripts/configs for EC2 deployment.
Reviewed changes
Copilot reviewed 35 out of 42 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| requirements.txt | Pins backend dependencies (FastAPI, SQLAlchemy, LangChain, ChromaDB, OpenAI). |
| model/schema.py | Defines star schema tables + query_log ORM model. |
| model/database.py | Adds engine/session factory for SQLite/Postgres. |
| model/init.py | Package marker for model module. |
| data/seed.py | Loads Olist CSVs into the star schema (optional Kaggle download). |
| data/raw/.gitkeep | Ensures raw data directory exists in git. |
| agent/semantic_layer.py | Provides semantic schema dictionary used for prompting and indexing. |
| agent/build_index.py | Builds/persists ChromaDB embeddings for schema RAG. |
| agent/retriever.py | Retrieves relevant schema snippets from ChromaDB at query time. |
| agent/sql_chain.py | LCEL pipeline: question → SQL → HITL check → execute → log. |
| agent/hitl_guard.py | Regex-based guard to flag potentially dangerous SQL for approval. |
| agent/few_shot_examples.yaml | Few-shot Q→SQL examples injected into prompts. |
| agent/init.py | Package marker for agent module. |
| api/main.py | FastAPI app setup, CORS config, global exception handler, router inclusion. |
| api/routes/query.py | /api/query and /api/approve endpoints and response models. |
| api/routes/schema.py | /api/schema endpoint serving semantic schema. |
| api/routes/health.py | /api/health readiness/liveness checks (DB/Chroma/OpenAI). |
| api/routes/init.py | Package marker for routes module. |
| api/init.py | Package marker for api module. |
| frontend/package.json | Frontend dependencies/scripts for React + Vite + TS. |
| frontend/vite.config.ts | Dev server proxy configuration for /api → backend. |
| frontend/tsconfig.json | TypeScript compiler configuration. |
| frontend/index.html | HTML entrypoint including font imports. |
| frontend/src/main.tsx | React root bootstrap. |
| frontend/src/index.css | Global styling + theme variables. |
| frontend/src/api.ts | Axios client + typed API wrappers for backend endpoints. |
| frontend/src/App.tsx | Main layout wiring schema explorer, chat window, approval modal. |
| frontend/src/components/ChatWindow.tsx | Chat UI, query submission, message rendering. |
| frontend/src/components/SqlDisplay.tsx | SQL syntax highlighting + copy-to-clipboard + approval banner. |
| frontend/src/components/ResultsTable.tsx | Sortable results table rendering. |
| frontend/src/components/SchemaExplorer.tsx | Sidebar schema browser with expand/collapse and active-table highlighting. |
| frontend/src/components/ApprovalModal.tsx | “CONFIRM”-based human approval UI for flagged SQL. |
| infra/setup.sh | Provisioning script for Ubuntu (Python 3.11, Node 20, nginx). |
| infra/configure_env.sh | Writes .env with secrets/settings and locks permissions. |
| infra/install_app.sh | Installs Python deps, creates schema, seeds data, builds index + frontend. |
| infra/texttosql.service | systemd unit for running gunicorn/uvicorn workers. |
| infra/nginx.conf | nginx reverse proxy + static hosting configuration. |
| infra/verify.sh | Post-deploy verification script (systemd/nginx/health checks). |
| infra/README_DEPLOY.md | EC2 deployment walkthrough. |
| infra/TROUBLESHOOTING.md | Troubleshooting guide for common operational failures. |
| .gitignore | Ignores env/db/venv/chroma_store/dist/node_modules artifacts. |
| .env.example | Example environment variables including ALLOWED_ORIGINS. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…→ NullPool for file SQLite Agent-Logs-Url: https://github.com/nerdjerry/text-to-sql/sessions/39759ee2-4d70-4c95-bb8f-7b8ca7fa6378 Co-authored-by: nerdjerry <7092764+nerdjerry@users.noreply.github.com>
…sql, cache OpenAI health check Agent-Logs-Url: https://github.com/nerdjerry/text-to-sql/sessions/4860fb21-30ae-43d9-803b-b13b4dd1ec28 Co-authored-by: nerdjerry <7092764+nerdjerry@users.noreply.github.com>
…nfigure_env.sh shell escaping Agent-Logs-Url: https://github.com/nerdjerry/text-to-sql/sessions/ecc1783b-e871-4175-af27-1fa21dbdcdfd Co-authored-by: nerdjerry <7092764+nerdjerry@users.noreply.github.com>
… curl Agent-Logs-Url: https://github.com/nerdjerry/text-to-sql/sessions/b785d208-89e4-46ba-b0b2-39a94907e227 Co-authored-by: nerdjerry <7092764+nerdjerry@users.noreply.github.com>
Copilot stopped work on behalf of
techwithprateek due to an error
April 11, 2026 03:04
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
model/— database.py (engine factory), schema.py (star schema)agent/— semantic_layer.py, build_index.py, retriever.py, sql_chain.py, hitl_guard.py, few_shot_examples.yamlapi/— main.py, routes/query.py, routes/schema.py, routes/health.pydata/— seed.py (Olist CSV → star schema loader)frontend/— React 18 + Vite TypeScript app with ChatWindow, SqlDisplay, ResultsTable, SchemaExplorer, ApprovalModalinfra/— setup.sh, configure_env.sh, install_app.sh, texttosql.service, nginx.conf, verify.sh, TROUBLESHOOTING.md, README_DEPLOY.mdmodel/database.py: cache engine + sessionmaker as module-level singletons; replace StaticPool with NullPool for file-based SQLiteagent/sql_chain.py: usewith get_session() as session:in_log_querymodel/schema.py: addunique=TrueonDimReviews.order_idto enforce one-to-one relationship correctnessmodel/schema.py: changeFloat→Numeric(12, 2)fororder_total_usdandfreight_value_usdcurrency columnsagent/sql_chain.py: enforce SELECT/WITH-only allowlist in_execute_sql; reject multi-statement payloadsapi/routes/health.py: replace per-requestmodels.list()with key-presence check + 60-second TTL cacheinfra/nginx.conf: replace unconditionalConnection "upgrade"with amap $http_upgrade $connection_upgradeso keep-alive is preserved for normal HTTP requestsinfra/nginx.conf: removelocation = /api/healthexact-match block; usemap $request_uri $loggable+access_log ... if=$loggableso the health endpoint inherits all proxy headers/timeouts from the/api/locationinfra/configure_env.sh: useprintf '%q'to shell-escape all values written to.env, preventing breakage onsourcewhen values contain spaces,$,#, or other shell-sensitive charactersinfra/verify.sh: add--connect-timeout 2 --max-time 5to the EC2 metadatacurlso the script fails fast on non-EC2 hosts or when the metadata service is blocked