QueryBot is a full-stack natural-language-to-query agent that supports PostgreSQL, MySQL, and MongoDB.
It lets a user ask a database question in plain English, introspects the database schema, asks an AI model to write read-only queries (SQL or MongoDB JSON), executes them, streams agent steps to the UI, retries on failures, and returns a human-readable answer with a table and optional chart.
Key feature: multiple simultaneous database connections. You can configure any number of Postgres, MySQL, and/or MongoDB databases. The UI shows a dropdown to switch between them — each question goes to the currently selected database, with prompts and tools auto-adapted per DB type.
The workspace implementation created a complete backend and frontend application from scratch.
Implemented backend:
- NestJS API application.
- Strict TypeScript configuration.
- Multi-database connection manager supporting PostgreSQL, MySQL, and MongoDB simultaneously.
- PostgreSQL connection using raw
pgwith pool management, schema introspection, and read-only SQL enforcement. - MySQL connection using
mysql2/promisewith pool management, schema introspection, and read-only SQL enforcement. - MongoDB connection using
mongodbnative driver with JSON query execution and document-sampling schema inference. - Per-connection schema caching (60-second TTL).
- SQL execution service with read-only enforcement.
- AI provider abstraction.
- Gemini provider using Gemini function calling.
- Groq provider using OpenAI-compatible tool calling.
- Self-correcting agent loop.
- REST chat endpoint with
connectionIdsupport. - Socket.IO streaming chat gateway with
connectionIdsupport. - Health, schema, and connections listing endpoints.
- Environment-based configuration (
DATABASE_URL, numberedDATABASE_URL1...DATABASE_URLN, orDATABASESJSON). - Production build configuration.
Implemented frontend:
- Next.js App Router application under
frontend/. - React chat interface.
- Schema sidebar with table search and expandable columns.
- Database selector dropdown (visible when >1 connection configured).
- Socket.IO streaming integration.
- REST fallback when the socket is unavailable.
- Live agent step display.
- SQL syntax highlighting.
- Result table rendering.
- Optional Recharts bar/line chart rendering.
- Responsive layout for desktop and mobile.
- Production build configuration.
Implemented project operations:
- Backend and frontend package manifests with exact dependency pins.
- Backend and frontend lockfiles generated by
npm install. .env.examplewith all required configuration..gitignore.- Build and typecheck scripts.
- README documentation.
Backend:
@nestjs/core11.1.26@nestjs/common11.1.26@nestjs/config4.0.4@nestjs/platform-fastify11.1.26@nestjs/websockets11.1.26@nestjs/platform-socket.io11.1.26pg8.21.0mysql23.14.0mongodb6.16.0socket.io4.8.3typescript6.0.3
AI:
- Gemini REST API function calling.
- Groq OpenAI-compatible chat completions tool calling.
@google/genai2.8.0is installed for compatibility/future SDK use.groq-sdk1.2.1andopenai6.42.0are installed for compatibility/future SDK use.- The current provider code uses
fetchdirectly so the request/response shape is explicit and easy to audit.
Frontend:
next16.2.7react19.2.7react-dom19.2.7socket.io-client4.8.3recharts3.8.1react-syntax-highlighter16.1.1lucide-react1.17.0typescript6.0.3
querybot/
.env.example
.gitignore
README.md
nest-cli.json
package.json
package-lock.json
tsconfig.json
src/
main.ts
app.module.ts
config/
configuration.ts
database/
database.module.ts
database.service.ts
schema.service.ts
connection-manager.service.ts
connections/
connection.interface.ts
postgres.connection.ts
mysql.connection.ts
mongodb.connection.ts
index.ts
agent/
agent.module.ts
agent.service.ts
prompts.ts
tools.ts
providers/
provider.types.ts
gemini.provider.ts
groq.provider.ts
chat/
chat.module.ts
chat.controller.ts
chat.gateway.ts
frontend/
package.json
package-lock.json
next.config.js
postcss.config.js
tailwind.config.js
tsconfig.json
src/
app/
layout.tsx
page.tsx
globals.css
components/
DataTable.tsx
DbSelector.tsx
MessageBubble.tsx
ResultChart.tsx
SchemaPanel.tsx
SqlBlock.tsx
ThinkingIndicator.tsx
types.ts
The application has two separately running processes.
Backend:
- Runs on port
3001by default. - Exposes REST endpoints.
- Exposes a Socket.IO namespace at
/chat. - Connects to PostgreSQL, MySQL, and/or MongoDB (any combination).
- Calls Gemini and/or Groq APIs.
Frontend:
- Runs on port
3000by default. - Loads connections list and schema from the backend.
- Connects to
http://localhost:3001/chatvia Socket.IO. - Falls back to
POST /chatif the socket is unavailable. - Renders a DB selector dropdown when multiple connections are configured.
- Renders messages, agent steps, SQL, tables, and charts.
High-level flow:
User question
-> Next.js chat UI
-> Socket.IO query event or POST /chat (with active connectionId)
-> AgentService
-> SchemaService gets cached schema for the active connection
-> Provider session starts with system prompt + user question
(prompt adapted for SQL or MongoDB based on connection type)
-> Gemini first, Groq fallback if configured
-> AI requests execute_sql or execute_query tool
-> DatabaseService delegates to the correct connection
-> AgentService sends tool result back to provider
-> Agent retries on query errors
-> AI calls answer tool
-> Backend returns final AgentResponse
-> Frontend renders answer, query, table, and chart
Centralizes runtime configuration.
Configuration keys:
PORTNODE_ENVFRONTEND_ORIGINDATABASES(JSON array for multi-DB, highest priority)DATABASE_URL1,DATABASE_URL2, ... (numbered env vars for multi-DB, easiest typed option)DATABASE_URL(single-DB backward compat, used when no numbered vars are set)DB_SCHEMAMAX_ROWSCACHE_TTLAI_PROVIDERGEMINI_API_KEYGEMINI_MODELGROQ_API_KEYGROQ_MODEL
Provider mode:
auto: use Gemini first, then Groq if Gemini fails.gemini: use Gemini only.groq: use Groq only.
Defaults:
PORT=3001NODE_ENV=developmentFRONTEND_ORIGIN=http://localhost:3000DB_SCHEMA=publicMAX_ROWS=500AI_PROVIDER=autoGEMINI_MODEL=gemini-3.1-pro-preview-customtoolsGROQ_MODEL=openai/gpt-oss-120b
Defines the abstract DbConnection base class and shared types.
abstract class DbConnection {
readonly id: string;
readonly type: DbType; // 'postgres' | 'mysql' | 'mongodb'
readonly name: string;
readonly schema: string;
readonly maxRows: number;
readonly url: string;
abstract connect(): Promise<void>;
abstract disconnect(): Promise<void>;
abstract executeQuery(query: string): Promise<QueryExecution>;
abstract introspectSchema(): Promise<TableInfo[]>;
abstract getQueryLanguage(): 'sql' | 'mongodb';
}Each database type implements this interface.
Full PostgreSQL implementation.
Responsibilities:
- Creates a
pg.Poolfrom the connection URL. - Schema introspection via
information_schema.tables,information_schema.columns,pg_class.reltuples. - Primary key detection via
table_constraints. - Foreign key detection via
constraint_column_usage. - Read-only SQL safety enforcement.
- Automatic
LIMITinjection.
Full MySQL implementation.
Responsibilities:
- Creates a
mysql2/promisepool from the connection URL. - Schema introspection via
information_schema.tables,information_schema.columns. - Primary key detection via
COLUMN_KEY = 'PRI'. - Foreign key detection via
key_column_usage.REFERENCED_TABLE_NAME. - Read-only SQL safety enforcement (MySQL-specific comments
#also scrubbed). - Automatic
LIMITinjection.
Full MongoDB implementation.
Responsibilities:
- Creates a
MongoClientfrom the connection URL. - Executes
findqueries (with filter, projection, sort) andaggregatepipelines. - Automatic
$limitinjection if not already present. - Schema inference by sampling up to 20 documents per collection.
- Type guessing from sampled values (
number,string,boolean,object).
MongoDB query JSON format:
{
"collection": "users",
"type": "find",
"filter": { "age": { "$gt": 18 } },
"projection": { "name": 1, "email": 1 },
"sort": { "createdAt": -1 },
"limit": 50
}Manages all database connections.
Responsibilities:
- Reads
DATABASESJSON → numberedDATABASE_URL1...N→ singleDATABASE_URL. - Instantiates and connects all
DbConnectioninstances on startup. - Exposes
getConnections(),getConnection(id),getDefaultConnection(). - Disconnects all on module shutdown.
Thin facade that delegates to ConnectionManagerService.
executeQuery(query, connectionId?)— delegates to the specified connection.executeQuery(query)— uses the default connection.
Per-connection schema introspection with caching.
Responsibilities:
introspectSchema(connectionId)— introspects and caches per connection.toPromptString(connectionId)— converts cached schema to compact prompt text.getAllSchemas()— returns all cached schemas keyed by connection ID.- Cache TTL: 60 seconds per connection.
Defines tools dynamically based on query language.
- SQL connections:
execute_sqlandanswer. - MongoDB connections:
execute_queryandanswer.
execute_sql input:
{
sql: string;
reasoning: string;
}execute_query input:
{
query: string; // JSON for MongoDB
reasoning: string;
}answer input:
{
summary: string;
chartType: 'bar' | 'line' | 'none';
chartXKey?: string;
chartYKey?: string;
}Builds system prompts dynamically per connection type.
- For SQL databases: standard SQL instructions, read-only enforcement, aliases, avoid
SELECT *. - For MongoDB: JSON query format instructions, collection names, field references.
- The prompt includes the DB name and query language.
Core agent loop.
Responsibilities:
- Accepts optional
connectionId. - Resolves the active connection from
ConnectionManagerService. - Builds system prompt and tool declarations per connection type.
- Validates question input.
- Runs the provider session.
- Streams each agent step through callback.
- Executes requested queries through the correct connection.
- Sends results or errors back to the provider.
- Falls back from Gemini to Groq when
AI_PROVIDER=auto. - Returns structured
AgentResponse.
Provider abstraction. Key change: createSession now takes tools as a third parameter, enabling dynamic tool sets per DB type.
REST endpoints:
GET /health
GET /schema?connectionId=...
GET /connections
POST /chat
GET /connections returns the list of configured databases:
[
{ "id": "pg1", "type": "postgres", "name": "Sales DB" },
{ "id": "m1", "type": "mysql", "name": "Users DB" }
]POST /chat accepts an optional connectionId in the body.
Socket.IO gateway for streaming. QueryPayload now includes connectionId.
Main application screen.
Responsibilities:
- Loads connections list from
GET /connections. - Loads schema for the active connection from
GET /schema?connectionId=.... - Creates Socket.IO connection to
http://localhost:3001/chat. - Passes
activeConnectionIdto socket emits and REST calls. - When the user switches databases via
DbSelector, clears messages and reloads schema. - Falls back to
POST /chatwhen socket is unavailable.
Database connection switcher.
- Visible only when >1 connection is configured.
- Shows a dropdown with connection names and types.
- Selecting a connection triggers schema reload and message clear.
Renders the database schema sidebar (unchanged from original).
Renders user and assistant messages (unchanged).
Renders live agent steps (unchanged).
Renders query text using react-syntax-highlighter (works for both SQL and JSON).
Renders query results (unchanged).
Renders charts using Recharts (unchanged).
Shared frontend TypeScript types.
Added for multi-DB:
interface DbConnectionInfo {
id: string;
type: string;
name: string;
}Priority order: DATABASES JSON > numbered DATABASE_URL1...N > single DATABASE_URL.
Option 1 — Numbered env vars (easiest for multiple databases):
# Auto-detects type from URL prefix: postgresql://, mysql://, mongodb://
DATABASE_URL1=postgresql://user:password@localhost:5432/mydb
DATABASE_URL2=mysql://user:password@localhost:3306/mydb
DATABASE_URL3=mongodb://localhost:27017/mydb
# Optional per-DB overrides (DB_NAME1, DB_SCHEMA1, MAX_ROWS1, etc.)
DB_NAME1=Sales Database
DB_SCHEMA1=public
MAX_ROWS1=1000Numbered var suffixes:
DATABASE_URL1,DATABASE_URL2, ... — connection URL (type auto-detected).DB_NAME1,DB_NAME2, ... — (optional) display name in the UI; falls back to the database name from the URL.DB_SCHEMA1,DB_SCHEMA2, ... — (optional, SQL only) schema to introspect; falls back toDB_SCHEMAor"public".MAX_ROWS1,MAX_ROWS2, ... — (optional) max rows per query; falls back toMAX_ROWSor500.
Option 2 — Single database (backward compatible):
DATABASE_URL=postgresql://user:password@localhost:5432/mydbOption 3 — JSON array (highest priority):
DATABASES=[{"id":"pg1","type":"postgres","url":"postgresql://user:pass@localhost:5432/mydb","name":"My PG","maxRows":500},{"id":"mysql1","type":"mysql","url":"mysql://user:pass@localhost:3306/mydb","name":"My MySQL","schema":"mydb","maxRows":500},{"id":"mongo1","type":"mongodb","url":"mongodb://localhost:27017/mydb","name":"My Mongo","maxRows":500}]Each entry supports:
id— unique identifier (used in API calls and UI).type—"postgres","mysql", or"mongodb".url— connection string.name— display name in the UI dropdown.schema— (optional, SQL only) schema to introspect, defaults to"public".maxRows— (optional) max rows per query, defaults to500.
Common AI and app config (all options):
AI_PROVIDER=auto # auto, gemini, or groq
GEMINI_API_KEY=
GEMINI_MODEL=gemini-3.1-pro-preview-customtools
GROQ_API_KEY=
GROQ_MODEL=openai/gpt-oss-120b
PORT=3001
NODE_ENV=development
FRONTEND_ORIGIN=http://localhost:3000
DB_SCHEMA=public # fallback for all SQL connections
MAX_ROWS=500 # fallback for all connections
CACHE_TTL=300 # response cache TTL in seconds (0 to disable)Frontend .env.local is optional:
NEXT_PUBLIC_API_URL=http://localhost:3001Install backend dependencies:
npm installInstall frontend dependencies:
cd frontend
npm installCreate backend .env:
copy .env.example .envFill in:
DATABASE_URL,DATABASE_URL1...N, orDATABASES- at least one of
GEMINI_API_KEYorGROQ_API_KEY
Run backend:
npm run start:devRun frontend:
cd frontend
npm run devOpen:
http://localhost:3000
Backend typecheck:
npm run typecheckBackend production build:
npm run buildFrontend typecheck:
cd frontend
npm run typecheckFrontend production build:
cd frontend
npm run buildVerification already performed:
- Backend
npm run typecheck: passed. - Backend
npm run build: passed. - Frontend
npm run typecheck: passed. - Frontend
npm run build: passed. - Backend production dependency audit: passed with
0 vulnerabilities. - Frontend production dependency audit: reported 2 moderate issues through Next's bundled
postcss.
Frontend audit note:
npm audit --omit=devreports a moderatepostcssadvisory via Next.- npm suggests
npm audit fix --force, but that would installnext@9.3.3, which is a major downgrade and not acceptable. - No forced audit fix was applied.
CREATE DATABASE querybot_test;Connect to querybot_test and run:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR,
email VARCHAR,
country VARCHAR,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INT REFERENCES users(id),
amount DECIMAL,
created_at TIMESTAMP DEFAULT NOW()
);
INSERT INTO users (name, email, country) VALUES
('Alice', 'alice@example.com', 'BD'),
('Bob', 'bob@example.com', 'US'),
('Charlie', 'charlie@example.com', 'IN');
INSERT INTO orders (user_id, amount) VALUES
(1, 49.99), (1, 29.99), (2, 99.99), (3, 19.99);DATABASE_URL=postgresql://user:password@localhost:5432/querybot_testCREATE DATABASE querybot_test;
USE querybot_test;
CREATE TABLE users (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100),
country VARCHAR(10),
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE orders (
id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT,
amount DECIMAL(10,2),
created_at TIMESTAMP DEFAULT NOW(),
FOREIGN KEY (user_id) REFERENCES users(id)
);
INSERT INTO users (name, email, country) VALUES
('Alice', 'alice@example.com', 'BD'),
('Bob', 'bob@example.com', 'US'),
('Charlie', 'charlie@example.com', 'IN');
INSERT INTO orders (user_id, amount) VALUES
(1, 49.99), (1, 29.99), (2, 99.99), (3, 19.99);DATABASE_URL=mysql://user:password@localhost:3306/querybot_testuse querybot_test;
db.users.insertMany([
{ name: "Alice", email: "alice@example.com", country: "BD", age: 30 },
{ name: "Bob", email: "bob@example.com", country: "US", age: 25 },
{ name: "Charlie", email: "charlie@example.com", country: "IN", age: 35 }
]);
db.orders.insertMany([
{ userId: 1, amount: 49.99, createdAt: new Date() },
{ userId: 1, amount: 29.99, createdAt: new Date() },
{ userId: 2, amount: 99.99, createdAt: new Date() },
{ userId: 3, amount: 19.99, createdAt: new Date() }
]);DATABASE_URL=mongodb://localhost:27017/querybot_testUsing numbered env vars (easiest):
DATABASE_URL1=postgresql://user:password@localhost:5432/querybot_test
DATABASE_URL2=mysql://user:password@localhost:3306/querybot_test
DATABASE_URL3=mongodb://localhost:27017/querybot_test
DB_NAME1=Postgres DB
DB_NAME2=MySQL DB
DB_NAME3=Mongo DB
DB_SCHEMA2=querybot_testOr using JSON (higher priority):
DATABASES=[{"id":"pg","type":"postgres","url":"postgresql://user:pass@localhost:5432/querybot_test","name":"Postgres DB"},{"id":"mysql","type":"mysql","url":"mysql://user:pass@localhost:3306/querybot_test","name":"MySQL DB","schema":"querybot_test"},{"id":"mongo","type":"mongodb","url":"mongodb://localhost:27017/querybot_test","name":"Mongo DB"}]Suggested test questions:
How many users are there?Show users by countryWhat is the total revenue per user?List all orders with user namesDrop the users table
Expected behavior:
- Normal analytics questions should return a query, a table, and possibly a chart.
Drop the users tableshould be blocked by the safety check.- Switching databases in the UI should reload the schema and route subsequent questions to that DB.
Implemented controls:
- Database mutations are blocked in the SQL safety layer (Postgres and MySQL).
- Only
SELECTandWITHstatements are allowed for SQL databases. - Multiple SQL statements are blocked.
- Unsafe SQL keywords are blocked after comments and strings are scrubbed.
MAX_ROWSis applied to queries without an existingLIMIT.- MongoDB queries are restricted to
findandaggregate(no insert/update/delete). - Schema table/collection names are filtered to safe identifier patterns.
- Full query results are not logged by the backend.
- Connection URLs are never returned from API endpoints.
- Production CORS uses
FRONTEND_ORIGIN.
Important remaining risk:
- This is a defensive application-level safety layer, not a substitute for database permissions.
- The database users in connection URLs should still be read-only.
- Recommended: create dedicated read-only roles with access only to intended schemas/tables/collections.
Provider modes:
AI_PROVIDER=autoUses all configured providers in this order:
Gemini -> Groq
AI_PROVIDER=geminiUses only Gemini.
AI_PROVIDER=groqUses only Groq.
Fallback rule:
- If a provider throws an API/runtime error before a complete answer is produced, QueryBot emits an error step and tries the next configured provider.
No fallback occurs for:
- A successful provider response that says the question cannot be answered.
- Queries returning zero rows.
- Safety rejection handled inside the normal agent loop.
QueryBot includes an in-memory response cache to avoid repeat AI calls for identical questions.
How it works:
- Before running any AI provider, AgentService computes a cache key from
question + connectionId + schemaFingerprint. - The schema fingerprint is a hash of table names, row counts, column names, and types — it changes automatically when the database schema changes.
- If the key exists, the cached
AgentResponseis replayed (steps are emitted through the same callback, so the UI streams the same way). - If the key does not exist, the normal provider loop runs and the successful result is stored with the configured TTL.
- Cache is scoped per-connection — switching databases produces a different key and bypasses stale entries.
Configuration:
| Env Var | Default | Description |
|---|---|---|
CACHE_TTL |
300 |
Cache TTL in seconds. Set to 0 to disable caching. |
Cache cost model:
Without caching, every duplicate question costs the full LLM round-trip (typically 2-4 calls per question, averaging 2K-5K tokens each). With caching, the second identical question returns instantly from memory. For a user who asks the same question twice (e.g., refreshed the page or re-asked), this saves 100% of the LLM cost.
Cache entries automatically expire when:
- The TTL elapses.
- The schema changes (different fingerprint → cache miss).
- The server restarts (in-memory).
For a distributed/production deployment, replace the in-memory store with Redis via the same interface.
- The backend will not start without
DATABASE_URL,DATABASE_URL1...N, orDATABASESconfigured. - Chat will not work without at least one configured AI API key.
- Full end-to-end testing was not run against real databases and live API keys during implementation.
- Gemini is implemented through REST instead of the SDK to keep message/tool history explicit.
- The app does not currently include automated unit tests.
- The app does not currently include authentication.
- The app does not currently persist chat history.
- The app assumes one active assistant response at a time in the UI.
- The frontend has a minimal markdown renderer, not a full markdown parser.
- The SQL safety checker is conservative and may block some valid advanced SQL.
- MongoDB schema inference samples only 20 documents per collection and may miss rarely-used fields.
High priority:
- Add a dedicated read-only database role and document grants.
- Add unit tests for
DatabaseServiceSQL safety. - Add mocked provider tests for the self-correction loop.
- Add e2e tests for
POST /chatand Socket.IO streaming. - Add request timeout handling around provider calls.
- Add query timeout at the database session level.
Medium priority:
- Move Gemini provider from raw REST to official SDK.
- Add configurable provider timeout and retry policy.
- Add a query history panel.
- Add CSV export for result tables.
- Add auth before deploying against private databases.
- Add per-user/session rate limiting.
Deployment priority:
- Set production
FRONTEND_ORIGIN. - Use read-only DB users.
- Store API keys and database URLs in platform secrets.
- Verify provider model availability in the target account.
- Run
npm auditregularly.
Backend deployment target:
- Railway or equivalent Node.js host.
Frontend deployment target:
- Vercel or equivalent Next.js host.
Backend required environment:
- Node.js compatible with the installed dependency set.
DATABASE_URL,DATABASE_URL1...N, orDATABASESAI_PROVIDER- at least one AI key
FRONTEND_ORIGIN
Frontend required environment:
NEXT_PUBLIC_API_URLpointing to deployed backend.
Backend:
npm install
npm run typecheck
npm run build
npm run start:devFrontend:
cd frontend
npm install
npm run typecheck
npm run build
npm run devProduction start after backend build:
npm run startProduction start after frontend build:
cd frontend
npm run start