- pgEdge RAG Server
- Architecture
- Installing pgEdge RAG Server
- Configuring the pgEdge RAG Server
- Using pgEdge RAG Server
- Using the RAG Server API
- pgEdge RAG Server Release Notes
- Developer Notes
- Licence
The pgEdge RAG Server is a simple API server for performing Retrieval-Augmented Generation (RAG) of text based on content from a PostgreSQL database using pgvector.
Documentation for the RAG Server is available online at: https://docs.pgedge.com/pgedge-rag-server/
The RAG Server features:
- Multiple RAG pipelines with configurable embedding and LLM providers
- Hybrid search combining vector similarity and BM25 text matching
- Support for OpenAI, Anthropic, Voyage, and Ollama LLM providers
- Token budget management to control LLM costs
- Optional streaming responses via Server-Sent Events
- TLS/HTTPS support
To use the pgEdge RAG Server, you must:
- Build the pgedge-rag-server binary.
- Create a configuration file that specifies details used by the RAG server.
- Invoke pgedge-rag-server.
Before installing pgEdge RAG Server, you should install or obtain:
- Go 1.22 or later
- PostgreSQL 14 or later, with pgvector installed
- API keys for your chosen LLM providers
Before building the binary, clone the RAG server repository and navigate into the root of the repo:
git clone https://github.com/pgedge/pgedge-rag-server.git
cd pgedge-rag-serverBuild the pgEdge RAG server binary with the command; the binary is created in the bin directory:
make buildAfter installation, verify the tool is working:
pgedge-rag-server versionYou can also access online help after building RAG server:
pgedge-rag-server helpCreate a configuration file that specifies server connection details and other properties; (see the online documentation for complete details. The default name of the file is pgedge-rag-server.yaml; when invoked, the server searches for configuration file in:
/etc/pgedge/pgedge-rag-server.yaml- the directory that contains the
pgedge-rag-serverbinary.
You can optionally use the -config option on the command line to specify the complete path to a custom location for the configuration file.
The following sample demonstrates a minimal configuration:
server:
listen_address: "0.0.0.0"
port: 8080
pipelines:
- name: "my-docs"
description: "Search my documentation"
database:
host: "localhost"
port: 5432
database: "mydb"
tables:
- table: "documents"
text_column: "content"
vector_column: "embedding"
embedding_llm:
provider: "openai"
model: "text-embedding-3-small"
rag_llm:
provider: "anthropic"
model: "claude-sonnet-4-20250514"After building the binary and creating a configuration file, you can invoke pgedge-rag-server. Use the command:
./bin/pgedge-rag-server (options)You can include the following options when invoking the server:
| Option | Description |
|---|---|
-config |
Path to configuration file (see below) |
-openapi |
Output OpenAPI v3 specification and exit |
-version |
Show version information and exit |
-help |
Show help message and exit |
When you invoke pgedge-rag-server you can optionally include the -config option to specify the complete path to a custom location for the configuration file. If you do not specify a location on the command line, the server searches for configuration files in:
/etc/pgedge/pgedge-rag-server.yamlpgedge-rag-server.yaml(in the binary's directory)
The online documentation contains detailed information about using the API, and allows you to try the API in a browser.
To List Available Pipelines
curl http://localhost:8080/v1/pipelinesTo Query a Pipeline
curl -X POST http://localhost:8080/v1/pipelines/my-docs \
-H "Content-Type: application/json" \
-d '{"query": "How do I configure replication?"}'To Query with Streaming
curl -X POST http://localhost:8080/v1/pipelines/my-docs \
-H "Content-Type: application/json" \
-d '{"query": "How do I configure replication?", "stream": true}'This project is licensed under the PostgreSQL License.