A simple system to scrape iRacing forums, store posts with embeddings, and query them using AI.
Demo:
- Semantic Search: Find relevant forum posts using sentence embeddings
- AI-Powered Q&A: Ask questions and get answers based on forum content
- Interactive CLI: Easy-to-use command-line interface
- Vector Search: Uses Milvus for fast similarity search
-
Install dependencies:
uv sync
-
Set your OpenAI API key (choose one method):
Option A: Create a .env file (recommended):
echo "OPENAI_API_KEY=your-openai-api-key-here" > .env
Option B: Set environment variable:
export OPENAI_API_KEY="your-openai-api-key-here"
To scrape the forum posts and build the database you need to run the scraper.py file for the main and old forums. Once the browser initially loads, login, and the crawler will begin.
To run the scraper:
-
For main forums:
python scraper.py
-
For JForum sections:
python scraper.py jforum
Run the interactive query system:
python query_system.pyAvailable commands:
ask <question>- Ask a question about the forum postssearch <query>- Search for similar postspost <id>- Get a specific post by IDhelp- Show available commandsquit- Exit the program
from query_system import ForumQuerySystem
# Initialize the system
query_system = ForumQuerySystem(openai_api_key="your-key")
# Search for similar posts
posts = query_system.search_similar_posts("telemetry data", limit=5)
# Ask a question
answer = query_system.ask_question("How do I get started with the iRacing SDK?")
# Get a specific post
post = query_system.get_post_by_id(123)
# Don't forget to close the connection
query_system.close()- Embeddings: Each forum post is converted to a vector embedding using the
all-MiniLM-L6-v2model - Similarity Search: When you search or ask a question, the system finds the most similar posts using Milvus vector search with cosine similarity
- Context Building: Relevant posts are used as context for the OpenAI model
- AI Response: GPT-5 generates answers based on the forum content
The system uses a Milvus vector database with a collection named 'forum_posts' containing fields like:
- id (auto-generated)
- vector (embedding)
- source
- author
- date
- text
- comment_id
scraper.py- Entry point for forum scraping (wrapper for main and JForum scrapers)main_forum_scraper.py- Scraper for main iRacing forumsjforum_scraper.py- Scraper for JForum sectionsmilvus.py- Milvus utility functions for setup and saving postsquery_system.py- Main query system and CLI interfacepyproject.toml- Project dependencies and configuration
