This project focuses on combining GraphRAG and private domain Deep Search to achieve an explainable and reasoning-capable intelligent question answering system. It also integrates multi-agent collaboration and knowledge graph enhancement to build a complete RAG intelligent interaction solution.
💡 Inspired by retrieval-augmented reasoning and deep search scenarios, this project explores the integration path of RAG and Agent in future applications.
Note: This project was officially included by deepwiki on 4.28, which helps understand the overall project code and core working principles. Address: https://deepwiki.com/1517005260/graph-rag-agent
graph-rag-agent/
├── agent/ # 🤖 Agent module - core interaction layer
│ ├── base.py # Agent base class
│ ├── graph_agent.py # Graph-based Agent
│ ├── hybrid_agent.py # Hybrid search Agent
│ ├── naive_rag_agent.py # Simple vector retrieval Agent
│ ├── deep_research_agent.py # Deep research Agent
│ ├── fusion_agent.py # Multi-agent collaboration Agent
│ └── agent_coordinator.py # Multi-agent coordinator
├── assets/ # 🖼️ Static assets
│ ├── deepsearch.svg # RAG evolution diagram
│ └── start.md # Quick start document
├── build/ # 🏗️ Knowledge graph construction module
│ ├── main.py # Build entry
│ ├── build_graph.py # Basic graph construction
│ ├── build_index_and_community.py # Index and community construction
│ ├── build_chunk_index.py # Text chunk index construction
│ ├── incremental/ # Incremental update submodule
│ └── incremental_update.py # Incremental update management
├── CacheManage/ # 📦 Cache management module
│ ├── manager.py # Unified cache manager
│ ├── backends/ # Storage backends
│ ├── models/ # Data models
│ └── strategies/ # Cache key generation strategies
├── community/ # 🔍 Community detection and summarization module
│ ├── detector/ # Community detection algorithms
│ └── summary/ # Community summary generation
├── config/ # ⚙️ Configuration module
│ ├── neo4jdb.py # Database connection management
│ ├── prompt.py # Prompt templates
│ └── settings.py # Global settings
├── evaluator/ # 📊 Evaluation system
│ ├── core/ # Evaluation core components
│ ├── metrics/ # Evaluation metrics implementation
│ └── test/ # Evaluation test scripts
├── frontend/ # 🖥️ Frontend interface
│ ├── app.py # Application entry
│ ├── components/ # UI components
│ └── utils/ # Frontend utilities
├── graph/ # 📈 Graph construction module
│ ├── core/ # Core components
│ ├── extraction/ # Entity-relation extraction
│ ├── indexing/ # Index management
│ └── processing/ # Entity processing
├── model/ # 🧩 Model management
│ └── get_models.py # Model initialization
├── processor/ # 📄 Document processor
│ ├── document_processor.py # Document processing core
│ ├── file_reader.py # Multi-format file reading
│ └── text_chunker.py # Text chunking
├── search/ # 🔎 Search module
│ ├── local_search.py # Local search
│ ├── global_search.py # Global search
│ └── tool/ # Search toolset
│ ├── naive_search_tool.py # Simple search
│ ├── deep_research_tool.py # Deep research tool
│ └── reasoning/ # Reasoning components
├── server/ # 🖧 Backend service
│ ├── main.py # FastAPI application entry
│ ├── models/ # Data models
│ ├── routers/ # API routes
│ └── services/ # Business logic
└── test/ # 🧪 Test module
├── search_with_stream.py # Streaming output test
└── search_without_stream.py # Standard output test
Additionally, each module has its own readme to introduce its functionality.
- The growing reasoning ability of large models: Where are RAG and Agent heading?
- Enterprise-level knowledge graph interactive QA system solution
- Jean - Building a graph with domestic LLM + LangChain + Neo4j
- GraphRAG vs DeepSearch? The answer from the GraphRAG proposer
- Reproducing GraphRAG from scratch: Complete implementation of GraphRAG's core functions, representing knowledge as a graph structure
- Innovative integration of DeepSearch and GraphRAG: While existing DeepSearch frameworks are mainly based on vector databases, this project innovatively combines them with knowledge graphs
- Multi-agent collaborative architecture: Implements collaboration among different types of agents to enhance the ability to handle complex problems
- Comprehensive evaluation system: Provides 20+ evaluation metrics to comprehensively measure system performance
- Incremental update mechanism: Supports dynamic incremental construction and intelligent deduplication of the knowledge graph
- Visualization of reasoning process: Displays the AI's reasoning trajectory to improve explainability and transparency
Please refer to: Quick Start Document
- Multi-format document processing: Supports TXT, PDF, MD, DOCX, DOC, CSV, JSON, YAML/YML, and other formats
- LLM-driven entity-relation extraction: Uses large language models to identify entities and relationships from text
- Incremental update mechanism: Supports dynamic updates on existing graphs and intelligently handles conflicts
- Community detection and summarization: Automatically identifies knowledge communities and generates summaries, supporting Leiden and SLLPA algorithms
- Consistency validation: Built-in graph consistency check and repair mechanism
- Multi-level retrieval strategies: Supports local search, global search, hybrid search, and other modes
- Graph-enhanced context: Uses graph structure to enrich retrieval content and provide more comprehensive knowledge background
- Chain of Exploration: Implements multi-step exploration on the knowledge graph
- Community-aware retrieval: Optimizes search results based on knowledge community structure
- Multi-step thinking-search-reasoning: Supports decomposition and deep exploration of complex problems
- Evidence chain tracking: Records the evidence source of each reasoning step to improve explainability
- Visualization of reasoning process: Real-time display of AI's reasoning trajectory
- Multi-path parallel search: Executes multiple search strategies simultaneously, making comprehensive use of different knowledge sources
- NaiveRagAgent: Basic vector retrieval agent, suitable for simple questions
- GraphAgent: Graph-based agent, supports relational reasoning
- HybridAgent: Agent combining multiple retrieval methods
- DeepResearchAgent: Deep research agent, supports multi-step reasoning for complex problems
- FusionGraphRAGAgent: Fusion agent, combines the advantages of multiple strategies
- Multi-dimensional evaluation: Includes answer quality, retrieval performance, graph evaluation, and deep research evaluation
- Performance monitoring: Tracks API call duration to optimize system performance
- User feedback mechanism: Collects user feedback on answers to continuously improve the system
- Streaming response: Supports real-time streaming display of AI-generated content
- Interactive knowledge graph: Provides a Neo4j-style graph interaction interface
- Debug mode: Developers can view execution traces and search processes
- RESTful API: Complete backend API design, supports extensible development
cd test/
python search_with_stream.py
# This example tests the output of FusionGraphRAGAgent. You can test other agents by uncommenting them in the test script.
Test started: 2025-04-05 21:55:04
===== Starting streaming Agent test =====
Enhanced deep research tool loaded
===== Test query: What are the application requirements for excellent students? =====
[Test] FusionGraphRAGAgent - Streaming - Query: 'What are the application requirements for excellent students?'
Start receiving streaming output (max wait 300 seconds)...
Performance metric - fast_cache_check: 1.0043s
DEBUG - LLM keyword result: {
"low_level": ["student", "institutions"],
"high_level": ["comparison", "criteria", "educat...
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.570 seconds.
Prefix dict has been built successfully.
Query graph construction completed, containing 5 entities and 0 relationships, time: 0.00s
DEBUG - LLM keyword result: {
"low_level": ["Institution A", "excellent student"],
"high_level": ["criteria", "define", ...
[Dual-path search] LLM evaluation: Both results are valuable, merging results
DEBUG - LLM keyword result: {
"low_level": ["Institution B"],
"high_level": ["criteria", "excellent student", "definitio...
[Dual-path search] LLM evaluation: Precise query results are more specific and valuable
DEBUG - LLM keyword result: {
"low_level": ["student", "institutions"],
"high_level": ["comparison", "criteria", "excell...
[Validation] Answer passed keyword relevance check
DEBUG - LLM keyword result: {
"low_level": ["student admission", "top universities"],
"high_level": ["criteria", "excell...
Query graph construction completed, containing 5 entities and 0 relationships, time: 0.00s
DEBUG - LLM keyword result: {
"low_level": ["excellent student", "top universities"],
"high_level": ["academic qualifica...
[Dual-path search] LLM evaluation: Precise query results are more specific and valuable
DEBUG - LLM keyword result: {
"low_level": ["student admission", "top universities"],
"high_level": ["extracurricular ac...
[Dual-path search] LLM evaluation: Precise query results are more specific and valuable
DEBUG - LLM keyword result: {
"low_level": ["student", "universities"],
"high_level": ["admission criteria", "excellence...
[Validation] Answer does not contain any high-level keywords: ['admission criteria', 'excellence', 'higher education']
Reached max wait time 300 seconds, ending reception early
[Complete] Streaming query finished
- Total time: 414.15s
- First block delay: 1.00s
- Number of data blocks: 14
- Total characters: 766
Result:
**Analyzing the question and formulating a retrieval plan**...
**Retrieval plan formulated**
- Complexity assessment: 0.60
- Global view required: Yes
- Relationship path tracking required: No
- Time-related content: No
- Knowledge domains involved: Education, Admission Policies, Student Assessment
**Executing task 1/5**: exploration - Comparison of excellent student criteria across different institutions
**Starting deep exploration**...
✓ Deep exploration completed
**Executing task 2/5**: local_search - Specific academic achievements or qualifications required for recognition as an excellent student
✓ Local search completed
**Executing task 3/5**: global_search - General application criteria for excellent students in various educational institutions
✓ Global search completed
**Executing task 4/5**: local_search - Policies governing how excellent students are defined and assessed
✓ Local search completed
**Executing task 5/5**: exploration - Detailed criteria for excellent student admission in top universities
**Starting deep exploration**...
✓ Deep exploration completed
===== Test query: How much is the academic scholarship? =====
[Test] FusionGraphRAGAgent - Streaming - Query: 'How much is the academic scholarship?'
Start receiving streaming output (max wait 300 seconds)...
Performance metric - fast_cache_check: 0.9272s
DEBUG - LLM keyword result: {
"low_level": ["institutions", "scholarship offerings"],
"high_level": ["education", "schol...
Query graph construction completed, containing 5 entities and 0 relationships, time: 0.00s
DEBUG - LLM keyword result: {
"low_level": ["institutions", "scholarships"],
"high_level": ["notable", "offer", "educati...
[Dual-path search] LLM evaluation: Query results with knowledge base names are more specific and valuable
DEBUG - LLM keyword result: {
"low_level": ["institutions", "scholarships"],
"high_level": ["types"]
}
[Dual-path search] LLM evaluation: Query results with knowledge base names are more specific and valuable
DEBUG - LLM keyword result: {
"low_level": ["institutions", "scholarship offerings"],
"high_level": ["education", "finan...
[Validation] Answer passed keyword relevance check
[Complete] Streaming query finished
- Total time: 226.51s
- First block delay: 0.93s
- Number of data blocks: 18
- Total characters: 1230
Result:
**Analyzing the question and formulating a retrieval plan**...
**Retrieval plan formulated**
- Complexity assessment: 0.50
- Global view required: Yes
- Relationship path tracking required: No
- Time-related content: No
- Knowledge domains involved: Education, Finance, Scholarship Programs
**Executing task 1/6**: exploration - Explore different institutions and their scholarship offerings
**Starting deep exploration**...
✓ Deep exploration completed
**Executing task 2/6**: local_search - Average amount of funds awarded by academic scholarships
✓ Local search completed
**Executing task 3/6**: global_search - Statistics on academic scholarship funding trends
✓ Global search completed
**Executing task 4/6**: global_search - Overview of academic scholarships
✓ Global search completed
**Executing task 5/6**: global_search - Types and amounts of financial aid available for students
✓ Global search completed
**Executing task 6/6**: local_search - Financial aid offices or resources for further information
✓ Local search completed
**Integrating all retrieval results to generate the final answer**...
**Integrating all retrieval results to generate the final answer**...
According to the retrieved results, at East China University of Science and Technology, the amount of academic scholarships is divided into different levels, and the amount and proportion for each level are as follows:
1. **Special Scholarship**: 5000 RMB/person/year, 2% of recipients.
2. **First-class Scholarship**: 3000 RMB/person/year, 3% of recipients.
3. **Second-class Scholarship**: 2000 RMB/person/year, 10% of recipients.
4. **Third-class Scholarship**: 1000 RMB/person/year, 25% of recipients.
Based on these different levels and proportions, the weighted average scholarship amount per winning student is about 640 RMB.
These scholarships are awarded based on students' comprehensive grades and moral scores. The school uses this method to encourage and support outstanding students. For each applicant, the school has strict selection criteria and procedures to ensure that scholarships are awarded to eligible students.
In addition to academic scholarships, East China University of Science and Technology also provides other types of financial aid, such as national student loans and inspirational scholarships, to help financially disadvantaged students complete their studies.
If you have other questions about scholarship types or application procedures, please refer to the official guidelines of the relevant school department or further consult the school's student financial aid management center.
===== Test query: What are the standards for the College English Test? =====
[Test] FusionGraphRAGAgent - Streaming - Query: 'What are the standards for the College English Test?'
Start receiving streaming output (max wait 300 seconds)...
Keyword extraction failed: Expecting value: line 1 column 1 (char 0)
Performance metric - fast_cache_check: 1.0581s
DEBUG - LLM keyword result: {
"low_level": ["various countries", "university", "English test"],
"high_level": ["standards", "comparison"]
}
Query graph construction completed, containing 5 entities and 0 relationships, time: 0.00s
DEBUG - LLM keyword result: {
"low_level": ["university", "English test", "country"],
"high_level": ["education standards", "comparative analysis", "language proficiency assessment"]
}
[Dual-path search] LLM evaluation: Precise query results are more specific and valuable
DEBUG - LLM keyword result: {
"low_level": ["university", "English test", "various countries"],
"high_level": ["difficulty", "form", "standards"]
}
[Dual-path search] LLM evaluation: Query results with knowledge base names are more specific and valuable
DEBUG - LLM keyword result: {
"low_level": [],
"high_level": ["various countries", "university", "English test", "standards", "comparison"]
}
[Validation] Answer passed keyword relevance check
[Complete] Streaming query finished
- Total time: 224.52s
- First block delay: 1.06s
- Number of data blocks: 16
- Total characters: 1126
Result:
**Analyzing the question and formulating a retrieval plan**...
**Retrieval plan formulated**
- Complexity assessment: 0.70
- Global view required: Yes
- Relationship path tracking required: No
- Time-related content: Yes
- Knowledge domains involved: Education system, language learning, exam assessment
**Executing task 1/5**: global_search - The impact of College English Test on students
✓ Global search completed
**Executing task 2/5**: local_search - Changes in College English Test standards over time
✓ Local search completed
**Executing task 3/5**: exploration - Comparison of College English Test standards in different countries
**Starting deep exploration**...
✓ Deep exploration completed
**Executing task 4/5**: global_search - General overview of College English Test
✓ Global search completed
**Executing task 5/5**: local_search - Detailed introduction to College English Test standards
✓ Local search completed
**Integrating all retrieval results to generate the final answer**...
**Integrating all retrieval results to generate the final answer**...
### College English Test Standards
According to local and global search results, we can understand several main aspects of the College English Test standards, including test purpose, exemption conditions, course setup, and assessment methods.
#### 1. Background and Goals
The goal of college English courses is to cultivate students' English application ability, academic or professional English communication ability, and cross-cultural communication ability to meet the learning needs of students at different levels and improve teaching quality. [Result 1]
#### 2. Test Exemption Standards
College English exemption conditions are related to national unified test scores:
- **National College English Test Band 4**: A score of 480 or above can exempt the "College English" degree exam. [Result 2]
- **National College English Test Band 6**: A score of 425 or above can also exempt the "College English" degree exam. [Result 2]
#### 3. Course Setup and Assessment
College English courses consist of three stages, and all students participating in the courses must pass the final assessment of each stage and earn credits. [Result 1, Result 2]
#### 4. Degree Exam Arrangement
The degree exam is usually arranged in the last academic year before graduation. Students who fail can take a make-up exam. [Result 2]
#### 5. International Standards Comparison
Different countries use different English tests. For example, the UK usually uses IELTS, and the US uses TOEFL. Although the forms of the tests differ, they all include listening, speaking, reading, and writing. The purpose is to ensure that students can study smoothly in an English environment. [Exploration Result 1]
### Summary
The core standard of the College English Test is to assess students' English proficiency to adapt to academic or professional environments. There are differences in standards across countries, reflected in test methods and scoring. The system uses grading and make-up exams to ensure students' English ability.
===== Test query: Student Xiao Ming skipped 30 class hours, secretly kept a hair dryer, and beat up a classmate. Can he still be selected for the National Scholarship? =====
[Test] FusionGraphRAGAgent - Streaming - Query: 'Student Xiao Ming skipped 30 class hours, secretly kept a hair dryer, and beat up a classmate. Can he still be selected for the National Scholarship?'
Start receiving streaming output (max wait 300 seconds)...
Performance metric - fast_cache_check: 1.1123s
Received 20 blocks, total 824 characters, time: 101.35s
DEBUG - LLM keyword result: {
"low_level": ["truancy", "possession of prohibited items", "violent behavior"],
"high_level": ["comprehensive handling", "campus discipline management", "behavioral norms"]
}
Query graph construction completed, containing 5 entities and 0 relationships, time: 0.00s
DEBUG - LLM keyword result: {
"low_level": ["truancy", "school"],
"high_level": ["disciplinary measures", "behavior"]
}
[Dual-path search] LLM evaluation: Precise query results are more specific and valuable
DEBUG - LLM keyword result: {
"low_level": ["possession of prohibited items", "school"],
"high_level": ["disciplinary measures", "behavior", "rules and regulations"]
}
[Dual-path search] LLM evaluation: Precise query results are more specific and valuable
DEBUG - LLM keyword result:
{
"low_level": [],
"high_level": ["truancy", "possession of prohibited items", "violent behavior", "comprehensive handling", "school"]
}
[Validation] Answer passed keyword relevance check
[Complete] Streaming query finished
- Total time: 291.64s
- First block delay: 1.11s
- Number of data blocks: 32
- Total characters: 2011
Result:
**Analyzing the question and formulating a retrieval plan**...
**Retrieval plan formulated**
- Complexity assessment: 0.80
- Global view required: Yes
- Relationship path tracking required: Yes
- Time-related content: No
- Knowledge domains involved: Education policy, scholarship selection criteria, student discipline regulations
The core of this question is the selection criteria for the National Scholarship and whether Xiao Ming's behavior meets these criteria. To answer this question, we need to understand the following information:
1. **National Scholarship selection criteria and standards**: Generally, the selection criteria for the National Scholarship include academic performance, moral character, social performance, and comprehensive quality. The specific standards vary by country and institution, but usually include requirements for conduct and discipline.
2. **Impact of Xiao Ming's behavior**: Truancy, possession of prohibited items, and violence are all negative behaviors in school discipline and moral evaluation, which will affect Xiao Ming's eligibility. However, the specific impact and whether it absolutely disqualifies him depends on the school's regulations and circumstances.
...
**Executing task 1/7**: local_search - School's truancy disciplinary provisions
✓ Local search completed
**Executing task 2/7**: local_search - School's disciplinary provisions for possession of dangerous items
✓ Local search completed
**Executing task 3/7**: local_search - School's disciplinary provisions for teachers and students
✓ Local search completed
**Executing task 4/7**: local_search - School's disciplinary provisions for student violence
✓ Local search completed
**Executing task 5/7**: global_search - National Scholarship selection criteria
✓ Global search completed
**Executing task 6/7**: chain_exploration - Specific requirements for disciplinary violations in selection criteria
**Starting Chain of Exploration**...
- Extracting entities from existing results: National Scholarship, school discipline regulations
- Exploration path:
• Step 0: National Scholarship
• Step 0: School discipline regulations
• Step 1: Scholarship selection principles
• Step 1: Selection committee
• Step 1: Student academic records
- Found 21 related items
✓ Chain of Exploration completed
**Executing task 7/7**: exploration - Comprehensive handling of truancy, possession of prohibited items, and violent behavior by the school
**Starting deep exploration**...
✓ Deep exploration completed
**Final thinking based on all search results**...
Based on the provided validation results summary, we can see that none of the hypotheses are supported, rejected, or marked as uncertain. In this case, we lack specific data to support or oppose any hypothesis. My updated thinking process is:
1. **Lack of data support:** In the current validation results, no hypothesis is clearly supported, rejected, or marked as uncertain, which means we lack sufficient data or evidence for further analysis and conclusions.
2. **Continue data collection:** Since there are no definitive results, it is recommended to strengthen data collection and analysis to provide stronger evidence for each hypothesis. This includes designing new experiments, finding additional data sources, increasing sample size, etc.
...
**Integrating all retrieval results to generate the final answer**...
**Integrating all retrieval results to generate the final answer**...
According to the search results, whether Xiao Ming can be selected for the National Scholarship depends on the impact of his behavior on the selection criteria. The analysis is as follows:
### 1. National Scholarship Selection Criteria
The National Scholarship selection standards emphasize not only excellent academic performance (e.g., top 10%) but also require applicants to have high moral character and outstanding performance, such as participating in social services or abiding by the constitution and laws. These requirements show that moral character and discipline are important considerations.
### 2. Behavior Analysis and Impact
#### 1. **Truancy**
According to school disciplinary regulations, truancy is a serious violation. Absence from class without leave for more than a certain number of hours can result in warnings, demerits, or more severe penalties. Therefore, truancy for 30 class hours may have a significant negative impact on scholarship selection.
#### 2. **Possession of dangerous items**
Schools have clear disciplinary measures for possession of dangerous items, such as unauthorized use of electrical appliances, which may result in warnings or demerits. This behavior not only affects personal safety but is also considered a violation of school rules.
#### 3. **Violent behavior**
Violent behavior such as beating classmates is usually dealt with seriously and may result in demerits or probation. In scholarship selection, recorded violent behavior will significantly affect the evaluation of a student's moral character.
### 3. Comprehensive Analysis and Conclusion
Since the National Scholarship selection requires applicants to have no serious disciplinary violations, Xiao Ming's multiple violations, including truancy, possession of dangerous items, and violent behavior, will seriously affect his moral character evaluation. Therefore, according to the usual selection standards, Xiao Ming does not meet the "no disciplinary record" condition, which will directly disqualify him from the National Scholarship.
Based on the above analysis, Xiao Ming cannot be selected for the National Scholarship due to multiple disciplinary violations. It is recommended that Xiao Ming reflect on the consequences of his actions and comply with school regulations in the future to improve his performance.
### Information Sources
- School academic management regulations and disciplinary provisions.
- National Scholarship selection standards, including moral character assessment.
===== Test Summary =====
Tests passed: 4/4
Average total time: 289.21s
Average first block delay: 1.03s
Average number of data blocks: 20.0
Test completed: 2025-04-05 22:14:29It can be seen that due to embedding similarity, the LLM may sometimes approximate "excellent student" (an honorary title) as "National Scholarship" (title ≠ scholarship). This issue needs to be addressed by fine-tuning the embedding in the future.
-
Automated Data Acquisition:
- Add scheduled crawler functionality to replace the current manual document update method
- Implement resource auto-discovery and incremental crawling
-
Graph Construction Optimization:
- Use GRPO to train small models to support graph extraction
- Reduce the cost and latency of DeepResearch for graph extraction/Chain of Exploration
-
Domain-specific Embeddings:
- Solve the problem of distinguishing semantically similar but conceptually different terms
- Optimize embedding distinction for terms like "excellent student" vs "National Scholarship", "manslaughter" vs "intentional homicide", etc.
-
Agent Performance Optimization:
- Improve Agent framework response speed
- Optimize multi-agent collaboration mechanism
-
Project Engineering Optimization
- Optimize project structure, as the current structure is too redundant and scattered
- Cache optimization, as the current cache only hits exactly the same queries
- GraphRAG – Microsoft's open-source knowledge graph enhanced RAG framework
- llm-graph-builder – Neo4j official LLM graph building tool
- LightRAG – Lightweight knowledge-enhanced generation solution
- deep-searcher – Zilliz team's open-source private semantic search framework
- ragflow – Enterprise-level RAG system