generated from embabel/java-agent-template
-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Summary
Add a built-in ingestion pipeline that runs on startup, configurable per-user profiles, and resilient per-document error handling with structured failure reporting.
Changes
Built-in ingestion runner
IngestionRunnerimplementsApplicationRunnerand performs RAG content ingestion on startup whenguide.reload-content-on-startup=trueIngestionResultrecord captures loaded/failed URLs, directories, and individual documents with elapsed timeIngestionFailurerecord pairs each failure with a human-readable reason (extracted from the exception message)- On completion, a structured INGESTION COMPLETE banner is printed to stdout showing what loaded, what failed and why, RAG store stats, and the MCP endpoint
DataManagerController.loadReferences()now returns the structuredIngestionResult
Use existing rag-core storage interfaces
DataManagernow depends onChunkingContentElementRepository(fromembabel-agent-rag-core) instead of a customRagStorewrapper, so any backend implementing that library interface (e.g.DrivineStorefor Neo4j) can be plugged in without changes- Uses
ContentElementRepositoryInfo(fromembabel-agent-rag-core) for store metrics instead of a customRagStatsrecord - Removed
RagStore,DrivineRagStoreAdapter, andRagStats— these duplicated abstractions already provided by the rag and rag-neo modules - Ingestion via
ContentRefreshPolicy.ingestUriIfNeeded()is called directly inDataManager.ingestPage()rather than being wrapped behind a custom interface method
Resilient error handling
- URL ingestion: each URL is independently try-caught so one timeout or parse failure doesn't block the rest
- Directory ingestion: each document within a directory is independently try-caught so one bad file doesn't skip remaining documents in that directory
- All failures are collected with source identity and reason, then displayed in the summary banner
Configurable user profiles
GUIDE_PROFILEenvironment variable (default:user) controls whichapplication-{profile}.ymlis loadeduser-config/directory holds personal profile overrides (gitignored);application-user.yml.exampleprovided as a templateapplication-user.ymlchecked intosrc/main/resources/as a sensible default.env.exampledocuments required environment variables includingGUIDE_PROFILE
Helper script: scripts/fresh-ingest.sh
- Convenience script to wipe Neo4j RAG data and re-ingest from scratch
- Starts Neo4j via Docker Compose, clears all ContentElement nodes, then launches the app with
reload-content-on-startup=true - Automatically kills any existing process on the app port before starting
- Passes
--spring.config.additional-location=file:./user-config/so Spring Boot picks up personal profile files
Documentation
scripts/README.md— usage instructions for the helper scriptsscripts/INGESTION-TESTING.md— step-by-step testing guide for the ingestion pipeline
Tests
- Unit tests for
IngestionFailure,IngestionResult,IngestionRunner, andDataManagerController - Fixed flaky JWT token comparison in
HubApiControllerTest
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels