fix(bible): 修复角色关系指数增长导致数据库爆炸的问题#160
Conversation
bible_service.add_character() 原先每次调用都会全量加载 Bible、删除所有 子表数据后重新写入,多次调用会导致关系数据按 2^n 倍增。实测 bible_character_relationships 表膨胀至 406 万行(1.4GB),其中每条 关系被复制 131,072 次。 修复方案: - 新增 SqliteBibleRepository.add_character_incremental() 增量写入方法 - BibleService.add_character() 改为增量 INSERT,不再 DELETE + 全量重写 - 更新单元测试适配新逻辑 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
📝 WalkthroughWalkthroughBibleService.add_character now performs incremental character persistence. It validates for duplicate characters, uses a new SqliteBibleRepository.add_character_incremental method when available, falls back to the previous full-save behavior, and reloads the Bible state before returning the updated DTO. ChangesIncremental Character Persistence
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
infrastructure/persistence/database/sqlite_bible_repository.py (1)
30-35:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRoot cause remains:
_clear_childrennever clearsbible_character_relationships, so everysave()call still accumulates orphaned relationship rows.This PR correctly fixes
add_characterby bypassingsave(). However,_clear_childrenstill omits thebible_character_relationshipstable. Every other service method (add_world_setting,add_location,add_timeline_note,add_style_note,update_bible) still callssave(), which calls_clear_children, which deletesbible_charactersrows but leaves their relationship rows behind as orphans — then new relationship rows are inserted with fresh UUIDs. Over repeated calls the relationships table will continue to grow unboundedly for those paths.The minimal complementary fix:
🛡️ Proposed fix for `_clear_children`
def _clear_children(self, conn, novel_id: str) -> None: + conn.execute( + """ + DELETE FROM bible_character_relationships + WHERE character_id IN ( + SELECT id FROM bible_characters WHERE novel_id = ? + ) + """, + (novel_id,), + ) conn.execute("DELETE FROM bible_style_notes WHERE novel_id = ?", (novel_id,)) conn.execute("DELETE FROM bible_timeline_notes WHERE novel_id = ?", (novel_id,)) conn.execute("DELETE FROM bible_locations WHERE novel_id = ?", (novel_id,)) conn.execute("DELETE FROM bible_world_settings WHERE novel_id = ?", (novel_id,)) conn.execute("DELETE FROM bible_characters WHERE novel_id = ?", (novel_id,))The subquery must run before
bible_charactersis deleted, otherwise it returns no rows and the relationships are left behind.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@infrastructure/persistence/database/sqlite_bible_repository.py` around lines 30 - 35, _clear_children currently deletes bible_characters before removing related rows, leaving orphaned rows in bible_character_relationships; update the _clear_children(conn, novel_id) logic to first delete from bible_character_relationships using a subquery that selects character ids for the given novel_id (e.g. DELETE FROM bible_character_relationships WHERE character_id IN (SELECT id FROM bible_characters WHERE novel_id = ?)), then proceed to delete bible_characters and the other child tables (bible_style_notes, bible_timeline_notes, bible_locations, bible_world_settings) to ensure relationships are removed before their parent character rows.application/world/services/bible_service.py (1)
123-148:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftTOCTOU: duplicate-character guard is not atomic with the database write.
Two concurrent requests for the same
character_idcan both pass theget_charactercheck on line 128 (both see the character absent) and then both calladd_character_incremental. The second call'sINSERT OR REPLACEsilently replaces the character row while both calls append relationship rows, re-introducing duplication.The most robust fix is a DB-level unique constraint on
(novel_id, id)inbible_characters(it likely already is the PK) and catchingIntegrityErrorinadd_character_incrementaland re-raising it asInvalidOperationError, making the DB the authoritative guard rather than the in-memory check.# in add_character_incremental — after the try: import sqlite3 try: conn.execute("INSERT INTO bible_characters ...") # use INSERT, not INSERT OR REPLACE ... conn.commit() except sqlite3.IntegrityError: conn.rollback() raise InvalidOperationError(f"Character '{character_id}' already exists") except Exception: conn.rollback() raise🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@application/world/services/bible_service.py` around lines 123 - 148, The current TOCTOU check in bible_service (bible.get_character(CharacterId(...))) is not atomic with the subsequent repo.add_character_incremental call; update the repository method add_character_incremental (not the service) to make the DB authoritative by using a plain INSERT (not INSERT OR REPLACE) into bible_characters, ensure the (novel_id, id) uniqueness is enforced at the DB level, and catch the DB IntegrityError (e.g., sqlite3.IntegrityError) inside add_character_incremental to rollback and re-raise as InvalidOperationError (preserving the message like "Character '<id>' already exists"); you can keep the in-memory get_character check for early feedback but the real duplicate protection must be implemented and translated to InvalidOperationError inside add_character_incremental.
🧹 Nitpick comments (2)
application/world/services/bible_service.py (1)
133-148: ⚡ Quick win
hasattrduck-typing leaks infrastructure concerns into the application layer.Using
hasattr(repo, "add_character_incremental")mirrors the same pattern already present forupdate_character_anchors, but it means the application service silently degrades to the buggysave()fallback path for any repository that hasn't implemented the method — with no warning or error. TheBibleRepositoryabstract base should declareadd_character_incremental(raisingNotImplementedErrorby default) so the fallback cannot happen silently.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@application/world/services/bible_service.py` around lines 133 - 148, The service currently duck-types with hasattr(repo, "add_character_incremental") which allows silent fallback to the buggy save() path; instead ensure the abstract repository interface defines add_character_incremental (e.g., add an abstract method add_character_incremental in BibleRepository that raises NotImplementedError) and remove the hasattr check in bible_service.py so the service calls repo.add_character_incremental(novel_id, character_id, name, description, rel_dicts) directly (or explicitly raise a clear error if the repository does not implement the method), keeping the rel conversion logic around the same (the rel_dicts building and use of update_character_anchors unchanged).tests/unit/application/services/test_bible_service.py (1)
41-64: ⚡ Quick winNew
InvalidOperationErrorduplicate-character guard has no test coverage.Lines 128–131 in
bible_service.pynow raiseInvalidOperationErrorwhen the character already exists, but there is no test exercising that branch. The analogoustest_add_character_bible_not_foundtest exists; the duplicate case should be added symmetrically.💚 Suggested test
def test_add_character_already_exists(self, service, mock_repository): """测试重复添加同一角色应抛出 InvalidOperationError""" from domain.shared.exceptions import InvalidOperationError existing_char = Character( id=CharacterId("char-1"), name="主角", description="主角描述" ) bible = Bible(id="bible-1", novel_id=NovelId("novel-1")) bible.add_character(existing_char) mock_repository.get_by_novel_id.return_value = bible with pytest.raises(InvalidOperationError): service.add_character( novel_id="novel-1", character_id="char-1", name="主角", description="主角描述", ) mock_repository.add_character_incremental.assert_not_called()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/application/services/test_bible_service.py` around lines 41 - 64, Add a unit test that covers the new duplicate-character guard in bible_service.py: create a test named test_add_character_already_exists that sets up a Bible containing a Character (use CharacterId/Character and bible.add_character), mock mock_repository.get_by_novel_id to return that Bible, then call service.add_character with the same character_id and assert it raises domain.shared.exceptions.InvalidOperationError and that mock_repository.add_character_incremental.assert_not_called(); this mirrors test_add_character_bible_not_found and ensures the InvalidOperationError branch (the check in bible_service.py that raises when the character already exists) is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@application/world/services/bible_service.py`:
- Around line 160-162: The code calls BibleDTO.from_domain(bible) without
checking that bible (from
self.bible_repository.get_by_novel_id(NovelId(novel_id))) may be None; add a
null-check after the get_by_novel_id call and handle the missing Bible
explicitly — e.g., if bible is None raise a clear NotFound/DomainError (or
return an appropriate error/None response) instead of calling
BibleDTO.from_domain, updating the method containing add_character_incremental
(or the function where this reload occurs) to reference get_by_novel_id and
BibleDTO.from_domain so the None case is handled.
In `@infrastructure/persistence/database/sqlite_bible_repository.py`:
- Around line 345-357: add_character_incremental is not idempotent because
existing rows in bible_character_relationships are not removed before
re-inserting relationships; add a targeted DELETE to remove all relationship
rows for the given character_id before the loop that INSERTs new rows. Locate
the function add_character_incremental and, just before the for i, rel in
enumerate(relationships) loop, execute a parameterized DELETE like "DELETE FROM
bible_character_relationships WHERE character_id = ?" using the same
conn/transaction so old relationships are purged atomically prior to inserting
the new ones. Ensure you reference character_id when deleting and keep the
subsequent INSERT logic unchanged.
---
Outside diff comments:
In `@application/world/services/bible_service.py`:
- Around line 123-148: The current TOCTOU check in bible_service
(bible.get_character(CharacterId(...))) is not atomic with the subsequent
repo.add_character_incremental call; update the repository method
add_character_incremental (not the service) to make the DB authoritative by
using a plain INSERT (not INSERT OR REPLACE) into bible_characters, ensure the
(novel_id, id) uniqueness is enforced at the DB level, and catch the DB
IntegrityError (e.g., sqlite3.IntegrityError) inside add_character_incremental
to rollback and re-raise as InvalidOperationError (preserving the message like
"Character '<id>' already exists"); you can keep the in-memory get_character
check for early feedback but the real duplicate protection must be implemented
and translated to InvalidOperationError inside add_character_incremental.
In `@infrastructure/persistence/database/sqlite_bible_repository.py`:
- Around line 30-35: _clear_children currently deletes bible_characters before
removing related rows, leaving orphaned rows in bible_character_relationships;
update the _clear_children(conn, novel_id) logic to first delete from
bible_character_relationships using a subquery that selects character ids for
the given novel_id (e.g. DELETE FROM bible_character_relationships WHERE
character_id IN (SELECT id FROM bible_characters WHERE novel_id = ?)), then
proceed to delete bible_characters and the other child tables
(bible_style_notes, bible_timeline_notes, bible_locations, bible_world_settings)
to ensure relationships are removed before their parent character rows.
---
Nitpick comments:
In `@application/world/services/bible_service.py`:
- Around line 133-148: The service currently duck-types with hasattr(repo,
"add_character_incremental") which allows silent fallback to the buggy save()
path; instead ensure the abstract repository interface defines
add_character_incremental (e.g., add an abstract method
add_character_incremental in BibleRepository that raises NotImplementedError)
and remove the hasattr check in bible_service.py so the service calls
repo.add_character_incremental(novel_id, character_id, name, description,
rel_dicts) directly (or explicitly raise a clear error if the repository does
not implement the method), keeping the rel conversion logic around the same (the
rel_dicts building and use of update_character_anchors unchanged).
In `@tests/unit/application/services/test_bible_service.py`:
- Around line 41-64: Add a unit test that covers the new duplicate-character
guard in bible_service.py: create a test named test_add_character_already_exists
that sets up a Bible containing a Character (use CharacterId/Character and
bible.add_character), mock mock_repository.get_by_novel_id to return that Bible,
then call service.add_character with the same character_id and assert it raises
domain.shared.exceptions.InvalidOperationError and that
mock_repository.add_character_incremental.assert_not_called(); this mirrors
test_add_character_bible_not_found and ensures the InvalidOperationError branch
(the check in bible_service.py that raises when the character already exists) is
exercised.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: d852aaf1-c659-41f2-8b6a-1e9153edf56a
📒 Files selected for processing (3)
application/world/services/bible_service.pyinfrastructure/persistence/database/sqlite_bible_repository.pytests/unit/application/services/test_bible_service.py
| # 重新加载最新状态返回 | ||
| bible = self.bible_repository.get_by_novel_id(NovelId(novel_id)) | ||
| return BibleDTO.from_domain(bible) |
There was a problem hiding this comment.
Potential NoneType crash: BibleDTO.from_domain(bible) is called without a null-check after the reload.
get_by_novel_id returns Optional[Bible]. If the Bible is concurrently deleted between add_character_incremental and the reload on line 161, bible is None and BibleDTO.from_domain(None) will raise an AttributeError.
🐛 Proposed fix
- # 重新加载最新状态返回
- bible = self.bible_repository.get_by_novel_id(NovelId(novel_id))
- return BibleDTO.from_domain(bible)
+ # 重新加载最新状态返回
+ bible = self.bible_repository.get_by_novel_id(NovelId(novel_id))
+ if bible is None:
+ raise EntityNotFoundError("Bible", f"for novel {novel_id}")
+ return BibleDTO.from_domain(bible)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@application/world/services/bible_service.py` around lines 160 - 162, The code
calls BibleDTO.from_domain(bible) without checking that bible (from
self.bible_repository.get_by_novel_id(NovelId(novel_id))) may be None; add a
null-check after the get_by_novel_id call and handle the missing Bible
explicitly — e.g., if bible is None raise a clear NotFound/DomainError (or
return an appropriate error/None response) instead of calling
BibleDTO.from_domain, updating the method containing add_character_incremental
(or the function where this reload occurs) to reference get_by_novel_id and
BibleDTO.from_domain so the None case is handled.
| for i, rel in enumerate(relationships): | ||
| rid = f"{character_id}-rel-{i}-{uuid.uuid4().hex[:6]}" | ||
| target_name = rel.get("target", "") or "" | ||
| relation = rel.get("relation", "") or "" | ||
| desc = rel.get("description", "") or "" | ||
| conn.execute( | ||
| """ | ||
| INSERT INTO bible_character_relationships | ||
| (id, character_id, target_name, relation, description) | ||
| VALUES (?, ?, ?, ?, ?) | ||
| """, | ||
| (rid, character_id, target_name, relation, desc), | ||
| ) |
There was a problem hiding this comment.
add_character_incremental is not idempotent — stale relationships are not purged before re-insertion.
INSERT OR REPLACE on bible_characters (line 334) performs a DELETE+INSERT internally. Because SQLite foreign-key enforcement is off by default and the schema most likely has no explicit ON DELETE CASCADE, the existing relationship rows for character_id in bible_character_relationships are not removed when the character row is replaced. The subsequent INSERT loop then appends new rows on top of the survivors, re-introducing a smaller version of the original duplication bug.
The fix is a targeted DELETE at the top of the relationship loop:
🛡️ Proposed fix
+ conn.execute(
+ "DELETE FROM bible_character_relationships WHERE character_id = ?",
+ (character_id,),
+ )
for i, rel in enumerate(relationships):
rid = f"{character_id}-rel-{i}-{uuid.uuid4().hex[:6]}"This makes the method self-consistent regardless of whether the service-level duplicate guard fires first (e.g., TOCTOU, direct repo invocation).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@infrastructure/persistence/database/sqlite_bible_repository.py` around lines
345 - 357, add_character_incremental is not idempotent because existing rows in
bible_character_relationships are not removed before re-inserting relationships;
add a targeted DELETE to remove all relationship rows for the given character_id
before the loop that INSERTs new rows. Locate the function
add_character_incremental and, just before the for i, rel in
enumerate(relationships) loop, execute a parameterized DELETE like "DELETE FROM
bible_character_relationships WHERE character_id = ?" using the same
conn/transaction so old relationships are purged atomically prior to inserting
the new ones. Ensure you reference character_id when deleting and keep the
subsequent INSERT logic unchanged.
修复 bible_character_relationships 表因 add_character() 全量 DELETE + INSERT 模式导致关系数据指数增长(2^n)的 bug。实测关系表膨胀至 406 万行(1.4GB),每条关系被复制
131,072 次。改为增量 INSERT,不再触碰已有数据。
架构影响
测试
后端单测
pytest tests/unit/application/services/test_bible_service.py tests/unit/domain/bible/test_bible.py tests/unit/domain/bible/test_character.py -q
结果:36 passed
风险说明
Summary by CodeRabbit
Release Notes
New Features
Improvements