Skip to content

fix(bible): 修复角色关系指数增长导致数据库爆炸的问题#160

Open
ws1065 wants to merge 1 commit intoshenminglinyi:masterfrom
ws1065:feat/fix-bible-relationship-explosion
Open

fix(bible): 修复角色关系指数增长导致数据库爆炸的问题#160
ws1065 wants to merge 1 commit intoshenminglinyi:masterfrom
ws1065:feat/fix-bible-relationship-explosion

Conversation

@ws1065
Copy link
Copy Markdown

@ws1065 ws1065 commented May 8, 2026

修复 bible_character_relationships 表因 add_character() 全量 DELETE + INSERT 模式导致关系数据指数增长(2^n)的 bug。实测关系表膨胀至 406 万行(1.4GB),每条关系被复制
131,072 次。改为增量 INSERT,不再触碰已有数据。


架构影响

  • 涉及层级:application / infrastructure
  • 是否新增数据库表/字段:否
  • 是否修改现有 API 契约:否

测试

后端单测

pytest tests/unit/application/services/test_bible_service.py tests/unit/domain/bible/test_bible.py tests/unit/domain/bible/test_character.py -q

结果:36 passed

  • 新增/修改的逻辑有对应单测
  • 本地后端启动正常(python -m uvicorn ...)
  • 本地前端启动正常(npm run dev)

风险说明

  • 潜在风险:add_world_setting、add_location 等方法仍使用全量 save 模式,后续建议一并改为增量写入
  • 回滚方式:git revert 本提交即可;数据库备份位于 data/aitext.db.bak

Summary by CodeRabbit

Release Notes

  • New Features

    • Characters can now be added incrementally without affecting existing characters or relationships.
    • Added validation to prevent duplicate character creation.
  • Improvements

    • Optimized character persistence for better performance.

bible_service.add_character() 原先每次调用都会全量加载 Bible、删除所有
子表数据后重新写入,多次调用会导致关系数据按 2^n 倍增。实测
bible_character_relationships 表膨胀至 406 万行(1.4GB),其中每条
关系被复制 131,072 次。

修复方案:
- 新增 SqliteBibleRepository.add_character_incremental() 增量写入方法
- BibleService.add_character() 改为增量 INSERT,不再 DELETE + 全量重写
- 更新单元测试适配新逻辑

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ws1065 ws1065 requested a review from shenminglinyi as a code owner May 8, 2026 03:29
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

BibleService.add_character now performs incremental character persistence. It validates for duplicate characters, uses a new SqliteBibleRepository.add_character_incremental method when available, falls back to the previous full-save behavior, and reloads the Bible state before returning the updated DTO.

Changes

Incremental Character Persistence

Layer / File(s) Summary
Repository Incremental Persistence
infrastructure/persistence/database/sqlite_bible_repository.py
SqliteBibleRepository gains add_character_incremental method that upserts a character row and inserts relationship rows for a given novel without deleting existing characters. Includes transaction management and UUID-based relationship ID generation.
Service Layer Integration
application/world/services/bible_service.py
BibleService.add_character imports InvalidOperationError, validates for duplicate characters, attempts incremental persistence via the new repository method with relationship normalization, falls back to previous full-save behavior, and reloads Bible state before returning DTO.
Tests & Validation
tests/unit/application/services/test_bible_service.py
test_add_character mocks updated with sequential Bible returns via side_effect to simulate incremental persistence and assert add_character_incremental is called once instead of save.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A character hops into the Bible's tale,
No longer must the whole world fail!
Incremental paths let each one stay,
While newcomers find their rightful way.
Persistence preserved, relationships dance—
The repository takes its chance! 🌟

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: fixing exponential growth of character relationships in the Bible by switching from full delete+insert to incremental insert operations.
Description check ✅ Passed The description covers all required template sections with sufficient detail: change type (fix), explanation of the problem and solution, architectural impact, test results, and risk assessment.
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
infrastructure/persistence/database/sqlite_bible_repository.py (1)

30-35: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Root cause remains: _clear_children never clears bible_character_relationships, so every save() call still accumulates orphaned relationship rows.

This PR correctly fixes add_character by bypassing save(). However, _clear_children still omits the bible_character_relationships table. Every other service method (add_world_setting, add_location, add_timeline_note, add_style_note, update_bible) still calls save(), which calls _clear_children, which deletes bible_characters rows but leaves their relationship rows behind as orphans — then new relationship rows are inserted with fresh UUIDs. Over repeated calls the relationships table will continue to grow unboundedly for those paths.

The minimal complementary fix:

🛡️ Proposed fix for `_clear_children`
 def _clear_children(self, conn, novel_id: str) -> None:
+    conn.execute(
+        """
+        DELETE FROM bible_character_relationships
+        WHERE character_id IN (
+            SELECT id FROM bible_characters WHERE novel_id = ?
+        )
+        """,
+        (novel_id,),
+    )
     conn.execute("DELETE FROM bible_style_notes WHERE novel_id = ?", (novel_id,))
     conn.execute("DELETE FROM bible_timeline_notes WHERE novel_id = ?", (novel_id,))
     conn.execute("DELETE FROM bible_locations WHERE novel_id = ?", (novel_id,))
     conn.execute("DELETE FROM bible_world_settings WHERE novel_id = ?", (novel_id,))
     conn.execute("DELETE FROM bible_characters WHERE novel_id = ?", (novel_id,))

The subquery must run before bible_characters is deleted, otherwise it returns no rows and the relationships are left behind.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@infrastructure/persistence/database/sqlite_bible_repository.py` around lines
30 - 35, _clear_children currently deletes bible_characters before removing
related rows, leaving orphaned rows in bible_character_relationships; update the
_clear_children(conn, novel_id) logic to first delete from
bible_character_relationships using a subquery that selects character ids for
the given novel_id (e.g. DELETE FROM bible_character_relationships WHERE
character_id IN (SELECT id FROM bible_characters WHERE novel_id = ?)), then
proceed to delete bible_characters and the other child tables
(bible_style_notes, bible_timeline_notes, bible_locations, bible_world_settings)
to ensure relationships are removed before their parent character rows.
application/world/services/bible_service.py (1)

123-148: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

TOCTOU: duplicate-character guard is not atomic with the database write.

Two concurrent requests for the same character_id can both pass the get_character check on line 128 (both see the character absent) and then both call add_character_incremental. The second call's INSERT OR REPLACE silently replaces the character row while both calls append relationship rows, re-introducing duplication.

The most robust fix is a DB-level unique constraint on (novel_id, id) in bible_characters (it likely already is the PK) and catching IntegrityError in add_character_incremental and re-raising it as InvalidOperationError, making the DB the authoritative guard rather than the in-memory check.

# in add_character_incremental — after the try:
import sqlite3
try:
    conn.execute("INSERT INTO bible_characters ...")  # use INSERT, not INSERT OR REPLACE
    ...
    conn.commit()
except sqlite3.IntegrityError:
    conn.rollback()
    raise InvalidOperationError(f"Character '{character_id}' already exists")
except Exception:
    conn.rollback()
    raise
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@application/world/services/bible_service.py` around lines 123 - 148, The
current TOCTOU check in bible_service (bible.get_character(CharacterId(...))) is
not atomic with the subsequent repo.add_character_incremental call; update the
repository method add_character_incremental (not the service) to make the DB
authoritative by using a plain INSERT (not INSERT OR REPLACE) into
bible_characters, ensure the (novel_id, id) uniqueness is enforced at the DB
level, and catch the DB IntegrityError (e.g., sqlite3.IntegrityError) inside
add_character_incremental to rollback and re-raise as InvalidOperationError
(preserving the message like "Character '<id>' already exists"); you can keep
the in-memory get_character check for early feedback but the real duplicate
protection must be implemented and translated to InvalidOperationError inside
add_character_incremental.
🧹 Nitpick comments (2)
application/world/services/bible_service.py (1)

133-148: ⚡ Quick win

hasattr duck-typing leaks infrastructure concerns into the application layer.

Using hasattr(repo, "add_character_incremental") mirrors the same pattern already present for update_character_anchors, but it means the application service silently degrades to the buggy save() fallback path for any repository that hasn't implemented the method — with no warning or error. The BibleRepository abstract base should declare add_character_incremental (raising NotImplementedError by default) so the fallback cannot happen silently.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@application/world/services/bible_service.py` around lines 133 - 148, The
service currently duck-types with hasattr(repo, "add_character_incremental")
which allows silent fallback to the buggy save() path; instead ensure the
abstract repository interface defines add_character_incremental (e.g., add an
abstract method add_character_incremental in BibleRepository that raises
NotImplementedError) and remove the hasattr check in bible_service.py so the
service calls repo.add_character_incremental(novel_id, character_id, name,
description, rel_dicts) directly (or explicitly raise a clear error if the
repository does not implement the method), keeping the rel conversion logic
around the same (the rel_dicts building and use of update_character_anchors
unchanged).
tests/unit/application/services/test_bible_service.py (1)

41-64: ⚡ Quick win

New InvalidOperationError duplicate-character guard has no test coverage.

Lines 128–131 in bible_service.py now raise InvalidOperationError when the character already exists, but there is no test exercising that branch. The analogous test_add_character_bible_not_found test exists; the duplicate case should be added symmetrically.

💚 Suggested test
def test_add_character_already_exists(self, service, mock_repository):
    """测试重复添加同一角色应抛出 InvalidOperationError"""
    from domain.shared.exceptions import InvalidOperationError

    existing_char = Character(
        id=CharacterId("char-1"), name="主角", description="主角描述"
    )
    bible = Bible(id="bible-1", novel_id=NovelId("novel-1"))
    bible.add_character(existing_char)
    mock_repository.get_by_novel_id.return_value = bible

    with pytest.raises(InvalidOperationError):
        service.add_character(
            novel_id="novel-1",
            character_id="char-1",
            name="主角",
            description="主角描述",
        )
    mock_repository.add_character_incremental.assert_not_called()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/application/services/test_bible_service.py` around lines 41 - 64,
Add a unit test that covers the new duplicate-character guard in
bible_service.py: create a test named test_add_character_already_exists that
sets up a Bible containing a Character (use CharacterId/Character and
bible.add_character), mock mock_repository.get_by_novel_id to return that Bible,
then call service.add_character with the same character_id and assert it raises
domain.shared.exceptions.InvalidOperationError and that
mock_repository.add_character_incremental.assert_not_called(); this mirrors
test_add_character_bible_not_found and ensures the InvalidOperationError branch
(the check in bible_service.py that raises when the character already exists) is
exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@application/world/services/bible_service.py`:
- Around line 160-162: The code calls BibleDTO.from_domain(bible) without
checking that bible (from
self.bible_repository.get_by_novel_id(NovelId(novel_id))) may be None; add a
null-check after the get_by_novel_id call and handle the missing Bible
explicitly — e.g., if bible is None raise a clear NotFound/DomainError (or
return an appropriate error/None response) instead of calling
BibleDTO.from_domain, updating the method containing add_character_incremental
(or the function where this reload occurs) to reference get_by_novel_id and
BibleDTO.from_domain so the None case is handled.

In `@infrastructure/persistence/database/sqlite_bible_repository.py`:
- Around line 345-357: add_character_incremental is not idempotent because
existing rows in bible_character_relationships are not removed before
re-inserting relationships; add a targeted DELETE to remove all relationship
rows for the given character_id before the loop that INSERTs new rows. Locate
the function add_character_incremental and, just before the for i, rel in
enumerate(relationships) loop, execute a parameterized DELETE like "DELETE FROM
bible_character_relationships WHERE character_id = ?" using the same
conn/transaction so old relationships are purged atomically prior to inserting
the new ones. Ensure you reference character_id when deleting and keep the
subsequent INSERT logic unchanged.

---

Outside diff comments:
In `@application/world/services/bible_service.py`:
- Around line 123-148: The current TOCTOU check in bible_service
(bible.get_character(CharacterId(...))) is not atomic with the subsequent
repo.add_character_incremental call; update the repository method
add_character_incremental (not the service) to make the DB authoritative by
using a plain INSERT (not INSERT OR REPLACE) into bible_characters, ensure the
(novel_id, id) uniqueness is enforced at the DB level, and catch the DB
IntegrityError (e.g., sqlite3.IntegrityError) inside add_character_incremental
to rollback and re-raise as InvalidOperationError (preserving the message like
"Character '<id>' already exists"); you can keep the in-memory get_character
check for early feedback but the real duplicate protection must be implemented
and translated to InvalidOperationError inside add_character_incremental.

In `@infrastructure/persistence/database/sqlite_bible_repository.py`:
- Around line 30-35: _clear_children currently deletes bible_characters before
removing related rows, leaving orphaned rows in bible_character_relationships;
update the _clear_children(conn, novel_id) logic to first delete from
bible_character_relationships using a subquery that selects character ids for
the given novel_id (e.g. DELETE FROM bible_character_relationships WHERE
character_id IN (SELECT id FROM bible_characters WHERE novel_id = ?)), then
proceed to delete bible_characters and the other child tables
(bible_style_notes, bible_timeline_notes, bible_locations, bible_world_settings)
to ensure relationships are removed before their parent character rows.

---

Nitpick comments:
In `@application/world/services/bible_service.py`:
- Around line 133-148: The service currently duck-types with hasattr(repo,
"add_character_incremental") which allows silent fallback to the buggy save()
path; instead ensure the abstract repository interface defines
add_character_incremental (e.g., add an abstract method
add_character_incremental in BibleRepository that raises NotImplementedError)
and remove the hasattr check in bible_service.py so the service calls
repo.add_character_incremental(novel_id, character_id, name, description,
rel_dicts) directly (or explicitly raise a clear error if the repository does
not implement the method), keeping the rel conversion logic around the same (the
rel_dicts building and use of update_character_anchors unchanged).

In `@tests/unit/application/services/test_bible_service.py`:
- Around line 41-64: Add a unit test that covers the new duplicate-character
guard in bible_service.py: create a test named test_add_character_already_exists
that sets up a Bible containing a Character (use CharacterId/Character and
bible.add_character), mock mock_repository.get_by_novel_id to return that Bible,
then call service.add_character with the same character_id and assert it raises
domain.shared.exceptions.InvalidOperationError and that
mock_repository.add_character_incremental.assert_not_called(); this mirrors
test_add_character_bible_not_found and ensures the InvalidOperationError branch
(the check in bible_service.py that raises when the character already exists) is
exercised.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: d852aaf1-c659-41f2-8b6a-1e9153edf56a

📥 Commits

Reviewing files that changed from the base of the PR and between 60e6446 and a1591aa.

📒 Files selected for processing (3)
  • application/world/services/bible_service.py
  • infrastructure/persistence/database/sqlite_bible_repository.py
  • tests/unit/application/services/test_bible_service.py

Comment on lines +160 to 162
# 重新加载最新状态返回
bible = self.bible_repository.get_by_novel_id(NovelId(novel_id))
return BibleDTO.from_domain(bible)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Potential NoneType crash: BibleDTO.from_domain(bible) is called without a null-check after the reload.

get_by_novel_id returns Optional[Bible]. If the Bible is concurrently deleted between add_character_incremental and the reload on line 161, bible is None and BibleDTO.from_domain(None) will raise an AttributeError.

🐛 Proposed fix
-        # 重新加载最新状态返回
-        bible = self.bible_repository.get_by_novel_id(NovelId(novel_id))
-        return BibleDTO.from_domain(bible)
+        # 重新加载最新状态返回
+        bible = self.bible_repository.get_by_novel_id(NovelId(novel_id))
+        if bible is None:
+            raise EntityNotFoundError("Bible", f"for novel {novel_id}")
+        return BibleDTO.from_domain(bible)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@application/world/services/bible_service.py` around lines 160 - 162, The code
calls BibleDTO.from_domain(bible) without checking that bible (from
self.bible_repository.get_by_novel_id(NovelId(novel_id))) may be None; add a
null-check after the get_by_novel_id call and handle the missing Bible
explicitly — e.g., if bible is None raise a clear NotFound/DomainError (or
return an appropriate error/None response) instead of calling
BibleDTO.from_domain, updating the method containing add_character_incremental
(or the function where this reload occurs) to reference get_by_novel_id and
BibleDTO.from_domain so the None case is handled.

Comment on lines +345 to +357
for i, rel in enumerate(relationships):
rid = f"{character_id}-rel-{i}-{uuid.uuid4().hex[:6]}"
target_name = rel.get("target", "") or ""
relation = rel.get("relation", "") or ""
desc = rel.get("description", "") or ""
conn.execute(
"""
INSERT INTO bible_character_relationships
(id, character_id, target_name, relation, description)
VALUES (?, ?, ?, ?, ?)
""",
(rid, character_id, target_name, relation, desc),
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

add_character_incremental is not idempotent — stale relationships are not purged before re-insertion.

INSERT OR REPLACE on bible_characters (line 334) performs a DELETE+INSERT internally. Because SQLite foreign-key enforcement is off by default and the schema most likely has no explicit ON DELETE CASCADE, the existing relationship rows for character_id in bible_character_relationships are not removed when the character row is replaced. The subsequent INSERT loop then appends new rows on top of the survivors, re-introducing a smaller version of the original duplication bug.

The fix is a targeted DELETE at the top of the relationship loop:

🛡️ Proposed fix
+            conn.execute(
+                "DELETE FROM bible_character_relationships WHERE character_id = ?",
+                (character_id,),
+            )
             for i, rel in enumerate(relationships):
                 rid = f"{character_id}-rel-{i}-{uuid.uuid4().hex[:6]}"

This makes the method self-consistent regardless of whether the service-level duplicate guard fires first (e.g., TOCTOU, direct repo invocation).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@infrastructure/persistence/database/sqlite_bible_repository.py` around lines
345 - 357, add_character_incremental is not idempotent because existing rows in
bible_character_relationships are not removed before re-inserting relationships;
add a targeted DELETE to remove all relationship rows for the given character_id
before the loop that INSERTs new rows. Locate the function
add_character_incremental and, just before the for i, rel in
enumerate(relationships) loop, execute a parameterized DELETE like "DELETE FROM
bible_character_relationships WHERE character_id = ?" using the same
conn/transaction so old relationships are purged atomically prior to inserting
the new ones. Ensure you reference character_id when deleting and keep the
subsequent INSERT logic unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant