Conversation
| - `func Execute() error` returns `rootCmd.Execute()` — errors propagate to `main()` rather than calling `os.Exit` directly, ensuring deferred cleanup runs. | ||
| - In `PersistentPreRunE`: if `MSGVAULT_HOME` is set and `--home` was not explicitly provided, print a warning to stderr about the env var being set. | ||
|
|
||
| **Dependencies:** |
There was a problem hiding this comment.
The PersistentPreRunE warning about MSGVAULT_HOME is insufficient. If MSGVAULT_HOME points to a sensitive directory and the user runs devdata commands, symlink operations on ~/.msgvault could create confusion or accidental data modification. The tool should error (not just warn) if MSGVAULT_HOME is set unless --home is explicitly provided, forcing the user to acknowledge the override.
Automated security review by Claude 4.5 Sonnet - Human review still required
|
A session using a data excerpt for development: |
| absHome, err := filepath.Abs(home) | ||
| if err != nil { | ||
| return fmt.Errorf("resolve home directory: %w", err) | ||
| } |
There was a problem hiding this comment.
The path traversal check strings.HasPrefix(srcDir, absHome+string(filepath.Separator)) will fail for the home directory itself (when srcDir == absHome). Line 72 handles this with || srcDir != absHome, but the logic is inverted: it should be && srcDir != absHome. This could allow paths equal to home directory to bypass validation. The check should be: !strings.HasPrefix(srcDir, absHome+string(filepath.Separator)) && srcDir != absHome.
Automated security review by Claude 4.5 Sonnet - Human review still required
| } | ||
|
|
||
| // Remove symlink | ||
| if err := os.Remove(path); err != nil { |
There was a problem hiding this comment.
🚨 Symlink removal without re-verification before os.Remove (high severity)
The code re-verifies the symlink status at line 46 but then proceeds to os.Remove at line 56 without a second verification immediately before removal. A race condition between the check and removal could allow deletion of a real directory. The PRD explicitly requires re-verification immediately before os.Remove (see SECURITY.md section 'Symlink removal safety'). Move the Lstat check to line 55, immediately before os.Remove(path).
Automated security review by Claude 4.5 Sonnet - Human review still required
tools/devdata/dataset/copy.go
Outdated
|
|
||
| // Sanitize source path for ATTACH: reject null bytes, escape single quotes | ||
| if strings.ContainsRune(srcDBPath, 0) { | ||
| os.RemoveAll(dstDir) |
There was a problem hiding this comment.
The code checks for null bytes at line 64 and escapes single quotes at line 66, but does not validate that srcDBPath is an absolute canonical path within the expected directory. An attacker-controlled path could contain '..' sequences or symlinks that resolve outside the home directory, bypassing path traversal protection. The PRD requires canonicalization with filepath.Clean() and filepath.Abs() before this point, but newdata.go performs this. Add a defense-in-depth check here to verify srcDBPath is absolute and within expected bounds.
Automated security review by Claude 4.5 Sonnet - Human review still required
| absHome, err := filepath.Abs(home) | ||
| if err != nil { | ||
| return fmt.Errorf("resolve home directory: %w", err) | ||
| } |
There was a problem hiding this comment.
The path traversal check on lines 80-85 uses strings.HasPrefix with separator appending, but this can be bypassed with symlinks or redundant separators (e.g., ///home vs /home). Use filepath.Rel(absHome, srcDir) and check that the result does not start with ".." to properly validate paths are within bounds. Apply the same fix to dstDir validation on line 83.
Automated security review by Claude 4.5 Sonnet - Human review still required
6b3281f to
955bc00
Compare
| // Resolve destination path | ||
| dstDir := datasetPath(newDataDstFlag) | ||
|
|
||
| // Canonicalize and validate paths |
There was a problem hiding this comment.
🚨 Path traversal via filepath.Abs race condition (high severity)
Lines 66-73 call filepath.Abs on user-controlled --src/--dst flags, but these can still resolve symlinks outside the home directory. An attacker creating ~/.msgvault-evil as a symlink to /etc would pass validation but srcDir would resolve to /etc. Use filepath.EvalSymlinks explicitly and verify the final resolved path prefix-matches the canonical home directory before any filesystem operations.
Automated security review by Claude 4.5 Sonnet - Human review still required
| defer db.Close() | ||
|
|
||
| // Canonicalize source path for ATTACH (defense in depth — caller should | ||
| // also validate, but CopySubset is public and must not trust its inputs). |
There was a problem hiding this comment.
Lines 63-75 use filepath.Abs and filepath.Clean to canonicalize srcDBPath, but do not resolve symlinks. An attacker could place ~/.msgvault-evil/msgvault.db as a symlink to a victim's database outside the home directory. Use filepath.EvalSymlinks to resolve the final target and verify it is within the expected home directory before ATTACH.
Automated security review by Claude 4.5 Sonnet - Human review still required
764979c to
8cd7b38
Compare
|
Claude on why the test is failing on Windows: Based on my analysis of the code, TestReplaceSymlink is failing on Windows due to symlink permission requirements. Here's why: The ProblemOn Windows, creating symbolic links requires either:
GitHub Actions' windows-latest runners do not run with elevated privileges by default, so the test fails at line 158 when trying to create the first symlink: The SolutionsYou have several options: Option 1: Skip the test on Windows (Recommended)Add a build constraint or skip condition: Option 2: Make symlink tests Windows-specific with proper error handlingAccept that symlinks may not be available and handle the permission error gracefully: Option 3: Request Developer Mode in CI (Not reliable)You could try enabling Developer Mode in the GitHub Actions Windows runner, but this is fragile and not guaranteed to work across all Windows CI environments. Recommended ActionSkip the symlink tests on Windows since:
I don't like turning off tests for a platform, but it seems the best choice. The alternative seems to be to fight Github Actions configuration. |
871fb50 to
6e7d05e
Compare
Security Review: 3 High/Medium Issues FoundClaude's automated security review identified potential security concerns. Please review each finding below. Note: 1 low severity issue(s) were omitted to reduce noise.
|
roborev: Combined ReviewSummary verdict: 3 low-severity findings; no critical/high/medium issues. Low
Synthesized from 4 reviews (agents: codex, gemini | types: security, review) |
PRD changes: - Acknowledge tools/ as new convention vs scripts/ for shell tools - Document exit-dev-data restores gold regardless of current mount - Fix FTS5 rebuild description to match actual backfillFTSBatch query - Note attachment storage_path file-not-found behavior in dev datasets - Note analytics/ Parquet cache auto-build on TUI launch - Add MSGVAULT_HOME env var interaction section with warning behavior - Add schema version compatibility section acknowledging version coupling Task list changes: - Replace os.Exit(1) with error-returning Execute() pattern - Add MSGVAULT_HOME warning in PersistentPreRunE - Clarify two-phase open sequence for schema init + FK-off bulk copy - Add SELECT * fragility acknowledgment with rationale - Fix FTS5 populate query to match store.backfillFTSBatch exactly - Note conversation participant_count reflects subset, not original - Add IsDefault field to DatasetInfo for non-symlink directory listing - Handle real ~/.msgvault directory in list command output - Add automated test tasks for dataset package (Stage 1) and copy logic (Stage 2) using t.TempDir() and in-memory SQLite Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. Add dataset name allowlist to prevent SQL injection in ATTACH
Restrict dataset names to [a-zA-Z0-9_-] via ValidateDatasetName(),
called at the CLI boundary for --dataset, --src, and --dst flags.
This eliminates SQL injection vectors in the ATTACH DATABASE statement
while keeping the existing quote-escaping as defense in depth.
2. Move Lstat guard immediately before os.Remove in exit-dev-data
Collapse the Lstat re-verification and os.Remove into a single tight
block with no intervening code, minimizing the race window between
symlink verification and removal.
3. Canonicalize srcDBPath in CopySubset before ATTACH
Add filepath.Abs + filepath.Clean as defense in depth inside
CopySubset, since it is a public function that must not trust
its inputs. The caller already canonicalizes, but this ensures
the path used in the ATTACH DATABASE statement is always an
absolute canonical path regardless of call site.
4. Eliminate TOCTOU race in ReplaceSymlink with atomic rename
Replace the Lstat-then-Remove-then-Symlink pattern with a
temp-symlink + os.Rename pattern. On POSIX, rename atomically
replaces an existing symlink and fails with an error if the
target has become a real directory, eliminating the race window
entirely.
5. Reject control characters in source path before ATTACH
Replace the null-byte-only check with a full control character
scan (0x00-0x1F and 0x7F DEL) on the canonicalized source path.
This hardens the ATTACH DATABASE statement against characters
that could interfere with SQL parsing beyond just null bytes.
6. Use filepath.Rel for path traversal check instead of string prefix
Replace strings.HasPrefix-based path containment check with
filepath.Rel, which correctly handles symlinks and redundant
separators. A relative path starting with ".." means the target
escapes the home directory.
7. Harden devdata tool: error handling, path safety, race prevention
- Return errors from homeDir()/msgvaultPath()/datasetPath() instead of
calling os.Exit(1), so Cobra error handling and deferred cleanup run
- Add symlink-resolution + containment check in CopyFileIfExists to
prevent symlink escape from dataset directories
- Use random suffix for temp symlink in ReplaceSymlink to avoid
collisions between concurrent calls
- Fix misleading comment about PRAGMA foreign_keys (FK check is
standalone, not dependent on the pragma)
- Clarify temp table cleanup comment re: rollback behavior
- Use realistic string format for source_message_id in test data
- Replace hardcoded /tmp/ paths in tests with t.TempDir()
- Strengthen TestCopySubset_DestinationExists with actual assertions
8. Check all error return values in copy.go
- Check st.Close() on success path (can indicate unflushed writes)
- Check rows.Close() and rows.Err() after FK check iteration
- Check rows.Scan() errors (report as violations instead of silently skipping)
- Check res.RowsAffected() for conversations, participants, labels
- Check tx.Exec for temp table DROP
- Check db.Exec("DETACH DATABASE src") on success path
- Mark all best-effort cleanup calls with _ = (os.RemoveAll,
tx.Rollback, db.Exec on error paths) to document intentional ignore
9. Harden devdata: file sync, safe cleanup, input validation, test errors
- CopyFileIfExists: explicitly Sync and Close destination file instead of
relying on deferred Close, ensuring write errors are not silently lost
- CopySubset: track whether destination directory was newly created and
only RemoveAll on error if we created it, preventing deletion of
pre-existing directories passed by other callers
- new-data command: validate --rows is positive (negative values would
bypass SQLite LIMIT, zero would create empty databases)
- Tests: check all QueryRow().Scan() errors in TestCopySubset_Basic
10. Address security review findings for devdata tool
- Widen CopyFileIfExists containment check to verify against dataset root
directory (containDir param) instead of just the immediate parent, preventing
symlink escapes to sibling directories within the dataset
- Reorder rows.Err() before rows.Close() for idiomatic Go error checking
- Add _ = to intentionally-ignored os.Remove in ReplaceSymlink for consistency
- Document deferred db.Close() vs cleanupDir() ordering interaction
- Remove resolved SECURITY-REVIEW-SKIP comments
- Add TestCopyFileIfExists_SymlinkEscape test for symlink containment
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
6e7d05e to
526e757
Compare
roborev: Combined ReviewVerdict: 5 issues found (2 Medium, 3 Low); no Critical/High. Critical
High
Medium
Low
Synthesized from 4 reviews (agents: codex, gemini | types: security, review) |
Windows doesn't allow os.Rename to overwrite existing files, which broke dataset switching and token file updates. Fixed by: 1. ReplaceSymlink: On Windows, remove old symlink before creating new one (POSIX continues to use atomic temp+rename) 2. saveToken: On Windows, remove existing token file before rename (POSIX continues to use atomic temp+rename) Both fixes maintain safety checks (verify symlink before removal) and preserve atomicity guarantees where possible. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Addresses code review findings for Windows file system compatibility: 1. **Symlink replacement race condition fix** (dataset.go:92-116) - Changed Windows symlink replacement to try temp+rename first for atomicity - Falls back to remove+create only if rename fails - Eliminates race window where another process could interfere between remove and create - Maintains safety check that verifies path is a symlink before any modifications 2. **Token file removal error handling** (oauth.go:324-332) - Changed Windows file removal to only ignore os.ErrNotExist errors - Now properly reports permission errors, I/O errors, and directory conflicts - Prevents confusing failures on subsequent rename operation - Cleans up temp file on error 3. **Windows-specific test coverage** (new files) - Added dataset_windows_test.go with build tag for Windows-only tests - Added oauth_windows_test.go with build tag for Windows-only tests - Tests verify correct behavior for file overwrites, error cases, and initial saves - Ensures Windows-specific code paths are validated on Windows platform All tests pass. Changes maintain backward compatibility with POSIX systems while improving robustness and error reporting on Windows. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Replace custom contains()/findSubstring() helpers with strings.Contains() in both Windows test files - Remove unused encoding/json and path/filepath imports from oauth test - Make temp symlink cleanup in ReplaceSymlink report real errors instead of silently ignoring all failures Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add destination path containment check to CopyFileIfExists (defense in depth) - Reject CopySubset when destination database already exists (prevent corruption) - Document CopySubset's caller responsibility for path-traversal validation - Move formatSize to root.go to eliminate hidden cross-file dependency - Save original symlink target in exit-dev-data recovery path - Warn on unexpected FTS population errors instead of silently swallowing - Validate dataset name inside datasetPath to prevent future callers from skipping Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Clean up partially-created DB file when directory pre-existed (copy.go) - Report symlink recovery success/failure in exit-dev-data error messages - Remove misleading PRAGMA foreign_keys = ON from FK check test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
roborev: Combined ReviewSummary verdict: 3 findings (2 Medium, 1 Low); otherwise clean. Medium
Low
Synthesized from 4 reviews (agents: codex, gemini | types: security, review) |
|
I have squelched the incompetent Sonnet review bot so it will just be the roborev reviews going forward. I will work on this PR today! |
|
I haven't forgotten about this! I've been in triage mode and with a lot of meetings and other work obligations this week. I will work on this soon |
This is to address issue #100.