Skip to content

Bug: Cost estimate significantly underestimates actual spend — root cause identified in cache.js #47

@xxshubhamxx

Description

@xxshubhamxx

Environment

  • OS: macOS (Apple Silicon, darwin/arm64)
  • Node.js: v22.22.0
  • Agentlytics version: 0.2.12

Observed Discrepancy

For the same period (April 2026, Asia/Kolkata timezone):

Tool Estimated Cost
Agentlytics $962
ccusage $2,236.69

Agentlytics is 2.3× below ccusage. Both tools read the same underlying Claude Code session files.


Root Cause: getCachedChats() uses a single top_model for cost — not per-message model data

In cache.js, the getCachedChats() function estimates cost per session like this:

// cache.js — getCachedChats()
r.cost = r.top_model
  ? (calculateCost(r.top_model, inTok, outTok, r._cacheR || 0, r._cacheW || 0) || 0)
  : 0;

top_model is the most frequent model in the session — but for mixed-model sessions (e.g., claude-opus-4 + claude-sonnet-4-5), it will price all tokens at the cheaper model's rate, significantly underestimating sessions where the expensive model handled the bulk of the tokens.


Root Cause: estimateCosts() is the correct path but is only used in the Cost tab

The more accurate estimateCosts() function in cache.js does per-model token attribution from the messages table:

// cache.js — estimateCosts() — CORRECT approach
const modelTokens = db.prepare(`
  SELECT m.model,
         SUM(m.input_tokens) as input, SUM(m.output_tokens) as output,
         SUM(m.cache_read) as cacheRead, SUM(m.cache_write) as cacheWrite
  FROM messages m JOIN chats c ON m.chat_id = c.id
  WHERE m.model IS NOT NULL AND (...)
  GROUP BY m.model
`).all(...);

This per-message model attribution correctly handles mixed-model sessions. But the dashboard overview stat card and session list use getCachedChats() with the single top_model shortcut — so total cost displayed on the main dashboard is calculated differently (and less accurately) than the dedicated Cost tab.


Root Cause: char → token fallback inflates input and deflates cost

Both getCachedChats() and getCachedDashboardStats() have this fallback:

// When no token data, estimate from character counts
if (inTok === 0 && outTok === 0 && ((r._uChars || 0) > 0 || (r._aChars || 0) > 0)) {
  inTok = Math.round((r._uChars || 0) / 4);
  outTok = Math.round((r._aChars || 0) / 4);
}

For editors that do expose token data (Claude Code, Copilot), this fallback should never fire — but if _inTok and _outTok are 0 due to a parsing issue, this kicks in and estimates tokens from character counts, completely missing cacheRead and cacheWrite tokens (which have no character-count equivalent). This means all cache token costs are silently dropped for any session where this fallback triggers.


Root Cause: Session-level cost doesn't weight by actual token distribution per model

In getCachedChats(), when inTok and outTok are available, they are attributed entirely to top_model:

r.cost = r.top_model
  ? (calculateCost(r.top_model, inTok, outTok, r._cacheR || 0, r._cacheW || 0) || 0)
  : 0;

If a session used claude-opus-4 (input: $15/M) for 80% of its tokens but claude-haiku-4-5 (input: $1/M) was the most frequent model (e.g. used for many small tool responses), then top_model = claude-haiku-4-5 and the entire session's token cost is priced at $1/M input instead of the correct blended rate.


Suggested Fix

The estimateCosts() path already solves this correctly. The fix is to use getCostBreakdown() for the dashboard total instead of summing getCachedChats().cost:

// In server.js or wherever the dashboard total is computed:
// BEFORE (inaccurate):
const totalCost = chats.reduce((sum, c) => sum + c.cost, 0);

// AFTER (accurate — uses per-message model attribution):
const { totalCost } = getCostBreakdown(opts);

For the session list, the per-session cost can be improved by splitting tokens proportionally across models found in that session's messages:

// Instead of top_model shortcut, sum per-model costs from messages:
const msgRows = db.prepare(
  `SELECT model, SUM(input_tokens) as i, SUM(output_tokens) as o,
   SUM(cache_read) as cr, SUM(cache_write) as cw
   FROM messages WHERE chat_id = ? AND model IS NOT NULL
   GROUP BY model`
).all(chatId);
const cost = msgRows.reduce((sum, r) =>
  sum + (calculateCost(r.model, r.i, r.o, r.cr, r.cw) || 0), 0
);

This is more expensive per query but dramatically more accurate, and can be cached per chat_id since sessions are immutable after close.


Happy to submit a PR for this if helpful. The estimateCosts() logic is already correct — it just needs to be the single source of truth for all cost displays.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions