Refine timestamps in spans and recording alignment by toubatbrian · Pull Request #982 · livekit/agents-js

toubatbrian · 2026-01-16T21:36:23Z

Summary

This PR ports the Python PR #4131 (AGT-2316) to TypeScript, refining timestamp accuracy for telemetry spans and improving recording alignment.

Changes

Telemetry Timestamp Accuracy

User speech timing: Calculate accurate speech start time by subtracting speechDuration from detection time, rather than recording when VAD triggered
Agent speech timing: Track when audio playback actually starts (first frame captured) instead of when generation begins
Span start times: Added startTime parameter support to tracer.startSpan() to allow backdating spans

Recording Alignment

recorder_io.ts: Added _lastSpeechEndTime and _lastSpeechStartTime tracking for proper audio alignment
Silence padding: takeBuf() now supports padSince parameter to prepend silence frames when needed
Recording start time: Now returns the minimum of input/output start times for accurate alignment

Event Propagation

Added PlaybackStartedEvent interface and EVENT_PLAYBACK_STARTED constant to io.ts
ParticipantAudioOutput now emits playbackStarted event when first audio frame is captured
generation.ts listens for playback events to resolve firstFrameFut with accurate timestamp

OTel Context Propagation

Added _agentTurnContext to SpeechHandle to maintain proper span hierarchy
Agent state updates now pass OTel context for correct parent-child relationships

Bug Fix: Duplicate Tool Calls

Fixed duplicate FunctionCall entries in session history by filtering toolsMessages to only add FunctionCallOutput items (since FunctionCall items are already added by onToolExecutionStarted)

Utilities

Added rejected property to Future class to check if a future was rejected

Files Changed

File	Changes
`telemetry/traces.ts`	Added `startTime` to `StartSpanOptions`, pass directly to OTel SDK
`voice/io.ts`	Added `PlaybackStartedEvent`, `EVENT_PLAYBACK_STARTED`, `onPlaybackStarted()`
`voice/room_io/_output.ts`	Emit `playbackStarted` on first frame capture
`voice/generation.ts`	Listen for `playbackStarted`, resolve `firstFrameFut` with timestamp
`voice/audio_recognition.ts`	Calculate accurate speech start time with `speechDuration`
`voice/agent_session.ts`	Pass `startTime` and `otelContext` to state update methods
`voice/agent_activity.ts`	Propagate timestamps, set `_agentTurnContext`, fix duplicate tool calls
`voice/speech_handle.ts`	Added `_agentTurnContext` property
`voice/recorder_io/recorder_io.ts`	Added speech timing tracking, silence padding, aligned recording start
`utils.ts`	Added `rejected` getter to `Future` class

Testing

Verified telemetry spans now have accurate start times
Confirmed no duplicate function calls in Agent Insights transcript
All existing tests pass

Summary by CodeRabbit

New Features
- Added explicit start timestamp support for tracing spans to improve observability and timing precision of voice interactions.
- Introduced playback start event signals for enhanced audio playback monitoring.
- Improved audio recording and playback synchronization through refined timing and boundary alignment.
Chores
- Updated test environment configuration for example applications.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

changeset-bot · 2026-01-16T21:36:27Z

🦋 Changeset detected

Latest commit: 2fe2557

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 18 packages

Name	Type
@livekit/agents	Patch
@livekit/agents-plugin-anam	Patch
@livekit/agents-plugin-baseten	Patch
@livekit/agents-plugin-bey	Patch
@livekit/agents-plugin-cartesia	Patch
@livekit/agents-plugin-deepgram	Patch
@livekit/agents-plugin-elevenlabs	Patch
@livekit/agents-plugin-google	Patch
@livekit/agents-plugin-hedra	Patch
@livekit/agents-plugin-inworld	Patch
@livekit/agents-plugin-livekit	Patch
@livekit/agents-plugin-neuphonic	Patch
@livekit/agents-plugin-openai	Patch
@livekit/agents-plugin-resemble	Patch
@livekit/agents-plugin-rime	Patch
@livekit/agents-plugin-silero	Patch
@livekit/agents-plugins-test	Patch
@livekit/agents-plugin-xai	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

coderabbitai · 2026-01-16T21:36:35Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

The changes introduce precise timestamp tracking and OpenTelemetry context propagation throughout the voice agent system. They add support for explicit span start times, rejection tracking in futures, and implement event-driven timing for audio playback and speech events to improve span accuracy and recording alignment.

Changes

Cohort / File(s)	Summary
Telemetry and Timing Infrastructure `agents/src/telemetry/traces.ts`, `agents/src/utils.ts`	Added optional `startTime` field to `StartSpanOptions` to support explicit span initialization timestamps. Added rejection state tracking to `Future<T>` with public `rejected` getter.
Voice Agent State Management `agents/src/voice/agent_activity.ts`, `agents/src/voice/agent_session.ts`, `agents/src/voice/speech_handle.ts`	Refactored `onStartOfSpeech` signature to compute speechStartTime from VAD event duration. Added OpenTelemetry context capture at multiple entry points. Updated `_updateAgentState` and `_updateUserState` to accept optional timing and context options. Added internal `_agentTurnContext` field to `SpeechHandle`.
Speech Recognition Timing `agents/src/voice/audio_recognition.ts`	Compute explicit `startTime` for user_turn spans based on detected speech duration in VAD START_OF_SPEECH events.
Audio Generation and Playback Events `agents/src/voice/generation.ts`, `agents/src/voice/io.ts`	Changed `firstFrameFut` type from `Future` to `Future<number>` to capture numeric timestamps. Added `PlaybackStartedEvent` interface and `onPlaybackStarted()` method to `AudioOutput` with static event identifier. Wired event forwarding through audio chain.
Recorder Audio I/O Alignment `agents/src/voice/recorder_io/recorder_io.ts`	Extensive timing updates: added `padSince` parameter to `takeBuf()` for silence padding at speech boundaries; updated `recordingStartedAt` to return minimum of wall times; added trailing silence duration calculations; updated `createSilenceFrame()` signature to accept duration in seconds. Introduced internal timing state tracking (`_padded`, `_lastSpeechEndTime`, `_lastSpeechStartTime`).
Audio Output Playback Events `agents/src/voice/room_io/_output.ts`, `agents/src/voice/avatar/datastream_io.ts`	Added `firstFrameEmitted` flag to emit `onPlaybackStarted()` exactly once per playback cycle, resetting on playout completion or flush.
Configuration and Examples `.changeset/lazy-spies-worry.md`, `examples/src/drive-thru/drivethru_agent.ts`, `examples/src/frontdesk/frontdesk_agent.ts`	Added changeset documenting patch release for timestamp refinement. Added ESLint-disable comments and conditional checks to skip CLI startup during Vitest test execution.

Sequence Diagram

sequenceDiagram
    participant SpeechDet as Speech Detection
    participant AgentAct as Agent Activity
    participant Tracer as OTEL Tracer
    participant AudioOut as Audio Output
    participant Playback as Playback System

    SpeechDet->>SpeechDet: Detect speech start<br/>(VAD event)
    SpeechDet->>SpeechDet: Compute startTime =<br/>now - speechDuration
    SpeechDet->>Tracer: startSpan("user_turn",<br/>{startTime})
    
    AgentAct->>AgentAct: Capture OTEL context
    AgentAct->>AgentAct: Update agent state<br/>with context & timing
    AgentAct->>Tracer: startSpan("agent_speaking",<br/>{startTime, context})
    
    AudioOut->>AudioOut: Begin playback
    AudioOut->>Playback: Register onPlaybackStarted<br/>listener
    
    Playback->>Playback: Start audio output
    Playback->>AudioOut: Emit PLAYBACK_STARTED<br/>event (createdAt)
    
    AudioOut->>AudioOut: onPlaybackStarted(createdAt)
    AudioOut->>AudioOut: Emit PlaybackStartedEvent<br/>with timestamp
    AudioOut->>Tracer: Spans reference<br/>precise timestamps

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~35 minutes

Suggested reviewers

davidzhao
theomonnom

Poem

🐰✨ Timestamps and context flow,
Through spans they swiftly go,
Playback events now chime with glee,
Recording times align precisely! 🎙️

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Refine timestamps in spans and recording alignment' accurately summarizes the main changes: improving telemetry timestamp precision and aligning audio recording timing.
Description check	✅ Passed	The PR description is comprehensive, well-structured, and covers all major changes with clear sections for telemetry accuracy, recording alignment, event propagation, OTel context, bug fixes, and affected files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fef7fd0 and 2fe2557.

📒 Files selected for processing (1)

agents/src/voice/io.ts

🚧 Files skipped from review as they are similar to previous changes (1)

agents/src/voice/io.ts

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2eb8d02b56

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

agents/src/voice/generation.ts

toubatbrian · 2026-01-16T21:44:34Z

@codex

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8f38e2c44b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

agents/src/telemetry/traces.ts

agents/src/voice/recorder_io/recorder_io.ts

agents/src/voice/agent_activity.ts

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@agents/src/voice/agent_activity.ts`:
- Around line 640-646: onStartOfSpeech computes speechStartTime by subtracting
VADEvent.speechDuration from Date.now() but speechDuration is in seconds while
Date.now() is milliseconds; update the subtraction in onStartOfSpeech to convert
ev.speechDuration to milliseconds (multiply by 1000) before subtracting, so the
timestamp passed to this.agentSession._updateUserState('speaking', ...) is
correct.

In `@agents/src/voice/recorder_io/recorder_io.ts`:
- Around line 693-711: captureFrame sets _startedWallTime and
_lastSpeechStartTime unconditionally while only pushing frames into accFrames
when this.recorderIO.recording is true; move the initialization of
_startedWallTime and _lastSpeechStartTime so they only occur when recording is
active (i.e., inside the same this.recorderIO.recording branch that pushes into
accFrames) to ensure timestamps align with when frames are actually recorded,
leaving the await this.nextInChain.captureFrame and await super.captureFrame
calls unchanged.

🧹 Nitpick comments (2)

agents/src/voice/agent_activity.ts (2)
1229-1231: Consider logging the actual error for debugging purposes.

The catch handler assumes the rejection is always due to cancellation, but other errors might occur. Logging the error would help with debugging unexpected failures.
♻️ Suggested improvement
       textOut.firstTextFut.await
         .then(() => onFirstFrame())
-        .catch(() => this.logger.debug('firstTextFut cancelled before first frame'));
+        .catch((e) => this.logger.debug({ error: e }, 'firstTextFut rejected before first frame'));
1686-1697: Consider extracting the duplicate filtering logic.

This filtering logic is duplicated at lines 1486-1493. While acceptable, extracting to a helper function would reduce duplication.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f38e2c and 6a77734.

📒 Files selected for processing (2)

agents/src/voice/agent_activity.ts
agents/src/voice/recorder_io/recorder_io.ts

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

agents/src/voice/recorder_io/recorder_io.ts
agents/src/voice/agent_activity.ts

**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

agents/src/voice/recorder_io/recorder_io.ts
agents/src/voice/agent_activity.ts

**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

agents/src/voice/recorder_io/recorder_io.ts
agents/src/voice/agent_activity.ts

🧬 Code graph analysis (1)

agents/src/voice/agent_activity.ts (2)

agents/src/vad.ts (1)

VADEvent (24-56)

agents/src/llm/chat_context.ts (1)

FunctionCallOutput (284-350)

🔇 Additional comments (12)

agents/src/voice/agent_activity.ts (6)

7-7: LGTM!

The import alias otelContext for context is clear and helps distinguish OpenTelemetry context from other context references in the codebase.

1174-1175: LGTM!

Good pattern for capturing the OTel context at task entry and propagating it through onFirstFrame to _updateAgentState. This ensures accurate span parent-child relationships across async boundaries.

Also applies to: 1220-1225

1486-1493: LGTM!

Good fix to prevent duplicate FunctionCall entries in session history. The filtering ensures only FunctionCallOutput items are added here since FunctionCall items were already added by onToolExecutionStarted.

1517-1520: LGTM!

Good naming improvement using the InS suffix to explicitly indicate the unit is seconds, addressing previous feedback about unit clarity.

1318-1319: LGTM!

Consistent application of the OTel context capture and first-frame callback patterns in _pipelineReplyTaskImpl.

Also applies to: 1419-1424, 1436-1438, 1443-1445

1765-1766: LGTM!

Consistent implementation of OTel context capture and first-frame handling in _realtimeGenerationTaskImpl.

Also applies to: 1804-1808, 1896-1903

agents/src/voice/recorder_io/recorder_io.ts (6)

125-129: LGTM!

Passing the last speech end time to takeBuf enables proper alignment between input and output recordings.

139-152: LGTM!

Correct logic for returning the minimum of input/output start times, with proper handling of undefined cases.

562-600: LGTM!

Good improvements to playback finish handling:

Properly handles pause state when calculating finish time

Clamps playback position to actual speech window

Tracks last speech timing for future padding decisions

Logs warning when speech start time is missing

603-621: LGTM!

Good adoption of the InS suffix convention for variables representing seconds. This makes the code much easier to reason about and addresses previous feedback about unit clarity.

731-735: LGTM!

Updated createSilenceFrame to use durationInS parameter name, consistent with the seconds-based naming convention used throughout the file.

680-685: LGTM!

Properly appends trailing silence to the buffer when needed, with correct ms-to-seconds conversion.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

agents/src/voice/agent_activity.ts

agents/src/voice/recorder_io/recorder_io.ts

toubatbrian · 2026-01-19T22:43:09Z

agents/src/voice/io.ts


 export interface PlaybackFinishedEvent {
-  // How much of the audio was played back
+  /** How much of the audio was played back, in seconds */


@lukasIO I'm going to keep the naming of playbackPositon for this PR. Otherwise, if will trigger a lot of renamings to playbackPositionInS, which I will do in a different PR.

makes sense, the comment is already helpful, thank you!

save

2eb8d02

toubatbrian changed the title ~~Refine timestamps in spans and recording alignment~~ [AGT-2450] Refine timestamps in spans and recording alignment Jan 16, 2026

toubatbrian changed the title ~~[AGT-2450] Refine timestamps in spans and recording alignment~~ https://linear.app/livekit/issue/AGT-2450/refine-timestamps-in-spans-and-recording-alignment Jan 16, 2026

toubatbrian changed the title ~~https://linear.app/livekit/issue/AGT-2450/refine-timestamps-in-spans-and-recording-alignment~~ Refine timestamps in spans and recording alignment Jan 16, 2026

chatgpt-codex-connector bot reviewed Jan 16, 2026

View reviewed changes

agents/src/voice/generation.ts Show resolved Hide resolved

Create lazy-spies-worry.md

324d4dc

toubatbrian requested a review from lukasIO January 16, 2026 21:41

Update datastream_io.ts

8f38e2c

chatgpt-codex-connector bot reviewed Jan 16, 2026

View reviewed changes

agents/src/telemetry/traces.ts Show resolved Hide resolved

lukasIO reviewed Jan 19, 2026

View reviewed changes

toubatbrian added 2 commits January 19, 2026 14:09

fix review comments

c8c7ae5

Update recorder_io.ts

6a77734

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

agents/src/voice/agent_activity.ts Show resolved Hide resolved

agents/src/voice/recorder_io/recorder_io.ts Show resolved Hide resolved

Update recorder_io.ts

fc79680

toubatbrian requested a review from lukasIO January 19, 2026 22:17

toubatbrian added 5 commits January 19, 2026 14:22

fix lint

bd20934

Merge branch 'main' into brian/refine-ts-recording

6293994

Update traces.ts

c77fbae

fix lint

fef7fd0

Update io.ts

8b6aaed

toubatbrian commented Jan 19, 2026

View reviewed changes

toubatbrian added 2 commits January 20, 2026 12:35

Merge branch 'main' into brian/refine-ts-recording

1927f0b

Update inworld_tts.ts

2fe2557

lukasIO approved these changes Jan 21, 2026

View reviewed changes

toubatbrian merged commit 25df43a into main Jan 21, 2026
8 checks passed

toubatbrian deleted the brian/refine-ts-recording branch January 21, 2026 19:04

github-actions bot mentioned this pull request Jan 21, 2026

Version Packages #987

Merged

github-actions bot mentioned this pull request Jan 21, 2026

Version Packages tillkolter/livekit-agents-js#1

Open

This was referenced Jan 22, 2026

feat: implement TTS aligned transcripts #990

Merged

Add agent activity interruption detector integration #991

Merged

This was referenced Jan 30, 2026

Feat/barge in #1002

Open

Migrate user span #1027

Merged

Comments

Conversation

toubatbrian commented Jan 16, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Telemetry Timestamp Accuracy

Recording Alignment

Event Propagation

OTel Context Propagation

Bug Fix: Duplicate Tool Calls

Utilities

Files Changed

Testing

Summary by CodeRabbit

Uh oh!

changeset-bot bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

coderabbitai bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

toubatbrian commented Jan 16, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

toubatbrian Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukasIO Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

toubatbrian commented Jan 16, 2026 •

edited by coderabbitai bot

Loading

changeset-bot bot commented Jan 16, 2026 •

edited

Loading

coderabbitai bot commented Jan 16, 2026 •

edited

Loading

toubatbrian Jan 19, 2026 •

edited

Loading