Skip to content

feat: add ExitReason tracking for actor lifecycle#163

Open
ElFantasma wants to merge 3 commits intomainfrom
feat/exit-reasons
Open

feat: add ExitReason tracking for actor lifecycle#163
ElFantasma wants to merge 3 commits intomainfrom
feat/exit-reasons

Conversation

@ElFantasma
Copy link
Copy Markdown
Collaborator

Summary

  • Adds ExitReason enum (Normal, Shutdown, Panic(String), Kill) with is_abnormal() method
  • Actors now report WHY they stopped, not just that they did
  • ActorRef::wait_exit() returns the exit reason (async in tasks, blocking in threads)
  • ActorRef::exit_reason() polls the reason without blocking
  • join() unchanged for backwards compatibility
  • Includes exit_reason example demonstrating all exit paths

Foundation for supervision trees (#131, #133). The watch channel changes from bool to Option<ExitReason> in tasks mode, and Mutex<bool> to Mutex<Option<ExitReason>> in threads mode. run_actor() now returns ExitReason on every exit path: clean stop, channel closure, panic in started(), and panic in handler.

Test plan

  • 8 new tests (4 per execution mode): clean stop, panic in started, panic in handler, polling
  • All 67 existing tests pass (no regressions from watch channel type change)
  • Runnable example: cargo run -p exit_reason

@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

The PR looks solid and correctly implements exit reason tracking for supervision. Here are my observations:

✅ Correctness & Safety

concurrency/src/threads/actor.rs:444-451 - The CompletionGuard drop implementation correctly handles mutex poisoning and uses std::mem::replace to move the reason out without leaving the struct in an invalid state. This is safe.

Panic handling - Both async and thread implementations correctly use catch_unwind/AssertUnwindSafe to trap panics and convert them to ExitReason::Panic, ensuring the completion channel always receives a reason.

Race safety - The exit_reason() polling methods are thread-safe:

  • Async version uses tokio::sync::watch (atomic refcounted)
  • Thread version uses Mutex with poisoning recovery

⚠️ Minor Issues

concurrency/src/tasks/actor.rs:385-392 - The fallback to ExitReason::Normal when rx.changed().await.is_err() is theoretically unreachable (since run_actor catches all panics), but acceptable as a defensive measure. Consider adding a comment or tracing::warn to indicate this is an unexpected state.

Unused variants - ExitReason::Kill and ExitReason::Shutdown are defined but never constructed. This is fine for a foundational PR, but consider adding #[doc(hidden)] or a TODO comment indicating these are reserved for future supervision implementation.

📝 Suggestions

concurrency/src/error.rs:11-17 - The is_abnormal() method logic is correct for Erlang-style supervision semantics (Transient/Permanent children). Consider documenting that Shutdown is considered normal to prevent confusion:

/// Returns `true` for exit reasons that should trigger a restart.
/// Note: Shutdown is considered normal (no restart).
pub fn is_abnormal(&self) -> bool { ... }

concurrency/src/threads/actor.rs:395-398 - The manual drop(guard) is slightly unidiomatic. You can simplify:

let _guard = CompletionGuard { completion, reason };
// Let it drop naturally at end of scope

🧪 Testing

The test coverage is comprehensive for the implemented functionality:

  • Clean shutdown path
  • Panic in started()
  • Panic in message handler
  • Polling exit_reason() while running

Missing (non-blocking): Test for panic in stopped() callback, though this is less critical since the actor has already effectively exited at that point.

Summary

The implementation is correct, thread-safe, and follows Rust best practices. The symmetry between async and thread backends is maintained properly. No blockers.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

Findings:

  • High: stopped() panics are documented as ExitReason::Panic, but both runtimes currently discard that and report the previous reason, which is usually Normal. That means a crash in shutdown logic is invisible to supervisors using wait_exit() / exit_reason(). See concurrency/src/tasks/actor.rs:506 and concurrency/src/threads/actor.rs:479.

  • Medium: The thread backend no longer guarantees completion notification if the actor thread unwinds unexpectedly. CompletionGuard is created only after run_actor returns, so any uncaught panic before that leaves completion as None forever and join() / wait_exit() can block indefinitely. The old placement of the guard was safer. See concurrency/src/threads/actor.rs:412 and concurrency/src/threads/actor.rs:241.

  • Medium: The async backend treats a closed completion watch channel as ExitReason::Normal. If the task is aborted or panics before completion_tx.send(Some(reason)), callers will misclassify an abnormal termination as a clean stop, which is the wrong default for supervision logic. See concurrency/src/tasks/actor.rs:383 and concurrency/src/tasks/actor.rs:433.

Open question:

  • ExitReason::Shutdown and ExitReason::Kill are added as public API, but I don’t see any path that can actually produce them yet. If that’s intentional for future supervision work, the docs should say they are reserved; otherwise the enum currently over-promises semantics. See concurrency/src/error.rs:1.

Verification gap: I could not run cargo test in this environment because Cargo attempted to reach index.crates.io and network access is blocked.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

PR #163 Review: feat: add ExitReason tracking for actor lifecycle

Summary

Solid foundational work for supervision trees. The design is clean, the API is symmetric between tasks and threads, and the test coverage is good. A few issues worth addressing before merging.


Bug: Panic in stopped() doesn't update exit_reason

concurrency/src/tasks/actor.rs ~L505 and concurrency/src/threads/actor.rs ~L480

let stop_result = AssertUnwindSafe(actor.stopped(&ctx)).catch_unwind().await;
if let Err(panic) = stop_result {
    let msg = panic_message(&panic);
    tracing::error!("Panic in stopped() callback: {msg}");
    // exit_reason is NOT updated here
}
exit_reason  // returns Normal when stopped() panicked on a clean actor

If an actor exits cleanly (exit_reason = Normal) but stopped() panics, the caller receives ExitReason::Normal. The stopped() panic is logged but silently dropped from the exit reason. This matters for supervision: a supervisor seeing Normal would not restart a child that actually panicked on cleanup.

Fix: Update exit_reason if stopped() panics and exit_reason is still Normal:

if let Err(panic) = stop_result {
    let msg = panic_message(&panic);
    tracing::error!("Panic in stopped() callback: {msg}");
    if matches!(exit_reason, ExitReason::Normal) {
        exit_reason = ExitReason::Panic(format!("panic in stopped(): {msg}"));
    }
}

Shutdown and Kill variants are never produced

These variants are defined in ExitReason but no code path ever sets them. This leaves callers unable to distinguish Shutdown from Normal in practice, and Kill from Panic. If these aren't wired up in this PR, they should either have #[doc(hidden)] or a doc comment like // Not yet produced; reserved for supervision tree implementation.


Missing PartialEq derive

concurrency/src/error.rs L3

#[derive(Debug, Clone)]
pub enum ExitReason {

Without PartialEq, supervision code can't do reason == ExitReason::Normal. The stated purpose is to enable supervision trees (#131, #133), so this will immediately be needed. Add PartialEq now.


panic_message is duplicated

concurrency/src/tasks/actor.rs and concurrency/src/threads/actor.rs both define identical panic_message functions. This belongs in error.rs as a pub(crate) function.


Clippy lint: &Box<dyn Any> parameter

Both copies of panic_message take &Box<dyn std::any::Any + Send>, which clippy flags as clippy::borrowed_box. The idiomatic signature is:

fn panic_message(panic: &(dyn std::any::Any + Send)) -> String {

The call sites work the same way since Box<T> derefs to T.


Redundant drop(guard) in threads

concurrency/src/threads/actor.rs ~L415

let guard = CompletionGuard { completion, reason };
drop(guard);  // redundant — guard drops at end of closure anyway

The explicit drop adds noise without clarity. The old _guard RAII pattern was cleaner; if the explicit call is meant to make the intent obvious, a comment would be better.


Theoretical race in polling test

concurrency/src/threads/actor.rs (and tasks) test exit_reason_poll_returns_none_while_running

let actor = Counter { count: 0 }.start();
assert!(actor.exit_reason().is_none());  // racy: actor could have exited already

In practice Counter won't self-stop, but the assertion is non-deterministic by construction. A more robust pattern would be to check is_none() only while a long-running actor is inflight, or to replace it with a paired wait_exit/exit_reason assertion.


Minor: mem::replace in CompletionGuard::drop

concurrency/src/threads/actor.rs ~L243

*completed = Some(std::mem::replace(&mut self.reason, ExitReason::Normal));

This works (you can't move out of self in drop), but it's subtle. A cleaner alternative is to store reason: Option<ExitReason> and use .take():

*completed = self.reason.take();

What's good

  • The watch::Receiver<Option<ExitReason>> choice for tasks is idiomatic and avoids a second channel.
  • The Condvar-based polling in threads is correct; the mutex is held across cvar.wait as required.
  • join() delegating to wait_exit() preserves backward compatibility cleanly.
  • Test coverage is thorough for the paths that are actually implemented.
  • panic_message extracting a human-readable string from the panic payload is a meaningful improvement over {panic:?}.

Automated review by Claude (Anthropic) · sonnet · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 31, 2026

Greptile Summary

This PR adds ExitReason tracking to the actor lifecycle, changing the internal completion signal from a bare bool/watch<bool> to Option<ExitReason>, and exposing wait_exit() / exit_reason() on ActorRef for both the tasks and threads execution modes. It is a well-scoped foundation for the upcoming supervision tree work.

Key changes:

  • New ExitReason enum (Normal, Shutdown, Panic(String), Kill) with is_abnormal() and Display in error.rs
  • run_actor() now returns ExitReason on every exit path in both tasks/actor.rs and threads/actor.rs
  • ActorRef::wait_exit() and ActorRef::exit_reason() added to both modules; join() now delegates to wait_exit()
  • 8 new tests (4 per mode) covering clean stop, panic-in-started, panic-in-handler, and polling
  • New exit_reason example demonstrating all four scenarios

Issues found:

  • Threads RAII regression (P1): The CompletionGuard is now constructed after run_actor returns. If run_actor ever panics unexpectedly at the thread boundary, the guard is never built, cvar.notify_all() is never called, and any caller blocked in wait_exit() deadlocks forever.
  • Task abort maps to Normal (P2): If the async task is cancelled/aborted externally, wait_exit() returns ExitReason::Normal — silently misclassifying an aborted actor.
  • stopped() panics not reflected in exit reason (P2): A clean-stop actor whose stopped() panics still reports Normal.
  • Missing PartialEq on ExitReason (P2): Supervision trees will want == comparisons.
  • panic_message duplicated (P2): Identical helper defined in both actor modules.

Confidence Score: 4/5

Safe to merge with the RAII regression in threads mode addressed; all other findings are P2 style/hardening concerns

One P1 issue exists: the threads CompletionGuard is no longer constructed under RAII, meaning an unexpected panic in run_actor at the thread boundary would cause wait_exit() to deadlock. While run_actor catches all known panics internally (making this unlikely in practice), it is a regression from the previous code's safety guarantee. All other findings are P2 quality/robustness suggestions that do not block current functionality.

concurrency/src/threads/actor.rs — the CompletionGuard construction and run_actor call ordering needs a top-level catch_unwind guard to restore the old RAII safety

Important Files Changed

Filename Overview
concurrency/src/threads/actor.rs Mutex changed to Option<ExitReason>; CompletionGuard refactored to carry reason, but now constructed after run_actor — breaking the RAII guarantee that prevented wait_exit() deadlocks on unexpected panics
concurrency/src/tasks/actor.rs Watch channel changed from bool to Option<ExitReason>; wait_exit() added; run_actor returns ExitReason; panic_message duplicated from threads module; wait_exit maps dropped sender to Normal even for task aborts
concurrency/src/error.rs Adds ExitReason enum with Normal, Shutdown, Panic(String), Kill variants, is_abnormal(), and Display impl; missing PartialEq derive limits supervision use
concurrency/src/lib.rs Re-exports ExitReason alongside ActorError — straightforward one-line change
examples/exit_reason/src/main.rs New runnable example demonstrating all four exit paths (clean stop, handler panic, started panic, polling); well-structured and accurate
examples/exit_reason/Cargo.toml New example package manifest, correctly declares workspace dependencies

Sequence Diagram

sequenceDiagram
    participant Caller
    participant ActorRef
    participant run_actor
    participant CompletionChannel

    Caller->>ActorRef: .start()
    ActorRef->>CompletionChannel: init(None)
    ActorRef->>run_actor: spawn(actor, ctx, rx, token)

    alt started() panics
        run_actor-->>run_actor: catch_unwind → ExitReason::Panic
    else started() calls ctx.stop()
        run_actor-->>run_actor: ExitReason::Normal
    else normal flow
        loop message loop
            run_actor->>run_actor: recv message
            alt handler panics
                run_actor-->>run_actor: exit_reason = Panic
                run_actor-->>run_actor: break
            else ctx.stop() called
                run_actor-->>run_actor: break (Normal)
            end
        end
        run_actor->>run_actor: stopped() [catch_unwind]
    end

    run_actor->>CompletionChannel: send(Some(ExitReason))
    CompletionChannel-->>ActorRef: notify

    Caller->>ActorRef: .wait_exit()
    ActorRef->>CompletionChannel: wait for Some(reason)
    CompletionChannel-->>ActorRef: Some(ExitReason)
    ActorRef-->>Caller: ExitReason
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: concurrency/src/threads/actor.rs
Line: 412-420

Comment:
**RAII guarantee broken — `wait_exit()` can deadlock if `run_actor` panics unexpectedly**

The old code constructed `CompletionGuard` *before* calling `run_actor`, so the guard's `Drop` impl would always fire (writing `Some(reason)` and calling `cvar.notify_all()`) even if `run_actor` panicked at the thread level:

```rust
// OLD — always notifies, even on unexpected panic
let _guard = CompletionGuard(completion);
run_actor(actor, ctx, rx, cancellation_token);
```

The new code constructs the guard *after* `run_actor` returns. If `run_actor` ever panics without being caught (e.g., a future bug in the framework, a stack overflow, or a `catch_unwind` boundary issue), the guard is never created, `completion` is just dropped as an `Arc`, and any thread blocked in `wait_exit()``cvar.wait(…)` will deadlock forever.

Note that the `tasks` mode is unaffected because dropping `completion_tx` causes `rx.changed()` to return `Err`, so `wait_exit()` returns `ExitReason::Normal` instead of hanging.

A safe fix is to wrap `run_actor` at the thread closure level:
```rust
let _thread_handle = rt::spawn(move || {
    let reason = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
        run_actor(actor, ctx, rx, cancellation_token)
    }))
    .unwrap_or(ExitReason::Panic("unexpected framework panic".to_string()));

    let guard = CompletionGuard { completion, reason };
    drop(guard);
});
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: concurrency/src/tasks/actor.rs
Line: 383-393

Comment:
**`wait_exit()` silently maps task abortion/cancellation to `ExitReason::Normal`**

When `rx.changed().await` returns `Err`, it means every `watch::Sender` was dropped without ever sending `Some(reason)`. If the spawned task is **aborted** from outside (e.g., the tokio `JoinHandle` is used to abort, or the runtime shuts down mid-execution), the sender is dropped without a value and this branch returns `ExitReason::Normal` — even though the actor was forcibly killed.

For a supervision foundation this silent misclassification matters: a supervisor that only restarts on `is_abnormal()` would silently ignore an aborted actor. Consider returning `ExitReason::Kill` here instead:

```suggestion
            if rx.changed().await.is_err() {
                return ExitReason::Kill;
            }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: concurrency/src/tasks/actor.rs
Line: 505-510

Comment:
**Panic in `stopped()` is not reflected in the returned `ExitReason`**

When `stopped()` panics, the error is logged but `exit_reason` is unchanged. Two scenarios arise:

1. **Clean stop + `stopped()` panics** → returns `ExitReason::Normal`. A supervisor relying on `is_abnormal()` won't restart the actor despite cleanup code crashing.
2. **Handler panic + `stopped()` panics** → returns the original `Panic("panic in handler: …")`, which is correct but hides the secondary cleanup failure.

The same logic applies to the equivalent lines in `concurrency/src/threads/actor.rs` (lines 479–485).

If panics in `stopped()` should be surfaced, consider:
```rust
if let Err(panic) = stop_result {
    let msg = panic_message(&panic);
    tracing::error!("Panic in stopped() callback: {msg}");
    if !exit_reason.is_abnormal() {
        exit_reason = ExitReason::Panic(format!("panic in stopped(): {msg}"));
    }
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: concurrency/src/error.rs
Line: 2-12

Comment:
**`ExitReason` is missing `PartialEq` — limits usability for supervision**

`ExitReason` derives `Debug` and `Clone` but not `PartialEq`. Supervision trees will almost certainly need to compare exit reasons directly. Without `PartialEq`, callers must use `matches!` patterns everywhere and cannot use `==` or `!=`.

```suggestion
#[derive(Debug, Clone, PartialEq)]
pub enum ExitReason {
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: concurrency/src/tasks/actor.rs
Line: 515-523

Comment:
**`panic_message` is duplicated across both actor modules**

This identical helper function is defined in both `concurrency/src/tasks/actor.rs` (lines 515–523) and `concurrency/src/threads/actor.rs` (lines 490–498). Consider moving it to `concurrency/src/error.rs` or a shared internal utilities module so it can be reused without duplication.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "feat: add ExitReason tracking for actor ..." | Re-trigger Greptile

Comment on lines 412 to 420
let _thread_handle = rt::spawn(move || {
let _guard = CompletionGuard(completion);
run_actor(actor, ctx, rx, cancellation_token);
let reason = run_actor(actor, ctx, rx, cancellation_token);
let guard = CompletionGuard {
completion,
reason,
};
drop(guard);
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 RAII guarantee broken — wait_exit() can deadlock if run_actor panics unexpectedly

The old code constructed CompletionGuard before calling run_actor, so the guard's Drop impl would always fire (writing Some(reason) and calling cvar.notify_all()) even if run_actor panicked at the thread level:

// OLD — always notifies, even on unexpected panic
let _guard = CompletionGuard(completion);
run_actor(actor, ctx, rx, cancellation_token);

The new code constructs the guard after run_actor returns. If run_actor ever panics without being caught (e.g., a future bug in the framework, a stack overflow, or a catch_unwind boundary issue), the guard is never created, completion is just dropped as an Arc, and any thread blocked in wait_exit()cvar.wait(…) will deadlock forever.

Note that the tasks mode is unaffected because dropping completion_tx causes rx.changed() to return Err, so wait_exit() returns ExitReason::Normal instead of hanging.

A safe fix is to wrap run_actor at the thread closure level:

let _thread_handle = rt::spawn(move || {
    let reason = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
        run_actor(actor, ctx, rx, cancellation_token)
    }))
    .unwrap_or(ExitReason::Panic("unexpected framework panic".to_string()));

    let guard = CompletionGuard { completion, reason };
    drop(guard);
});
Prompt To Fix With AI
This is a comment left during a code review.
Path: concurrency/src/threads/actor.rs
Line: 412-420

Comment:
**RAII guarantee broken — `wait_exit()` can deadlock if `run_actor` panics unexpectedly**

The old code constructed `CompletionGuard` *before* calling `run_actor`, so the guard's `Drop` impl would always fire (writing `Some(reason)` and calling `cvar.notify_all()`) even if `run_actor` panicked at the thread level:

```rust
// OLD — always notifies, even on unexpected panic
let _guard = CompletionGuard(completion);
run_actor(actor, ctx, rx, cancellation_token);
```

The new code constructs the guard *after* `run_actor` returns. If `run_actor` ever panics without being caught (e.g., a future bug in the framework, a stack overflow, or a `catch_unwind` boundary issue), the guard is never created, `completion` is just dropped as an `Arc`, and any thread blocked in `wait_exit()``cvar.wait(…)` will deadlock forever.

Note that the `tasks` mode is unaffected because dropping `completion_tx` causes `rx.changed()` to return `Err`, so `wait_exit()` returns `ExitReason::Normal` instead of hanging.

A safe fix is to wrap `run_actor` at the thread closure level:
```rust
let _thread_handle = rt::spawn(move || {
    let reason = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
        run_actor(actor, ctx, rx, cancellation_token)
    }))
    .unwrap_or(ExitReason::Panic("unexpected framework panic".to_string()));

    let guard = CompletionGuard { completion, reason };
    drop(guard);
});
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +383 to 393
pub async fn wait_exit(&self) -> ExitReason {
let mut rx = self.completion_rx.clone();
while !*rx.borrow_and_update() {
loop {
if let Some(reason) = rx.borrow_and_update().clone() {
return reason;
}
if rx.changed().await.is_err() {
break;
return ExitReason::Normal;
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 wait_exit() silently maps task abortion/cancellation to ExitReason::Normal

When rx.changed().await returns Err, it means every watch::Sender was dropped without ever sending Some(reason). If the spawned task is aborted from outside (e.g., the tokio JoinHandle is used to abort, or the runtime shuts down mid-execution), the sender is dropped without a value and this branch returns ExitReason::Normal — even though the actor was forcibly killed.

For a supervision foundation this silent misclassification matters: a supervisor that only restarts on is_abnormal() would silently ignore an aborted actor. Consider returning ExitReason::Kill here instead:

Suggested change
pub async fn wait_exit(&self) -> ExitReason {
let mut rx = self.completion_rx.clone();
while !*rx.borrow_and_update() {
loop {
if let Some(reason) = rx.borrow_and_update().clone() {
return reason;
}
if rx.changed().await.is_err() {
break;
return ExitReason::Normal;
}
}
}
if rx.changed().await.is_err() {
return ExitReason::Kill;
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: concurrency/src/tasks/actor.rs
Line: 383-393

Comment:
**`wait_exit()` silently maps task abortion/cancellation to `ExitReason::Normal`**

When `rx.changed().await` returns `Err`, it means every `watch::Sender` was dropped without ever sending `Some(reason)`. If the spawned task is **aborted** from outside (e.g., the tokio `JoinHandle` is used to abort, or the runtime shuts down mid-execution), the sender is dropped without a value and this branch returns `ExitReason::Normal` — even though the actor was forcibly killed.

For a supervision foundation this silent misclassification matters: a supervisor that only restarts on `is_abnormal()` would silently ignore an aborted actor. Consider returning `ExitReason::Kill` here instead:

```suggestion
            if rx.changed().await.is_err() {
                return ExitReason::Kill;
            }
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 505 to +510
cancellation_token.cancel();
let stop_result = AssertUnwindSafe(actor.stopped(&ctx)).catch_unwind().await;
if let Err(panic) = stop_result {
tracing::error!("Panic in stopped() callback: {panic:?}");
let msg = panic_message(&panic);
tracing::error!("Panic in stopped() callback: {msg}");
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Panic in stopped() is not reflected in the returned ExitReason

When stopped() panics, the error is logged but exit_reason is unchanged. Two scenarios arise:

  1. Clean stop + stopped() panics → returns ExitReason::Normal. A supervisor relying on is_abnormal() won't restart the actor despite cleanup code crashing.
  2. Handler panic + stopped() panics → returns the original Panic("panic in handler: …"), which is correct but hides the secondary cleanup failure.

The same logic applies to the equivalent lines in concurrency/src/threads/actor.rs (lines 479–485).

If panics in stopped() should be surfaced, consider:

if let Err(panic) = stop_result {
    let msg = panic_message(&panic);
    tracing::error!("Panic in stopped() callback: {msg}");
    if !exit_reason.is_abnormal() {
        exit_reason = ExitReason::Panic(format!("panic in stopped(): {msg}"));
    }
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: concurrency/src/tasks/actor.rs
Line: 505-510

Comment:
**Panic in `stopped()` is not reflected in the returned `ExitReason`**

When `stopped()` panics, the error is logged but `exit_reason` is unchanged. Two scenarios arise:

1. **Clean stop + `stopped()` panics** → returns `ExitReason::Normal`. A supervisor relying on `is_abnormal()` won't restart the actor despite cleanup code crashing.
2. **Handler panic + `stopped()` panics** → returns the original `Panic("panic in handler: …")`, which is correct but hides the secondary cleanup failure.

The same logic applies to the equivalent lines in `concurrency/src/threads/actor.rs` (lines 479–485).

If panics in `stopped()` should be surfaced, consider:
```rust
if let Err(panic) = stop_result {
    let msg = panic_message(&panic);
    tracing::error!("Panic in stopped() callback: {msg}");
    if !exit_reason.is_abnormal() {
        exit_reason = ExitReason::Panic(format!("panic in stopped(): {msg}"));
    }
}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +2 to +12
#[derive(Debug, Clone)]
pub enum ExitReason {
/// Clean stop via `ctx.stop()` or channel closure.
Normal,
/// Ordered shutdown from a supervisor or linked actor.
Shutdown,
/// Actor panicked in `started()`, a handler, or `stopped()`.
Panic(String),
/// Untrappable kill signal.
Kill,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 ExitReason is missing PartialEq — limits usability for supervision

ExitReason derives Debug and Clone but not PartialEq. Supervision trees will almost certainly need to compare exit reasons directly. Without PartialEq, callers must use matches! patterns everywhere and cannot use == or !=.

Suggested change
#[derive(Debug, Clone)]
pub enum ExitReason {
/// Clean stop via `ctx.stop()` or channel closure.
Normal,
/// Ordered shutdown from a supervisor or linked actor.
Shutdown,
/// Actor panicked in `started()`, a handler, or `stopped()`.
Panic(String),
/// Untrappable kill signal.
Kill,
}
#[derive(Debug, Clone, PartialEq)]
pub enum ExitReason {
Prompt To Fix With AI
This is a comment left during a code review.
Path: concurrency/src/error.rs
Line: 2-12

Comment:
**`ExitReason` is missing `PartialEq` — limits usability for supervision**

`ExitReason` derives `Debug` and `Clone` but not `PartialEq`. Supervision trees will almost certainly need to compare exit reasons directly. Without `PartialEq`, callers must use `matches!` patterns everywhere and cannot use `==` or `!=`.

```suggestion
#[derive(Debug, Clone, PartialEq)]
pub enum ExitReason {
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 515 to 523
fn panic_message(panic: &Box<dyn std::any::Any + Send>) -> String {
if let Some(s) = panic.downcast_ref::<&str>() {
s.to_string()
} else if let Some(s) = panic.downcast_ref::<String>() {
s.clone()
} else {
format!("{panic:?}")
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 panic_message is duplicated across both actor modules

This identical helper function is defined in both concurrency/src/tasks/actor.rs (lines 515–523) and concurrency/src/threads/actor.rs (lines 490–498). Consider moving it to concurrency/src/error.rs or a shared internal utilities module so it can be reused without duplication.

Prompt To Fix With AI
This is a comment left during a code review.
Path: concurrency/src/tasks/actor.rs
Line: 515-523

Comment:
**`panic_message` is duplicated across both actor modules**

This identical helper function is defined in both `concurrency/src/tasks/actor.rs` (lines 515–523) and `concurrency/src/threads/actor.rs` (lines 490–498). Consider moving it to `concurrency/src/error.rs` or a shared internal utilities module so it can be reused without duplication.

How can I resolve this? If you propose a fix, please make it concise.

@ElFantasma
Copy link
Copy Markdown
Collaborator Author

Bot Review Responses

All bot findings have been addressed in 4ec9859 and f37e99b. Summary below.


Kimi

Fallback to ExitReason::Normal when rx.changed().await.is_err()

Fixed. Changed to ExitReason::Kill — if the watch sender is dropped without sending a reason, the actor was forcibly terminated, not cleanly stopped.

Manual drop(guard) is unidiomatic

Fixed. Restored RAII pattern with let mut guard = CompletionGuard { ... };.

Missing test for panic in stopped()

stopped() panics now update the exit reason (see Codex/Claude fix below). Existing panic_in_stopped_still_completes test covers the behavior.


Codex

High: stopped() panics are invisible to supervisors

Fixed. If stopped() panics and the exit reason was Normal, it's now updated to ExitReason::Panic("panic in stopped(): ..."). If the reason was already abnormal (e.g., handler panic), the original reason is preserved.

Medium: Thread backend CompletionGuard no longer guarantees completion notification

Fixed. Guard is now constructed BEFORE run_actor() with reason: None. If run_actor panics unexpectedly at the thread boundary, the guard's Drop sets ExitReason::Kill via .take().or(Some(Kill)). This restores the RAII safety guarantee from the old code.

Medium: Async backend treats closed watch channel as Normal

Fixed. Returns ExitReason::Kill instead.

Shutdown and Kill never produced

Added doc comments: "Not yet produced — reserved for supervision tree implementation."


Claude

Bug: Panic in stopped() doesn't update exit_reason

Fixed. Same as Codex finding above.

Missing PartialEq derive

Fixed. Added PartialEq + 3 new tests (exit_reason_partial_eq, exit_reason_display, exit_reason_is_abnormal_classification).

panic_message is duplicated

Fixed. Moved to error.rs as pub(crate) fn panic_message.

Clippy lint: &Box<dyn Any>

Fixed. Changed to &(dyn std::any::Any + Send).

Redundant drop(guard)

Fixed. See Kimi response above.

mem::replace in CompletionGuard::drop

Fixed. Now uses Option<ExitReason> + .take().

Theoretical race in polling test

Disregarded. The Counter actor never self-stops, so exit_reason() is deterministically None immediately after start(). The test is not racy in practice.


Greptile

All findings overlap with Codex/Claude above — all addressed. The sequence diagram and summary are accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant