Fix #630: Prevent duplicate HTTP task execution via status update and lock renewal #681

Deepak1101100 · 2025-12-06T09:36:20Z

Fixes #630

What
Prevents duplicate execution of long-running HTTP tasks (and other async system tasks) by:
Persisting task status as IN_PROGRESS before blocking execution begins.
Renewing the workflow lock during long-running system task execution.

Why
Duplicate HTTP task execution was caused by two core issues:
Task status remained SCHEDULED in the database during execution Since status was only persisted after the blocking HTTP call completed, WorkflowRepairService incorrectly detected the task as stuck and re-queued it.

Workflow lock expired during long-running execution The default 60-second lease expired while HTTP tasks ran for several minutes, allowing other workers to acquire the same workflow and execute the task again.
This resulted in:
Duplicate outbound HTTP requests
Concurrent workflow decisions
Inconsistent workflow state

Fixes

Persist IN_PROGRESS before blocking call
For async system tasks that actually block, the task is now marked IN_PROGRESS and persisted before invoking systemTask.start():

if (systemTask.isAsync() && systemTask.isAsyncComplete(task)) {
task.setStatus(TaskModel.Status.IN_PROGRESS);
task.setWorkerId(Utils.getServerId());
executionDAOFacade.updateTask(task);
}
systemTask.start(workflow, task, workflowExecutor);

This prevents premature re-queueing by WorkflowRepairService.

Workflow lock renewal for long-running tasks
A periodic lock renewal mechanism is added using ScheduledExecutorService, following the same watchdog pattern used by Redisson distributed locks. This implements the long-term solution explicitly suggested by maintainers in PR Fix duplicate HTTP task execution via WorkflowRepairService race condition #633:

“Longer term – we should extend the lease while working on the long running system tasks.”

Locks are renewed at a fixed interval (half of the lease time) while the task is executing, and safely released in finally to prevent leaks.

Testing
Manual
Long-running HTTP task (120s+ delay)
Verified:
Single execution (previously executed up to 4x)
No re-queue during execution
No workflow lock expiration

Unit Tests
Updated existing Spock test:
AsyncSystemTaskExecutorTest.groovy

Adjusted expectations for:
Early IN_PROGRESS persistence
Additional updateTask() call

All core tests pass:
./gradlew :conductor-core:test

Notes
This is my first contribution to Conductor. The lock renewal implementation directly follows standard distributed lock watchdog patterns and maintainer guidance from prior reviews. Happy to refine based on feedback.

PR Checklist

[✔] Bug reproduced locally
[✔] Root cause identified
[✔] Fix implemented
[✔] Existing unit tests updated
[✔] All core tests passing ./gradlew :conductor-core:test)

…ock renewal for async system tasks

Fix conductor-oss#630: persist IN_PROGRESS before execution and add l…

8bd97e8

…ock renewal for async system tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix #630: Prevent duplicate HTTP task execution via status update and lock renewal #681

Fix #630: Prevent duplicate HTTP task execution via status update and lock renewal #681

Uh oh!

Deepak1101100 commented Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix #630: Prevent duplicate HTTP task execution via status update and lock renewal #681

Are you sure you want to change the base?

Fix #630: Prevent duplicate HTTP task execution via status update and lock renewal #681

Uh oh!

Conversation

Deepak1101100 commented Dec 6, 2025

PR Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant