Skip to content

Conversation

@atsareg
Copy link
Contributor

@atsareg atsareg commented Jan 19, 2026

Jobs that failed initialization, e.g. input sandbox download, are rescheduled by the jobwrapper and return non-zero status. JobAgent checks the job status with a certain delay, can be few minutes in the case of Pool/Singularity inner CE. The job can be already rescheduled, matched and running on another site again by this time. JobAgent sets the job status to Failed if it is in a Running status even if it is running after a successful rescheduling elsewhere. This Failed status is false in this case and should not be set.

BEGINRELEASENOTES

*WorkloadManagement
FIX: JobAgent - do not fail already rescheduled job

ENDRELEASENOTES

@atsareg atsareg added the alsoTargeting:integration Cherry pick this PR to integration after merge label Jan 19, 2026
@atsareg atsareg changed the title [8.0] JobAgent - do not reschedule already rescheduled job [8.0] JobAgent - do not fail already rescheduled job Jan 19, 2026
job.sendJobAccounting(status=JobStatus.FAILED, minorStatus=JobMinorStatus.UPLOADING_JOB_OUTPUTS)

return 2
return JW.FINALIZATION_FAILED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return value 2 is now "PAYLOAD_FAILED" but here you are returning 3 for "FINALIZATION_FAILED". I do not know if this is on purpose.

job.sendFailoverRequest()
job.sendJobAccounting(status=JobStatus.FAILED, minorStatus=JobMinorStatus.UPLOADING_JOB_OUTPUTS)
return 2
return JW.FINALIZATION_FAILED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

# In order to avoid overriding perfectly valid states, the status is updated iff the job was running
# The payload failed (if result["Value"] is not 0 and the job was not rescheduled)
elif result["Value"] and result["Value"] != RESCHEDULED:
# In order to avoid overriding perfectly valid states, the status is updated if the job was running
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original "iff" was there for "if and only if".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

alsoTargeting:integration Cherry pick this PR to integration after merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants