Cancel Package Deployment Jobs?

Started by Alwin, May 20, 2026, 03:00:14 PM

Previous topic - Next topic

Alwin

The new overview for package deployment jobs in version 6.x and later is generally great.
However, we haven't yet found a way to stop jobs that are in the "Failed" status. The system keeps trying to run them over and over again. How can we delete these jobs? Right-clicking only allows us to hide all jobs.
Does anyone have any ideas?

Alex Kirhenshtein

The "keeps retrying" is by design, and the reason "Hide" is the only menu option on a Failed row is intentional too. Here's what's actually happening:

When a package deployment attempt fails with a transient error (e.g. "Unable to connect to agent"), the server marks the current job as Failed and creates a new job in Scheduled state, 10 minutes later. So you're not looking at one stuck job — you're looking at a chain of new jobs being added every ~10 min.

Consequences:

  • You can't cancel a Failed job. The cancel action is only valid against jobs in Scheduled state. That's why "Hide" is the only thing the right-click gives you on a failed row — it's a view filter, not a state change.
  • To break the loop, cancel the next Scheduled job in the chain, not the failed ones. Look for the row whose status is Scheduled (timestamped ~10 min after the most recent failure) and right-click → Cancel. After that, no new job will be created and the chain stops.
  • Failed rows are cleaned up automatically by the housekeeper based on the server config PackageDeployment.JobRetentionTime (in days, default 7). Drop it temporarily (e.g. to 1) if you want them gone faster, then restore.

Worth checking the errorMessage on the failed rows too — if it's connectivity-related, cancelling alone is just a workaround; fixing agent reachability will stop new failures from being created.

Alwin

Hi Alex,

Thank you for your detailed explanation.

In our case, we cannot see the scheduled jobs and therefore cannot delete them either, as they do not appear in the list (Client 6.1.1).

As you suspected, the error is "Unable to connect to agent." This situation arose because a package was deployed to a node that is monitored only via ICMP (Agent Communication Option - "Use ICMP ping on primary IP address to determine node status"). How can we prevent this? We regularly deploy packages to the cluster that consist of a mix of agent-based and ICMP-based monitoring.

It would be very helpful if we could disable the automatic retry via a central configuration option.

best regards,
Alwin


Alwin

Update:
We can see the active jobs for a short time about 10 minutes after they start, but they run so fast that there's no way to cancle them before they disappear from the list again.

Alwin

Hi Alex,

i have pinned "Package Deployment Jobs" to the pinboard. In this view, the status of the scheduled jobs is not visible. However, if I access the "Package Deployment Jobs" directly through the configuration, the data is refreshed and suddenly all jobs are visible.

If we now try to cancel the jobs, we get the error "Request is out of state"

eugene1

#5
I have the same problem. 6.1.2
Stopping a job doesn't work.

I follow the instructions—I stop the last job in the chain, and the status is canceled. After 10 minutes, everything restarts.
example in picture.

How can I stop the work in an emergency / force ? Maybe there's a key in the settings to prevent restarting the job if it fails, for example?

so, I think the best way is to add a custom parameter to the server configuration - LimitFailCounterPackageDeploy, with default value not zero, 100 to example