BibSched: tasks not halting the queue on failure
|Reported by:||jlavik||Owned by:||Alessio Deiana <alessio.deiana@…>|
We all know sometimes BibTask's fails, be it dbdump failing or an oaiharvest timing out etc., causing the queue to exit automatic mode until human operators "arrive to the scene". Now, this can happen often in the middle of the night or at times when human operators are far away from "the scene". For some production systems it can seriously harm the flow of execution and service to have the BibSched queue halt for several hours, even in the middle of the night - due to nightly tasks such as harvesting. Some of these failures can be more harmless then others, but no matter the cause, the queue stops.
Now there are two ways of attacking this problem, besides having human operators more readily available. One way could be to add an configurable option to scheduled BibTasks to not stop the queue on failures. For example, a dbdump task failing can be a serious matter in itself, but it does not harm the running service per se. Of course, operators should still be made aware of the issue via the normal channels, but the queue should move on as usual.
A secondary or additional option, perhaps, would be to look into all the different Bibtasks and further define which errors are more significant then others and amend to have the lesser significant errors fail "silently" - not stopping the queue.
Whatever the option it should be easily configurable per instance which tasks can or cannot cause the BibSched queue to halt.
Change History (20)
Changed 2 years ago by adeiana
comment:15 Changed 2 years ago by Alessio Deiana <alessio.deiana@…>
- Owner set to Alessio Deiana <alessio.deiana@…>
- Resolution set to fixed
- Status changed from in_merge to closed