TL;DR
In this post, I will walk through how to use timeouts and retries with Celery.
The Code
Refer this gist
Code Walkthrough
This part will cover some of the functions and their results
Func: timeout_test
This simple function shows how a timeout exception can be caught. Note that the timeout has been defined in the conf
"task_soft_time_limit": "30",
The results look something like this –
[2018-09-04 10:28:24,360: WARNING/ForkPoolWorker-8] Bello Merld at 2018-09-04T10:28:24.360234[2018-09-04 10:28:24,360: WARNING/ForkPoolWorker-8] [2018-09-04 10:28:24,917: WARNING/ForkPoolWorker-5] Gonna start a long task! [2018-09-04 10:28:54,917: WARNING/MainProcess] Soft time limit (30.0s) exceeded for tasks.timeout_test[b6d6a7fc-9110-4847-8f19-9c57908983e3] [2018-09-04 10:28:55,755: WARNING/ForkPoolWorker-5] Clean up now!
Func: retry_timeout_test
This second function shows how a timeout exception can be combined with retries.
The results look something like this –
[2018-09-04 11:06:38,872: WARNING/ForkPoolWorker-1] Bello Merld at 2018-09-04T11:06:38.871975 [2018-09-04 11:06:39,868: WARNING/ForkPoolWorker-6] Gonna start a long task! [2018-09-04 11:06:39,869: WARNING/ForkPoolWorker-6] n is 7 [2018-09-04 11:06:39,869: WARNING/ForkPoolWorker-6] Raising an Error [2018-09-04 11:06:39,870: WARNING/ForkPoolWorker-6] Retrying now [2018-09-04 11:07:09,915: WARNING/ForkPoolWorker-5] Gonna start a long task! [2018-09-04 11:07:09,916: WARNING/ForkPoolWorker-5] n is 2 [2018-09-04 11:07:09,916: WARNING/ForkPoolWorker-5] Running subprocess [2018-09-04 11:07:40,122: WARNING/MainProcess] Soft time limit (30.0s) exceeded for tasks.retry_timeout_test[ba42b67d-6d6b-454d-9095-e4aa153c08ee] [2018-09-04 11:07:40,356: WARNING/ForkPoolWorker-5] Clean up now!
This is extremely helpful in cases where there are network calls failing and you want to ensure that it ends up posting anyway. We are using a fixed timeout interval but you could use an exponential backoff too.
Func: max_retries_test
This next function tests how max retries work with Celery.
The results look something like this –
[2018-09-04 11:11:32,380: WARNING/ForkPoolWorker-4] Bello Merld at 2018-09-04T11:11:32.379416 [2018-09-04 11:11:32,844: WARNING/ForkPoolWorker-7] Gonna start a long task! [2018-09-04 11:11:32,845: WARNING/ForkPoolWorker-7] Raising an Error [2018-09-04 11:11:32,846: WARNING/ForkPoolWorker-7] Retrying now [2018-09-04 11:12:03,455: WARNING/ForkPoolWorker-5] Gonna start a long task! [2018-09-04 11:12:03,457: WARNING/ForkPoolWorker-5] Raising an Error [2018-09-04 11:12:03,457: WARNING/ForkPoolWorker-5] Retrying now [2018-09-04 11:12:35,001: WARNING/ForkPoolWorker-3] Gonna start a long task! [2018-09-04 11:12:35,002: WARNING/ForkPoolWorker-3] Raising an Error [2018-09-04 11:12:35,003: WARNING/ForkPoolWorker-3] Retrying now [2018-09-04 11:13:05,131: WARNING/ForkPoolWorker-4] Gonna start a long task! [2018-09-04 11:13:05,132: WARNING/ForkPoolWorker-4] Raising an Error [2018-09-04 11:13:05,132: WARNING/ForkPoolWorker-4] Retrying now [2018-09-04 11:13:05,454: ERROR/ForkPoolWorker-4] Task tasks.max_retries_test[0079bc82-1b98-4bd9-862f-19afa14c0056] raised unexpected: KeyError('Testing Key Errors',) Traceback (most recent call last): File "/home/user/env3.4/lib/python3.4/site-packages/celery/app/trace.py", line 374, in trace_task R = retval = fun(*args, **kwargs) File "/home/user/env3.4/lib/python3.4/site-packages/celery/app/trace.py", line 629, in __protected_call__ return self.run(*args, **kwargs) File "/home/user/timeouts/tasks.py", line 73, in max_retries_test raise self.retry(exc=ex) File "/home/user/env3.4/lib/python3.4/site-packages/celery/app/task.py", line 669, in retry raise_with_context(exc) File "/home/user/timeouts/tasks.py", line 70, in max_retries_test raise KeyError("Testing Key Errors") KeyError: 'Testing Key Errors'
Func: max_retries_test_2
This last function looks very similar to the third function with one small change. During a retry, we will not pass the original exception object.
The results look something like this –
[2018-09-04 11:16:31,803: WARNING/ForkPoolWorker-1] Bello Merld at 2018-09-04T11:16:31.803085 [2018-09-04 11:16:32,270: WARNING/ForkPoolWorker-2] Gonna start a long task! [2018-09-04 11:16:32,271: WARNING/ForkPoolWorker-2] Raising an Error [2018-09-04 11:16:32,271: WARNING/ForkPoolWorker-2] Retrying now [2018-09-04 11:17:02,931: WARNING/ForkPoolWorker-5] Gonna start a long task! [2018-09-04 11:17:02,932: WARNING/ForkPoolWorker-5] Raising an Error [2018-09-04 11:17:02,932: WARNING/ForkPoolWorker-5] Retrying now [2018-09-04 11:17:33,454: WARNING/ForkPoolWorker-6] Gonna start a long task! [2018-09-04 11:17:33,455: WARNING/ForkPoolWorker-6] Raising an Error [2018-09-04 11:17:33,455: WARNING/ForkPoolWorker-6] Retrying now [2018-09-04 11:18:03,464: WARNING/ForkPoolWorker-1] Gonna start a long task! [2018-09-04 11:18:03,464: WARNING/ForkPoolWorker-1] Raising an Error [2018-09-04 11:18:03,465: WARNING/ForkPoolWorker-1] Retrying now [2018-09-04 11:18:03,756: ERROR/ForkPoolWorker-1] Task tasks.max_retries_test_2[31c7b71a-3075-4eca-bb06-fbba818eb8c3] raised unexpected: MaxRetriesExceededError("Can't retry tasks.max_retries_test_2[31c7b71a-3075-4eca-bb06-fbba818eb8c3] args:() kwargs:{}",) Traceback (most recent call last): File "/home/user/timeouts/tasks.py", line 110, in max_retries_test_2 raise KeyError("Testing Key Errors") KeyError: 'Testing Key Errors' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/user/env3.4/lib/python3.4/site-packages/celery/app/trace.py", line 374, in trace_task R = retval = fun(*args, **kwargs) File "/home/user/env3.4/lib/python3.4/site-packages/celery/app/trace.py", line 629, in __protected_call__ return self.run(*args, **kwargs) File "/home/user/timeouts/tasks.py", line 113, in max_retries_test_2 raise self.retry() File "/home/user/env3.4/lib/python3.4/site-packages/celery/app/task.py", line 672, in retry self.name, request.id, S.args, S.kwargs)) celery.exceptions.MaxRetriesExceededError: Can't retry tasks.max_retries_test_2[31c7b71a-3075-4eca-bb06-fbba818eb8c3] args:() kwargs:{
It is a question of choice on which error you need.
Interestingly, the MaxRetriesExceededError didn’t get caught in both cases. I will share once I find out if there is a better way to handle that.
Use Cases
One way to combine the above concepts would be to run them with Chords. This blog post will walk you through how chords work in Celery. The code for this section is in this gist.
Celery tracks the number of retries thus far using the field “request.retries” which is available to the “task” object. For more information on the available fields you could check this out.
The result for the above code looks something like this –
[2018-09-04 20:21:51,706: WARNING/ForkPoolWorker-4] Gonna start a long task! [2018-09-04 20:21:51,707: WARNING/ForkPoolWorker-4] Raising an Error [2018-09-04 20:21:51,707: WARNING/ForkPoolWorker-4] Retrying now [2018-09-04 20:22:21,962: WARNING/ForkPoolWorker-8] Gonna start a long task! [2018-09-04 20:22:21,964: WARNING/ForkPoolWorker-8] Running chords [2018-09-04 20:22:26,673: WARNING/ForkPoolWorker-8] Retrying now [2018-09-04 20:22:32,617: WARNING/ForkPoolWorker-4] 90 [2018-09-04 20:22:57,109: WARNING/ForkPoolWorker-8] Gonna start a long task! [2018-09-04 20:22:57,110: WARNING/ForkPoolWorker-8] Running chords [2018-09-04 20:23:27,155: WARNING/MainProcess] Soft time limit (30.0s) exceeded for tasks.timeout_and_chords[64d65fdb-2bcb-4604-86fd-044ca48ba4eb] [2018-09-04 20:23:29,538: WARNING/ForkPoolWorker-8] Clean up now!
I’d like to call out one interesting fact on the result. Ideally, you would expect the chord sum 90 to be printed before “Retrying now” but you see otherwise here. This happens because chords are asynchronous and Celery has moved on after handing off the chord execution to a different worker thread.
Conclusion
Timeouts and Retries work beautifully in Celery and can get very easy to manage as long as you know what you are doing. Depending on your use case, you could combine a bunch of features (like the one we saw with Chords).
Examples:
- Timeouts with Infinite Retries and Exponential Backoffs
- Timeouts with Max Retries and Exponential Backoffs
- Scheduled Tasks (using Celery Beat) with Timeouts
- Retries for pre-defined exceptions (covered in Celery Docs)
The list goes on.
This is a core task feature of Celery and the possible combinations are vast. Visit this link for information about retries and this link for information about timeouts.