Timeouts and Retries

TL;DR

In this post, I will walk through how to use timeouts and retries with Celery.

The Code

Refer this gist

Code Walkthrough

This part will cover some of the functions and their results

Func: timeout_test

This simple function shows how a timeout exception can be caught. Note that the timeout has been defined in the conf

"task_soft_time_limit": "30",

The results look something like this –

[2018-09-04 10:28:24,360: WARNING/ForkPoolWorker-8] Bello Merld at 2018-09-04T10:28:24.360234[2018-09-04 10:28:24,360: WARNING/ForkPoolWorker-8]
[2018-09-04 10:28:24,917: WARNING/ForkPoolWorker-5] Gonna start a long task!
[2018-09-04 10:28:54,917: WARNING/MainProcess] Soft time limit (30.0s) exceeded for tasks.timeout_test[b6d6a7fc-9110-4847-8f19-9c57908983e3]
[2018-09-04 10:28:55,755: WARNING/ForkPoolWorker-5] Clean up now!

Func: retry_timeout_test

This second function shows how a timeout exception can be combined with retries.

The results look something like this –

[2018-09-04 11:06:38,872: WARNING/ForkPoolWorker-1] Bello Merld at 2018-09-04T11:06:38.871975
[2018-09-04 11:06:39,868: WARNING/ForkPoolWorker-6] Gonna start a long task!
[2018-09-04 11:06:39,869: WARNING/ForkPoolWorker-6] n is 7
[2018-09-04 11:06:39,869: WARNING/ForkPoolWorker-6] Raising an Error
[2018-09-04 11:06:39,870: WARNING/ForkPoolWorker-6] Retrying now
[2018-09-04 11:07:09,915: WARNING/ForkPoolWorker-5] Gonna start a long task!
[2018-09-04 11:07:09,916: WARNING/ForkPoolWorker-5] n is 2
[2018-09-04 11:07:09,916: WARNING/ForkPoolWorker-5] Running subprocess
[2018-09-04 11:07:40,122: WARNING/MainProcess] Soft time limit (30.0s) exceeded for tasks.retry_timeout_test[ba42b67d-6d6b-454d-9095-e4aa153c08ee]
[2018-09-04 11:07:40,356: WARNING/ForkPoolWorker-5] Clean up now!

This is extremely helpful in cases where there are network calls failing and you want to ensure that it ends up posting anyway. We are using a fixed timeout interval but you could use an exponential backoff too.

Func: max_retries_test

This next function tests how max retries work with Celery.

The results look something like this –

[2018-09-04 11:11:32,380: WARNING/ForkPoolWorker-4] Bello Merld at 2018-09-04T11:11:32.379416
[2018-09-04 11:11:32,844: WARNING/ForkPoolWorker-7] Gonna start a long task!
[2018-09-04 11:11:32,845: WARNING/ForkPoolWorker-7] Raising an Error
[2018-09-04 11:11:32,846: WARNING/ForkPoolWorker-7] Retrying now
[2018-09-04 11:12:03,455: WARNING/ForkPoolWorker-5] Gonna start a long task!
[2018-09-04 11:12:03,457: WARNING/ForkPoolWorker-5] Raising an Error
[2018-09-04 11:12:03,457: WARNING/ForkPoolWorker-5] Retrying now
[2018-09-04 11:12:35,001: WARNING/ForkPoolWorker-3] Gonna start a long task!
[2018-09-04 11:12:35,002: WARNING/ForkPoolWorker-3] Raising an Error
[2018-09-04 11:12:35,003: WARNING/ForkPoolWorker-3] Retrying now
[2018-09-04 11:13:05,131: WARNING/ForkPoolWorker-4] Gonna start a long task!
[2018-09-04 11:13:05,132: WARNING/ForkPoolWorker-4] Raising an Error
[2018-09-04 11:13:05,132: WARNING/ForkPoolWorker-4] Retrying now
[2018-09-04 11:13:05,454: ERROR/ForkPoolWorker-4] Task tasks.max_retries_test[0079bc82-1b98-4bd9-862f-19afa14c0056] raised unexpected: KeyError('Testing Key Errors',)
Traceback (most recent call last):
File "/home/user/env3.4/lib/python3.4/site-packages/celery/app/trace.py", line 374, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/user/env3.4/lib/python3.4/site-packages/celery/app/trace.py", line 629, in __protected_call__
return self.run(*args, **kwargs)
File "/home/user/timeouts/tasks.py", line 73, in max_retries_test
raise self.retry(exc=ex)
File "/home/user/env3.4/lib/python3.4/site-packages/celery/app/task.py", line 669, in retry
raise_with_context(exc)
File "/home/user/timeouts/tasks.py", line 70, in max_retries_test
raise KeyError("Testing Key Errors")
KeyError: 'Testing Key Errors'

Func: max_retries_test_2

This last function looks very similar to the third function with one small change. During a retry, we will not pass the original exception object.

The results look something like this –

[2018-09-04 11:16:31,803: WARNING/ForkPoolWorker-1] Bello Merld at 2018-09-04T11:16:31.803085
[2018-09-04 11:16:32,270: WARNING/ForkPoolWorker-2] Gonna start a long task!
[2018-09-04 11:16:32,271: WARNING/ForkPoolWorker-2] Raising an Error
[2018-09-04 11:16:32,271: WARNING/ForkPoolWorker-2] Retrying now
[2018-09-04 11:17:02,931: WARNING/ForkPoolWorker-5] Gonna start a long task!
[2018-09-04 11:17:02,932: WARNING/ForkPoolWorker-5] Raising an Error
[2018-09-04 11:17:02,932: WARNING/ForkPoolWorker-5] Retrying now
[2018-09-04 11:17:33,454: WARNING/ForkPoolWorker-6] Gonna start a long task!
[2018-09-04 11:17:33,455: WARNING/ForkPoolWorker-6] Raising an Error
[2018-09-04 11:17:33,455: WARNING/ForkPoolWorker-6] Retrying now
[2018-09-04 11:18:03,464: WARNING/ForkPoolWorker-1] Gonna start a long task!
[2018-09-04 11:18:03,464: WARNING/ForkPoolWorker-1] Raising an Error
[2018-09-04 11:18:03,465: WARNING/ForkPoolWorker-1] Retrying now
[2018-09-04 11:18:03,756: ERROR/ForkPoolWorker-1] Task tasks.max_retries_test_2[31c7b71a-3075-4eca-bb06-fbba818eb8c3] raised unexpected: MaxRetriesExceededError("Can't retry tasks.max_retries_test_2[31c7b71a-3075-4eca-bb06-fbba818eb8c3] args:() kwargs:{}",)
Traceback (most recent call last):
File "/home/user/timeouts/tasks.py", line 110, in max_retries_test_2
raise KeyError("Testing Key Errors")
KeyError: 'Testing Key Errors'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/user/env3.4/lib/python3.4/site-packages/celery/app/trace.py", line 374, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/user/env3.4/lib/python3.4/site-packages/celery/app/trace.py", line 629, in __protected_call__
return self.run(*args, **kwargs)
File "/home/user/timeouts/tasks.py", line 113, in max_retries_test_2
raise self.retry()
File "/home/user/env3.4/lib/python3.4/site-packages/celery/app/task.py", line 672, in retry
self.name, request.id, S.args, S.kwargs))
celery.exceptions.MaxRetriesExceededError: Can't retry tasks.max_retries_test_2[31c7b71a-3075-4eca-bb06-fbba818eb8c3] args:() kwargs:{

It is a question of choice on which error you need.

Interestingly, the MaxRetriesExceededError didn’t get caught in both cases. I will share once I find out if there is a better way to handle that.

Use Cases

One way to combine the above concepts would be to run them with Chords. This blog post will walk you through how chords work in Celery. The code for this section is in this gist.

Celery tracks the number of retries thus far using the field “request.retries” which is available to the “task” object. For more information on the available fields you could check this out.

The result for the above code looks something like this –

[2018-09-04 20:21:51,706: WARNING/ForkPoolWorker-4] Gonna start a long task!
[2018-09-04 20:21:51,707: WARNING/ForkPoolWorker-4] Raising an Error
[2018-09-04 20:21:51,707: WARNING/ForkPoolWorker-4] Retrying now
[2018-09-04 20:22:21,962: WARNING/ForkPoolWorker-8] Gonna start a long task!
[2018-09-04 20:22:21,964: WARNING/ForkPoolWorker-8] Running chords
[2018-09-04 20:22:26,673: WARNING/ForkPoolWorker-8] Retrying now
[2018-09-04 20:22:32,617: WARNING/ForkPoolWorker-4] 90
[2018-09-04 20:22:57,109: WARNING/ForkPoolWorker-8] Gonna start a long task!
[2018-09-04 20:22:57,110: WARNING/ForkPoolWorker-8] Running chords
[2018-09-04 20:23:27,155: WARNING/MainProcess] Soft time limit (30.0s) exceeded for tasks.timeout_and_chords[64d65fdb-2bcb-4604-86fd-044ca48ba4eb]
[2018-09-04 20:23:29,538: WARNING/ForkPoolWorker-8] Clean up now!

I’d like to call out one interesting fact on the result. Ideally, you would expect the chord sum 90 to be printed before “Retrying now” but you see otherwise here. This happens because chords are asynchronous and Celery has moved on after handing off the chord execution to a different worker thread.

Conclusion

Timeouts and Retries work beautifully in Celery and can get very easy to manage as long as you know what you are doing. Depending on your use case, you could combine a bunch of features (like the one we saw with Chords).

Examples:

  1. Timeouts with Infinite Retries and Exponential Backoffs
  2. Timeouts with Max Retries and Exponential Backoffs
  3. Scheduled Tasks (using Celery Beat) with Timeouts
  4. Retries for pre-defined exceptions (covered in Celery Docs)

The list goes on.

This is a core task feature of Celery and the possible combinations are vast. Visit this link for information about retries and this link for information about timeouts.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Powered by WordPress.com.

Up ↑