Custom Celery Tasks: including the enqueuing request_id to a task_id in Django

https://unsplash.com/photos/ZnMTiwDYXao

Debugging background tasks can be challenging, given they’re not necessarily executed in the order they’re enqueued (assuming you have multiple workers processing them), may be retried if they fail the first time through, and have no direct link to the web request that enqueued them.

Including the id of the web request that enqueued them in the id of the task itself can help alleviate a lot of these debugging issues, and deobfuscate some of the “what happened, and why?” questions that may arise.

I’m using the celery library with Django — this implementation is specific to that combination.

The primary implementation outlined below was first written up in this Rover blog — I found this article super helpful, but still had a couple hurdles to figure out, so thought I’d go into more detail here! Specifically, things that weren’t covered in that blog post that I’ll cover here include:

  • What if you’re enqueuing your tasks via instead of ?
  • Options for the id of background tasks that are themselves enqueued by another background task.
  • How to use the custom task class you’ve defined

High Level Strategy

The function of a celery takes a keyword argument called , which it then passes on to the method. If a is not provided, within , we see:

In order to generalize our tasks to ensure that they always define their own that also includes the id of the request that enqueued it, we can define a custom task class that handles this for us. Let’s look at the parts and prerequisites for this.

Generating the custom task_id

Of course, this strategy assumes you already have a attribute available to you on all of your objects. I covered this in some detail in a previous blog post — assuming you’ve done something similar, you can create a helper function such as:

Given this, within our custom task class, we can have a helper function that does something like this:

Let’s talk through each of those pieces, one by one:

  • — we talked about this above, this will retrieve the off your local thread, assuming it was set there by a piece of middleware.
  • — this just generates our unique in the same way the celery library would if we didn’t do it ourselves
  • — this bit handles the scenario where a background task was enqueued by another background task. In this case, the request won’t have gone through our middleware to set the on the local thread, but the all tasks have the id of the request that enqueued them, we can get it that way:
  • — This is necessary for the same reason the step above may be necessary — tasks can be enqueued from other background tasks. If you want to be able to trace that entire flow back to the beginning, you may consider skipping this step. Doing this means that each will only be prepended by the id of the actual web request that led to its enqueuing, but will not include the id of the task that directly enqueued it, if there was one. We chose this approach because, especially once retries are taken into account, there is potential for the to get really really long if you skip this — this can make it both less useful, and depending on how often you do this, or how deep the enqueuing of tasks from other tasks goes in your application, you could overrun the max length of 255 for a !
  • — this is the final step of prepending our uniquely generated with the ! Because we’re joining them with an , which is not a character that will be found in either of the ids themselves, we know we will always be able to separate one id from the other.

.delay versus .apply_async

Now that we have our custom …let’s use it! We’ll rely on the fact that celery’s takes in a as a keyword argument. However, this can present a problem if you’re enqueuing your tasks using instead. You could of course change all of your usages of to be , but that strategy would require both a larger code change than necessary, and also means that any future additions of calls to by engineers who aren’t aware of the custom task implementation will lose the custom functionality. Better if we could handle both. Good news — we can!

In celery’s base class, the docstring on the method says this:

And the method itself is a handy one-liner that does this:

So we can apply a similar strategy, but move it up the stack a bit. The method on our custom task class will call our helper method to generate a custom , and pass that into a call to of , which will take us into celery’s, as we need.

And then, our custom task class’s implementation of will….do the same thing! Instead of calling on the function, we’ll call straight into celery’s from our function. That’ll look like this:

Using the custom task class

Once your custom task is defined, you can use it by adding it as the to any of your already defined tasks. So something that used to be defined as:

becomes:

And we’re off!

Senior Software Engineer | www.adriennedomingus.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store