Why, When, and How to use Celery with Python

“Celery is an asynchronous task queue based on distributed message passing.”

Let’s break that down:

asynchronous: events happen outside of the main program flow.
task queue based: queues up the tasks to be performed. (The tasks themselves are queued up in a tool like Redis or RabbitMQ)
distributed message passing: multiple processes can read and write to the message (task) queue without being connected to one another.

Why?

Say you want to connect to a third-party service, Slack/Email/External Photo Storage, etc. These APIs take time to respond; as would any web/HTTP request and your Python Web Server would get stuck waiting for the response from the API. The user in the meantime only sees your slow website which is a very bad user experience.

Using Celery you can call the third-party API in a separate task (worker), so the main thread is free to return and accept other incoming HTTP requests.

When?

  1. Whenever we make a call that depends on external factors — for instance calling a third-party API.
  2. If a task takes too much time — for example, a user wants to download a few MB of data that is stored on their account, on your web app: a) delegate the task to Celery b) Zip and store the data in a publicly accessible folder, say on AWS S3 c) Send the user an email that their download is ready.

How?

Since the whole idea is that the task is done in a separate thread, you need to:

  1. Run a separate “worker” process that handles the functions that use Celery.
  2. Use the message queue (like Redis or RabbitMQ) where the tasks are queued.

The code itself is rather simple:

Here’s a regular function call:

def some_function():
  print("Regular function call")

# Call function from elsewhere
some_function()

vs. the same function call with Celery:

from celery.decorators import task
​
@task
def some_function_called_asynchronously():
  print("Function decorated with Celery task()")
  
# Call function using .delay()
some_function_called_asynchronously.delay()

The function can be called without celery too, simply call function_called_asynchronously() from anywhere without the .delay().

P.S.: Celery vs Python’s async

There’s another idea around Asynchronous handling that is important, async (Asynchronous processing) helps in handling multiple HTTP connections on a single server instance. That’s a topic on its own I’ll cover some other time.

Python has support for async in version 3.5. Django is on its way to adding async support by Dec 2019.


Posted

in

, ,

Comments

Leave a Reply