Let’s break that down:
asynchronous: events happen outside of the main program flow.
task queue based: queues up the tasks to be performed. (The tasks themselves are queued up in a tool like Redis or RabbitMQ)
distributed message passing: multiple processes can read and write to the message (task) queue without being connected to one another.
Why?
Say you want to connect to a third-party service, Slack/Email/External Photo Storage, etc. These APIs take time to respond; as would any web/HTTP request and your Python Web Server would get stuck waiting for the response from the API. The user in the meantime only sees your slow website which is a very bad user experience.
Using Celery you can call the third-party API in a separate task (worker), so the main thread is free to return and accept other incoming HTTP requests.
When?
- Whenever we make a call that depends on external factors — for instance calling a third-party API.
- If a task takes too much time — for example, a user wants to download a few MB of data that is stored on their account, on your web app: a) delegate the task to Celery b) Zip and store the data in a publicly accessible folder, say on AWS S3 c) Send the user an email that their download is ready.
How?
Since the whole idea is that the task is done in a separate thread, you need to:
- Run a separate “worker” process that handles the functions that use Celery.
- Use the message queue (like Redis or RabbitMQ) where the tasks are queued.
The code itself is rather simple:
Here’s a regular function call:
def some_function(): print("Regular function call") # Call function from elsewhere some_function()
vs. the same function call with Celery:
from celery.decorators import task @task def some_function_called_asynchronously(): print("Function decorated with Celery task()") # Call function using .delay() some_function_called_asynchronously.delay()
The function can be called without celery too, simply call function_called_asynchronously()
from anywhere without the .delay()
.
P.S.: Celery vs Python’s async
There’s another idea around Asynchronous handling that is important, async
(Asynchronous processing) helps in handling multiple HTTP connections on a single server instance. That’s a topic on its own I’ll cover some other time.
Python has support for async
in version 3.5. Django is on its way to adding async
support by Dec 2019.
Leave a Reply