Getting Started With Async Features in Python
Getting Started With Async Features in Python
realpython.com/python-async-features/
About Doug Farrell Doug is a Python developer with more than 25 years of experience. He writes about Python
on his personal website and works as a Senior Web Engineer with Shutterfly. » More about Doug Each tutorial at
Real Python is created by a team of developers so that it meets our high quality standards. The team members
who worked on this tutorial are: Aldren Brad Geir Arne Jaya Joanna
Have you heard of asynchronous programming in Python? Are you curious to know
more about Python async features and how you can use them in your work? Perhaps
you’ve even tried to write threaded programs and run into some issues. If you’re
looking to understand how to use Python async features, then you’ve come to the
right place.
All of the example code in this article have been tested with Python 3.7.2. You can
grab a copy to follow along by clicking the link below:
Download Code: Click here to download the code you’ll use to learn about
async features in Python in this tutorial.
Take the Quiz: Test your knowledge with our interactive “Getting Started With
Async Features in Python” quiz. You’ll receive a score upon completion to help you
track your learning progress:
Interactive Quiz
Getting Started With Async Features in Python
In this quiz, you'll test your understanding of asynchronous programming in Python. You'll revisit
the concepts of synchronous and asynchronous programs, and why you might want to write an
asynchronous program. You'll also test your knowledge on how to use Python async features.
This means that the program will move on to future execution steps even though a
previous step hasn’t yet finished and is still running elsewhere. This also means that
the program knows what to do when a previous step does finish running.
Why would you want to write a program in this manner? The rest of this article will
help you answer that question and give you the tools you need to elegantly solve
interesting asynchronous problems.
Remove ads
Why? In this case, one unit of work (input, process, output) is not the only purpose.
The real purpose is to handle hundreds or even thousands of units of work as quickly
as possible. This can happen over long periods of time, and several work units may
even arrive all at once.
Can a synchronous web server be made better? Sure, you could optimize the
execution steps so that all the work coming in is handled as quickly as possible.
Unfortunately, there are limitations to this approach. The result could be a web server
that doesn’t respond fast enough, can’t handle enough work, or even one that times
out when work gets stacked up.
Note: There are other limitations you might see if you tried to optimize the
above approach. These include network speed, file IO speed, database query
speed, and the speed of other connected services, to name a few. What these
all have in common is that they are all IO functions. All of these items are
orders of magnitude slower than the CPU’s processing speed.
What is non-blocking code? What’s blocking code, for that matter? Would the
answers to these questions help you write a better web server? If so, how could you
do it? Let’s find out!
Imagine this: you’re a parent trying to do several things at once. You have to balance
the checkbook, do the laundry, and keep an eye on the kids. Somehow, you’re able
to do all of these things at the same time without even thinking about it! Let’s break it
down:
• Working with the washer and dryer is a synchronous task, but the bulk of the
work happens after the washer and dryer are started. Once you’ve got them
going, you can walk away and get back to the checkbook task. At this point, the
washer and dryer tasks have become asynchronous. The washer and dryer
will run independently until the buzzer goes off (notifying you that the task
needs attention).
• Watching your kids is another asynchronous task. Once they are set up and
playing, they can do so independently for the most part. This changes when
someone needs attention, like when someone gets hungry or hurt. When one
of your kids yells in alarm, you react. The kids are a long-running task with high
priority. Watching them supersedes any other tasks you might be doing, like the
checkbook or laundry.
These examples can help to illustrate the concepts of blocking and non-blocking
code. Let’s think about this in programming terms. In this example, you’re like the
CPU. While you’re moving the laundry around, you (the CPU) are busy and blocked
from doing other work, like balancing the checkbook. But that’s okay because the
task is relatively quick.
On the other hand, starting the washer and dryer does not block you from performing
other tasks. It’s an asynchronous function because you don’t have to wait for it to
finish. Once it’s started, you can go back to something else. This is called a context
switch: the context of what you’re doing has changed, and the machine’s buzzer will
notify you sometime in the future when the laundry task is complete.
As a human, this is how you work all the time. You naturally juggle multiple things at
once, often without thinking about it. As a developer, the trick is how to translate this
kind of behavior into code that does the same kind of thing.
Now, you can re-prioritize the tasks any way you want, but only one of them would
happen at any given time. This is the result of a synchronous, step-by-step approach.
Like the synchronous web server described above, this would work, but it might not
be the best way to live. The parent wouldn’t be able to complete any other tasks until
the kids fell asleep. All other tasks would happen afterward, well into the night. (A
couple of weeks of this and many real parents might jump out the window!)
Remove ads
Let’s make the polling interval something like fifteen minutes. Now, every fifteen
minutes your parent checks to see if the washer, dryer or kids need any attention. If
not, then the parent can go back to work on the checkbook. However, if any of those
tasks do need attention, then the parent will take care of it before going back to the
checkbook. This cycle continues on until the next timeout out of the polling loop.
This approach works as well since multiple tasks are getting attention. However,
there are a couple of problems:
1. The parent may spend a lot of time checking on things that don’t need
attention: The washer and dryer haven’t yet finished, and the kids don’t need
any attention unless something unexpected happens.
2. The parent may miss completed tasks that do need attention: For instance,
if the washer finished its cycle at the beginning of the polling interval, then it
wouldn’t get any attention for up to fifteen minutes! What’s more, watching the
kids is supposedly the highest priority task. They couldn’t tolerate fifteen
minutes with no attention when something might be going drastically wrong.
You could address these issues by shortening the polling interval, but now your
parent (the CPU) would be spending more time context switching between tasks.
This is when you start to hit a point of diminishing returns. (Once again, a couple of
weeks living like this and, well… See the previous comment about windows and
jumping.)
If you think of each task as a part of one program, then you can separate them and
run them as threads. In other words, you can “clone” the parent, creating one
instance for each task: watching the kids, monitoring the washer, monitoring the
dryer, and balancing the checkbook. All of these “clones” are running independently.
This sounds like a pretty nice solution, but there are some issues here as well. One
is that you’ll have to explicitly tell each parent instance what to do in your program.
This can lead to some problems since all instances share everything in the program
space.
For example, say that Parent A is monitoring the dryer. Parent A sees that the
clothes are dry, so they take control of the dryer and begin unloading the clothes. At
the same time, Parent B sees that the washer is done, so they take control of the
washer and begin removing clothes. However, Parent B also needs to take control of
the dryer so they can put the wet clothes inside. This can’t happen, because Parent
A currently has control of the dryer.
After a short while, Parent A has finished unloading clothes. Now they want to take
control of the washer and start moving clothes into the empty dryer. This can’t
happen, either, because Parent B currently has control of the washer!
These two parents are now deadlocked. Both have control of their own resource and
want control of the other resource. They’ll wait forever for the other parent instance to
release control. As the programmer, you’d have to write code to work this situation
out.
Remember, these two parent instances are working within the same program. The
family checking account is a shared resource, so you’d have to work out a way for
the child-watching parent to inform the checkbook-balancing parent. Otherwise,
you’d need to provide some kind of locking mechanism so that the checkbook
resource can only be used by one parent at a time, with updates.
All of the examples in this article have been tested with Python 3.7.2. The
requirements.txt file indicates which modules you’ll need to install to run all the
examples. If you haven’t yet downloaded the file, you can do so now:
Download Code: Click here to download the code you’ll use to learn about
async features in Python in this tutorial.
You also might want to set up a Python virtual environment to run the code so you
don’t interfere with your system Python.
Remove ads
Synchronous Programming
This first example shows a somewhat contrived way of having a task retrieve work
from a queue and process that work. A queue in Python is a nice FIFO (first in first
out) data structure. It provides methods to put things in a queue and take them out
again in the order they were inserted.
In this case, the work is to get a number from the queue and have a loop count up to
that number. It prints to the console when the loop begins, and again to output the
total. This program demonstrates one way for multiple synchronous tasks to process
the work in a queue.
The program named example_1.py in the repository is listed in full below:
Python
1import queue
2
3def task(name, work_queue):
4 if work_queue.empty():
5 print(f"Task {name} nothing to do")
6 else:
7 while not work_queue.empty():
8 count = work_queue.get()
9 total = 0
10 print(f"Task {name} running")
11 for x in range(count):
12 total += 1
13 print(f"Task {name} total: {total}")
14
15def main():
16 """
17 This is the main entry point for the program
18 """
19 # Create the queue of work
20 work_queue = queue.Queue()
21
22 # Put some work in the queue
23 for work in [15, 10, 5, 2]:
24 work_queue.put(work)
25
26 # Create some synchronous tasks
27 tasks = [(task, "One", work_queue), (task, "Two", work_queue)]
28
29 # Run the tasks
30 for t, n, q in tasks:
31 t(n, q)
32
33if __name__ == "__main__":
34 main()
• Line 1 imports the queue module. This is where the program stores work to be
done by the tasks.
• Lines 3 to 13 define task(). This function pulls work out of work_queue and
processes the work until there isn’t any more to do.
• Line 15 defines main() to run the program tasks.
• Line 20 creates the work_queue. All tasks use this shared resource to retrieve
work.
• Lines 23 to 24 put work in work_queue. In this case, it’s just a random count of
values for the tasks to process.
• Line 27 creates a list of task tuples, with the parameter values those tasks will
be passed.
• Lines 30 to 31 iterate over the list of task tuples, calling each one and passing
the previously defined parameter values.
• Line 34 calls main() to run the program.
The task in this program is just a function accepting a string and a queue as
parameters. When executed, it looks for anything in the queue to process. If there is
work to do, then it pulls values off the queue, starts a for loop to count up to that
value, and outputs the total at the end. It continues getting work off the queue until
there is nothing left and it exits.
When this program is run, it produces the output you see below:
Shell
This shows that Task One does all the work. The while loop that Task One hits within
task() consumes all the work on the queue and processes it. When that loop exits,
Task Two gets a chance to run. However, it finds that the queue is empty, so Task Two
prints a statement that says it has nothing to do and then exits. There’s nothing in the
code to allow both Task One and Task Two to switch contexts and work together.
The yield statement turns task() into a generator. A generator function is called just
like any other function in Python, but when the yield statement is executed, control is
returned to the caller of the function. This is essentially a context switch, as control
moves from the generator function to the caller.
The interesting part is that control can be given back to the generator function by
calling next() on the generator. This is a context switch back to the generator
function, which picks up execution with all function variables that were defined before
the yield still intact.
The while loop in main() takes advantage of this when it calls next(t). This
statement restarts the task at the point where it previously yielded. All of this means
that you’re in control when the context switch happens: when the yield statement is
executed in task().
Python
1import queue
2
3def task(name, queue):
4 while not queue.empty():
5 count = queue.get()
6 total = 0
7 print(f"Task {name} running")
8 for x in range(count):
9 total += 1
10 yield
11 print(f"Task {name} total: {total}")
12
13def main():
14 """
15 This is the main entry point for the program
16 """
17 # Create the queue of work
18 work_queue = queue.Queue()
19
20 # Put some work in the queue
21 for work in [15, 10, 5, 2]:
22 work_queue.put(work)
23
24 # Create some tasks
25 tasks = [task("One", work_queue), task("Two", work_queue)]
26
27 # Run the tasks
28 done = False
29 while not done:
30 for t in tasks:
31 try:
32 next(t)
33 except StopIteration:
34 tasks.remove(t)
35 if len(tasks) == 0:
36 done = True
37
38if __name__ == "__main__":
39 main()
Shell
You can see that both Task One and Task Two are running and consuming work from
the queue. This is what’s intended, as both tasks are processing work, and each is
responsible for two items in the queue. This is interesting, but again, it takes quite a
bit of work to achieve these results.
The trick here is using the yield statement, which turns task() into a generator and
performs a context switch. The program uses this context switch to give control to the
while loop in main(), allowing two instances of a task to run cooperatively.
Notice how Task Two outputs its total first. This might lead you to think that the tasks
are running asynchronously. However, this is still a synchronous program. It’s
structured so the two tasks can trade contexts back and forth. The reason why Task
Two outputs its total first is that it’s only counting to 10, while Task One is counting to
15. Task Two simply arrives at its total first, so it gets to print its output to the console
before Task One.
Note: All of the example code that follows from this point use a module called
codetiming to time and output how long sections of code took to execute. There
is a great article here on RealPython that goes into depth about the codetiming
module and how to use it.
This module is part of the Python Package Index and is built by Geir Arne
Hjelle, who is part of the Real Python team. Geir Arne has been a great help to
me reviewing and suggesting things for this article. If you are writing code that
needs to include timing functionality, Geir Arne’s codetiming module is well
worth looking at.
To make the codetiming module available for the examples that follow you’ll
need to install it. This can be done with pip with this command: pip install
codetiming, or with this command: pip install -r requirements.txt. The
requirements.txt file is part of the example code repository.
Remove ads
A blocking call is code that stops the CPU from doing anything else for some period
of time. In the thought experiments above, if a parent wasn’t able to break away from
balancing the checkbook until it was complete, that would be a blocking call.
time.sleep(delay) does the same thing in this example, because the CPU can’t do
anything else but wait for the delay to expire.
Python
1import time
2import queue
3from codetiming import Timer
4
5def task(name, queue):
6 timer = Timer(text=f"Task {name} elapsed time: {{:.1f}}")
7 while not queue.empty():
8 delay = queue.get()
9 print(f"Task {name} running")
10 timer.start()
11 time.sleep(delay)
12 timer.stop()
13 yield
14
15def main():
16 """
17 This is the main entry point for the program
18 """
19 # Create the queue of work
20 work_queue = queue.Queue()
21
22 # Put some work in the queue
23 for work in [15, 10, 5, 2]:
24 work_queue.put(work)
25
26 tasks = [task("One", work_queue), task("Two", work_queue)]
27
28 # Run the tasks
29 done = False
30 with Timer(text="\nTotal elapsed time: {:.1f}"):
31 while not done:
32 for t in tasks:
33 try:
34 next(t)
35 except StopIteration:
36 tasks.remove(t)
37 if len(tasks) == 0:
38 done = True
39
40if __name__ == "__main__":
41 main()
• Line 1 imports the time module to give the program access to time.sleep().
• Line 3 imports the the Timer code from the codetiming module.
• Line 6 creates the Timer instance used to measure the time taken for each
iteration of the task loop.
• Line 10 starts the timer instance
• Line 11 changes task() to include a time.sleep(delay) to mimic an IO delay.
This replaces the for loop that did the counting in example_1.py.
• Line 12 stops the timer instance and outputs the elapsed time since
timer.start() was called.
• Line 30 creates a Timer context manager that will output the elapsed time the
entire while loop took to execute.
When you run this program, you’ll see the following output:
Shell
As before, both Task One and Task Two are running, consuming work from the queue
and processing it. However, even with the addition of the delay, you can see that
cooperative concurrency hasn’t gotten you anything. The delay stops the processing
of the entire program, and the CPU just waits for the IO delay to be over.
This is exactly what’s meant by blocking code in Python async documentation. You’ll
notice that the time it takes to run the entire program is just the cumulative time of all
the delays. Running tasks this way is not a win.
The time and queue modules have been replaced with the asyncio package. This
gives your program access to asynchronous friendly (non-blocking) sleep and queue
functionality. The change to task() defines it as asynchronous with the addition of the
async prefix on line 4. This indicates to Python that the function will be asynchronous.
The other big change is removing the time.sleep(delay) and yield statements, and
replacing them with await asyncio.sleep(delay). This creates a non-blocking delay
that will perform a context switch back to the caller main().
The while loop inside main() no longer exists. Instead of task_array, there’s a call to
await asyncio.gather(...). This tells asyncio two things:
The last line of the program asyncio.run(main()) runs main(). This creates what’s
known as an event loop). It’s this loop that will run main(), which in turn will run the
two instances of task().
The event loop is at the heart of the Python async system. It runs all the code,
including main(). When task code is executing, the CPU is busy doing work. When
the await keyword is reached, a context switch occurs, and control passes back to
the event loop. The event loop looks at all the tasks waiting for an event (in this case,
an asyncio.sleep(delay) timeout) and passes control to a task with an event that’s
ready.
Python
1import asyncio
2from codetiming import Timer
3
4async def task(name, work_queue):
5 timer = Timer(text=f"Task {name} elapsed time: {{:.1f}}")
6 while not work_queue.empty():
7 delay = await work_queue.get()
8 print(f"Task {name} running")
9 timer.start()
10 await asyncio.sleep(delay)
11 timer.stop()
12
13async def main():
14 """
15 This is the main entry point for the program
16 """
17 # Create the queue of work
18 work_queue = asyncio.Queue()
19
20 # Put some work in the queue
21 for work in [15, 10, 5, 2]:
22 await work_queue.put(work)
23
24 # Run the tasks
25 with Timer(text="\nTotal elapsed time: {:.1f}"):
26 await asyncio.gather(
27 asyncio.create_task(task("One", work_queue)),
28 asyncio.create_task(task("Two", work_queue)),
29 )
30
31if __name__ == "__main__":
32 asyncio.run(main())
When you look at the output of this program, notice how both Task One and Task Two
start at the same time, then wait at the mock IO call:
Shell
This indicates that await asyncio.sleep(delay) is non-blocking, and that other work
is being done.
At the end of the program, you’ll notice the total elapsed time is essentially half the
time it took for example_3.py to run. That’s the advantage of a program that uses
Python async features! Each task was able to run await asyncio.sleep(delay) at the
same time. The total execution time of the program is now less than the sum of its
parts. You’ve broken away from the synchronous model!
Remove ads
The program has been modified to import the wonderful requests module to make
the actual HTTP requests. Also, the queue now contains a list of URLs, rather than
numbers. In addition, task() no longer increments a counter. Instead, requests gets
the contents of a URL retrieved from the queue, and prints how long it took to do so.
Python
1import queue
2import requests
3from codetiming import Timer
4
5def task(name, work_queue):
6 timer = Timer(text=f"Task {name} elapsed time: {{:.1f}}")
7 with requests.Session() as session:
8 while not work_queue.empty():
9 url = work_queue.get()
10 print(f"Task {name} getting URL: {url}")
11 timer.start()
12 session.get(url)
13 timer.stop()
14 yield
15
16def main():
17 """
18 This is the main entry point for the program
19 """
20 # Create the queue of work
21 work_queue = queue.Queue()
22
23 # Put some work in the queue
24 for url in [
25 "http://google.com",
26 "http://yahoo.com",
27 "http://linkedin.com",
28 "http://apple.com",
29 "http://microsoft.com",
30 "http://facebook.com",
31 "http://twitter.com",
32 ]:
33 work_queue.put(url)
34
35 tasks = [task("One", work_queue), task("Two", work_queue)]
36
37 # Run the tasks
38 done = False
39 with Timer(text="\nTotal elapsed time: {:.1f}"):
40 while not done:
41 for t in tasks:
42 try:
43 next(t)
44 except StopIteration:
45 tasks.remove(t)
46 if len(tasks) == 0:
47 done = True
48
49if __name__ == "__main__":
50 main()
• Line 2 imports requests, which provides a convenient way to make HTTP calls.
• Line 3 imports the the Timer code from the codetiming module.
• Line 6 creates the Timer instance used to measure the time taken for each
iteration of the task loop.
• Line 11 starts the timer instance
• Line 12 introduces a delay, similar to example_3.py. However, this time it calls
session.get(url), which returns the contents of the URL retrieved from
work_queue.
• Line 13 stops the timer instance and outputs the elapsed time since
timer.start() was called.
• Lines 23 to 32 put the list of URLs into work_queue.
• Line 39 creates a Timer context manager that will output the elapsed time the
entire while loop took to execute.
When you run this program, you’ll see the following output:
Shell
Just like in earlier versions of the program, yield turns task() into a generator. It also
performs a context switch that lets the other task instance run.
Each task gets a URL from the work queue, retrieves the contents of the page, and
reports how long it took to get that content.
As before, yield allows both your tasks to run cooperatively. However, since this
program is running synchronously, each session.get() call blocks the CPU until the
page is retrieved. Note the total time it took to run the entire program at the end.
This will be meaningful for the next example.
The tasks here have been modified to remove the yield call since the code to make
the HTTP GET call is no longer blocking. It also performs a context switch back to the
event loop.
Python
1import asyncio
2import aiohttp
3from codetiming import Timer
4
5async def task(name, work_queue):
6 timer = Timer(text=f"Task {name} elapsed time: {{:.1f}}")
7 async with aiohttp.ClientSession() as session:
8 while not work_queue.empty():
9 url = await work_queue.get()
10 print(f"Task {name} getting URL: {url}")
11 timer.start()
12 async with session.get(url) as response:
13 await response.text()
14 timer.stop()
15
16async def main():
17 """
18 This is the main entry point for the program
19 """
20 # Create the queue of work
21 work_queue = asyncio.Queue()
22
23 # Put some work in the queue
24 for url in [
25 "http://google.com",
26 "http://yahoo.com",
27 "http://linkedin.com",
28 "http://apple.com",
29 "http://microsoft.com",
30 "http://facebook.com",
31 "http://twitter.com",
32 ]:
33 await work_queue.put(url)
34
35 # Run the tasks
36 with Timer(text="\nTotal elapsed time: {:.1f}"):
37 await asyncio.gather(
38 asyncio.create_task(task("One", work_queue)),
39 asyncio.create_task(task("Two", work_queue)),
40 )
41
42if __name__ == "__main__":
43 asyncio.run(main())
When you run this program, you’ll see the following output:
Shell
Take a look at the total elapsed time, as well as the individual times to get the
contents of each URL. You’ll see that the duration is about half the cumulative time of
all the HTTP GET calls. This is because the HTTP GET calls are running
asynchronously. In other words, you’re effectively taking better advantage of the CPU
by allowing it to make multiple requests at once.
Because the CPU is so fast, this example could likely create as many tasks as there
are URLs. In this case, the program’s run time would be that of the single slowest
URL retrieval.
Remove ads
Conclusion
This article has given you the tools you need to start making asynchronous
programming techniques a part of your repertoire. Using Python async features gives
you programmatic control of when context switches take place. This means that
many of the tougher issues you might see in threaded programming are easier to
deal with.
Asynchronous programming is a powerful tool, but it isn’t useful for every kind of
program. If you’re writing a program that calculates pi to the millionth decimal place,
for instance, then asynchronous code won’t help you. That kind of program is CPU
bound, without much IO. However, if you’re trying to implement a server or a program
that performs IO (like file or network access), then using Python async features could
make a huge difference.
You can get the code for all of the example programs used in this tutorial:
Download Code: Click here to download the code you’ll use to learn about
async features in Python in this tutorial.
Now that you’re equipped with these powerful skills, you can take your programs to
the next level!
Take the Quiz: Test your knowledge with our interactive “Getting Started With
Async Features in Python” quiz. You’ll receive a score upon completion to help you
track your learning progress:
Interactive Quiz
Getting Started With Async Features in Python
In this quiz, you'll test your understanding of asynchronous programming in Python. You'll revisit
the concepts of synchronous and asynchronous programs, and why you might want to write an
asynchronous program. You'll also test your knowledge on how to use Python async features.
Python Tricks
Get a short & sweet Python Trick delivered to your inbox every couple of days. No
spam ever. Unsubscribe any time. Curated by the Real Python team.
Doug is a Python developer with more than 25 years of experience. He writes about
Python on his personal website and works as a Senior Web Engineer with
Shutterfly.
Aldren
Brad Solomon
Brad
Geir Arne
Jaya Zhané
Jaya
Joanna Jablonski
Joanna