☰
Current Page
Main Menu
Home
Home
Editing A Beginners Guide to Async Programming in Python
Edit
Preview
H1
H2
H3
default
Set your preferred keybinding
default
vim
emacs
markdown
Set this page's format to
AsciiDoc
Creole
Markdown
MediaWiki
Org-mode
Plain Text
RDoc
Textile
Rendering unavailable for
BibTeX
Pod
reStructuredText
Help 1
Help 1
Help 1
Help 2
Help 3
Help 4
Help 5
Help 6
Help 7
Help 8
Autosaved text is available. Click the button to restore it.
Restore Text
https://medium.com/@marcnealer/a-beginners-guide-to-async-programming-in-python-0cb265dcfe4a # A Beginners Guide to Async Programming in Python | by Marc Nealer | Medium  [Marc Nealer](https://medium.com/@marcnealer) There is a lot of hype around async programming and it making all of our code run faster, but what exactly is it? In this article, I’m going to explain in simpler terms than most documentation, what is Async programming and how it works in Python. Quick overview statement ------------------------ To get us started, I’m going to give a quick overview of what Async coding is. When we code, we often perform IO actions. Writing to files and DBs, reading back, getting web pages, asking for a response from another application. With all of these we issue a command and then wait until we get a response. The thread out program is running in, is basically doing nothing while waiting for a response. These are know as blocking actions. If we move to multithreading, the system takes over and moves control between your different programs, based on its own metrics. This is not linked to the IO actions, and so our thread still ends up being blocked. With Async programming, we can tell the system when we are waiting for an IO response and processing can be passed to another program while we wait. So in basic terms, Async programming is where we run multiple routines in the same thread, but we control when control is moved from one to another. This makes the whole thing much more efficient. Web Server Example ------------------ Since FastAPI is the current shooting star, lets look at how Aysnc coding changes the way it works compared to standard Django actions. These are my No1 and 2 frameworks. Before you say anything, I know that Django does support Async view now, but they are good packages to show the differences. When a http request is made, it triggers a routine to start. This routine reads the request, sends a response and the routine finishes. The server can then go onto the next request. With multiple workers started, this means a http server with django behind it can process a lot of requests. The problem here is that the processing of requests has to be short and sweet. As Django coders, you will be familiar with making sure your views process fast and the need to shift longer running tasks to a background task manager. This works great and does a fab job in most cases, but then we come to websockets. Think of a websocket as a http request that doesn’t stop when a response is given, but stays in place waiting for more data to be sent or received. Now we have a problem as a websocket is going to block a thread and other requests can’t be run. So with django, we need to install django-channels and have websockets go through a different web server. In simpler terms Django, on its own, cannot support websockets. If we bring in Async processing with a framework like FastAPI, things are a little different. The request for the websocket comes in, and the connection made. The async routine then basically waits for IO, so while it waits control is passed to other requests. It doesn’t block the thread. This also means that we don’t have to worry so much over how fast the view processes except for how long the user is waiting for a response. From this, you can see FastAPI can actually process ALOT more requests than Django. How does Asyncio work --------------------- In Async coding we call functions coroutines and they cannot be called directly by your code, but must be run inside an Aysnc loop. In simpler terms, we create a special environment where our aysnc programs can be run a monitored. The controlling python routine is finished when co-routines are done and the loop closes. Here is a simple example ``` import asyncio async def my_coroutine(): for x in range(10): print("running") await asyncio.sleep(4) loop = asyncio.get_event_loop() loop.run_until_complete(my_coroutine()) ``` asyncio.get\_event\_loop() is creating the loop and .run\_until\_complete() starts the co-routine inside it. The \*await\* statement is how you say that its ok to pass control to another co-routine. A sort of multi-threading example --------------------------------- Here is an example that you might be more familiar with. Lets say we have a long list of web pages that we need to fetch. A typical web scraping scenario. ``` import aiohttp import asyncio import urllist async def fetch_page(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: return async def main(): tasks = [] for item in urllist: tasks.append(asyncio.create_task(fetch_page(item))) for item in tasks: await item loop = asyncio.get_event_loop() loop.run_until_complete(main()) ``` Assume urllist is a list of webpages. main() is started inside the loop, which in turn, creates an asyncio task which goes to fetch the page. A point to note is that main does not wait for the tasks to complete and would complete, and thus complete the routine, before all the tasks have been run. That is why we have the loop awaiting each task. Also note that there is an Async context manager in use. You need to use this when using libraries like aiohttp. aiohttp is used instead of Requests because Requests is not Async and will result in blocking the process while it waits for a response, thus negating the Async coding. There are a lot of Async alternative libraries out now, such as aiofiles, aio-pika and others. Motor for example is the async library for communicating with Mongodb. Futures ------- Futures are something that gets talked about alot with async, but in truth they are not used much. They are an object that can be passed to a co-routine and used to monitor when the co-routine is completed. In nearly all cases, futures are creating in the background and you don’t have to worry about them, so basically don’t. Know they are there and move on. Using Queues ------------ Data can be passed to and from aysnc tasks using queues. Here is the web scraper example, but this time the urls are placed in a queue and there are only 4 tasks pulling pages. ``` import asyncio import aiohttp import urllist async def get_webpage(session, url): async with session.get(url) as resp: return await resp.text() async def worker(name, queue): async with aiohttp.ClientSession() as session: while True: url = await queue.get() if url is None: # None is the signal to stop queue.task_done() return print(f'Worker {name} started getting {url}') content = await get_webpage(session, url) print(f'Worker {name} finished getting {url}') queue.task_done() async def main(): queue = asyncio.Queue() workers = [] for webpage in urllist: queue.put_nowait(webpage) for i in range(4): worker_coro = worker(f'Worker{i + 1}', queue) workers.append(asyncio.create_task(worker_coro)) await queue.join() # Block until all tasks are done for _ in workers: queue.put_nowait(None) # Signal all workers to exit await asyncio.gather(*workers; ) # Wait for all workers to exit asyncio.run(main()) ``` So main() is run inside a loop, which creates a queue, loads the queue with urls and then starts four tasks to process all the urls. This would be a better way to do web scrapping as grabbing all pages at the same time, will kinda result in your ip being blocked, as well as being VERY rude!! asyncio.run() is the same as get\_event\_loop() and run\_until\_complete() together Conclusion ---------- The language wrapped around async programming makes it sound very complex, but in truth its not. Its really just a better way of multi processing, especially when you have tasks that will spend long periods doing nothing. With that said, you do need to be much more aware of when your tasks are performing IO. Using Tortoise ORM, as an example, can be hard to get used to, as you need to be VERY aware of each and every statement that will result in a call to the DB, while when using Django, you don’t.
Uploading file...
Edit message:
Cancel