ASYNC / Concurreny Messaging and Job Queues (RabbitMQ/Redis/...) Multi-Processing Multi-Threading python
I'm going to talk about the 4 main levels of parallelism in modern Computing:
- multiple (virtual) machines
- multiple processes
- multiple threads
- multiple green threads, aka asyncio
Why you might use each of them, how to go about doing so with python and some of the pitfalls you might fall into along the way.
To do so, I'll give short examples in code of achieving each level:
- leveraging multiple hosts using RQ, and also the possibility of RPC with HTTP
- multiprocessing and threading using their respective modules from the python standard library
- asyncio demonstrated with AIOHTTP
That sounds great, but there are "gotchas" you should know about before you get started, for example:
- multiple machines can actually be multiple virtual machines on the same host
- effectively communicating between processes is hard, how can we go about making it easier?
- the limitations of threading and the GIL
- run_in_executor - do we ever really need to use multiprocessing or threading directly again
- use of asyncio when dealing with both networking between hosts and between processes - you end up using two different kinds of concurrency at the same time. That can be confusing, but also awesome.
I'll finish of by showcasing a library I built, arq which is a job queueing and RPC library for python which uses asyncio and Redis.
Type: Talk (45 mins); Python level: Beginner; Domain level: Beginner
CTO of a small and boring software company, TutorCruncher.
Big fan of open source, I'm the main maintain for multiple popular packages including pydantic, arq and aiohttp-devtools. Also a core contributor to AIOHTTP and RQ.