israeliqueue — IsraeliQueue implementation in C¶
Source code: __init__.pyx
A Cython implementation of a queue system where each item is associated with a “group.” The queue processes items in groups, ensuring that items belonging to the same group are dequeued together.
The package contains two classes: IsraeliQueue and
AsyncIsraeliQueue.
IsraeliQueue is a synchronous thread-safe queue. It allows you to
wait on the queue and block the current thread until a task is available.
AsyncIsraeliQueue is an asynchronous queue that works with Python’s
asyncio framework. It provides non-blocking, asynchronous methods for queue
operations and is suitable for applications requiring high concurrency and
asynchronous task management.
Both classes have an interface similar to Python’s built-in queue.Queue and asyncio.Queue classes.
What is an Israeli Queue?¶
An Israeli Queue is a type of priority queue where tasks are grouped together in the same priority. Adding new tasks to the queue will cause them to skip the line and group together. The tasks will then be taken out in-order, group by group.
Why is this useful?¶
IsraeliQueues enjoy many benefits from processing grouped tasks in batches. For example, imagine a bot or an API that requires logging in to a remote repository in order to bring files:
def login(repo_name: str):
return Session("repo_name") # Expensive operation
def download_file(session: Session, filename: str):
return Session.download(filename)
def logout(session: Session):
session.logout
Now, we have a thread or an asyncio task that adds files to download to the queue:
from israeliqueue import IsraeliQueue
queue = IsraeliQueue()
queue.put("cpython", "build.bat")
queue.put("black", "pyproject.toml")
queue.put("black", "bar") # Same repo as the second item
queue.put("cpython", "index.html") # Same repository as the first item
An ordinary queue will cause our bot to login and logout four times, processing each item individually. The IsraeliQueue groups the repositories together, saving setup costs and allowing to download them all in the same request:
while True:
group, items = queue.get_group()
session = login(group)
for item in items:
download_file(session, item)
logout(session)
If the downloading process accepts multiple files at once, it’s even more efficient:
session.download_files(*items)
Other uses may include batching together AWS queries, batching numpy calculations, and plenty more!
Quickstart¶
Installation¶
To install the package, simply pip install cisraeliqueue.
You can use the classes in your project as follows:
from israeliqueue import IsraeliQueue, AsyncIsraeliQueue
Synchronous Example¶
from israeliqueue import IsraeliQueue
# Initialize the queue
queue = IsraeliQueue(maxsize=10)
# Add items to the queue
queue.put('group1', 'task1')
queue.put('group1', 'task2')
queue.put('group2', 'task3')
# Get items from the queue
group, task = queue.get()
print(f"Processing {task} from {group}")
# Get all items from the same group
group, tasks = queue.get_group()
print(f"Processing all tasks from {group}: {tasks}")
# Mark the task as done
queue.task_done()
# Wait for all tasks to complete
queue.join()
Asynchronous Example¶
import asyncio
from israeliqueue import AsyncIsraeliQueue
async def main():
# Initialize the queue
queue = AsyncIsraeliQueue(maxsize=10)
# Add items to the queue
await queue.put('group1', 'task1')
await queue.put('group1', 'task2')
await queue.put('group2', 'task3')
# Get items from the queue
group, task = await queue.get()
print(f"Processing {task} from {group}")
# Get all items from the same group
group, tasks = await queue.get_group()
print(f"Processing all tasks from {group}: {tasks}")
# Mark the task as done
queue.task_done()
# Wait for all tasks to complete
await queue.join()
# Run the async example
asyncio.run(main())
Classes¶
- class israeliqueue.IsraeliQueue[GT, VT](maxsize=None)¶
This is the synchronous implementation of the Israeli Queue. It provides group-based task processing and supports both blocking and non-blocking queue operations. The class is thread-safe and can be used in multithreaded environments.
GT is the type of the group key. It must be a
Hashableobject. VT is the type of the value stored in the queue.An
IsraeliQueuehas the following attributes:- maxsize¶
The maximum number of items that can be placed in the queue. If the queue is full, any further attempts to add items will block until space becomes available. By default, the queue has no size limit (
maxsize=None).
An
IsraeliQueueinstance has the following methods:- put(group, value, /, *[, timeout])¶
Put a value into the queue. If the queue is full, this method will block until space becomes available.
group is any
Hashableobject that represents the group to which the value belongs.value is the task to be added to the queue.
timeout is an optional parameter that specifies the maximum time in seconds to wait for space to become available.
If the queue is full and the timeout is reached, a
Fullexception is raised.
- put_nowait(group, value, /)¶
Put a value into the queue without blocking. If the queue is full, a
Fullexception is raised.Equivalent to
put(group, value, timeout=0).
- get(*[, timeout])¶
Remove and return a
(group, value)tuple from the queue. If the queue is empty, this method will block until a value becomes available.Consecutive calls to
get()will return as many items as possible from the same group, even if they were added after the last item of the group was taken out. That enables the following constructs:shared_resource = None while True: group, task = queue.get() if group != shared_resource: sleep(5) # Simulate a long operation shared_resource = group # Switch to a different resource # We can use queue.put(group, <different_task>) safetly within # task.run(). The tasks will still prioritize the same group. task.run()
timeout is an optional parameter that specifies the maximum time in seconds to wait for a value to become available.
If the queue is empty and the timeout is reached, an
Emptyexception is raised.
- get_nowait()¶
Remove and return an value from the queue without blocking. If the queue is empty, an
Emptyexception is raised.Equivalent to
get(timeout=0).
- get_group(*[, timeout])¶
Remove and return a
(group, (values, ...))tuple from the queue. If the queue is empty, this method will block until a value becomes available.The next group taken out will be different if possible, allowing to effectively batch operations:
shared_resource = None while True: group, tasks = await queue.get_group() shared_resource = group # Switch to a different resource for task in tasks: # We can use queue.put(group, <different_task>) safetly # within task.run(). The next get_group() will bring a # different group with a potentially higher item count. task.run()
timeout is an optional parameter that specifies the maximum time in seconds to wait for a value to become available.
If the queue is empty and the timeout is reached, an
Emptyexception is raised.
- get_group_nowait()¶
Remove and return a
(group, (values, ...))tuple from the queue without blocking. If the queue is empty, anEmptyexception is raised.Equivalent to
get_group(timeout=0).
- task_done()¶
Indicate that a formerly enqueued task is complete. Used by queue consumers. For each
put()used to enqueue a task, a subsequent call totask_done()tells the queue that the processing on the task is complete.
- join(*[, timeout])¶
Blocks until all tasks in the queue are done. If timeout is specified, the method will block for at most timeout seconds.
- qsize()¶
Return the number of items in the queue.
- empty()¶
Return
Trueif the queue is empty,Falseotherwise.
- full()¶
Return
Trueif the queue is full,Falseotherwise.
- class israeliqueue.AsyncIsraeliQueue[GT, VT](maxsize=None)¶
This is the asynchronous implementation of the Israeli Queue. It provides group-based task processing and supports non-blocking, asynchronous queue operations. The class is designed to work with Python’s
asyncioframework.GT is the type of the group key. It must be a
Hashableobject. VT is the type of the value stored in the queue.An
AsyncIsraeliQueuehas the following attributes:- maxsize¶
The maximum number of items that can be placed in the queue. If the queue is full, any further attempts to add items will block until space becomes available. By default, the queue has no size limit (
maxsize=None).
An
AsyncIsraeliQueueinstance has the following methods:- async put(group, value, /)¶
Put a value into the queue. If the queue is full, this method will block until space becomes available.
group is any
Hashableobject that represents the group to which the value belongs.value is the task to be added to the queue.
If you wish to specify a timeout, use the
wait_for()function.
- put_nowait(group, value, /)¶
Put a value into the queue without blocking. If the queue is full, a
Fullexception is raised.
- async get()¶
Remove and return a
(group, value)tuple from the queue. If the queue is empty, this method will block until a value becomes available.Consecutive calls to
get()will return as many items as possible from the same group, even if they were added after the last item of the group was taken out. That enables the following constructs:shared_resource = None while True: group, task = await queue.get() if group != shared_resource: sleep(5) # Simulate a long operation shared_resource = group # Switch to a different resource # We can use queue.put(group, <different_task>) safetly within # task.run(). The tasks will still prioritize the same group. task.run()
If you wish to specify a timeout, use the
wait_for()function.
- get_nowait()¶
Remove and return an value from the queue without blocking. If the queue is empty, an
Emptyexception is raised.
- async get_group()¶
Remove and return a
(group, (values, ...))tuple from the queue. If the queue is empty, this method will block until a value becomes available.The next group taken out will be different if possible, allowing to effectively batch operations:
shared_resource = None while True: group, tasks = await queue.get_group() shared_resource = group # Switch to a different resource for task in tasks: # We can use queue.put(group, <different_task>) safetly # within task.run(). The next get_group() will bring a # different group with a potentially higher item count. task.run()
If you wish to specify a timeout, use the
wait_for()function.
- get_group_nowait()¶
Remove and return a
(group, (values, ...))tuple from the queue without blocking. If the queue is empty, anEmptyexception is raised.
- task_done()¶
Indicate that a formerly enqueued task is complete. Used by queue consumers. For each
put()used to enqueue a task, a subsequent call totask_done()tells
- async join()¶
Blocks until all tasks in the queue are done.
If you wish to specify a timeout, use the
wait_for()function.
- qsize()¶
Return the number of items in the queue.
- empty()¶
Return
Trueif the queue is empty,Falseotherwise.
- full()¶
Return
Trueif the queue is full,Falseotherwise.
Exceptions¶
- exception israeliqueue.Full¶
Raised when the queue is full and a new item cannot be added.
- exception israeliqueue.Empty¶
Raised when the queue is empty and an item cannot be retrieved.
Complexity¶
put / put_nowait: O(1) - Insertion at the end of the queue.
get / get_nowait: O(1) - Dequeueing from the front of the queue.
get_group / get_group_nowait: O(group) - Dequeueing all items from the same group.
task_done: O(1) - Simple bookkeeping to track completed tasks.
join: O(1) - Blocks until all tasks are done.
Benchmarks¶
The following benchmarks were run on a 2020 MacBook Pro. Each consist of 4000 tasks, with 26 unique groups.
python -m timeit -s "from string import ascii_lowercase; from queue import Queue; q = Queue();" "for i in range(4000): q.put((ascii_lowercase[i%26], 'b'))" "for i in range(4000): q.get()"
50 loops, best of 5: 7.13 msec per loop
python -m timeit -s "from string import ascii_lowercase; from queue import PriorityQueue; pq = PriorityQueue();" "for i in range(4000): pq.put((ascii_lowercase[i%26], 'b'))" "for i in range(4000): pq.get()"
50 loops, best of 5: 9.52 msec per loop
python -m timeit -s "from string import ascii_lowercase; from queue import SimpleQueue; sq = SimpleQueue();" "for i in range(4000): sq.put((ascii_lowercase[i%26], 'b'))" "for i in range(4000): sq.get()"
500 loops, best of 5: 654 usec per loop
python -m timeit -s "from string import ascii_lowercase; from israeliqueue import IsraeliQueue; iq = IsraeliQueue();" "for i in range(4000): iq.put(ascii_lowercase[i%26], 'b')" "for i in range(4000): iq.get()"
50 loops, best of 5: 6.61 msec per loop
As we can see, the IsraeliQueue is faster than the built-in PriorityQueue and Queue classes.
That makes sense as the majority of the class is writen in C.
Unlike PriorityQueue the algorithm is O(1) for every operation and so should be fast regardless of the queue size.
The builtin SimpleQueue is the fastest as it is also implemented in C, but it does not support priority queues or any smart handling of queue size and data.
License¶
This project is licensed under the MIT License.