How to build a brokerless task queue in Python with FastWorker
A complete walkthrough of building a distributed Python task queue without Redis, RabbitMQ, or Kafka — using FastWorker, NNG messaging, and 2–3 Python processes.
If you’ve shipped any Python web application, you’ve probably hit the background-job wall. Some work takes too long to run in a request handler — sending email, resizing images, generating reports, calling a third-party API — and you need it to happen somewhere else. The standard playbook is Celery + Redis, which works, but buys you a lot of infrastructure along the way.
This guide walks through building the same thing without any of that infrastructure. By the end you’ll have a distributed task queue running on 2–3 Python processes, with a real web dashboard and native FastAPI integration.
Why brokerless?
A task queue traditionally looks like this:
- Your app produces a task.
- A broker (Redis, RabbitMQ, Kafka, SQS) stores it durably.
- A worker pool polls the broker and consumes tasks.
- A result backend (another Redis, a database) stores outputs.
- A monitoring tool (Flower, Grafana) watches everything.
That’s four or five moving pieces to deploy, monitor, secure, back up, and keep in sync with your Python runtime. For moderate-scale Python services — which is most services — that’s massive overkill.
FastWorker removes the broker and result backend by putting coordination inside a Python process called the control plane. Your app talks to the control plane over NNG (a lightweight nanomsg successor), and the control plane distributes tasks to optional subworkers. No Redis. No broker. No separate result store.
Prerequisites
- Python 3.12 or later
- A terminal and 10 minutes
pip install fastworker
That’s the entire installation.
Step 1 — Define a task module
Create mytasks.py with any functions you want to run asynchronously. Decorate them with @task:
# mytasks.py
from fastworker import task
@task
def resize_image(path: str, width: int) -> str:
from PIL import Image
img = Image.open(path)
img.thumbnail((width, width * 3))
out = path.replace(".jpg", f"_{width}.jpg")
img.save(out)
return out
@task
def send_welcome_email(user_id: int, email: str) -> bool:
# your email-sending code
return True
There’s no configuration, no broker URL, no queue name. A task is just a decorated function.
Step 2 — Start the control plane
In a terminal, launch the control plane and point it at your task module:
fastworker control-plane --task-modules mytasks
Two things happen:
- The control plane starts listening for client connections on TCP port 5555.
- A built-in web dashboard starts at http://127.0.0.1:8080. Open it — you’ll see live worker status, queue depth by priority, task history, and cache stats. You did not just install Flower. It ships in the box.
The control plane can also process tasks itself, so this one process is a complete deployment.
Step 3 — (Optional) Add subworkers
If you want more throughput, spin up subworkers in additional terminals. They auto-register with the control plane:
fastworker subworker \
--worker-id sw1 \
--control-plane-address tcp://127.0.0.1:5555 \
--base-address tcp://127.0.0.1:5561 \
--task-modules mytasks
No reconfiguration. Add another one by bumping the worker id and port; the control plane picks it up automatically.
Step 4 — Submit tasks from Python
The async Client is the main interface. It’s designed to plug directly into FastAPI request handlers:
# app.py
from fastapi import FastAPI
from fastworker import Client
app = FastAPI()
client = Client()
@app.on_event("startup")
async def _startup():
await client.start()
@app.on_event("shutdown")
async def _shutdown():
client.stop()
@app.post("/images/{path}")
async def resize(path: str):
task_id = await client.delay("resize_image", path, 800)
return {"task_id": task_id}
@app.get("/images/result/{task_id}")
async def result(task_id: str):
r = await client.get_task_result(task_id)
if r is None:
return {"status": "pending"}
return {"status": r.status, "result": r.result}
client.delay is non-blocking: it returns immediately with a task id. Your request handler never waits for the work to finish. That’s exactly what you want for production APIs.
Step 5 — Prioritize what matters
FastWorker has four built-in priority levels: critical, high, normal, low. Tasks at higher priorities drain first.
from fastworker.tasks.models import TaskPriority
await client.delay("charge_card", order_id, priority=TaskPriority.CRITICAL)
await client.delay("send_welcome_email", user_id, email,
priority=TaskPriority.LOW)
No separate queues, no routing rules, no broker config. Just pick a priority and submit.
What you get
With those five steps, you have:
- A distributed task queue running on 2–3 Python processes
- A real-time web dashboard
- Priority scheduling
- Automatic worker discovery and load balancing
- Native async FastAPI integration
- A result cache you can poll
- Zero broker infrastructure
What you give up
FastWorker is intentionally scoped. If you need durable persistence (tasks survive a control-plane crash), exactly-once delivery, complex DAG workflows, or multi-language workers, you should reach for Celery, Temporal, or a proper broker. The Limitations document lays it out in detail, and the FastWorker vs Celery comparison digs into the tradeoffs.
For the other 90% of Python services — moderate volume, Python-only, with a team that wants its on-call rotation to stay small — brokerless is the right answer. Try FastWorker for an afternoon and see for yourself.
Frequently asked questions
Do I really need zero external services?
Yes. FastWorker uses NNG (nanomsg-next-generation) over TCP for direct Python-to-Python messaging. A minimal deployment is one control plane process — no Redis, no RabbitMQ, no database.
Can I scale beyond a single machine?
Yes. Subworkers connect to the control plane over TCP and can run on any reachable host. The control plane auto-discovers them and load-balances tasks to the least-loaded node.
How does it compare to Celery?
FastWorker trades features (chains, DAGs, durable persistence) for operational simplicity. It's best for Python services handling 1K–10K tasks/min that want to avoid running a broker.