Priority queues and load balancing in FastWorker
How FastWorker's four priority levels work, how the control plane dispatches tasks to subworkers, and when to reach for each priority in a FastAPI app.
One reason FastWorker can skip the broker entirely is that it owns task dispatch itself. The control plane knows how many subworkers are connected, how loaded each one is, and what priority every queued task is. That makes priority queuing and load balancing cheap to implement and easy to use.
The four priority levels
FastWorker has exactly four levels, and they’re an enum:
from fastworker.tasks.models import TaskPriority
TaskPriority.CRITICAL # drain first
TaskPriority.HIGH
TaskPriority.NORMAL # default
TaskPriority.LOW # drain last
The control plane drains higher priorities before lower ones. That’s strict priority ordering — a LOW task will wait forever if higher priorities keep arriving. If you want fairness across priorities, don’t use priorities; run multiple control planes per workload instead.
Submitting with priority
You pass the priority as a keyword argument when you submit:
from fastworker import Client
from fastworker.tasks.models import TaskPriority
client = Client()
await client.start()
# Hot path — user is waiting
await client.delay("charge_card", order_id,
priority=TaskPriority.CRITICAL)
# Transactional but not user-blocking
await client.delay("send_receipt_email", order_id,
priority=TaskPriority.HIGH)
# Default — analytics rollup
await client.delay("record_purchase_metric", order_id)
# Background cleanup
await client.delay("archive_old_logs",
priority=TaskPriority.LOW)
That’s the entire interface. No queue names, no routing keys, no broker config.
How dispatch actually works
Here’s what happens when you submit a task:
- Client → control plane. Your
client.delay(...)call goes to the control plane over an NNG REQ/REP socket. The control plane assigns a task id, stores the task in an in-memory queue, and returns the id immediately. - Priority sort. The queue is priority-sorted. New
CRITICALtasks jump ahead of pendingNORMALones. - Dispatch loop. A dispatcher coroutine picks the next task at the highest available priority and picks a subworker to run it.
- Least-loaded selection. The dispatcher maintains an in-memory load count per subworker (tasks currently assigned). It picks the subworker with the lowest count. Ties break round-robin.
- Worker execution. The task is sent over DEALER/ROUTER to that subworker. If no subworker is available, the control plane executes the task itself.
- Result caching. When the task finishes, its return value goes into the LRU result cache and (if requested) a completion callback fires.
There’s no broker polling, no fair-queueing algorithm, no separate priority queue per level — it’s a single in-memory priority queue with a dispatcher loop.
When to use each priority
Use priorities for latency tiering, not fairness. A good rule of thumb:
- CRITICAL — A human is waiting on this in the current session. Payments, live notifications, session-critical calls. These should be rare and fast.
- HIGH — The user will notice if it’s delayed by minutes. Transactional emails, password resets, real-time cache invalidations.
- NORMAL — The default. Most background work. Indexing, rendering, webhook delivery.
- LOW — Best-effort. Cleanup, analytics rollups, prefetching, compaction.
If more than ~20% of your tasks are CRITICAL, your priorities aren’t priorities — they’re the default. Dial it back.
Load balancing in practice
The dispatcher’s least-loaded strategy works best when your subworkers are homogeneous. If one machine is much faster than another, you can help it along by running more subworker processes on the faster box:
# Fast box
fastworker subworker --worker-id fast-1 --base-address tcp://0.0.0.0:5561 ...
fastworker subworker --worker-id fast-2 --base-address tcp://0.0.0.0:5565 ...
fastworker subworker --worker-id fast-3 --base-address tcp://0.0.0.0:5569 ...
# Slow box
fastworker subworker --worker-id slow-1 --base-address tcp://0.0.0.0:5561 ...
The load count is per subworker process, not per host, so three processes on the fast box take three times as much work as one on the slow box. No configuration knob required.
Watching it happen
Open the built-in dashboard at http://127.0.0.1:8080. You’ll see:
- Queue depth by priority — four bars, one per level
- Subworker status — which workers are connected and their current load
- Task history — recent task ids, status, and timing
You can literally watch a flood of LOW tasks pile up while HIGH ones drain ahead of them.
What priorities are not
- Not deadlines. FastWorker doesn’t promise a
CRITICALtask starts within N milliseconds. It just promises no lower-priority task jumps ahead. - Not SLA classes. If you need hard SLAs, run a separate control plane per class and size each to meet its load.
- Not routing. There’s no
@task(queue="emails")equivalent. If you need specialized worker pools, run a separate control plane with its own task module.
Next steps
- Architecture — how the control plane and NNG work
- FastAPI background tasks at scale
- Observability with OpenTelemetry
- FastWorker vs Celery
Frequently asked questions
How many priority levels does FastWorker have?
Four: critical, high, normal (default), and low. They drain in order — a normal-priority task never blocks a critical one.
What does 'load balancing' mean here?
The control plane tracks how many tasks each subworker is currently processing and routes new work to the least-loaded one. There is no sticky assignment; every task is placed based on current load.
Do I need to configure anything for priorities to work?
No. Priorities are enum values passed at submission time. There's no broker config, no routing key, no queue declaration.