Skip to content

PATTERN Cited by 1 source

Bulk job enqueue

When enqueuing many jobs of the same class at once, issue a single bulk enqueue call with the full ID list rather than one enqueue per job. Amortises the per-request cost of the job-store round-trip across N jobs.

The pattern

# naive: one Redis round-trip per job
BackupPolicy.needs_to_run.find_each do |policy|
  BackupJob.perform_async(policy.id)        # 10,000 round-trips
end

# bulk: one Redis round-trip per 1,000 jobs
BackupPolicy.needs_to_run.in_batches do |backup_policies|
  BackupJob.perform_bulk(backup_policies.pluck(:id))  # 10 round-trips
end

Sidekiq 6.x+ exposes perform_bulk([args1, args2, ...]) which pushes all the serialised job payloads in a single Redis LPUSH (or batched LPUSH). The amortisation is on the network + Redis command-processing axis, not on the worker execution axis — jobs still execute individually.

Why it matters

The common case for bulk enqueue is a paired scheduler that wakes up, finds N records needing work, and needs to enqueue N jobs. One Redis round-trip per job is per-job latency × N serialised — for N=10,000 at 1 ms/ round-trip, that's 10 s of scheduler-tick time spent just talking to Redis.

With perform_bulk and batches of 1,000:

  • 10 Redis round-trips instead of 10,000.
  • ~10 ms of Redis-talk instead of 10 s.
  • Scheduler tick stays within its cadence budget.

Batch size trade-off

The trade-off: larger batches amortise better but also:

  • Increase per-request memory on the Redis client and server (a batch of 100,000 is a single large command).
  • Increase per-request latency (the one request takes longer).
  • Fail-together (a bad payload anywhere in the batch may fail the whole batch, depending on store semantics).

PlanetScale's canonical choice is 1,000 per batch, iterated via ActiveRecord's in_batches cursor. 1,000 is a common sweet spot for job-store and DB cursor work.

Canonical implementation (PlanetScale, 2022)

From sources/2026-04-21-planetscale-how-we-made-planetscales-background-jobs-self-healing:

# Job is set to run every 5 minutes
class ScheduleBackupJobs < BaseJob
  sidekiq_options queue: "background"

  def perform
    # Schedule backup jobs in batches of 1,000
    BackupPolicy.needs_to_run.in_batches do |backup_policies|
      BackupJob.perform_bulk(backup_policies.pluck(:id))
    end
  end
end

Motivation named in the post: "Scheduling jobs one-by-one works well when you only have a few thousand. Once our app grew, we noticed we were spending a lot of time sending individual Redis requests. Each request is very fast, but requests add up when run thousands of times."

When to use it

  • Scheduler jobs enqueuing their findings. The canonical case; a scheduler with many pending rows to dispatch.
  • Fan-out from a list of IDs. Any code path that iterates IDs and calls perform_async in a tight loop should consider perform_bulk.
  • Migration / backfill enqueuing. Re-enqueuing jobs for every row in a table during a backfill.

When not to use it

  • Low volume. Overhead of batch machinery > benefit at N < ~50. Keep it simple; use perform_async one-by- one.
  • Heterogeneous job classes or args. Bulk enqueue is homogeneous-per-class (same worker class, different args). Cross-class batches require a different API or back to per-call.
  • Strict per-job ordering with backpressure. Bulk enqueue writes all jobs to the store at once; it doesn't respect backpressure from the consumers. If you need to slow down based on worker throughput, enqueue one at a time with a rate-limit.

Composition with jitter

Bulk enqueue + jitter is the common pair. Bulk-enqueuing 10,000 jobs with perform_bulk makes them all available for workers immediately. Without jitter, the workers drain the queue as fast as they can, and whichever external API or downstream system the jobs call sees a 10,000-concurrent-request burst. Jitter adds a wait: rand(0..max) delay per job to spread them.

Relationship to other patterns

Seen in

Last updated · 378 distilled / 1,213 read