VOOZH about

URL: https://deepwiki.com/hypervel/bus/7-batch-processing

⇱ Batch Processing | hypervel/bus | DeepWiki


Loading...
Menu

Batch Processing

This document introduces the batch processing system in hypervel/bus, which enables grouping multiple jobs together for coordinated execution, progress tracking, and lifecycle management. It covers the core concepts, architectural components, and the overall batch lifecycle.

For detailed information about creating and configuring batches, see Creating and Managing Batches. For callback mechanisms and state transitions, see Batch Callbacks and Lifecycle. For information about nesting batches within job chains, see Nested Batches and Chaining.


What are Batches

A batch is a collection of jobs that are dispatched together and tracked as a single unit. Unlike individual job dispatch (see Dispatching Jobs) or job chains (see Job Chaining), batches allow you to:

  • Track collective progress: Monitor how many jobs in the group have completed
  • React to group completion: Execute callbacks when all jobs finish successfully
  • Handle group failures: Execute callbacks when any job in the batch fails
  • Continue despite failures: Optionally allow the batch to continue even if individual jobs fail
  • Manage state persistently: Store batch metadata in a repository for distributed tracking across workers

Batches are particularly useful for operations that involve processing large datasets where you want to split the work across multiple jobs but need visibility into the overall completion status.

Sources: src/PendingBatch.php1-341 src/Batch.php1-363


Core Components

The batch processing system consists of three primary components that work together to manage batch execution:

Component Architecture


Sources: src/PendingBatch.php24-341 src/Batch.php23-363


PendingBatch: Configuration Object

The PendingBatch class (src/PendingBatch.php24) is a fluent configuration object that collects all batch settings before dispatch. It holds:

PropertyTypeDescription
$namestringHuman-readable batch name
$jobsCollectionJobs to be batched
$optionsarrayConfiguration array including callbacks, queue settings, and flags

Key configuration methods include:

Sources: src/PendingBatch.php24-341


Batch: Execution Object

The Batch class (src/Batch.php23) represents an active batch during execution. It manages runtime state and job coordination:

PropertyTypeDescription
$idstringUnique batch identifier
$totalJobsintTotal number of jobs in the batch
$pendingJobsintNumber of jobs not yet completed
$failedJobsintNumber of failed jobs
$failedJobIdsarrayIDs of jobs that failed
$optionsarraySerialized configuration from PendingBatch
$createdAtCarbonInterfaceBatch creation timestamp
$cancelledAt?CarbonInterfaceCancellation timestamp (if cancelled)
$finishedAt?CarbonInterfaceCompletion timestamp (if finished)

The Batch object is responsible for:

Sources: src/Batch.php23-363


BatchRepository: Persistence Interface

The BatchRepository interface (see Batch Repository Interface) defines the contract for storing and retrieving batch state. This abstraction allows batch state to be persisted across distributed workers and application restarts.

Key operations include:

  • store(PendingBatch) - Persist a new batch
  • find(string $id) - Retrieve a batch by ID
  • incrementTotalJobs(), decrementPendingJobs(), incrementFailedJobs() - Atomic count updates
  • markAsFinished(), cancel(), delete() - State transitions

The default implementation is DatabaseBatchRepository (see Database Implementation).

Sources: src/Batch.php42-43


Batch Lifecycle

The following diagram illustrates the complete lifecycle of a batch from creation through completion:

Batch State Flow


Sources: src/PendingBatch.php245-340 src/Batch.php139-169 src/Batch.php223-253


Key Lifecycle Phases

1. Configuration Phase

The configuration phase occurs when a PendingBatch is created and configured before dispatch. During this phase, the batch is not yet persisted or executing.


Sources: src/PendingBatch.php44-48 src/PendingBatch.php186-238


2. Dispatch Phase

When dispatch() or dispatchAfterResponse() is called, the batch transitions from configuration to active execution:


Sources: src/PendingBatch.php245-266 src/PendingBatch.php323-340


3. Execution Phase

During execution, individual jobs complete and update batch state atomically. Each job completion or failure triggers repository updates and potentially callback execution:


Sources: src/Batch.php139-169


Configuration Options

The PendingBatch class provides a fluent interface for configuring batch behavior. Options are stored in the $options array and serialized with the batch.

Key Configuration Methods

MethodPurposeStored In
name(string)Set batch name$name property
onQueue(string)Set queue for all jobs$options['queue']
onConnection(string)Set connection for all jobs$options['connection']
allowFailures(bool)Allow batch to continue on job failure$options['allowFailures']
before(callable)Register pre-execution callback$options['before'][]
progress(callable)Register per-job callback$options['progress'][]
then(callable)Register success callback$options['then'][]
catch(callable)Register failure callback$options['catch'][]
finally(callable)Register completion callback$options['finally'][]

Detailed configuration examples and usage are covered in Creating and Managing Batches.

Sources: src/PendingBatch.php31-36 src/PendingBatch.php69-238


Callback System

Batches support five lifecycle callbacks that execute at different points during batch execution:

Callback Execution Order


Callbacks are serialized using SerializableClosure (src/PendingBatch.php72-73 src/PendingBatch.php92-93) to allow them to be stored with the batch and executed in worker processes.

Detailed callback behavior and usage patterns are covered in Batch Callbacks and Lifecycle.

Sources: src/PendingBatch.php69-163 src/Batch.php139-169 src/Batch.php223-253


Use Cases

Batches are most appropriate for the following scenarios:

1. Bulk Data Processing

When processing large datasets that need to be split across multiple jobs:

  • Importing thousands of records from a CSV file
  • Processing image uploads for a photo gallery
  • Sending bulk email campaigns
  • Generating reports from large datasets

2. Progress Tracking

When you need to show users the completion status of a multi-step operation:

  • Dashboard showing "Processing 500/1000 records"
  • Progress bars for long-running operations
  • Notification when all jobs complete

3. Coordinated Failure Handling

When related jobs should be cancelled together if one fails:

  • Database migrations split across multiple jobs
  • Multi-step data transformations
  • Transactional operations across multiple systems

4. Deferred Aggregation

When you need to perform an action only after all related jobs complete:

  • Generating a summary report after processing all items
  • Sending a completion notification
  • Cleaning up temporary resources
  • Triggering a downstream workflow

Sources: src/Batch.php139-169 src/Batch.php223-253


Batch vs Other Dispatch Patterns

FeatureSingle JobJob ChainBatch
Multiple jobsNoYes (sequential)Yes (parallel)
Progress trackingNoNoYes
Collective callbacksNoPer-job onlyYes (group-level)
Failure isolationN/AStops chainConfigurable
Dynamic job additionNoVia chainingYes
State persistenceNoMinimalFull state tracking
Use caseIndependent taskSequential workflowParallel processing

For single job dispatch, see Dispatching Jobs. For job chains, see Job Chaining.

Sources: src/PendingBatch.php1-341 src/Batch.php1-363


Data Flow Architecture

The following diagram shows how data flows through the batch system from creation to completion:


Sources: src/PendingBatch.php44-48 src/PendingBatch.php245-266 src/Batch.php70-106 src/Batch.php139-169