Mastering Laravel job batching Pt. 3: Build our custom batch system

Introduction

This article is 3rd of the 4-parts series: Mastering Laravel batches

In this article, we'll see how we can implement a system allowing to turn any of our job into a batchable one with as minimum code as possible without compromising batch efficiency and making the most of it.

Defining our use case

Before writing any code, we have to define our use case.
Goal is to describe exactly how our system should behave, in normal conditions but also in not so normal conditions because bugs and failures are part of the life of an application.
Let's write our checklist!

- [ ] Batch should be able to handled up to 500 jobs
- [ ] Multiple batch should be able to run at the same time, without waiting for the previous one to finish
- [ ] Progression should be multiples of 10 from 0 until 100 (0, 10, 20, etc.)
- [ ] Progression should reflect the number of the ran jobs (succeeded or failed)
- [ ] Application should have a way to track progression without asking for it
- [ ] Application should have a way to know a batch has started without asking for it
- [ ] Application should have a way to know a batch has ended without asking for it
- [ ] Application should have a clear description of any failed job (class name + arguments + failed job id at minimum)
- [ ] Failed jobs could be retried
- [ ] Job failures should not cause a batch to be cancelled
- [ ] Batch should be cancellable
- [ ] Pending jobs of a cancelled batch should not run

We'll go through this requirements in 3 steps:

  • define and develop the command required to run all the features

  • develop a naive but working implementation

  • refine the implementation (in the next article)

Starting with working code before refining it avoids early over engineering and allows to see really what needs to be abstracted or not and in which way. Additionally, it's safe. Typically, you'll develop unit/feature tests during the first phase. Just do a git commit and now you can break everything you want, you're covered by the tests. Though here, we won't write any tests to keep things simple.

Now, where to start?
Let's not code now but stop on the first 2 points.

Define the general behavior

Our 2 first needs are:

  • [ ] Batch should be able to handled up to 500 jobs

  • [ ] Multiple batch should be able to run at the same time, without waiting for the previous one to finish

From what we saw it in previous articles, if we want to run multiple batches at the same time, we should opt for progressive ones. And they come with the following drawbacks:

  • progression needs to be implemented in our system

  • we need a protection against infinite loops

  • we may need an external queue

First point is covered by many bullets points in our feature list. Second point will be covered when implementing the call to the next job.
That leaves us with the third point: do wee need an external queue? For the current article, we'll treat list of 30 characters, for example:

[
    "8B7mGHlQkoOcegL4w3NzwRpAmVSF2O",
    "w643DaGWrBpxo0MyyP9vtkcyjlI5cK",
    "EyCAkRRb38YhSPOwko4mWIzrRFe4um",
    "dd6J9BUG5s0Tw8l18T0qQeBwrtiBYX",
    "qFAAbRQAzdk5wVhxLzmZH9nkgcUkvH",
]

the simplest way to manage it would be to have the jobs accepts a list as an argument, get one item from it and add a job to the batch with the remaining items. Pseudo code could be something like this:

public function handle(array $items_to_treat):void {
    $item = array_shift($items_to_treat);

    // Perform action on item

    // Add next job to batch if any items left
    if($this->batch() && count($items_to_treat) > 0) {
        $this->batch()->add(new static($items_to_treat));
    }
}

This would work fine but think about failed jobs and retrying jobs.
For failed jobs, this solution means that the complete arguments will be stored in the failed_jobs table. Here, we talk about 500 * 30 chars at maximum, it's not that much but if we have a lot of batches running and a lot of failing jobs, querying that table will quickly become a problem.
Let's find better solutions. Other solutions rely on a queue. With the queue, there is a 2 ways:

  • job has a single item as argument and next job is added to the batch with the next item retrieved from the queue

  • job has no arguments and retrieve item from the queue, a new job is added to the batch if queue is not empty

Pseudo for each solution would be:

  • Solution 1
public function handle(string $item):void {
    // Perform action on item


    if($this->batch()) {
        $new_item = get_next_item_from_queue();

        if($new_item){
            $this->batch()->add(new static($new_item));
        }
    }
}
  • Solution 2
public function handle():void {
    $item = get_next_item_from_queue();

    // Perform action on item

    if($this->batch() && queue_is_not_empty()) {
        $this->batch()->add(new static());
    }
}

Both will have a small footprint in failed_job table but which one to choose?

A clue can be found in our feature list:

  • [ ] Application should have a clear description of any failed job (class name + arguments + failed job id at minimum)

Out of the box and without any additional code, we are not able to fit this requirement with solution 2. Because the job's constructor has no arguments, they wont' be stored id failed_jobs and because at the end of the batch the queue will be empty, we cannot rely on it anymore. Nothing's left in the queue, nothing in the failed job.

Whereas with solution 1, everything that's required can be found in the failed_jobs table without any additional code.

This is a perfect example of why have a clear and detailed goal is important.

To sum up we now know that:

  • we'll use progressive batch

  • we need an external queue

  • our job will treat item one at a time

Let's begin with the queue.

The queue

The first option that comes when talking about queue and working with Laravel is: Redis. We'll use the list feature which is ideal for our need as the document says:

Redis lists are linked lists of string values. Redis lists are frequently used to:

  • Implement stacks and queues.

  • Build queue management for background worker systems.

Amongst all the available commands, we'll use the following:

  • RPUSH: Appends one or more elements to a list. Creates the key if it doesn't exists

  • LPOP: Returns the first elements in a list after removing it. Deletes the list if the last element was popped

  • LLEN: Returns the length of a list

  • LRANGE: Returns a range of element from a list

the data we used in Redis has 2 parts:

  • a key, to identify the list

  • a list of items, which is what needs to be treated

And we have to able to:

  • create the queue

  • get the first item of the queue

  • get the queue length (to determine if it's empty or not)

  • delete the queue (when batch is cancelled)

This translates easily in a contract:

# app/Contracts/BatchQueueRepository.php

interface BatchQueueRepository
{
    public function create(string $key, array $data): int;
    public function delete(string $key): bool;
    public function getFirstItem(string $key): mixed;
    public function length(string $key): int;
}

You can notice that getFirstItem returns a mixed value, allowing to store any kind of data in the queue.
Now for the concrete class:

# app/Repositories/RedisBatchQueueRepository.php

class RedisBatchQueueRepository implements BatchQueueRepository
{
}

One thing to know about redis queue is that, you can only store strings in it, therefore everything we push to queue have to be converted to string. The safest and less error-prone way to do it is to serialize our data. We could use json encoding but the benefit of serialization is that it's easier to work with object if we need to.
Let's implement our create and getFirstItem methods:

# app/Repositories/RedisBatchQueueRepository.php

class RedisBatchQueueRepository implements BatchQueueRepository
{
    public function create(string $key, array $data): int
    {
        $serialized_data = array_map(fn ($value) => serialize($value), $data);
        return Redis::rpush($key, ...$serialized_data);
    }

    public function getFirstItem(string $key): mixed
    {
        $serialized_item =  Redis::lpop($key);
        return $serialized_item ? unserialize($serialized_item) : null;
    }
}

The next 2 methods, delete and length are straight forward:

# app/Repositories/RedisBatchQueueRepository.php

class RedisBatchQueueRepository implements BatchQueueRepository
{
    // ...
    public function delete(string $key): bool
    {
        return Redis::del($key);
    }

    public function length(string $key): int
    {
        return Redis::llen($key);
    }
}

Now that we have the storage part of the queue implemented, we need a way to use it, but first we need to answer a question:

What should we use as the queue key? The best is to use something based on the batch id, so that we have a nice relation between these two entities. One could argue that when we're not in the batch, we don't know its id. First, with our choice regarding the job arguments, we won't need the queue outside the batch anyway. Additionally, if we refer again to our feature list, we can see:

  • [ ] Progression should be multiples of 10 from 0 until 100 (0, 10, 20, etc.)

  • [ ] Application should have a way to track progression without asking for it

  • [ ] Application should have a way to know a batch has started without asking for it

  • [ ] Application should have a way to know a batch has ended without asking for it

In order to identify the batch related to those wanted data, batch id will be present in it. Then we'll have access to it when needed. Application will be responsible of storing it for further process (job retry, batch cancel, etc.)

Then our queue key will be formatted as: batch-queue-[batch_id].

The queue key answered, we can implement the class allowing us to control the queue:

# app/Models/BatchQueue.php
class BatchQueue
{
    protected BatchQueueRepository $repository;
    public string $key;
    public array $data;

    public function __construct(Batch $batch, BatchQueueRepository $batch_queue_repository)
    {
        $this->key = 'batch-queue-' . $batch->id;
        $this->repository = $batch_queue_repository;
    }

    public function create(): bool
    {
        return $this->repository->create($this->key, $this->data);
    }

    public function delete(): bool
    {
        return $this->repository->delete($this->key);
    }

    public function pop(): mixed
    {
        return $this->repository->getFirstItem($this->key);
    }

    public function count(): int
    {
        return $this->repository->length($this->key);
    }

    public function isEmpty(): bool
    {
        return $this->count() > 0;
    }
}

You may have spotted it but there is an issue with the BatchQueue:pop() and RedisBatchQueueRepository:getFirstItem() methods.
We will tackle it in the next article, for now we head to a working implementation.

Bonus point if you want to got a little bit further:

  • a ttl on redis list to ensure that no queue stays forever

  • add a validator as parameter to the BatchQueue to validate the data stored in Redis

A job to batch

The queue is ready, now we need the job that will be pushed to queue.
We'll keep it simple, we'll just log the random string received as an argument and add the next job to the batch.
Let's start with the job creation:

sail artisan make:job SimpleJob

First, make the job Batchable and define the arguments:

# app/Jobs/SimpleJob.php

use Illuminate\Bus\Batchable;
// ...
class SimpleJob implements ShouldQueue
{
    use Batchable, Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public function __construct(public string $random_string)
    {
    }
}

Now, when run, the job should add the next job to the batch. For this, we need to get the next item to treat from the queue and create a job for it.
Note that this should happen only if the job runs in a batch. Checking this allows to run the job without a batch, which will be required for job retry.

# app/Jobs/SimpleJob.php

class SimpleJob implements ShouldQueue
{
    // ...
    public function handle(): void
    {
        Log::info(sprintf('%s [%s] RAN', class_basename($this), $this->random_string));

        // Run this part only if in a batch
        if ($this->batch()) {
            // Instantiate a queue object
            $batch_queue = new BatchQueue($this->batch(), new RedisBatchQueueRepository());

            // Get the nex item to treat from the queue
            $next_item  = $batch_queue->pop();

            // If there is an item, create a new job and add it to the batch
            if ($next_item) {
                $this->batch()->add(new static($next_item));
            }
        }
    }
}

The last bit of code is where the magic happens: once job has finished its process, it checks the queue in order to add a new job to the batch. And this repeats until the queue is empty. All that is required now is a batch with the first job with its item and we'll have a progressive batch consuming a queue until the end.
This implementation has a small memory footprint and because we'll use progressive batches, multiple ones can run at the same time without blocking the queue. So we have 2 bullet points done:

  • [x] Batch should be able to handled up to 500 jobs

  • [x] Multiple batch should be able to run at the same time, without waiting for the previous one to finish

Now that we have the job, we need the batch and something to run it.

A command to test the features

We won't write unit tests as we should do in a normal development process.
Instead we'll write a command that allows us to test all our desired features.
It will be more interactive that unit tests and still we won't see a web page, which is great. Yes, I'm a backend developer!

A command able to run multiple operations: the skeleton

Let's create the command:

sail artisan make:command BatchManager --command=batch:manage

What do we want to be able to do?

  • dispatch a batch

  • cancel a batch

  • retry all failed jobs of a batch

  • retry a job within a job

  • inspect a batch

Let's make a skeleton for these features inside a nice interactive command.
First we'll start by adding empty methods for each wanted operations:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
    }

    protected function inspectBatch(): void
    {
    }

    protected function cancelBatch(): void
    {
    }

    protected function retryBatch(): void
    {
    }

    protected function retryJob(): void
    {
    }
}

Now we need a way to choose which operation to run, and ideally to run as many operations without leaving the command.
Let's code the operation manager and plug it into the handle() method:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    use function Laravel\Prompts\select;

    public function handle()
    {
        $this->chooseOperation();
    }

    private function chooseOperation(): void
    {
        $operations = [
            'dispatchBatch' => 'Dispatch a new batch',
            'cancelBatch' => 'Cancel a batch',
            'retryBatch' => 'Retry all failed jobs of a batch',
            'retryJob' => 'Retry a job',
            'inspectBatch' => 'Inspect a batch',
            'exit' => 'Exit',
        ];
        $operation = select('What do you want to do now?', options: $operations, default: 'exit', scroll: 10);

        // Run operation or exit command
        if ($operation == 'exit') {
            $this->question('Bye');
        } else {
            $this->$operation();
            $this->chooseOperation();
        }
    }
}

Try to run it:

sail artisan batch:manage

And you'll see a menu to choose the operation you want and when operation ends, menu comes back in order to choose another operation to run. Choosing exit exits the command.

Inspecting a batch

In order to know what happens and what is the state of the batch we run, we'll add some kind of batch inspector. It contains 2 phases:

  • log the run batch ids

  • display data about the desired batch

Most of the first part will be taken care of in the next chapter but we can implement the properties required to store the batch ids:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected array $batch_ids = [];
    protected string $current_batch_id;
    // ...
}

And for the second part, we need a way to choose the batch to operate on:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
   // ...
   protected function getBatchId(): string
   {
       return count($this->batch_ids) > 0 ? select('Select batch for operation', $this->batch_ids) : $this->current_batch_id;
   }
   // ...
}

And a way to inspect a batch, which is straight forward:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function inspectBatch(): void
    {
        $batch_id = $this->getBatchId();
        $batch = Bus::findBatch($batch_id);
        dump($batch->toArray());
    }
    // ...
}

We can inspect the batch, now let's add a way to dispatch some!

Dispatch a simple batch

First step is to dispatch a batch. We'll with a minimal one and build it up stone by stone following our requirements.
As we still want a fancy command, let's start by having a dynamic number of jobs:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        $this->info('Dispatch batch');

        $nb_jobs = text('How many jobs should the batch contains?', default: 5);

        if ($nb_jobs < 1) {
            $this->error('Cannot dispatch a batch with less than 1 job.');
            $this->dispatchBatch();
        }
    }
    // ...
}

Add a minimal batch:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        // ... 
        $batch = Bus::batch([])
            ->dispatch();
    }
    // ...
}

And last but not least, feed our properties for further operations:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        // ... 
        $this->current_batch_id = $batch->id;
        $this->batch_ids[] = $batch->id;
    }
    // ...
}

Now, you should be able to use the command:

sail artisan batch:manage

Choose Dispatch a new batch and then Inspect a batch to get an output like the following:

array:11 [
  "id" => "9c57a570-ecb4-41ff-bd4c-c135d953b008"
  "name" => ""
  "totalJobs" => 0
  "pendingJobs" => 0
  "processedJobs" => 0
  "progress" => 0
  "failedJobs" => 0
  "options" => []
  "createdAt" => Carbon\CarbonImmutable @1719002517^ {#695
    #endOfTime: false
    #startOfTime: false
    #constructedObjectId: "00000000000002b70000000000000000"
    -clock: null
    #localMonthsOverflow: null
    #localYearsOverflow: null
    #localStrictModeEnabled: null
    #localHumanDiffOptions: null
    #localToStringFormat: null
    #localSerializer: null
    #localMacros: null
    #localGenericMacros: null
    #localFormatFunction: null
    #localTranslator: null
    #dumpProperties: array:3 [
      0 => "date"
      1 => "timezone_type"
      2 => "timezone"
    ]
    #dumpLocale: null
    #dumpDateProperties: null
    date: 2024-06-21 20:41:57.0 UTC (+00:00)
  }
  "cancelledAt" => null
  "finishedAt" => null
]

Implement features

Run batch with the data

For now, we only run an empty batch, it's time to add real data.
We're going to treat a list of 30-characters random strings.
Let's create that:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        // ...
        $data = array_map(fn ($_) => Str::random(30), range(1, $nb_jobs));
    }
    // ...
}

Now, consider the way our progressive batch and our queue works:

  • batch should be initiate with a job with its arguments

  • queue key contains the batch id

This means that we have to build the first job to feed the batch with, which is simple.
But what about creating the queue, which will be used by the jobs, with the batch id. How can we create the queue after the batch was launched?
We're just going to use the before() callback for this purpose. It runs when batch is created, then we have its id, but before any jobs run, so if we create the queue there, it will be available for the first job. We'll end up with something like this:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        // ...
        $data = array_map(fn ($_) => Str::random(30), range(1, $nb_jobs));

        // Add first job to batch job list
        $first_data = array_shift($data);
        $batch = Bus::batch([new SimpleJob($first_data)])
        // Create the queue before any job runs 
        ->before(function (Batch $batch) use ($data) {
                Log::info(sprintf('Batch [%s] created.', $batch->id));

                $batch_queue = new BatchQueue($batch, new RedisBatchQueueRepository());
                $batch_queue->data = $data;
                $batch_queue->create();
            })
        ->dispatch();
    }
    // ...
}

Let's try it!
Follow your logs in a terminal:

tail -f storage/logs/laravel.log

In another terminal, Restart and monitor queue if not done already:

sail artisan queue:restart && sail artisan queue:work

And in another terminal, run a batch:

sail artisan batch:manage

Choose 'Dispatch a new batch` and you shall see logs like this:

[...] local.INFO: Batch [9c5c560b-b692-4c42-b7f7-49dc9f311922] created.  
[...] local.INFO: SimpleJob [VdKHNFgRqUiuXn6B7wXekhWd1CATL0] RAN  
[...] local.INFO: SimpleJob [CZ67UGaZZEkUCYv5guJgiJvzrDDuSl] RAN  
[...] local.INFO: SimpleJob [DxfoCXdJzhRju2NuaPNXwQmN6vyrXM] RAN  
[...] local.INFO: SimpleJob [Vg3PTcThDCx1IFeiO1iczkDmrtaGWp] RAN  
[...] local.INFO: SimpleJob [xTsCUX2qOHYbBkwIcR804wikhmEVEb] RAN

Job failures should not cause a batch to be cancelled

This one is a known suspect and answer is easy: just use allowFailures().
In those kind of case, it is very tempting to implement the solution and to test afterwards that everything's fine. Here, we'll go the other way: we're going to make our code fail before applying the fix. Why so? First, doing this way ensures that code runs fine because of the fix we apply, not because of any other code or worse because the code of the potential fail is wrong. Second, well I find very satisfying to fix the code I've broken myself.
So, let's break the code. To ease this process, we'll make the first job fail by passing a string too long:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        // ...
        // Add first job to batch job list
        $first_data = array_shift($data);
        $first_data .= '_fails';
        //...
    }
    // ...
}

We add also a log when batch ends:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        // ...
        $batch = Bus::batch([new SimpleJob($first_data)])
            // ...
            ->finally(function (Batch $batch) {
                Log::info(sprintf('Batch [%s] finally ended.', $batch->id));
            })
            ->dispatch();
    }
    // ...
}

And throw an error in the job if the string is not 30-characters long:

# app/Jobs/SimpleJob.php

class SimpleJob implements ShouldQueue
{
    public function handle(): void
    {
        if (strlen($this->random_string) != 30) {
            Log::info(sprintf('%s [%s] FAILS', class_basename($this), $this->random_string));
            throw new \Exception('Random string must be 30 characters long');
        }

        //...
    }
}

We could use a Validator but we use direct check to reduce boilerplate.
It's time to test the failure:

Restart the queue as we updated the job:

sail artisan queue:restart && sail artisan queue:work

Run a batch:

sail artisan batch:manage

And check the logs to see what happened:

[...] local.INFO: Batch [9c5c64ee-a3fc-4cf9-9a08-cea939fe9ece] created.  
[...] local.INFO: SimpleJob [jeWzCsdQOk4brTbPXVLRbQRYXI1kSy_fails] FAILS  
[...] local.INFO: Batch [9c5c64ee-a3fc-4cf9-9a08-cea939fe9ece] finally ended.  
[...] local.ERROR: Random string must be 30 characters long 
...

As expected, we see that the first job failed and the batch ended immediately.
Add the allowFailures():

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        // ...
        $batch = Bus::batch([new SimpleJob($first_data)])
            // ...
            ->allowFailures()
            ->dispatch();
    }
    // ...
}

And run a new batch:

sail artisan batch:manage

And inspect the logs:

[...] local.INFO: Batch [9c5c65d2-ed08-4780-98e8-649963543d85] created.  
[...] local.INFO: SimpleJob [GRFSxz2WOQtsRYKcKqJafRh0d0r2X1_fails] FAILS  
[...] local.INFO: Batch [9c5c65d2-ed08-4780-98e8-649963543d85] finally ended.  
[...] local.ERROR: Random string must be 30 characters long
...

No differences. What went wrong?
This is another drawback of progressive batch. Because jobs are added one by one to the batch, if a job failed, it's the last job of the batch. Thus batch stops and there is nothing left to process. We can alleviate this by managing errors in the job itself:

# app/Jobs/SimpleJob.php

class SimpleJob implements ShouldQueue
{
    public function handle(): void
    {
        // Normal process
        try {
            if (strlen($this->random_string) != 30) {
                throw new \Exception('Random string must be 30 characters long');
            }

            Log::info(sprintf('%s [%s] RAN', class_basename($this), $this->random_string));
        } 
        // In case of error, log and rethrow the error 
        catch (Throwable $e) {
            Log::info(sprintf('%s [%s] FAILS', class_basename($this), $this->random_string));
            throw $e;
        } 
        // If running in a batch, add next job to it
        finally {
            if ($this->batch()) {
                $batch_queue = new BatchQueue($this->batch(), new RedisBatchQueueRepository());
                $next_item  = $batch_queue->pop();
                if ($next_item) {
                    $this->batch()->add(new static($next_item));
                }
            }
        }
    }
}

Test this implementation by restarting the queue

sail artisan queue:restart && sail artisan queue:work

Run a batch:

sail artisan batch:manage

And check the logs:

[...] local.INFO: Batch [9c5c6aca-5674-41e9-bcef-9d8a64a3897d] created.  
[...] local.INFO: SimpleJob [D6ht6uziadFoZcbBxq0vz0NdnMsY38_fails] FAILS  
[...] local.ERROR: Random string must be 30 characters long  
...
[...] local.INFO: SimpleJob [jtjto8RnmUK45CyEZbsxEO0rs3N6zz] RAN  
[...] local.INFO: SimpleJob [KqmxPlUuyYYqkqVbgSCUccnzJctlYB] RAN  
[...] local.INFO: SimpleJob [Xnt8j1RH1HYpsNzD7ILrluk22sEs4G] RAN  
[...] local.INFO: SimpleJob [XJ25OEPgfAVAmFgwnDln7HPVmvgdNW] RAN  
[...] local.INFO: Batch [9c5c6aca-5674-41e9-bcef-9d8a64a3897d] finally ended.

Now everything's fine.

But do we still need the allowFailures()?
Good question, if we remove it, we can see that everything still runs fine as expected. For our current use case, it has no impact but it can have one if we decide to treat more than one job at a time. For now, we'll keep the allowFailures() for safety.

Batch cancelling

Now, let's tackle this two points:

  • [ ] Batch should be cancellable

  • [ ] Pending jobs of a cancelled batch should not run

First add a way to cancel a batch to our command. We already have the cancelBatch() method, let's fill it. It is easy as we just need to call ->cancel on the batch:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function cancelBatch(): void
    {
        $batch = Bus::findBatch($this->current_batch_id);
        $batch->cancel();
    }
    // ...
}

Easy right... But aren't we missing something? What about the queue created for the batch? We need to delete it too:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function cancelBatch(): void
    {
        $batch = Bus::findBatch($this->current_batch_id);
        $batch->cancel();

        // Clear the queue
        $batch_queue = new BatchQueue($batch, new RedisBatchQueueRepository());
        $batch_queue->delete();
    }
    // ...
}

Let's update our job to make it longer in order to cancel the batch easily:

# app/Jobs/SimpleJob.php

class SimpleJob implements ShouldQueue
{
    public function handle(): void
    {
        sleep(1);
        //...
    }
}

Restart and monitor the queue:

sail artisan queue:restart  && sail artisan queue:work

And try to cancel the batch:

sail artisan batch:manage

Choose Dispatch a new batch and then Cancel a batch.

Watching the log gives something like this result:

[...] local.INFO: Batch [9c5f8551-5a8d-45e0-9dfc-363df89a45e8] created.  
[...] local.INFO: SimpleJob [6UfseY87vde7tI3Bq9raVVrc97klEK_fails] FAILS  
[...] local.ERROR: Random string must be 30 characters long ...
[...] local.INFO: SimpleJob [k3RIBLilfuq0mY3QKIYS7m7XB58iFy] RAN  
[...] local.INFO: Batch [9c5f8551-5a8d-45e0-9dfc-363df89a45e8] finally ended.

Ok, it seems that cancelling is enough to stop the process. Nice, but why so? We didn't bypass the job execution in the code.
There's two things here:

  • our progressive batch only add one job at a time, then when job is cancelled, there is no job left to process

  • When cancelling, we also remove the queue, therefore no new job can be added to the batch as there is no data left to process

Retrying a batch

This one is also a quick win and with what we learn in previous article, no explanation is needed:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function retryBatch(): void
    {
        $batch_id = $this->getBatchId();
        if (confirm(sprintf('Are you sure to retry all failed jobs of batch [%s]?', $batch_id), false)) {
            $this->call('queue:retry-batch', ['id' => $this->current_batch_id]);
        };
    }
    // ...
}

Retrying a specific job

This one is trickier that the previous one. We will again rely on an artisan command: queue:retry which requires a failed job id to run. But how to get this identifier? Let's do it bit by bit.
First, we need to retrieve the failed jobs a batch. We could just query the job_batches and get the failed_job_ids list. And also query the failed_jobs table with those ids to get details about the failed jobs. This will work but let's do it in a safer way and use Laravel for this purpose... and build a nice interface. Let's start by retrieving all the failed jobs of a batch:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function retryJob(): void
    {
        $batch_id = $this->getBatchId();
        $batch = Bus::findBatch($batch_id);

        $queue_failer = app('queue.failer');

        $failed_jobs = collect($batch->failedJobIds)
            ->mapWithKeys(function ($failed_job_id) use ($queue_failer) {
                // Do things
            });
    }
    // ...
}

The app('queue.failer') will allow us to access failed job details. If you weren't curious to look into failed_jobs to see what they look like, let's check it:

sail artisan tinker
DB::table('failed_jobs')->orderByDesc('failed_at')->first()
= {#5203
    +"id": 89,
    +"uuid": "aa54385a-dc9c-426c-8115-11ac03691135",
    +"connection": "redis",
    +"queue": "default",
    +"payload": "{"uuid":"aa54385a-dc9c-426c-8115-11ac03691135","displayName":"App\\Jobs\\SimpleJob","job":"Illuminate\\Queue\\CallQueuedHandler@call","maxTries":null,"maxExceptions":null,"failOnTimeout":false,"backoff":null,"timeout":null,"retryUntil":null,"data":{"commandName":"App\\Jobs\\SimpleJob","command":"O:18:\"App\\Jobs\\SimpleJob\":2:{s:13:\"random_string\";s:36:\"6UfseY87vde7tI3Bq9raVVrc97klEK_fails\";s:7:\"batchId\";s:36:\"9c5f8551-5a8d-45e0-9dfc-363df89a45e8\";}"},"telescope_uuid":"9c5f8551-7946-490e-8f19-104f4e10f325","id":"XdHblzbX8bdLQZS3YXKrTH49j7qONmpF","attempts":0}",
    +"exception": """
      Exception: Random string must be 30 characters long in /var/www/html/app/Jobs/SimpleJob.php:38\n
      ...
      """,
    +"failed_at": "2024-06-25 18:38:48",
  }

We can see that a failed job has an id (which is not the one to use), an uuid used to retry it, info about the connection and queue where it ran, the exception responsible for the failure and the failure date. But what is interesting us for now is the payload:

{
    "uuid": "aa54385a-dc9c-426c-8115-11ac03691135",
    "displayName": "App\\Jobs\\SimpleJob",
    "job": "Illuminate\\Queue\\CallQueuedHandler@call",
    "maxTries": null,
    "maxExceptions": null,
    "failOnTimeout": false,
    "backoff": null,
    "timeout": null,
    "retryUntil": null,
    "data": {
        "commandName": "App\\Jobs\\SimpleJob",
        "command": "O:18:\"App\\Jobs\\SimpleJob\":2:{s:13:\"random_string\";s:36:\"6UfseY87vde7tI3Bq9raVVrc97klEK_fails\";s:7:\"batchId\";s:36:\"9c5f8551-5a8d-45e0-9dfc-363df89a45e8\";}"
    },
    "telescope_uuid": "9c5f8551-7946-490e-8f19-104f4e10f325",
    "id": "XdHblzbX8bdLQZS3YXKrTH49j7qONmpF",
    "attempts": 0
}

This payload is what is pushed to queue in order to be processed and what's nice here is that we have all arguments used to run our job. You can see that the batchId is part of this argument: its is used by job via the ->batch() method to operate on the batch. Let's extract the data and build our interface to choose which job to retry:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function retryJob(): void
    {
        $batch_id = $this->getBatchId();
        $batch = Bus::findBatch($batch_id);
        $queue_failer = app('queue.failer');
        $failed_jobs = collect($batch->failedJobIds)
            ->mapWithKeys(function ($failed_job_id) use ($queue_failer) {
                // Get job data
                $job_data = $queue_failer->find($failed_job_id);
                $payload = json_decode($job_data->payload, true);
                $job_class = unserialize($payload['data']['command']);

                // Extract arg names from class constructor signature and their data from unserialized class
                $args = [];
                $job_reflection = new ReflectionClass($job_class);
                $constructor = $job_reflection->getConstructor();
                foreach ($constructor->getParameters() as $param) {
                    $param_name = $param->getName();
                    $args[$param->getName()] = $job_class->$param_name;
                }

                $job = [
                    'uuid' => $payload['uuid'],
                    'class' => $payload['data']['commandName'],
                    'args' => json_encode($args, JSON_PRETTY_PRINT),
                    'failed_at' => $job_data->failed_at,
                ];

                return [$payload['uuid'] => $job];
            });


        table(['uuid', 'class', 'args', 'failed_at'], $failed_jobs);

        $job_uuid_to_retry = select('Which job do you want to retry', array_column($failed_jobs->toArray(), 'uuid'));
        $this->call('queue:retry', ['id' => $job_uuid_to_retry]);

        $this->info('Job pushed to queue for retry');
    }
    // ...
}

Let's test it:

sail artisan batch:manage

Choose Dispatch a new batch and then Retry a job. you should see something like this:

Choose a job and it will be retry. Obviously, if job still throws an Exception it will still fail!

Progression

Our implementation have to address this:

  • [ ] Progression should be multiples of 10 from 0 until 100 (0, 10, 20, etc.)

  • [ ] Progression should reflect the number of the ran jobs (succeeded or failed)

We know we can't rely on Batch::progress() method because it uses the batch's total number of jobs and because our progressive batch add jobs one by one, this figure is wrong. But we are managing batch's data ourselves then having the real total number of jobs is easy:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        $data = array_map(fn ($_) => Str::random(30), range(1, $nb_jobs));
        $total_jobs = count($data);
        // ...
    }

    // ...
}

For the count of processed jobs, $batch->processedJobs() is relevant as is $batch->failedJobs for the number of failed ones. The rest is just a little bit of maths:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        $batch = Bus::batch([new SimpleJob($first_data)])
        // ...
        ->progress(function (Batch $batch) use ($total_jobs) {
                $previous_progress = $total_jobs > 0 ? round((($batch->processedJobs() - 1) / $total_jobs) * 100) : 0;
                $progress =  $total_jobs > 0 ? round(($batch->processedJobs() / $total_jobs) * 100) : 0;

                if (floor($previous_progress / 10) != floor($progress / 10)) {
                    Log::info(sprintf(
                        'Batch [%s] progress : %d/%d [%d%%]',
                        $batch->id,
                        $batch->processedJobs(),
                        $total_jobs,
                        (int)(floor($progress / 10) * 10),
                    ));
                }
            })
            //...
            ->dispatch();
    }

    // ...
}

Let's run a batch to see the code in action:

sail artisan batch:manage

and choose Dispatch a new batch.
Logs will look like this:

[...] local.INFO: Batch [9c6195a9-64f4-4843-aca9-c4abe54625d6] created.  
[...] local.INFO: SimpleJob [Twv2kbWENspwEiH3kV1IzW4hO6lrEg_fails] FAILS  
[...] local.INFO: PJ: 0 -> 0  
[...] local.INFO: Batch [9c6195a9-64f4-4843-aca9-c4abe54625d6] progress : 0/5 [0%]  
[...] local.ERROR: Random string must be 30 characters long {"exception":"[object] 
[...] local.INFO: SimpleJob [0NAOUVK2YKQKATDOk4C6ga7Ox8X55K] RAN  
[...] local.INFO: Batch [9c6195a9-64f4-4843-aca9-c4abe54625d6] progress : 1/5 [20%]  
[...] local.INFO: SimpleJob [gFg5XZO8yfR5CpCzMJzzvuB7OaB8Gx] RAN  
[...] local.INFO: Batch [9c6195a9-64f4-4843-aca9-c4abe54625d6] progress : 2/5 [40%]  
[...] local.INFO: SimpleJob [rJphHxtxvGhOzxIlut3VJtveux73Ls] RAN  
[...] local.INFO: Batch [9c6195a9-64f4-4843-aca9-c4abe54625d6] progress : 3/5 [60%]  
[...] local.INFO: SimpleJob [bQiDkksyLQeNZhp82LNssHI3xzFyR5] RAN  
[...] local.INFO: Batch [9c6195a9-64f4-4843-aca9-c4abe54625d6] progress : 4/5 [80%]  
[...] local.INFO: Batch [9c6195a9-64f4-4843-aca9-c4abe54625d6] finally ended.

This isn't what we expected, starting at 0% and ending at 80% is wrong.
I have to admit that I've tested this code without any job failing and saw the desired 100% and was fine with the code.
Again, testing unhappy path is key. So, what's wrong? the problem is the use of $batch->processedJobs(), if we inspect the method, wee this:

/**
 * Get the total number of jobs that have been processed by the batch thus far.
 *
 * @return int
 */
public function processedJobs()
{
    return $this->totalJobs - $this->pendingJobs;
}

Where $this->pendingJobs is the number of jobs that still needs to be treated. This includes failed jobs! Then, and because a progressive batch number of jobs is the number of processed jobs, event if they have failed, we should have used $batch->totalJobs instead of $batch->processedJobs(). A new problem arise as it will lead to code like this:

$progress =  $total_jobs > 0 ? round(($batch->totalJobs / $total_jobs) * 100) : 0;

Having that much total_jobs in the same line is not understandable, therefore we'll use a new variable and a comment to explain our intent:

// The total number of jobs of a progressive batch is the number of jobs already processed 
$nb_processed_jobs = $batch->totalJobs;

And update the code:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        $batch = Bus::batch([new SimpleJob($first_data)])
        // ...
        ->progress(function (Batch $batch) use ($total_jobs) {
                // The total number of jobs of a progressive batch is the number of jobs already processed 
                $nb_processed_jobs = $batch->totalJobs;
                $previous_progress = $total_jobs > 0 ? round((($nb_processed_jobs - 1) / $total_jobs) * 100) : 0;
                $progress =  $total_jobs > 0 ? round(($nb_processed_jobs / $total_jobs) * 100) : 0;


                if (floor($previous_progress / 10) != floor($progress / 10)) {

                    Log::info(sprintf(
                        'Batch [%s] progress : %d/%d [%d%%]',
                        $batch->id,
                        $nb_processed_jobs,
                        $total_jobs,
                        (int)(floor($progress / 10) * 10),
                    ));
                }
            })
            //...
            ->dispatch();
    }

    // ...
}

Now, re-run the command:

sail artisan batch:manage

Choose Dispatch a new batch and inspect the logs:

[...] local.INFO: Batch [9c61a0ef-9f70-4825-ae7d-cf1f009d0d8b] created.  
[...] local.INFO: SimpleJob [Mgk5lk3PdLRvvUt9Ytvqe9eG3NgtL5_fails] FAILS  
[...] local.INFO: Batch [9c61a0ef-9f70-4825-ae7d-cf1f009d0d8b] progress : 2/5 [40%]  
[...] local.ERROR: Random string must be 30 characters long {"exception":"[object] 
...
[...] local.INFO: SimpleJob [Mapetbse0RIgdBqW80HC1ozTB7Xxvx] RAN  
[...] local.INFO: Batch [9c61a0ef-9f70-4825-ae7d-cf1f009d0d8b] progress : 3/5 [60%]  
[...] local.INFO: SimpleJob [7dHApXlL5Q4VHaRJBXEgXN2tUbFEWS] RAN  
[...] local.INFO: Batch [9c61a0ef-9f70-4825-ae7d-cf1f009d0d8b] progress : 4/5 [80%]  
[...] local.INFO: SimpleJob [QxJChi5ggrv4zPoBcZiECDWLqUzZqj] RAN  
[...] local.INFO: Batch [9c61a0ef-9f70-4825-ae7d-cf1f009d0d8b] progress : 5/5 [100%]  
[...] local.INFO: SimpleJob [HXq5RlGQ6FBV8G3O0b9hu0PBA2AUdS] RAN  
[...] local.INFO: Batch [9c61a0ef-9f70-4825-ae7d-cf1f009d0d8b] progress : 5/5 [100%]  
[...] local.INFO: Batch [9c61a0ef-9f70-4825-ae7d-cf1f009d0d8b] finally ended.

It's worse, now we start at 40% and we have 100% twice... What happened?
We assumed that $batch->totalJobs is the batch's number of jobs, and this is true. But we didn't take in account when the progress() callback is run. In fact, it is run after a new job has been added to the batch, then $batch->totalJobs reflects this number. In fact to have the correct figure we need the batch's number of jobs except for the jobs that still needs to be run but with the failed jobs. We'll have to move from this:

// The total number of jobs of a progressive batch is the number of jobs already processed 
$nb_processed_jobs = $batch->totalJobs;

to this:

$nb_processed_jobs = $batch->processedJobs() + $batch->failedJobs;

Now if we test, logs are fine:

[...] local.INFO: Batch [9c61a651-0050-41ae-a129-ebd515f3af25] created.  
[...] local.INFO: SimpleJob [86wD8mpPBPvMI3JMcPTjWMoY30yfiF_fails] FAILS  
[...] local.INFO: Batch [9c61a651-0050-41ae-a129-ebd515f3af25] progress : 1/5 [20%]  
[...] local.ERROR: Random string must be 30 characters long 
...
[...] local.INFO: SimpleJob [7SsZRZx4uf7WmVQbQawhtQL0QWkWRX] RAN  
[...] local.INFO: Batch [9c61a651-0050-41ae-a129-ebd515f3af25] progress : 2/5 [40%]  
[...] local.INFO: SimpleJob [K1BOReEl5QudUnwFE1UotYKUOsRBm2] RAN  
[...] local.INFO: Batch [9c61a651-0050-41ae-a129-ebd515f3af25] progress : 3/5 [60%]  
[...] local.INFO: SimpleJob [nONm3tRUCgJPulN3k1RmFsxlPCPaRr] RAN  
[...] local.INFO: Batch [9c61a651-0050-41ae-a129-ebd515f3af25] progress : 4/5 [80%]  
[...] local.INFO: SimpleJob [zrshmNG8q8M61n207BBVxusRZw5bCz] RAN  
[...] local.INFO: Batch [9c61a651-0050-41ae-a129-ebd515f3af25] progress : 5/5 [100%]  
[...] local.INFO: Batch [9c61a651-0050-41ae-a129-ebd515f3af25] finally ended.

Allow application to get data about batch state

Now, let's tackle the last bullet points:

  • [ ] Application should have a way to track progression without asking for it

  • [ ] Application should have a way to know a batch has started without asking for it

  • [ ] Application should have a way to know a batch has ended without asking for it

  • [ ] Application should have a clear description of any failed job (class name + arguments + failed job id at minimum)

How to inform the application that a batch started, progressed, etc.?
It seems to be a good use case for events. When batch is in a certain state, we broadcast an event and any application listening to the channel we broadcast to will be informed. First, we need to list the desired event and the required data for each:

  • batch.started:

      {
          "batch_id": (string) The batch id
      }
    
  • batch progressed:

      {
          "batch_id": (string) The batch id,
          "data": {
              "progress": (int) Progression percentage,
              "nb_jobs_processed": (int) The number of processed jobs (successful or failed),
              "nb_jobs_failed": (int) The number of failed jobs,
              "nb_jobs_total": (int) The total number of jobs ,
          }
      }
    
  • batch.ended:

      {
          "batch_id": (string) The batch id
      }
    
  • batch.job_failed

      {
          "batch_id": (string) The batch id,
          "data": {
              "uuid": (string) The failed job uuid,
              "error": (string) The error which caused the failure,
              "class": (string) The failed job class,
              "args": (object) The arguments used to run the job that failed,
          }
      }
    
  • batch.cancelled:

      {
          "batch_id": (string) The batch id
      }
    

All events will be broadcasted to a channel called batch.

Note: We'll use Laravel events for this task but keep in mind that an event can be a plain old PHP object (POPO). Such a simple object won't be broadcast to a channel, sms or whatever. What's interesting with this approach is that the event is kind of optional and can be used by other developers the way they want to: they just have to listen to the event and use it for what they want. This is mostly how Telescope works.

Back to our events. By writing down names and data for each, we can see that they share more or less the same structure:

  • name is batch.[state_or_result]

  • data always contains the batch id and an additional array of data

we can then use an abstract class to implement the general behaviour and extends it for each event. We could used a non-abstract mother class because there won't be any abstract methods but using an abstract makes it non instantiable. It's a clear message to other developers that the class should not be used directly without having to add a comment somewhere.

Lets create our abstract class via the following command:

sail artisan make:event Batch/BatchEvent

The code is pretty straightforward:

# app/Events/Batch/BatchEvent.php

//...
abstract class BatchEvent implements ShouldBroadcast
{
    use Dispatchable, InteractsWithSockets, SerializesModels;

    public string $batch_id;
    public array $data;

    /**
     * Create a new event instance.
     */
    public function __construct(Batch $batch, array $data = [])
    {
        $this->batch_id = $batch->id;
        $this->data = $data;
    }

    /**
     * Format batch name from the class name
     *
     * @return string
     */
    public function broadcastAs(): string
    {
        return 'batch.' . Str::of(class_basename($this))
            ->remove('Batch')
            ->snake()
            ->toString();
    }

    /**
     * Get the channels the event should broadcast on.
     *
     * @return array<int, \Illuminate\Broadcasting\Channel>
     */
    public function broadcastOn(): array
    {
        return [
            new PrivateChannel('batch'),
        ];
    }
}

This allows us to have our event classes to be very minimal. We don't event need to use the command to have boilerplate. Let's review their code:

BatchStarted

# app/Events/Batch/BatchStarted.php

class BatchStarted extends BatchEvent
{
}

BatchProgressed

# app/Events/Batch/BatchProgressed.php

class BatchProgressed extends BatchEvent
{
}

BatchEnded

# app/Events/Batch/BatchEnded.php

class BatchEnded extends BatchEvent
{
}

BatchJobFailed

# app/Events/Batch/BatchJobFailed.php

class BatchJobFailed extends BatchEvent
{
}

BatchCancelled

# app/Events/Batch/BatchCancelled.php

class BatchCancelled extends BatchEvent
{
}

Now, let's use them! Starting with the simplest ones: batch.started and batch.ended. Their places are in the before() callback and in the finally() callback respectively. At the same time, we'll get rid of the logs:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        $batch = Bus::batch([new SimpleJob($first_data)])
            ->before(function (Batch $batch) use ($data) {
                // ...
                event(new BatchStarted($batch));
                // ...
            })
            // ...
            ->finally(function (Batch $batch) {
                event(new BatchEnded($batch));
            })
            ->allowFailures()
            ->dispatch();
    }
    // ...
}

Now the batch.cancelled one should be broadcasted when the batch is cancel:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function cancelBatch(): void
    {
        // ...

        event(new BatchCancelled($batch));
    }
    // ...
}

Now, for batch.progressed event, we need to build the data array but it's quite easy:

# app/Console/Commands/BatchManager.php

class BatchManager extends Command
{
    // ...
    protected function dispatchBatch(): void
    {
        // ...
        $batch = Bus::batch([new SimpleJob($first_data)])
        // ...
        ->progress(function (Batch $batch) use ($total_jobs) {
                $nb_processed_jobs = $batch->processedJobs() + $batch->failedJobs;
                $previous_progress = $total_jobs > 0 ? round((($nb_processed_jobs - 1) / $total_jobs) * 100) : 0;
                $progress =  $total_jobs > 0 ? round(($nb_processed_jobs / $total_jobs) * 100) : 0;

                if (floor($previous_progress / 10) != floor($progress / 10)) {
                    event(new BatchProgressed($batch, [
                        'progress' => (int)(floor($progress / 10) * 10),
                        'nb_jobs_processed' => $nb_processed_jobs,
                        'nb_jobs_failed' => $batch->failedJobs,
                        'nb_jobs_total' =>  $total_jobs,
                    ]));
                }
            })
            // ...
            ->dispatch();
    }
    // ...
}

And last but not least, batch.job_failed will be broadcasted by the job when it fails. We can't use the cancel() callback because it is called only on the first failure:

# app/Jobs/SimpleJob.php

class SimpleJob implements ShouldQueue
{
    public function handle(): void
    {
        try{
            // ...
        } catch (Throwable $e) {
            if ($this->batch()) {
                $args = [];
                $job_reflection = new ReflectionClass($this);
                $constructor = $job_reflection->getConstructor();
                foreach ($constructor->getParameters() as $param) {
                    $param_name = $param->getName();
                    $args[$param->getName()] = $this->$param_name;
                }

                event(new BatchJobFailed($this->batch(), [
                    'uuid' => $this->job->uuid(),
                    'error' => $e->getMessage(),
                    'class' =>  get_class($this),
                    'args' => $args,
                ]));
            }

            throw $e;
        } finally {
            // ...
        }
    }
}

Now let's test!
Restart the queue:

sail artisan queue:restart && sail artisan queue:work

And run the command:

sail artisan batch:manage

If you inspect, you should see something like this:

[...] local.INFO: Broadcasting [batch.started] on channels [private-batch] with payload:
{
    "batch_id": "9c638a95-495f-4b52-b972-401666d2dbe3",
    "data": [],
    "socket": null
}  
[...] local.ERROR: Random string must be 30 characters long 
...
[...] local.INFO: Broadcasting [batch.job_failed] on channels [private-batch] with payload:
{
    "batch_id": "9c638a95-495f-4b52-b972-401666d2dbe3",
    "data": {
        "uuid": "890027e6-f7ef-4da4-9c47-772f322a9e34",
        "error": "Random string must be 30 characters long",
        "class": "App\\Jobs\\SimpleJob",
        "args": {
            "random_string": "raRxknTCanbtawxzIcO2uxGrLrBmoE_fails"
        }
    },
    "socket": null
}  
[...] local.INFO: SimpleJob [jLG5mMBbK494gf0ut2toX43MGGLwC8] RAN  
[...] local.INFO: Broadcasting [batch.progressed] on channels [private-batch] with payload:
{
    "batch_id": "9c638a95-495f-4b52-b972-401666d2dbe3",
    "data": {
        "progress": 20,
        "nb_jobs_processed": 1,
        "nb_jobs_failed": 1,
        "nb_jobs_total": 5
    },
    "socket": null
}  
[...] local.INFO: SimpleJob [ptYhtjDwMfN1SMZicGxEWibXNnV7PO] RAN  
[...] local.INFO: Broadcasting [batch.progressed] on channels [private-batch] with payload:
{
    "batch_id": "9c638a95-495f-4b52-b972-401666d2dbe3",
    "data": {
        "progress": 40,
        "nb_jobs_processed": 2,
        "nb_jobs_failed": 1,
        "nb_jobs_total": 5
    },
    "socket": null
}  
[...] local.INFO: SimpleJob [uj0mek4oTjAZK0BWCh4K9IVDZxX1Wb] RAN  
[...] local.INFO: Broadcasting [batch.progressed] on channels [private-batch] with payload:
{
    "batch_id": "9c638a95-495f-4b52-b972-401666d2dbe3",
    "data": {
        "progress": 60,
        "nb_jobs_processed": 3,
        "nb_jobs_failed": 1,
        "nb_jobs_total": 5
    },
    "socket": null
}  
[...] local.INFO: SimpleJob [yei2VrBCxWApNGuaSfW7sJO2OeR790] RAN  
[...] local.INFO: Broadcasting [batch.progressed] on channels [private-batch] with payload:
{
    "batch_id": "9c638a95-495f-4b52-b972-401666d2dbe3",
    "data": {
        "progress": 80,
        "nb_jobs_processed": 4,
        "nb_jobs_failed": 1,
        "nb_jobs_total": 5
    },
    "socket": null
}  
[...] local.INFO: Broadcasting [batch.progressed] on channels [private-batch] with payload:
{
    "batch_id": "9c638a95-495f-4b52-b972-401666d2dbe3",
    "data": {
        "progress": 100,
        "nb_jobs_processed": 5,
        "nb_jobs_failed": 1,
        "nb_jobs_total": 5
    },
    "socket": null
}  
[...] local.INFO: Broadcasting [batch.ended] on channels [private-batch] with payload:
{
    "batch_id": "9c638a95-495f-4b52-b972-401666d2dbe3",
    "data": [],
    "socket": null
}

And... that's it!

Conclusion

Let's review our feature list:

  • [x] Batch should be able to handled up to 500 jobs

  • [x] Multiple batch should be able to run at the same time, without waiting for the previous one to finish

  • [x] Progression should be multiples of 10 from 0 until 100 (0, 10, 20, etc.)

  • [x] Progression should reflect the number of the ran jobs (succeeded or failed)

  • [x] Application should have a way to track progression without asking for it

  • [x] Application should have a way to know a batch has started without asking for it

  • [x] Application should have a way to know a batch has ended without asking for it

  • [x] Application should have a clear description of any failed job (class name + arguments + failed job id at minimum)

  • [x] Failed jobs could be retried

  • [x] Job failures should not cause a batch to be cancelled

  • [x] Batch should be cancellable

  • [x] Pending jobs of a cancelled batch should not run

What did we learn?
It is essential to have a feature list before doing code, some choices may close opportunities we want to have, or some difficulties may be avoided because we don't need a specific feature. Laravel batches are flexible enough to allow to build a custom system on top of it, and the choices they made can be overridden without dirty hacks or database direct updates which let the custom code ready to face future Laravel updates.

Great! We designed our custom system and it works. But the abstraction is still lacking, we need to refine our code. This is what we will do in the last article of these series.