Changelog for Oban Pro v1.4

🚅 Streamlined Workflows

Workflows now use automatic scheduling to run workflow jobs in order, without any polling or snoozing. Jobs with upstream dependencies are automatically marked as on_hold and scheduled far into the future. After the upstream dependency executes they're made available to run, with consideration for retries, cancellations, and discards.

  • An optimized, debounced query system uses up to 10x fewer queries to execute slower workflows.

  • Queue concurrency limits don't impact workflow execution, and even complex workflows can quickly run in parallel with global_limit: 1 and zero snoozing.

  • Cancelled, deleted, and discarded dependencies are still handled according to their respective ignore_* policies.

All of the previous workflow "tuning" options like waiting_limit and waiting_snooze are gone, as they're not needed to optimize workflow execution. Finally, older "in flight" workflows will still run with the legacy polling mechanism to ensure backwards compatibility.

⏲️ Job Execution Deadlines

Jobs that shouldn't run after some period of time can be marked with a deadline. After the deadline has passed the job will be pre-emptively cancelled on its next run, or optionally during its next run if desired.

defmodule DeadlinedWorker do
  use Oban.Pro.Worker, deadline: {1, :hour}

  @impl true
  def process(%Job{args: args}) do
    # If this doesn't run within an hour, it's cancelled
  end
end

Deadlines may be set at runtime as well:

DeadlinedWorker.new(args, deadline: {30, :minutes})

In either case, the deadline is always relative and computed at runtime. That also allows the deadline to consider scheduling—a job scheduled to run 1 hour from now with a 1 hour deadline will expire 2 hours in the future.

🧙 Automatic Crontab Syncing

Synchronizing persisted entries manually required two deploys: one to flag it with deleted: true and another to clean up the entry entirely. That extra step isn't ideal for applications that don't insert or delete jobs at runtime.

To delete entries that are no longer listed in the crontab automatically set the sync_mode option to :automatic:

[
  sync_mode: :automatic,
  crontab: [
    {"0 * * * *", MyApp.BasicJob},
    {"0 0 * * *", MyApp.OtherJob}
  ]
]

To remove unwanted entries, simply delete them from the crontab:

 crontab: [
   {"0 * * * *", MyApp.BasicJob},
-  {"0 0 * * *", MyApp.OtherJob}
 ]

With :automatic sync, the entry for MyApp.OtherJob will be deleted on the next deployment.

❤️‍🩹🌟 Upgrading to v1.4

Changes to DynamicCron require a migration to add the new insertions column. You must re-run the Oban.Pro.Migrations.DynamicCron migration when upgrading.

defmodule MyApp.Repo.Migrations.UpdateObanCron do
  use Ecto.Migration

  defdelegate change, to: Oban.Pro.Migrations.DynamicCron
end

v1.4.0 — 2024-03-21

Enhancements

  • [DynamicCron] Add :sync_mode for automatic entry management.

    Most users expect that when they delete an entry from the crontab it won't keep running after the next deploy. A new sync_mode option allows selecting between automatic and manual entry management.

    In addition, this moves cron entry management into the evaluate handler. Inserting and deleting at runtime can't leverage leadership, because in rolling deployments the new nodes are never the leader.

  • [DynamicCron] Use recorded job insertion timestamps for guaranteed cron.

    A new field on the oban_crons table records each time a cron job is inserted up to a configurable limit. That field is then used for guaranteed cron jobs, and optionally for historic inspection beyond a job's retention period.

  • [DynamicCron] Stop adding a unique option when inserting jobs, regardless of the guaranteed option.

    There's no need for uniqueness checks now that insertions are tracked for each entry. Inserting without uniqueness is significantly faster.

  • [DynamicCron] Inject cron information into scheduled job meta.

    This change mirrors the addition of cron: true and cron_expr: expression added to Oban's Cron in order to make cron jobs easier to identify and report on through tools like Sentry.

  • [Worker] Add :deadline option for auto cancelling jobs

    Jobs that shouldn't run after some period of time can be marked with a deadline. After the deadline has passed the job will be pre-emptively cancelled on its next run, or optionally during its next run if desired.

  • [Workflow] Invert workflow execution to avoid bottlenecks caused by polling and snoozing.

    Workflow jobs no longer poll or wait for upstream dependencies while executing. Instead, jobs with dependencies are "held" until they're made available by a facilitator function. This inverted flow makes fewer queries, doesn't clog queues with jobs that aren't ready, avoids snoozing, and is generally more efficient.

  • [Workflow] Expose functions and direct callback docs to function docs.

    Most workflow functions aren't true callbacks, and shouldn't be overwritten. Now all callbacks point to the public function they wrap. Exposing workflow functions makes it easier to find and link to documentation.

Bug Fixes

  • [DynamicCron] Don't consider the node rebooted until it is leader

    With rolling deploys it is frequent that a node isn't the leader the first time it evaluates. However, the :rebooted flag was set to true on the first run, which prevented reboots from being inserted when the node ultimately acquired leadership.

  • [DynamicQueues] Accept streamline partition syntax for global_limit and rate_limit options.

    DynamicQueues didn't normalize the newer partition syntax before validation. This was an oversight, and a sign that validation had drifted between the Producer and Queue schemas. Now schemas use the same changesets to ensure compatibility.

  • [Smart] Handle partitioning by :worker and :args regardless of order.

    The docs implied partitioning by worker and args was possible, but there wasn't a clause that handled any order correctly.

  • [Smart] Explicitly cast transactional advisory lock prefix to integer.

    Postgres 16.1 may throw an error that it's unable to determine the argument type while taking bulk unique locks.

  • [Smart] Preserve recorded values between retries when sync acking failed.

    Acking a recorded value for a batch or workflow is synchronous, and a crash or timeout failure would lose the recorded value on subsequent attempts. Now the value is persisted between retries to ensure values are always recorded.

  • [Smart] Revert "Force materialized CTE in smart fetch query".

    Forcing a materialized CTE in the fetch query was added for reliability, but it can cause performance regressions under heavy workloads.

  • [Testing] Use configured queues when ensuring all started.

    Starting a supervised Oban in manual mode with tests specified would fail because in :manual testing mode the queues option is overridden to be empty.