compute fundamentals · 8 min read

Databricks All-Purpose vs Jobs Compute: When to Use Which

By LakeSentry Team · Reviewed by Kristina Kazmina, Senior Software Engineer, LakeSentry

All-purpose compute costs roughly 2–3x more per DBU than jobs compute for the same workload. Here's when each fits, and how to see which one your jobs are running on.

All-purpose compute in Databricks is built for interactive, multi-user work — notebooks, ad-hoc analysis, shared exploration. Jobs compute is built for scheduled, automated runs of known code. All-purpose is priced at roughly 2–3x the DBU rate of jobs compute (Databricks pricing), so when scheduled work ends up on all-purpose by accident, the bill grows quietly — though, as we’ll get to, all-purpose for jobs is the right call in some cases.

What all-purpose and jobs compute are

The two compute types look almost identical in the Databricks UI, which is part of why they get mixed up. They’re not interchangeable.

All-purpose compute is a long-lived cluster you start manually (or via auto-start), attach a notebook to, and share with other users. It supports concurrent users, access to cluster-installed packages, and stays running until you stop it or auto-termination kicks in. It’s the default when you create a notebook and click “Run.” It can also be used for scheduled workloads.

Jobs compute is a cluster Databricks creates for a specific job run and shuts down once the run completes. The cluster is dedicated to that one job and stops billing the moment the job ends — it isn’t shared with other users, and its state doesn’t persist between runs.

The Spark engine and runtime version are the same as on an all-purpose cluster, and the code that runs is identical. What changes is the cluster’s lifecycle and the per-DBU rate Databricks charges.

Why the cost differs

Databricks bills by DBU consumption × per-DBU rate. The rate is set by SKU, and the SKU is determined by the workload type.

All-purpose has the highest per-DBU rate because it’s optimized for interactivity — concurrent users, notebook attach/detach, long-running sessions, hot data caches. Jobs compute is the cheapest classic SKU because it does one thing and exits.

The exact ratio depends on plan tier and cloud provider, but the 2–3x figure holds across most setups. On top of the DBU charge, the cloud provider bills the underlying VMs separately — that part is roughly the same for both compute types.

Same Spark code, different SKU Relative DBU rate, classic compute All-Purpose Jobs Compute 0

When each compute type is the right fit

The decision is usually about who is running the work, what level of workload isolation it needs, and whether the same code will run again on a schedule or trigger.

All-purpose: interactive and exploratory

Use all-purpose when:

  • A human is at the keyboard, attaching a notebook, iterating on queries
  • Multiple users need to share the same environment — library state, environment variables, services started via init script
  • BI tools refresh on demand and need a warm cluster

The premium DBU rate buys interactivity. If nobody is iterating, you’re paying for capability you don’t use.

Jobs compute: scheduled and repeatable

Use jobs compute when:

  • The work runs on a schedule, a trigger, or a workflow step
  • Each run is independent — no shared state between runs
  • The cluster doesn’t need to outlive the job
  • Cluster state should stay workload-specific (e.g. library state isolation)

The savings come from two places: the lower per-DBU rate, and the fact that a job cluster can’t sit idle between runs (with the exception of some streaming workloads, which include idle gaps by design). It exists for one run and terminates.

Side-by-side at a glance

All-Purpose ComputeJobs Compute
Built forInteractive notebooks, ad-hoc code execution, BI refreshScheduled jobs, workflows, automated pipelines
LifecycleLong-lived; manual start, manual termination or auto-terminate after idleCreated per job run, destroyed on completion
ConcurrencyMultiple users and workloads share one clusterSingle job, isolated
Per-DBU rateHighest classic SKULowest classic SKU (~2–3x cheaper)
Idle costReal — pays while no one is using itNone — cluster terminates with the job
Right call whenA human is iterating, or a large group of workloads runs at the same time and doesn’t require isolationA schedule or trigger is running known code and workload isolation is needed

How wrong-fit workloads show up

Most environments don’t have a single “we use all-purpose for everything” problem. They have a long tail of automated work that ended up on all-purpose because that’s where it was tested.

A few of the usual suspects:

  • Automated jobs pointed at an existing all-purpose cluster. Databricks lets you point a scheduled, API-triggered, or workflow-driven job at an existing all-purpose cluster. Convenient during development, expensive in production — the cluster keeps running between runs unless you’ve configured aggressive auto-termination.
  • Shared “team clusters” hosting both interactive work and scheduled refreshes. A long-running all-purpose cluster set up for the data team’s notebooks ends up running scheduled jobs too — it’s already up, and shared-cluster permissions are simpler than per-job access management.
  • Long auto-termination windows. All-purpose clusters with autotermination_minutes set to 120, 240, or “never” are usually a relic of someone debugging once and not changing it back. The idle hours add up.

Across the Databricks environments we review at LakeSentry, the typical production pattern is most scheduled jobs running on jobs compute — though we still see teams using all-purpose compute in jobs, usually because permissions were easier on a shared cluster or serverless wasn’t an option when the workload was first set up. Databricks’ own guidance is that all-purpose isn’t recommended for production jobs, so both setups are worth revisiting.

The reason these are hard to catch is that none of them look wrong in the top-level UI. The cost lens that shows compute type next to actual usage pattern is the one most native tools don’t surface — see Native Databricks cost tools vs. the cross-workspace view for the longer take on that gap.

SQL warehouses, serverless, and other compute

Two clarifications, because these don’t fit the all-purpose / jobs binary.

SQL warehouses aren’t clusters — they’re a separate compute type with their own SKU, purpose-built for SQL analytics through BI tools and the SQL editor. Don’t compare them directly to all-purpose; compare them to “would I otherwise put this BI query on an interactive cluster?” If yes, the warehouse usually wins on both cost and performance.

Serverless compute is a third path Databricks now offers — it removes cluster management entirely but uses a different DBU rate and cost model, so it isn’t a direct comparison with classic compute. Rule of thumb: classic jobs compute is still cheapest at high, steady utilization; serverless wins for bursty workloads or when not managing clusters is the bigger value.

FAQ

Can I run a notebook on a job cluster?

Yes — that’s what a scheduled notebook job does when you point it at a new job cluster instead of an existing all-purpose one. The notebook runs end-to-end on the job cluster, which terminates when the run completes.

What about SQL warehouses — do they replace either of these?

For BI and SQL analytics workloads, yes — SQL warehouses are the right compute, not all-purpose. They’re not a substitute for jobs compute running pipelines or for notebooks running Python.

Is serverless cheaper than classic compute?

Not by default. Serverless compute uses a different DBU rate and cost model than classic compute. For steady, high-utilization scheduled work, classic jobs compute is usually cheaper. Serverless wins when workloads are bursty (you pay per workload, not per cluster hour) or when avoiding cluster management itself has value.

How do I know if my team is using the wrong compute type?

The signal is per-cluster utilization: an all-purpose cluster that runs scheduled work and has long idle gaps, or a cluster with auto-termination set to multiple hours. The native console shows current state but not the pattern over time. Cross-workspace cost attribution that tags each workload with its compute type and usage pattern is what makes wrong-fit jobs visible.

In LakeSentry, this lives in Cost Explorer’s Compute Types view (DBU spend split by Jobs / All-Purpose / SQL across all connected workspaces), with long auto-termination windows and idle clusters surfaced automatically in Insights & Actions.

See which of your workloads are on the wrong compute

Free tier — unlimited workspaces, no credit card. Connect in minutes.

Evaluating Databricks cost tools? Compare them side by side →