How it works

AI powers your workflows, Keld optimizes them.

Enterprises have work to run in their apps and agents. AI Model Providers have spare capacity to run it. Keld is the marketplace in the middle: send a job with a model or use case, a max price and a max time, and Keld finds the cheapest provider and runs it within your bounds.

Enterprises send a Deadline job

The unit of work isn't a prompt — it's a job. Tell Keld the model or use case, your ceiling price and how long you can wait. That's it.

Keld picks the best-fit model at the best price

The marketplace finds the cheapest provider of the model you need and runs the job within your time and price bounds, batching where it helps — all transparently, neutral to both sides.

AI Model Providers serve it from spare capacity

Providers offer the capacity they have to spare and get matched to live demand — turning idle compute into revenue at a price they set.

Get started

Two ways to route a workflow to Keld

However you build apps and agents, you keep your stack. Map and route from the Atlas console with no code, or initiate a Deadline job from the SDK you already use.

Without code · Atlas console

Map, then route — point and click

  1. 1
    Map your workflows. Atlas scans your AI usage and shows what you spend, by team, model, project and use case.
  2. 2
    Select a workflow to route. Pick the workload you want to send to the Keld marketplace.
  3. 3
    Decide how much, and set the bounds. Choose the share to divert to Keld and set your price ceiling, deadline and quality floor.
Workflow
Customer support
Divert to Keld · 65%
0%50%100%
Routes via Keld
65%
quality floor 99.5%
Projected saving
$1,640/mo
at current blend
With code · SDK & plugins

Initiate a Deadline job from code you know

  1. 1
    Import the Keld SDK or plugin for your platform — OpenAI SDK, LangChain, LiteLLM and more.
  2. 2
    Add a few parameters to the OpenAI-style call you already write, and it runs as a Deadline job on Keld.
# Replace your import — nothing else changes
from keld.openai import openai

client = openai.OpenAI()  # reads KELD_API_KEY

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": prompt}
    ],
    extra_body={
        "keld_ceiling_usd": 0.012,
        "keld_deadline_ms": 8000,
        "keld_use_case":    "summarise",
    }
)
# Atlas maps it. Runs as a Deadline job if price fits.
The simple idea

Pay for the right inference at the right time

Most AI work doesn't need a premium model running in real time. Tell Keld what an answer is worth to you and when you need it by — and stop overpaying for speed and brand you weren't using. Keld runs every job within those bounds, batching where it helps to drive the price down.

The bigger idea

Atlas maps every workflow, then optimizes it

Atlas scans and maps every AI workflow in your stack — so you see spend by team, model, project and use case — then optimizes the workloads that don't need a premium model or real-time SLA, running each one against the best model at the best price within your deadline and ceiling.

For Enterprises — the marketplace

You never touch any market mechanics. You send a Deadline job and Keld does the rest: it finds the cheapest provider of the model you need, runs it within your bounds, and settles at or below your ceiling. Atlas maps and optimizes your workflows; Integrations drop the same power into your existing stack.

For Enterprises →

For AI Model Providers — the trading platform

Providers see the same mechanism as a trading platform: place, manage and cancel orders to serve live demand from spare capacity, with micro-batching in front of your fleet. The hub favours neither side — it matches on price, deadline and performance.

For AI Model Providers →   Keld Trade →

Explore

Start exploring the Keld ecosystem of solutions.

See it on your own spend first

Start free by mapping your workflows with Atlas, then send Deadline jobs for the workloads that can wait.