Serverless GPU inference for ML models

Pay-per-millisecond API to run ML in production.

import requests

response = requests.post(
    url="https://api.pipeline.ai/v2/runs",
    headers={"Authorization": "Bearer YOUR_API_TOKEN"},
    json={
        "pipeline_id": "pipeline_67d9d8ec36d54c148c70df1f404b0369",
        "data": [
            ["Mountain winds, and babbling springs, and moonlight seas"],
            {
              "seed": 1,
              "num_inference_steps": 50,
              "guidance_scale": 7.5,
              "width": 512,
              "height": 512,
              "eta": 0.0,
              "num_samples": 3
}]})
Cheaper than AWS or GCPReduced GPU usage with serverless
Up-to-date enterprise hardwareNVIDIA Ampere and Volta GPUs.
Save engineering timeWe handle the cloud infrastructure as you scale.
Unlimited requestsNo changes required as your product grows
Reduced cold startLow latency and reliable response times.
Rapid supportPersonal specialist help.
New
Custom models

Deploy your own ML models on Pipeline

Deploy your own ML models on Pipeline

Upload your model and instantly get an inference API endpoint.

Access pre-trained models

State-of-the-art AI models, one API call away.

State-of-the-art AI models, one API call away.

Explore our list of pre-trained AI models available as an API.

Select one of our top models above or click below to view them all in our dashboard.

From our blog

From our blog

Read more about what we are building and how people are using Pipeline.

Start your AI journey today

Join over 2,500+ customers already using Pipeline.