Serverless GPU inference for ML models

Pay-per-millisecond API to run ML in production.

Example inference request

DALL·E Mega

1import requests
2
3response = requests.post(
4    url="https://api.pipeline.ai/v2/runs",
5    headers={"Authorization": "Bearer YOUR_API_TOKEN"},
6    json={
7        "pipeline_id": "pipeline_17ac3021b7674b10a6fbe3cb980ff57d",
8        "data": [
9            ["Mountain winds, and babbling springs, and moonlight seas"],
10            {"num_images": 9},
11        ],
12    },
13)
14
Cheaper than AWS or GCPReduced GPU usage with serverless
Up-to-date enterprise hardwareNVIDIA Ampere and Volta GPUs.
Save engineering timeWe handle the cloud infrastructure as you scale.
Unlimited requestsNo changes required as your product grows
Reduced cold startLow latency and reliable response times.
Rapid supportPersonal specialist help.
Have your own ML models?Get a serverless inference endpoint for free.

State-of-the-art AI models, one API call away.

Explore our list of AI models available as an API.

Dall·E Mini

Generate images from a text prompt.

Dall·E Mega

Generate images from a text prompt.

GPT-J

High-quality text generations on-par with GPT-3

GPT-2 Large

High-quality text generations.

GPT-2 Medium

Medium-quality text generations.

GPT-Neo 2.7B

High-quality text generations.

GPT-Neo 1.3B

Medium-quality text generations.

GPT-Neo 125M

Fast text generations.

What will you build next?

Some examples of what you can do with Pipeline.

Text generation

If you are looking for an AI model to generate text, our NLP models are the most suitable.

GPT-J

GPT-J 16-bit

GPT-2

GPT-2 L

GPT-2 M

GPT-NEO

GPT-Neo 2.7B

GPT-Neo 1.3B

GPT-Neo 125M

Image generation

If you are looking for an AI model to generate images, our DALL·E models are the most suitable.

DALL·E

DALL·E Mega

DALL·E Mini

Start your AI journey today

Join over 2,500+ customers already using Pipeline.