pricing
pay for what you use
start free with gpu access. scale with serverless, dedicated, or both.
free
$0forever
experiment with models and ship side projects.
- -50 gpu-hours / month
- -500K serverless invocations
- -5 GB storage
- -community support
- -3 services max
- -shared inference endpoints
pro
*$79/mo
for teams shipping ai products to production.
- -500 gpu-hours / month
- -unlimited invocations
- -100 GB storage
- -priority support
- -unlimited services
- -dedicated or serverless compute
- -custom domains
- -team seats included
faq
frequently asked
what is a gpu-hour?
one hour of compute on a single gpu. if you run inference on an h100 for 30 minutes, that is 0.5 gpu-hours.
can i mix serverless and dedicated?
yes. you can set the compute mode per-service. run your api serverless and your model inference on dedicated gpus in the same project.
do you offer startup credits?
qualifying startups can receive up to $100K in credits. contact us for details.
what happens if i exceed my free tier?
we will notify you before any overage. you can upgrade to pro or set hard spending limits. we never charge without your consent.