agent.ts
import { agent, gpu } from "@zectre/ai"
 
const app = agent({
model: "llama-3.3-70b",
compute: gpu("h100"),
scaling: "auto" // serverless or dedicated
})
 
await app.deploy() // live in 4.1s

01

ai inference

deploy models to gpu-backed endpoints. auto-scaling from zero to thousands of concurrent requests. built-in support for vllm, tgi, and custom runtimes.

02

adaptive compute

run serverless for bursty workloads or reserve dedicated instances for sustained throughput. switch modes per-service, not per-account.

03

microservices

deploy independent services with isolated runtimes. automatic discovery, load balancing, and zero-downtime rollouts baked in.

04

data layer

managed postgres, redis, and object storage. serverless or provisioned. branching for dev, replicas for prod.

05

agent orchestration

chain models, tools, and data sources into autonomous agents. built-in memory, tool calling, and human-in-the-loop controls.

06

observability

traces, metrics, and logs across every service and model call. cost attribution per-request. no third-party sdk required.

0%
uptime sla
<0ms
model cold start
0+
gpu regions
0B+
inferences/day

$npx create-zectre-app