Fabric
Serverless compute for low-latency AI and robotics.
Fabric is the compute layer of CodecFlow. It pools machines from cloud providers and DePIN networks into one programmable compute mesh — accessible through a simple SDK, billed per second, and tuned for latency-sensitive AI work.
Think of it like Modal Labs, built for agents and real-time inference at the edge.
flowchart TD
SDK["Developer / Agent\ncalls Fabric SDK"]
SDK -->|"specify: GPU tier,\nlatency target, budget"| COORD["Fabric Orchestrator"]
COORD --> SCORE{"Score available nodes"}
SCORE -->|"best match"| SELECT["Select provider node"]
SELECT --> CLOUD["Cloud GPU"]
SELECT --> DEPIN["DePIN nodes"]
SELECT --> ONPREM["On-prem cluster"]
CLOUD -->|"QUIC/Iroh tunnel"| EXEC["Execute workload\non selected node"]
DEPIN -->|"QUIC/Iroh tunnel"| EXEC
ONPREM -->|"QUIC/Iroh tunnel"| EXEC
EXEC -->|"stream results"| RESULT["Return result\nto caller"]
EXEC -->|"per-second metering"| BILL["Facilitator\n(Fiat + Stables)"]
RESULT --> SDK
BILL --> SETTLE["On-chain settlement"]
style COORD fill:#d97706,color:#fff
style EXEC fill:#059669,color:#fff
style SETTLE fill:#7c3aed,color:#fff
What It Does
Section titled “What It Does”- Orchestrate compute across cloud providers (AWS, GCP, etc.) and DePIN networks. Users get the best available machine without managing providers.
- Schedules jobs across the connected network based on availability, latency, and cost.
- Connects machines with QUIC and Iroh for low-latency peer-to-peer talk. This makes distributed jobs possible that would be impossible with plain HTTP.
- Connects on-prem and local hardware too. Teams can plug their own machines into Fabric.
- Agent-friendly. An AI agent can ask for compute, get a GPU, and pay with x402 — no human needed.
Why QUIC/Iroh Matters
Section titled “Why QUIC/Iroh Matters”Because Fabric machines talk over QUIC with Iroh, the delay between nodes is low enough to split jobs in ways that weren’t practical before.
For example: instead of bundling a GPU model inside the same container as its worker, the model can live as its own service and be called remotely — with delay low enough that it feels local. This means GPU resources can be shared across jobs dynamically instead of being locked to one deployment.
Python and TypeScript SDKs let developers write apps the same way they would with Modal Labs — decorate a function, set the compute requirements, deploy.
from fabric import service, GPU
@service(gpu=GPU.A100)def run_inference(frames): return model.predict(frames)The same code runs locally for development and on Fabric’s distributed network for production.
Payments
Section titled “Payments”You pay for what you use — nothing more. Billed per second, no idle charges, no provisioning. Request a machine, run your workload, done.
Agents can request and pay for machines on their own the same way. The Facilitator handles escrow for long-running sessions — funds hold upfront, draw down as compute runs, remainder returns when done.
Who Uses It
Section titled “Who Uses It”- SimArena — sends heavy cloud sim jobs (Isaac Sim, Genesis, Newton) to Fabric when browser physics isn’t enough
- optr — deploys graph nodes to Fabric for production distributed jobs (
graph.deploy()) - Autonomous agents — request and pay for compute on demand with x402
Comparable Products
Section titled “Comparable Products”Modal Labs — but Fabric adds agent-native x402 payments, DePIN provider integration, and the QUIC/Iroh mesh that powers optr’s streaming runtime.
Status
Section titled “Status”Alpha released. Public release in Q2.