New v0.12.0 — deployment proof packets, RTSM backend fit, runtime safety, rollout evidence

Prove a robot policy is safe to deploy before it touches production hardware.

Q: Does Tether work without a GPU?

Yes — install with pip install 'fastcrest-tether[serve,onnx]'. The CPU path runs the four verified major open VLA families at machine-precision parity to PyTorch; SmolVLA is the only one fast enough for real-time control on CPU.

Tether (formerly Reflex) is the deployment-proof CLI for vision-language-action models. Export with parity, serve under a latency budget, replay real traces, enforce ActionGuard safety, and produce the evidence packet a robotics team needs before putting a policy on hardware.

$ curl -fsSL https://fastcrest.com/install | sh && tether chat

Don't trust pipes-to-shell? View the installer source first — 177-line Bash bootstrap; the installer itself does not phone home. Tether is BSL 1.1 with documented, configurable runtime telemetry.

Get started → How it works

v0.12.0 · Apache 2.0 in 2030 · Python ≥ 3.10

Deployment Proof

A pass/hold packet for robot releases.

Run Tether against the model, target hardware, replay session, and safety config you actually plan to ship. The output is a concrete deployability record, not a demo screenshot.

Proof packet

Artifact hashes, PyTorch parity, p50/p95/p99 latency, deadline misses, stale-action windows, and target hardware details in one signed receipt.

Rollout confidence

Record-replay traces, policy-version evidence, shadow/canary gates, warm-swap checks, and rollback signals before promotion.

Audit evidence

ActionGuard summaries, safety violations, SBOM, vulnerability handling, and compliance gaps packaged for technical review.

Book a deployment proof sprint → Run the CLI locally

Optimize → Prove → Release → Comply

Four surfaces, one deployability system.

The source-available Tether engine creates local evidence. FastCrest Cloud runs proof jobs on rented GPUs. Fleet Release moves approved artifacts through robot cohorts. FastCrest Comply turns the evidence into a shared-auth conformity workspace.

Engine · open source

The CLI you just installed. Optimize and serve pi0 / pi0.5 / SmolVLA / GR00T policies on edge GPUs, with ActionGuard safety and a tamper-evident audit log. Free, BSL 1.1.

Cloud · hosted

Upload a model, target chip, replay session, and latency budget. Cloud GPUs return a verified edge artifact, signed receipt, and deployment proof packet.

Fleet Release · rollout

Move a verified artifact through staging, canary, and production robot cohorts with health gates, rollback, and release receipts.

Comply · EU evidence

Turn the audit log, signed cert, SBOM, and ActionGuard summary into a live compliance workspace for AI Act, CRA, and Machinery Regulation review.

How it works

From a HuggingFace model to a robot, in four steps.

tether go --model <hf_id> runs all four. Each step writes a verifiable artifact and refuses to ship if its check fails — bad exports never reach a robot.

What it looks like

Talk to your robot fleet in plain English.

tether chat wraps the entire CLI surface in a natural-language agent. 100 calls/day free, no signup, no API key.

romir@thor — tether chat

$ tether chat
connected: chat.fastcrest.com (model=gpt-5-mini)

you › what's the smallest VLA model on Jetson Orin Nano?

  → list_models({})
  → model_info({"model": "smolvla-base"})

smolvla-base — 900 MB, 450M params. p50 22 ms / p99 45 ms on A10G.

Real tether chat session — agent calls list_models, reads the registry, picks the model that fits your hardware, explains why. download .cast

Composable wedges

Every flag is opt-in. Compose only what you need.

14 runtime wedges layer on top of tether serve — safety, observability, optimization, transport. Enable only the checks your deployment needs, then export the evidence they produce.

Multi-embodiment

pi0, pi0.5, SmolVLA, GR00T — all four major open VLA families. ONNX export verified at cos = +1.000000 against PyTorch.

Edge-first

Jetson Orin Nano (8 GB) → AGX Orin → Thor → desktop NVIDIA. Hardware probe picks the right variant; export targets the right precision.

SnapFlow distillation

First open-source SnapFlow reproduction. Distill any pi0 / pi0.5 to a 1-step student that beats its 10-step teacher (64% vs 56% on libero_object).

Production runtime

CUDA graphs, cost-weighted batching, A2C2 correction, record-replay traces, real-time chunking — composable wedges on a single FastAPI server.

Numbers

Verifiable claims, not vibes.

All three reproducible with one command on Modal. See the parity ledger and changelog for full provenance.

9×

faster than monolithic ONNX (decomposed pi0.5 on Jetson AGX Orin)

5.55×

TensorRT FP16 vs ORT-CUDA on A10G (SmolVLA monolithic)

cos=+1.000000

numerical parity to PyTorch on the four verified major open VLA families

vs other tools

Where Tether fits.

Tether is deliberately narrow. Here's the honest read.

	Tether	Triton	HF Endpoints	Raw ONNX
Edge GPU deployment	design center	cloud-first	cloud-only	DIY
VLA-specific export (pi0 / pi0.5 / SmolVLA / GR00T; registry paths for OpenVLA / DreamZero)	built-in, verified core	no	no	manual, error-prone
Verified machine-precision parity	automatic	DIY	DIY	DIY
Decomposed pi0.5 (9× speedup)	one flag	no	no	~weeks of work
Setup time	30 seconds	days	minutes	1–3 weeks
Multi-tenant cloud serving at scale	not the design	battle-tested	managed	DIY

Honest details: vs other tools →

Common questions

FAQ

Why was Reflex renamed to Tether?

One brand, one name. The OSS deploy CLI is now Tether; the hosted optimize-and-verify SaaS is FastCrest Cloud; the regulated-AI evidence bundle is FastCrest Comply. The package on PyPI is fastcrest-tether (the bare name tether is reserved on PyPI), the GitHub repo is github.com/FastCrest/tether, and every CLI command is still tether .... Old reflex URLs 301-redirect for the foreseeable future; the legacy reflex-vla package on PyPI stays pinned at 0.11.x — new releases ship as fastcrest-tether.

Why not just use Triton?

Triton is excellent for multi-tenant cloud inference at scale — many models, many services, datacenter GPUs, an ML platform team. Tether is for the opposite: one model, one robot, one process, edge GPU, one developer. Tether ships VLA-specific features Triton doesn't (decomposed pi0.5, A2C2, ActionGuard with URDF, episode-aware policy routing). The two compose — you can tether export and drop the result into Triton if you want both. Full comparison →

Does it work without a GPU?

Yes — install with pip install 'fastcrest-tether[serve,onnx]'. The CPU path runs the four verified major open VLA families at machine-precision parity to PyTorch; SmolVLA is the only one fast enough for real-time control on CPU. tether chat works with no install at all once the package is installed.

What does BSL 1.1 mean for me?

Tether (formerly Reflex) is source-available under BSL 1.1 — same license HashiCorp, MongoDB, and Sentry use. Free for personal, academic, and commercial use, including embedding in your own product. The only restriction is offering Tether itself as a competing hosted service. Auto-converts to Apache 2.0 four years after each release. License details →

Does it work on RTX 5090 / Blackwell?

ONNX Runtime 1.25.1+ includes Blackwell kernels, but RTX 50-series and B200/GB200 deployments still need a smoke pass before production use. Run tether doctor and the local serve smoke on the target GPU before promoting; use non-Blackwell cloud GPUs for fallback proof jobs when needed.

Can I use this in a commercial product?

Yes — BSL 1.1 explicitly permits commercial use, including embedding in proprietary products you ship to your own customers. The only restricted case is offering Tether itself as a competing hosted service. Most legitimate use cases (deploying to your own robots, your own labs, your own customers) are clearly in the free bucket.

How does this compare to NVIDIA's GR00T runtime?

Tether is the only open-source one-command deploy path for GR00T, as far as we know. NVIDIA's runtime is closed-source and locked to their hardware. Tether supports GR00T alongside pi0 / pi0.5 / SmolVLA — multi-vendor, source-available, and works on Jetson Orin (not just Thor).

What's the Pro tier?

The source-available CLI stays free for local export, serve, replay, and evidence generation. Paid offerings are for repeated team proof runs, hosted GPU verification, compliance workspaces, and continuous self-distillation. Pricing details →

Get in touch

Deploying a VLA? Send the model family, target hardware, replay session, and the proof you need. For quick questions, the Discord is usually faster.

Prove a robot policy is safe to deploy before it touches production hardware.

Deployment Proof

A pass/hold packet for robot releases.

Proof packet

Rollout confidence

Audit evidence

Optimize → Prove → Release → Comply

Four surfaces, one deployability system.

Engine · open source

Cloud · hosted

Fleet Release · rollout

Comply · EU evidence

How it works

From a HuggingFace model to a robot, in four steps.

What it looks like

Talk to your robot fleet in plain English.

Composable wedges

Every flag is opt-in. Compose only what you need.

Multi-embodiment

Edge-first

SnapFlow distillation

Production runtime

Numbers

Verifiable claims, not vibes.

vs other tools

Where Tether fits.

Common questions

FAQ

Get in touch

Product

Resources

Community

Company

Prove a robot policy is safe to deploy before it touches production hardware.

Deployment Proof

A pass/hold packet for robot releases.

Proof packet

Rollout confidence

Audit evidence

Optimize → Prove → Release → Comply

Four surfaces, one deployability system.

Engine · open source

Cloud · hosted

Fleet Release · rollout

Comply · EU evidence

How it works

From a HuggingFace model to a robot, in four steps.

What it looks like

Talk to your robot fleet in plain English.

Composable wedges

Every flag is opt-in. Compose only what you need.

Multi-embodiment

Edge-first

SnapFlow distillation

Production runtime

Numbers

Verifiable claims, not vibes.

vs other tools

Where Tether fits.

Common questions

FAQ

Get in touch

Stay in the loop

Product

Resources

Community

Company