RoboMove

Community-driven robot training through social interaction.

Interactive robot playground built on the CodecFlow stack. Livestreams robots in simulation environments, letting Twitter users control them via tweets.

Built using Fabric and optr.

How It Works

Tweet a command tagging @RoboMove
The sim interprets the instruction and the robot attempts it
A short video reply shows the attempt
Each run updates the policy for future use

Technical Architecture

MuJoCo physics runs in a dedicated subprocess, isolated from the streaming and HTTP layers. The main loop steps physics → renders a frame → pushes to the GStreamer pipeline → records. Frame rendering and streaming run on the same tick; physics and rendering are decoupled through a frame queue.

Streaming: GStreamer pipeline handles video rendering and output (RTMP for broadcast, shared memory for low-latency local consumption). Cloudflare WebRTC SFU streams real-time simulation state data — joint positions, velocities, sensor readings — to connected clients with sub-100ms latency.

HTTP API: Actions arrive as HTTP requests, enter the callback manager queue with a duration and priority, and execute inside the sim loop on the next available tick.

Robot model: Unitree G1 humanoid, 23 DOF — 6 left leg, 6 right leg, 3 torso, 4 left arm, 4 right arm. Joint stiffness, damping, default positions, and torque limits defined per-joint in the scene XML.

Policies: Two control modes:

Balance (ONNX) — observation vector includes IMU (linear velocity, gyro, gravity vector), velocity commands, joint angles, joint velocities, last action, and phase signal. ONNX model runs inference every 10 sim steps. Output is added to default joint positions via PD control.
GMT / motion tracking (PyTorch) — loads reference motion trajectories (walk, dance, kick, squat, etc. as .pkl files). A PyTorch policy maps target joint positions from the trajectory to motor torques at each timestep.

optr graph: Social listener → command parser → sim worker → frame renderer → stream pusher → reply poster. Each node runs as an optr operator; the graph dispatches to Fabric GPU workers for inference-heavy steps.

What It Proves

Fabric at work: GPU compute for physics-heavy sim steps and policy inference, provisioned on demand per session, billed per second.
optr runtime: A multi-node, real-time graph running continuously — social input to physical output — across the full See→Think→Act loop.
MuJoCo on Fabric: Full-fidelity humanoid physics (23 DOF, contact-rich locomotion) running cloud-side, streamed back with GStreamer + WebRTC.
Community-as-sensor: Twitter users become the training data source. Each run logs to a dataset; the policy improves over time from crowd interaction.

Stack

MuJoCo — physics engine (XML scene, per-joint tuning)
GStreamer — video rendering + RTMP/SHM output
Cloudflare WebRTC SFU — real-time sim state streaming
ONNX Runtime / PyTorch — balance and motion-tracking policies
Fabric — GPU workers for sim + inference
optr — orchestration runtime

Status: First run with Unitree G1 humanoid. Open experiment — community-driven.