Pinned Loading
-
multi-agent-coding-system
multi-agent-coding-system PublicReached #13 on Stanford's Terminal Bench leaderboard. Orchestrator, explorer & coder agents working together with intelligent context sharing.
-
terminal-bench-rl
terminal-bench-rl PublicGRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's TerminalBench leaderboard.
-
Orca-Agent-RL
Orca-Agent-RL PublicScaling Coding-Agent RL to 32x H100s. **Achieving 160% improvement** on Stanford's TerminalBench
-
calculator_agent_rl
calculator_agent_rl PublicTraining an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.
-
tbench-agentic-data-pipeline
tbench-agentic-data-pipeline PublicMulti-agent synthetic data generation pipeline capable of generating and validating long horizon terminal/coding tasks for RL training
-
auto_agent_optimiser
auto_agent_optimiser PublicAn AI Agent that automatically optimises other AI agents.
If the problem persists, check the GitHub status page or contact support.




