Skip to content
View Danau5tin's full-sized avatar

Block or report Danau5tin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. multi-agent-coding-system multi-agent-coding-system Public

    Reached #13 on Stanford's Terminal Bench leaderboard. Orchestrator, explorer & coder agents working together with intelligent context sharing.

    Python 1.3k 170

  2. terminal-bench-rl terminal-bench-rl Public

    GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's TerminalBench leaderboard.

    Python 323 21

  3. Orca-Agent-RL Orca-Agent-RL Public

    Scaling Coding-Agent RL to 32x H100s. **Achieving 160% improvement** on Stanford's TerminalBench

    Python 89 12

  4. calculator_agent_rl calculator_agent_rl Public

    Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.

    Python 65 6

  5. tbench-agentic-data-pipeline tbench-agentic-data-pipeline Public

    Multi-agent synthetic data generation pipeline capable of generating and validating long horizon terminal/coding tasks for RL training

    Python 46 8

  6. auto_agent_optimiser auto_agent_optimiser Public

    An AI Agent that automatically optimises other AI agents.

    Python 4 2