Skip to content

dev : Implement bubblewrap sandbox backend with availability detection and tests#6709

Open
LIghtJUNction wants to merge 7 commits intoAstrBotDevs:devfrom
yizeyi18:sandbox
Open

dev : Implement bubblewrap sandbox backend with availability detection and tests#6709
LIghtJUNction wants to merge 7 commits intoAstrBotDevs:devfrom
yizeyi18:sandbox

Conversation

@LIghtJUNction
Copy link
Member

@LIghtJUNction LIghtJUNction commented Mar 20, 2026

This pull request introduces a new Bubblewrap-based sandbox environment ("bwrap") to the Astrbot framework, providing a secure and configurable isolation layer for executing code. It adds the necessary backend implementation, configuration options, and localization support for the new booter. The changes also include updates to the configuration schema and user interface metadata to expose Bubblewrap options to users.

The most important changes are:

Bubblewrap sandbox implementation:

  • Added bwrap.py, implementing BwrapBooter, BwrapConfig, and related components for secure, isolated execution of shell and Python commands using Bubblewrap. This includes support for custom read-write and read-only bind mounts, file operations, and runtime checks for Bubblewrap availability.

Configuration and schema updates:

  • Extended the default configuration (default.py) to add bwrap_rw_binds and bwrap_ro_binds options, and updated the schema to allow selecting "bwrap" as a sandbox driver with corresponding bind mount settings. [1] [2]

Integration into computer client:

  • Modified computer_client.py to support selecting and instantiating the Bubblewrap booter, passing in the appropriate configuration for read-write and read-only binds.
  • Minor fix in parse_description to use escaped newlines for frontmatter parsing.

Localization and UI support:

  • Added English, Chinese, and Russian translations and hints for the new Bubblewrap configuration options in config-metadata.json, ensuring these settings are available and understandable in the dashboard UI. [1] [2] [3] [4] [5] [6]

Testing support:

  • Updated test_computer.py to import and make available the new Bubblewrap booter and its components for unit testing. [1] [2]

Modifications / 改动点

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果


Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Add a Bubblewrap-based sandbox backend and wire it into the computer client and configuration system for running code in an isolated environment.

New Features:

  • Introduce a Bubblewrap booter implementation providing sandboxed shell, Python, and filesystem capabilities.
  • Expose Bubblewrap as a selectable sandbox driver with configurable read-write and read-only bind mounts in the provider settings UI.

Enhancements:

  • Update computer client booter selection logic to construct a Bubblewrap booter using configured bind mount options.
  • Adjust description parsing to preserve literal newlines in frontmatter handling.

Tests:

  • Add unit tests covering Bubblewrap configuration defaults, command construction, booter lifecycle, shell and Python execution, file upload/download, and host-backed filesystem operations.

yizeyi18 and others added 7 commits March 18, 2026 23:02
 - Based on Linux Namespace for resource isolation
 - Runs on local computer, with no privilege required
 - Only supports Linux as namespace & bubblewrap are not present on other platforms.

TODO:
 - Fix dashboard presentation. Why change on src does not affect what is really displayed?
 - Strenghthen backend availability detection. One known issue is, on some platforms like Ubuntu 24.04, bubblewrap is banned by system guards, even when it's shipped by package manager. A complete detector may contain :
   1. run the command with cmdline used by the booter. Return True if succsed.
   2. If false, do bottom-up reason detection. Namespace not compiled to kernel? Specific kernel parameters not set? Banned by safety guard? The availability detector should give the user a clear information on why this sandbox backend fails.

These work may require helps from frontend developers. It is tested to be usable on my computer, with non-persistent environment(forget on every command) and persistent file storage.
TODO add:
 - add plugin utility to change ro and rw bind in cmdline
 - make bind dirs dict instead of list to manually map mount point
 in older commits, ro_bind = ['/'] makes skill sync crash. This commit fixes it and adds detection.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@auto-assign auto-assign bot requested review from Soulter and anka-afk March 20, 2026 15:35
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. area:core The bug / feature is about astrbot's core, backend labels Mar 20, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a Bubblewrap sandbox environment to Astrbot, enhancing security and configurability for code execution. It includes the implementation of the sandbox, updates to the configuration schema, integration into the computer client, localization support, and testing enhancements, providing a comprehensive solution for isolated code execution.

Highlights

  • Bubblewrap Sandbox Implementation: Introduced a new Bubblewrap-based sandbox environment ('bwrap') to the Astrbot framework, offering a secure and configurable isolation layer for code execution.
  • Configuration and Schema Updates: Extended the default configuration to include 'bwrap_rw_binds' and 'bwrap_ro_binds' options, allowing users to specify read-write and read-only bind mounts for the Bubblewrap sandbox.
  • Integration into Computer Client: Modified the computer client to support selecting and instantiating the Bubblewrap booter, passing in the appropriate configuration for bind mounts.
  • Localization and UI Support: Added English, Chinese, and Russian translations and hints for the new Bubblewrap configuration options in the configuration metadata, ensuring these settings are accessible and understandable in the dashboard UI.
  • Testing Support: Updated the test suite to include the new Bubblewrap booter and its components for unit testing.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot
Copy link

dosubot bot commented Mar 20, 2026

Related Documentation

1 document(s) may need updating based on files changed in this PR:

AstrBotTeam's Space

pr4697的改动
View Suggested Changes
@@ -117,9 +117,40 @@
 
 - 如果 `computer_use_runtime` 设置为 `"local"`,立即返回本地 booter,无需检查 sandbox booter 配置
 - 如果 `computer_use_runtime` 设置为 `"none"`,抛出 `RuntimeError` 提示沙箱运行时已禁用
-- 只有在以上条件均不满足时,才继续检查 sandbox booter 配置(如 `shipyard_neo` 等)
+- 只有在以上条件均不满足时,才继续检查 sandbox booter 配置(如 `shipyard_neo`、`shipyard`、`bwrap` 等)
 
 该优化避免了在运行时类型已明确时进行不必要的沙箱配置检查,提高了边缘场景的性能。
+
+**Bubblewrap 沙箱支持(PR #6709)**
+
+[PR #6709](https://github.com/AstrBotDevs/AstrBot/pull/6709) 新增了基于 Bubblewrap 的沙箱环境(`bwrap`),提供安全且可配置的代码执行隔离层,作为 Docker 沙箱的替代方案。
+
+**核心特性:**
+- **轻量级隔离**:使用 Bubblewrap 提供命名空间级别的沙箱隔离,无需 Docker 容器
+- **自定义挂载**:支持配置额外的读写和只读挂载目录,灵活控制沙箱可访问的文件系统资源
+- **自动检测**:系统在运行时自动检测 Bubblewrap 可用性,若未安装则提供明确的错误提示
+- **完整工具支持**:提供 Python、Shell 和文件系统操作能力,与其他沙箱后端保持一致
+
+**配置项(provider_settings.sandbox):**
+- `booter: "bwrap"`:选择 Bubblewrap 作为沙箱驱动器
+- `bwrap_rw_binds`:读写挂载目录列表,用于指定需要在沙箱内以读写权限挂载的目录或文件(默认 `/tmp` 和 `/var/tmp` 已自动配置为读写)
+- `bwrap_ro_binds`:只读挂载目录列表,用于指定需要在沙箱内以只读权限挂载的目录或文件(默认系统目录 `/usr`、`/etc`、`/opt` 已自动配置为只读)
+
+**技术实现:**
+- `BwrapBooter` 类:实现 Bubblewrap 沙箱后端,管理沙箱生命周期和资源清理
+- `BwrapConfig` 数据类:封装 Bubblewrap 配置,包括工作区目录、挂载设置和网络隔离选项
+- `build_bwrap_cmd()` 函数:根据配置构建 Bubblewrap 命令行,确保正确的隔离和挂载
+- `HostBackedFileSystemComponent`:文件操作直接在宿主机工作区执行,提供极快的 I/O 性能
+- 自动可用性检测:系统在启动时检查 `bwrap` 可执行文件是否存在(Windows 系统不支持)
+
+**使用限制:**
+- 仅支持 Linux 和 macOS 系统(通过检测 `bwrap` 命令可用性)
+- Windows 系统下 `available()` 方法返回 `false`
+- 需要系统中已安装 Bubblewrap(在大多数 Linux 发行版中可通过包管理器安装)
+
+**测试覆盖:**
+- `tests/unit/test_computer.py` 包含 Bubblewrap 实现的完整单元测试
+- 测试场景包括配置构建、沙箱生命周期、Shell/Python 执行、文件系统操作和上传/下载功能
 
 **运行时工具列表:**
 

[Accept] [Decline]

Note: You must be authenticated to accept/decline updates.

How did I do? Any feedback?  Join Discord

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 6 issues, and left some high level feedback:

  • The change to parse_description now joins frontmatter lines with a literal "\n" instead of real newlines ("\n""\\n"), which is likely to break YAML parsing of the frontmatter; consider keeping real newline characters or explicitly explaining why YAML should receive escaped newlines.
  • In HostBackedFileSystemComponent._safe_path, absolute paths are passed through without any confinement, which allows file operations outside the sandbox workspace; if stronger isolation is desired, consider normalizing and enforcing that all paths resolve under workspace_dir (e.g., via os.path.realpath checks) and rejecting anything outside.
  • For BwrapShellComponent.exec with background=True, the created Popen process is never tracked or reaped; if long-running background commands are expected, consider adding some form of lifecycle management (e.g., tracking PIDs or using subprocess.DEVNULL plus periodic reaping) to avoid lingering processes.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The change to `parse_description` now joins frontmatter lines with a literal `"\n"` instead of real newlines (`"\n"``"\\n"`), which is likely to break YAML parsing of the frontmatter; consider keeping real newline characters or explicitly explaining why YAML should receive escaped newlines.
- In `HostBackedFileSystemComponent._safe_path`, absolute paths are passed through without any confinement, which allows file operations outside the sandbox workspace; if stronger isolation is desired, consider normalizing and enforcing that all paths resolve under `workspace_dir` (e.g., via `os.path.realpath` checks) and rejecting anything outside.
- For `BwrapShellComponent.exec` with `background=True`, the created `Popen` process is never tracked or reaped; if long-running background commands are expected, consider adding some form of lifecycle management (e.g., tracking PIDs or using `subprocess.DEVNULL` plus periodic reaping) to avoid lingering processes.

## Individual Comments

### Comment 1
<location path="astrbot/core/computer/computer_client.py" line_range="217" />
<code_context>
         return ""

-    frontmatter = "\n".join(lines[1:end_idx])
+    frontmatter = "\\n".join(lines[1:end_idx])
     try:
         import yaml
</code_context>
<issue_to_address>
**issue (bug_risk):** Joining with a literal "\n" string likely breaks YAML frontmatter parsing.

Using "\\n" will insert the literal backslash-n characters into the frontmatter instead of real line breaks, which is likely to produce invalid or misparsed YAML for multi-line values. Unless there’s a strong reason to change the format, this should continue to join with real newlines ("\n").
</issue_to_address>

### Comment 2
<location path="astrbot/core/computer/booters/bwrap.py" line_range="186-189" />
<code_context>
+    """File operations happen safely on host mapping to workspace, making I/O extremely fast."""
+    workspace_dir: str
+    
+    def _safe_path(self, path: str) -> str:
+        # Simply maps it. In a stricter implementation, we could verify it's inside workspace_dir.
+        # But for this implementation, we trust the agent or restrict to workspace_dir.
+        if not path.startswith("/"):
+            path = os.path.join(self.workspace_dir, path)
+        return path
</code_context>
<issue_to_address>
**🚨 issue (security):** Absolute paths bypass workspace restriction, allowing arbitrary host filesystem access.

Returning absolute paths unchanged lets callers use FileSystemComponent to access arbitrary host paths (e.g. `/etc/passwd`, `/home/...`), breaking the intended sandbox isolation. You should ensure all paths resolve under `workspace_dir` (e.g., normalize with `os.path.realpath` and check it’s under the workspace root), or very clearly document and strictly limit which absolute paths are allowed.
</issue_to_address>

### Comment 3
<location path="astrbot/core/computer/booters/bwrap.py" line_range="27-36" />
<code_context>
+    ) -> dict[str, Any]:
+        def _run() -> dict[str, Any]:
+            bwrap_cmd = build_bwrap_cmd(self.config, [os.environ.get("PYTHON", "python3"), "-c", code])
+            try:
+                result = subprocess.run(
+                    bwrap_cmd,
+                    timeout=timeout,
+                    capture_output=True,
+                    text=True,
+                )
+                stdout = "" if silent else result.stdout
+                return {
+                    "stdout": stdout,
+                    "stderr": result.stderr,
+                    "exit_code": result.returncode,
+                }
+            except subprocess.TimeoutExpired as e:
+                return {
+                    "stdout": e.stdout.decode() if isinstance(e.stdout, bytes) else str(e.stdout or ""),
</code_context>
<issue_to_address>
**suggestion:** Timeout handler assumes `stdout` may be bytes while `text=True` guarantees str.

With `text=True`, `e.stdout` is always a `str` (or `None`), so the `isinstance(e.stdout, bytes)` check and `.decode()` branch will never run. This dead code is misleading. You can simplify and keep behavior aligned with the non-timeout path, e.g.:

```python
stdout = e.stdout or ""
return {
    "stdout": "" if silent else stdout,
    "stderr": f"Execution timed out after {timeout} seconds.",
    "exit_code": 1,
}
```
</issue_to_address>

### Comment 4
<location path="astrbot/core/computer/booters/bwrap.py" line_range="278-279" />
<code_context>
+        self._fs = HostBackedFileSystemComponent(self.config.workspace_dir)
+        self._python = BwrapPythonComponent(self.config)
+        self._shell = BwrapShellComponent(self.config)
+        if not await self.available(): 
+            raise RuntimeError(
+            "BubbleWrap sandbox unavailable on current machine for no bwrap executable.")
+        test_shl = await self._shell.exec(command = "ls > /dev/null")
</code_context>
<issue_to_address>
**suggestion:** Check availability before constructing shell/python components and using bwrap.

You’re doing all the workspace and component setup before confirming `bwrap` exists, then immediately raising if it doesn’t. Please move the `await self.available()` check (and `RuntimeError`) ahead of creating `HostBackedFileSystemComponent`, `BwrapPythonComponent`, and `BwrapShellComponent` so we avoid unnecessary setup and keep the failure path clearer and cheaper.
</issue_to_address>

### Comment 5
<location path="tests/unit/test_computer.py" line_range="964-930" />
<code_context>
+        await booter.shutdown()
+
+@pytest.mark.skipif(shutil.which("bwrap") is None, reason="bwrap is not installed")
+class TestBwrapShellComponent:
+    @pytest.mark.asyncio
+    async def test_bwrap_shell_exec(self):
+        booter = BwrapBooter()
+        await booter.boot("test_shell")
+        res = await booter.shell.exec("echo 'hello bwrap'")
+        assert res["exit_code"] == 0
+        assert "hello bwrap" in res["stdout"]
+        await booter.shutdown()
+
+    @pytest.mark.asyncio
</code_context>
<issue_to_address>
**suggestion (testing):** Add coverage for BwrapShellComponent background execution and environment handling

The current tests cover only basic execution and a single bind case. Please add tests that exercise more of `BwrapShellComponent.exec`’s behavior, including:

- `background=True` (returns quickly, gives a PID, `exit_code` is `None`).
- Merging of a custom `env` dict into the process environment (e.g., assert a variable appears in `stdout`).
- `cwd` overrides when running in a subdirectory.
- Respect for the `timeout` argument on a long-running command.

Even a couple of focused tests here would substantially reduce the risk of regressions in these paths.

Suggested implementation:

```python
    @pytest.mark.asyncio
    async def test_bwrap_shell_exec_background(self):
        booter = BwrapBooter()
        await booter.boot("test_shell_background")

        # background=True should return quickly with a PID and no exit code yet
        res = await booter.shell.exec("sleep 5", background=True)
        assert res.get("exit_code") is None
        assert isinstance(res.get("pid"), int)

        await booter.shutdown()

    @pytest.mark.asyncio
    async def test_bwrap_shell_exec_env_merge(self):
        booter = BwrapBooter()
        await booter.boot("test_shell_env")

        # Custom env vars should be visible inside the sandboxed process
        res = await booter.shell.exec(
            "sh -lc 'echo \"$MY_TEST_VAR\"'",
            env={"MY_TEST_VAR": "bwrap_env_value"},
        )
        assert res["exit_code"] == 0
        assert "bwrap_env_value" in res["stdout"]

        await booter.shutdown()

    @pytest.mark.asyncio
    async def test_bwrap_shell_exec_cwd(self, tmp_path):
        booter = BwrapBooter()
        await booter.boot("test_shell_cwd")

        subdir = tmp_path / "subdir"
        subdir.mkdir()
        file_in_subdir = subdir / "here.txt"
        file_in_subdir.write_text("from_subdir")

        # Running in the subdirectory via cwd should see the file there
        res = await booter.shell.exec("ls", cwd=str(subdir))
        assert res["exit_code"] == 0
        assert "here.txt" in res["stdout"]

        await booter.shutdown()

    @pytest.mark.asyncio
    async def test_bwrap_shell_exec_timeout(self):
        booter = BwrapBooter()
        await booter.boot("test_shell_timeout")

        # A long-running command should respect the timeout argument
        with pytest.raises(asyncio.TimeoutError):
            await booter.shell.exec("sleep 60", timeout=0.1)

        await booter.shutdown()

    @pytest.mark.asyncio
    async def test_bwrap_available(self):

```

1. Ensure `asyncio` is imported at the top of `tests/unit/test_computer.py`:
   - `import asyncio`
2. If `BwrapShellComponent.exec` signals timeouts differently (e.g., returns a result dict with a timeout flag or specific exit code instead of raising `asyncio.TimeoutError`), adjust `test_bwrap_shell_exec_timeout` to assert on that behavior instead of using `pytest.raises(asyncio.TimeoutError)`.
3. If the sandbox does not have `/bin/sh` or the `sleep`/`ls` utilities available, replace the shell commands with equivalents that are known to be present in your bwrap environment.
</issue_to_address>

### Comment 6
<location path="astrbot/core/computer/booters/bwrap.py" line_range="1" />
<code_context>
+from __future__ import annotations
+
+import asyncio
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring shared process execution, config initialization, filesystem operations, and validation helpers to centralize logic and remove duplicated code paths in the new bwrap booter implementation.

You can simplify and de-duplicate a few areas without changing behavior.

---

### 1. Centralize bwrap execution for Shell & Python

`BwrapShellComponent.exec` and `BwrapPythonComponent.exec` duplicate a lot of process logic. A small shared helper keeps behavior identical but reduces branching and makes it easier to change timeout/decoding semantics later.

```python
def _run_bwrap(
    config: BwrapConfig,
    script_cmd: list[str],
    *,
    cwd: str | None = None,
    env: dict[str, str] | None = None,
    timeout: int | None = 30,
    background: bool = False,
    text: bool = False,
    silent: bool = False,
) -> dict[str, Any]:
    run_env = os.environ.copy()
    if env:
        run_env.update({str(k): str(v) for k, v in env.items()})

    working_dir = cwd or config.workspace_dir
    bwrap_cmd = build_bwrap_cmd(config, script_cmd)

    if background:
        proc = subprocess.Popen(
            bwrap_cmd,
            cwd=working_dir,
            env=run_env,
            stdout=subprocess.DEVNULL,
            stderr=subprocess.DEVNULL,
        )
        return {"pid": proc.pid, "stdout": "", "stderr": "", "exit_code": None}

    try:
        result = subprocess.run(
            bwrap_cmd,
            cwd=working_dir,
            env=run_env,
            timeout=timeout,
            capture_output=True,
            text=text,
        )
        if text:
            stdout = "" if silent else (result.stdout or "")
            stderr = result.stderr or ""
        else:
            stdout = "" if silent else _decode_shell_output(result.stdout)
            stderr = _decode_shell_output(result.stderr)
        return {
            "stdout": stdout,
            "stderr": stderr,
            "exit_code": result.returncode,
        }
    except subprocess.TimeoutExpired as e:
        # Preserve existing behavior: best-effort decode
        raw_stdout = e.stdout
        if text:
            decoded = raw_stdout or ""
        else:
            decoded = _decode_shell_output(raw_stdout) if raw_stdout else ""
        return {
            "stdout": decoded,
            "stderr": f"Execution timed out after {timeout} seconds.",
            "exit_code": 1,
        }
```

Then both components become thin wrappers:

```python
@dataclass
class BwrapShellComponent(ShellComponent):
    config: BwrapConfig

    async def exec(...):
        def _run() -> dict[str, Any]:
            script_cmd = ["/bin/sh", "-c", command] if shell else shlex.split(command)
            return _run_bwrap(
                self.config,
                script_cmd,
                cwd=cwd,
                env=env,
                timeout=timeout,
                background=background,
                text=False,
            )
        return await asyncio.to_thread(_run)


@dataclass
class BwrapPythonComponent(PythonComponent):
    config: BwrapConfig

    async def exec(...):
        def _run() -> dict[str, Any]:
            script_cmd = [os.environ.get("PYTHON", "python3"), "-c", code]
            return _run_bwrap(
                self.config,
                script_cmd,
                timeout=timeout,
                text=True,
                silent=silent,
            )
        return await asyncio.to_thread(_run)
```

This also gives you a single place to tweak env handling, timeouts, etc.

---

### 2. Make decoding strategy consistent

Right now `BwrapShellComponent` uses `_decode_shell_output` while `BwrapPythonComponent` relies on `text=True` and manual decode on timeout. With `_run_bwrap` above, you can standardize:

- Use `text=True` for all cases where you want strings.
- Use `_decode_shell_output` only when you deliberately want to work with bytes.

For example, for shell you can also switch to `text=True` and drop `_decode_shell_output` entirely if you don’t need the multi-encoding fallback:

```python
# If you're comfortable with locale-based decoding:
return _run_bwrap(
    self.config,
    script_cmd,
    cwd=cwd,
    env=env,
    timeout=timeout,
    background=background,
    text=True,
)
```

Or keep `_decode_shell_output` only inside `_run_bwrap` for the non-text path so there’s exactly one decoding path.

---

### 3. Avoid in-place mutation in `BwrapConfig.__post_init__`

`__post_init__` mutates `self.ro_binds` that comes from `BwrapBooter`, making it less obvious what’s user data vs defaults. You can preserve behavior but overwrite with a new list:

```python
@dataclass
class BwrapConfig:
    workspace_dir: str
    ro_binds: list[str] = field(default_factory=list)
    rw_binds: list[str] = field(default_factory=list)
    share_net: bool = True

    def __post_init__(self):
        default_ro = ["/usr", "/lib", "/lib64", "/bin", "/etc", "/opt"]
        # preserve order, remove duplicates
        combined = [*self.ro_binds, *default_ro]
        seen = set()
        self.ro_binds = [p for p in combined if not (p in seen or seen.add(p))]
```

This keeps the effective bind set identical while eliminating hidden mutation of the list passed into the constructor.

Alternatively, you can push that logic into `BwrapBooter.boot` before constructing `BwrapConfig`, but the above is minimal and localized.

---

### 4. Consolidate file operations with `HostBackedFileSystemComponent`

`BwrapBooter.upload_file`/`download_file` partially duplicate filesystem logic already encapsulated in `HostBackedFileSystemComponent`. You can route these operations through `_fs` to avoid two code paths:

```python
class HostBackedFileSystemComponent(FileSystemComponent):
    ...
    async def copy_to_workspace(self, src: str, dest_name: str) -> dict[str, Any]:
        target = os.path.join(self.workspace_dir, dest_name)
        os.makedirs(os.path.dirname(target), exist_ok=True)
        try:
            shutil.copy2(src, target)
            return {"success": True, "file_path": target}
        except Exception as e:
            return {"success": False, "error": str(e)}

    async def copy_from_workspace(self, src: str, dest: str) -> dict[str, Any]:
        if not src.startswith("/"):
            src = os.path.join(self.workspace_dir, src)
        os.makedirs(os.path.dirname(dest), exist_ok=True)
        shutil.copy2(src, dest)
        return {"success": True, "file_path": dest}
```

Then `BwrapBooter` becomes thinner:

```python
async def upload_file(self, path: str, file_name: str) -> dict:
    if not self._fs:
        return {"success": False, "error": "Not booted"}
    return await self._fs.copy_to_workspace(path, file_name)

async def download_file(self, remote_path: str, local_path: str) -> dict:
    if not self._fs:
        return {"success": False, "error": "Not booted"}
    return await self._fs.copy_from_workspace(remote_path, local_path)
```

This keeps all file movement in one abstraction and reduces duplication.

---

### 5. Factor `boot` into smaller helpers

`boot` currently mixes wiring, availability, and validation. You can improve readability by extracting validation without changing logic:

```python
class BwrapBooter(ComputerBooter):
    ...

    async def _validate_sandbox(self) -> None:
        test_shl = await self._shell.exec(command="ls > /dev/null")
        if test_shl["exit_code"] != 0:
            raise RuntimeError(
                'BubbleWrap sandbox fails to exec test shell command "ls > /dev/null" '
                f"with stderr:\n{test_shl['stderr']}"
            )
        test_py = await self._python.exec(code="print('Yes')")
        if test_py["exit_code"] != 0:
            raise RuntimeError(
                'BubbleWrap sandbox fails to exec test python code "print(\'Yes\')" '
                f"with stderr:\n{test_py['stderr']}"
            )

    async def boot(self, session_id: str) -> None:
        workspace_dir = os.path.join(
            get_astrbot_temp_path(), f"sandbox_workspace_{session_id}"
        )
        os.makedirs(workspace_dir, exist_ok=True)

        self.config = BwrapConfig(
            workspace_dir=os.path.abspath(workspace_dir),
            rw_binds=self._rw_binds,
            ro_binds=self._ro_binds,
        )

        if not await self.available():
            raise RuntimeError(
                "BubbleWrap sandbox unavailable on current machine for no bwrap executable."
            )

        self._fs = HostBackedFileSystemComponent(self.config.workspace_dir)
        self._python = BwrapPythonComponent(self.config)
        self._shell = BwrapShellComponent(self.config)

        await self._validate_sandbox()
```

This keeps behavior identical while making `boot` easier to scan and maintain.

---

### 6. Reduce repetition in `build_bwrap_cmd`

The ro/rw bind loops repeat the same checks. A tiny helper clarifies the logic:

```python
def _append_bind(cmd: list[str], path: str, flag: str) -> None:
    if flag == "--bind":
        if path == "/" or path.startswith("/root"):
            return
    if os.path.exists(path):
        cmd.extend([flag, path, path])

def build_bwrap_cmd(config: BwrapConfig, script_cmd: list[str]) -> list[str]:
    cmd = ["bwrap"]
    if not config.share_net:
        cmd.append("--unshare-net")

    for path in config.ro_binds:
        _append_bind(cmd, path, "--ro-bind")

    for path in config.rw_binds:
        _append_bind(cmd, path, "--bind")

    cmd.extend([
        "--unshare-pid",
        "--unshare-ipc",
        "--unshare-uts",
        "--die-with-parent",
        "--dir", "/tmp",
        "--dir", "/var/tmp",
        "--proc", "/proc",
        "--dev", "/dev",
        "--bind", config.workspace_dir, config.workspace_dir,
    ])
    cmd.extend(["--", *script_cmd])
    return cmd
```

This keeps all existing guard conditions but cuts down on duplicated branches.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

return ""

frontmatter = "\n".join(lines[1:end_idx])
frontmatter = "\\n".join(lines[1:end_idx])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Joining with a literal "\n" string likely breaks YAML frontmatter parsing.

Using "\n" will insert the literal backslash-n characters into the frontmatter instead of real line breaks, which is likely to produce invalid or misparsed YAML for multi-line values. Unless there’s a strong reason to change the format, this should continue to join with real newlines ("\n").

Comment on lines +186 to +189
def _safe_path(self, path: str) -> str:
# Simply maps it. In a stricter implementation, we could verify it's inside workspace_dir.
# But for this implementation, we trust the agent or restrict to workspace_dir.
if not path.startswith("/"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 issue (security): Absolute paths bypass workspace restriction, allowing arbitrary host filesystem access.

Returning absolute paths unchanged lets callers use FileSystemComponent to access arbitrary host paths (e.g. /etc/passwd, /home/...), breaking the intended sandbox isolation. You should ensure all paths resolve under workspace_dir (e.g., normalize with os.path.realpath and check it’s under the workspace root), or very clearly document and strictly limit which absolute paths are allowed.

Comment on lines +27 to +36
try:
return output.decode("utf-8")
except (LookupError, UnicodeDecodeError):
pass

try:
return output.decode(preferred)
except (LookupError, UnicodeDecodeError):
pass

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Timeout handler assumes stdout may be bytes while text=True guarantees str.

With text=True, e.stdout is always a str (or None), so the isinstance(e.stdout, bytes) check and .decode() branch will never run. This dead code is misleading. You can simplify and keep behavior aligned with the non-timeout path, e.g.:

stdout = e.stdout or ""
return {
    "stdout": "" if silent else stdout,
    "stderr": f"Execution timed out after {timeout} seconds.",
    "exit_code": 1,
}

Comment on lines +278 to +279
if not await self.available():
raise RuntimeError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Check availability before constructing shell/python components and using bwrap.

You’re doing all the workspace and component setup before confirming bwrap exists, then immediately raising if it doesn’t. Please move the await self.available() check (and RuntimeError) ahead of creating HostBackedFileSystemComponent, BwrapPythonComponent, and BwrapShellComponent so we avoid unnecessary setup and keep the failure path clearer and cheaper.

await booter.boot("test_session_123")
assert booter.config is not None
assert os.path.exists(booter.config.workspace_dir)
await booter.shutdown()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add coverage for BwrapShellComponent background execution and environment handling

The current tests cover only basic execution and a single bind case. Please add tests that exercise more of BwrapShellComponent.exec’s behavior, including:

  • background=True (returns quickly, gives a PID, exit_code is None).
  • Merging of a custom env dict into the process environment (e.g., assert a variable appears in stdout).
  • cwd overrides when running in a subdirectory.
  • Respect for the timeout argument on a long-running command.

Even a couple of focused tests here would substantially reduce the risk of regressions in these paths.

Suggested implementation:

    @pytest.mark.asyncio
    async def test_bwrap_shell_exec_background(self):
        booter = BwrapBooter()
        await booter.boot("test_shell_background")

        # background=True should return quickly with a PID and no exit code yet
        res = await booter.shell.exec("sleep 5", background=True)
        assert res.get("exit_code") is None
        assert isinstance(res.get("pid"), int)

        await booter.shutdown()

    @pytest.mark.asyncio
    async def test_bwrap_shell_exec_env_merge(self):
        booter = BwrapBooter()
        await booter.boot("test_shell_env")

        # Custom env vars should be visible inside the sandboxed process
        res = await booter.shell.exec(
            "sh -lc 'echo \"$MY_TEST_VAR\"'",
            env={"MY_TEST_VAR": "bwrap_env_value"},
        )
        assert res["exit_code"] == 0
        assert "bwrap_env_value" in res["stdout"]

        await booter.shutdown()

    @pytest.mark.asyncio
    async def test_bwrap_shell_exec_cwd(self, tmp_path):
        booter = BwrapBooter()
        await booter.boot("test_shell_cwd")

        subdir = tmp_path / "subdir"
        subdir.mkdir()
        file_in_subdir = subdir / "here.txt"
        file_in_subdir.write_text("from_subdir")

        # Running in the subdirectory via cwd should see the file there
        res = await booter.shell.exec("ls", cwd=str(subdir))
        assert res["exit_code"] == 0
        assert "here.txt" in res["stdout"]

        await booter.shutdown()

    @pytest.mark.asyncio
    async def test_bwrap_shell_exec_timeout(self):
        booter = BwrapBooter()
        await booter.boot("test_shell_timeout")

        # A long-running command should respect the timeout argument
        with pytest.raises(asyncio.TimeoutError):
            await booter.shell.exec("sleep 60", timeout=0.1)

        await booter.shutdown()

    @pytest.mark.asyncio
    async def test_bwrap_available(self):
  1. Ensure asyncio is imported at the top of tests/unit/test_computer.py:
    • import asyncio
  2. If BwrapShellComponent.exec signals timeouts differently (e.g., returns a result dict with a timeout flag or specific exit code instead of raising asyncio.TimeoutError), adjust test_bwrap_shell_exec_timeout to assert on that behavior instead of using pytest.raises(asyncio.TimeoutError).
  3. If the sandbox does not have /bin/sh or the sleep/ls utilities available, replace the shell commands with equivalents that are known to be present in your bwrap environment.

@@ -0,0 +1,318 @@
from __future__ import annotations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider refactoring shared process execution, config initialization, filesystem operations, and validation helpers to centralize logic and remove duplicated code paths in the new bwrap booter implementation.

You can simplify and de-duplicate a few areas without changing behavior.


1. Centralize bwrap execution for Shell & Python

BwrapShellComponent.exec and BwrapPythonComponent.exec duplicate a lot of process logic. A small shared helper keeps behavior identical but reduces branching and makes it easier to change timeout/decoding semantics later.

def _run_bwrap(
    config: BwrapConfig,
    script_cmd: list[str],
    *,
    cwd: str | None = None,
    env: dict[str, str] | None = None,
    timeout: int | None = 30,
    background: bool = False,
    text: bool = False,
    silent: bool = False,
) -> dict[str, Any]:
    run_env = os.environ.copy()
    if env:
        run_env.update({str(k): str(v) for k, v in env.items()})

    working_dir = cwd or config.workspace_dir
    bwrap_cmd = build_bwrap_cmd(config, script_cmd)

    if background:
        proc = subprocess.Popen(
            bwrap_cmd,
            cwd=working_dir,
            env=run_env,
            stdout=subprocess.DEVNULL,
            stderr=subprocess.DEVNULL,
        )
        return {"pid": proc.pid, "stdout": "", "stderr": "", "exit_code": None}

    try:
        result = subprocess.run(
            bwrap_cmd,
            cwd=working_dir,
            env=run_env,
            timeout=timeout,
            capture_output=True,
            text=text,
        )
        if text:
            stdout = "" if silent else (result.stdout or "")
            stderr = result.stderr or ""
        else:
            stdout = "" if silent else _decode_shell_output(result.stdout)
            stderr = _decode_shell_output(result.stderr)
        return {
            "stdout": stdout,
            "stderr": stderr,
            "exit_code": result.returncode,
        }
    except subprocess.TimeoutExpired as e:
        # Preserve existing behavior: best-effort decode
        raw_stdout = e.stdout
        if text:
            decoded = raw_stdout or ""
        else:
            decoded = _decode_shell_output(raw_stdout) if raw_stdout else ""
        return {
            "stdout": decoded,
            "stderr": f"Execution timed out after {timeout} seconds.",
            "exit_code": 1,
        }

Then both components become thin wrappers:

@dataclass
class BwrapShellComponent(ShellComponent):
    config: BwrapConfig

    async def exec(...):
        def _run() -> dict[str, Any]:
            script_cmd = ["/bin/sh", "-c", command] if shell else shlex.split(command)
            return _run_bwrap(
                self.config,
                script_cmd,
                cwd=cwd,
                env=env,
                timeout=timeout,
                background=background,
                text=False,
            )
        return await asyncio.to_thread(_run)


@dataclass
class BwrapPythonComponent(PythonComponent):
    config: BwrapConfig

    async def exec(...):
        def _run() -> dict[str, Any]:
            script_cmd = [os.environ.get("PYTHON", "python3"), "-c", code]
            return _run_bwrap(
                self.config,
                script_cmd,
                timeout=timeout,
                text=True,
                silent=silent,
            )
        return await asyncio.to_thread(_run)

This also gives you a single place to tweak env handling, timeouts, etc.


2. Make decoding strategy consistent

Right now BwrapShellComponent uses _decode_shell_output while BwrapPythonComponent relies on text=True and manual decode on timeout. With _run_bwrap above, you can standardize:

  • Use text=True for all cases where you want strings.
  • Use _decode_shell_output only when you deliberately want to work with bytes.

For example, for shell you can also switch to text=True and drop _decode_shell_output entirely if you don’t need the multi-encoding fallback:

# If you're comfortable with locale-based decoding:
return _run_bwrap(
    self.config,
    script_cmd,
    cwd=cwd,
    env=env,
    timeout=timeout,
    background=background,
    text=True,
)

Or keep _decode_shell_output only inside _run_bwrap for the non-text path so there’s exactly one decoding path.


3. Avoid in-place mutation in BwrapConfig.__post_init__

__post_init__ mutates self.ro_binds that comes from BwrapBooter, making it less obvious what’s user data vs defaults. You can preserve behavior but overwrite with a new list:

@dataclass
class BwrapConfig:
    workspace_dir: str
    ro_binds: list[str] = field(default_factory=list)
    rw_binds: list[str] = field(default_factory=list)
    share_net: bool = True

    def __post_init__(self):
        default_ro = ["/usr", "/lib", "/lib64", "/bin", "/etc", "/opt"]
        # preserve order, remove duplicates
        combined = [*self.ro_binds, *default_ro]
        seen = set()
        self.ro_binds = [p for p in combined if not (p in seen or seen.add(p))]

This keeps the effective bind set identical while eliminating hidden mutation of the list passed into the constructor.

Alternatively, you can push that logic into BwrapBooter.boot before constructing BwrapConfig, but the above is minimal and localized.


4. Consolidate file operations with HostBackedFileSystemComponent

BwrapBooter.upload_file/download_file partially duplicate filesystem logic already encapsulated in HostBackedFileSystemComponent. You can route these operations through _fs to avoid two code paths:

class HostBackedFileSystemComponent(FileSystemComponent):
    ...
    async def copy_to_workspace(self, src: str, dest_name: str) -> dict[str, Any]:
        target = os.path.join(self.workspace_dir, dest_name)
        os.makedirs(os.path.dirname(target), exist_ok=True)
        try:
            shutil.copy2(src, target)
            return {"success": True, "file_path": target}
        except Exception as e:
            return {"success": False, "error": str(e)}

    async def copy_from_workspace(self, src: str, dest: str) -> dict[str, Any]:
        if not src.startswith("/"):
            src = os.path.join(self.workspace_dir, src)
        os.makedirs(os.path.dirname(dest), exist_ok=True)
        shutil.copy2(src, dest)
        return {"success": True, "file_path": dest}

Then BwrapBooter becomes thinner:

async def upload_file(self, path: str, file_name: str) -> dict:
    if not self._fs:
        return {"success": False, "error": "Not booted"}
    return await self._fs.copy_to_workspace(path, file_name)

async def download_file(self, remote_path: str, local_path: str) -> dict:
    if not self._fs:
        return {"success": False, "error": "Not booted"}
    return await self._fs.copy_from_workspace(remote_path, local_path)

This keeps all file movement in one abstraction and reduces duplication.


5. Factor boot into smaller helpers

boot currently mixes wiring, availability, and validation. You can improve readability by extracting validation without changing logic:

class BwrapBooter(ComputerBooter):
    ...

    async def _validate_sandbox(self) -> None:
        test_shl = await self._shell.exec(command="ls > /dev/null")
        if test_shl["exit_code"] != 0:
            raise RuntimeError(
                'BubbleWrap sandbox fails to exec test shell command "ls > /dev/null" '
                f"with stderr:\n{test_shl['stderr']}"
            )
        test_py = await self._python.exec(code="print('Yes')")
        if test_py["exit_code"] != 0:
            raise RuntimeError(
                'BubbleWrap sandbox fails to exec test python code "print(\'Yes\')" '
                f"with stderr:\n{test_py['stderr']}"
            )

    async def boot(self, session_id: str) -> None:
        workspace_dir = os.path.join(
            get_astrbot_temp_path(), f"sandbox_workspace_{session_id}"
        )
        os.makedirs(workspace_dir, exist_ok=True)

        self.config = BwrapConfig(
            workspace_dir=os.path.abspath(workspace_dir),
            rw_binds=self._rw_binds,
            ro_binds=self._ro_binds,
        )

        if not await self.available():
            raise RuntimeError(
                "BubbleWrap sandbox unavailable on current machine for no bwrap executable."
            )

        self._fs = HostBackedFileSystemComponent(self.config.workspace_dir)
        self._python = BwrapPythonComponent(self.config)
        self._shell = BwrapShellComponent(self.config)

        await self._validate_sandbox()

This keeps behavior identical while making boot easier to scan and maintain.


6. Reduce repetition in build_bwrap_cmd

The ro/rw bind loops repeat the same checks. A tiny helper clarifies the logic:

def _append_bind(cmd: list[str], path: str, flag: str) -> None:
    if flag == "--bind":
        if path == "/" or path.startswith("/root"):
            return
    if os.path.exists(path):
        cmd.extend([flag, path, path])

def build_bwrap_cmd(config: BwrapConfig, script_cmd: list[str]) -> list[str]:
    cmd = ["bwrap"]
    if not config.share_net:
        cmd.append("--unshare-net")

    for path in config.ro_binds:
        _append_bind(cmd, path, "--ro-bind")

    for path in config.rw_binds:
        _append_bind(cmd, path, "--bind")

    cmd.extend([
        "--unshare-pid",
        "--unshare-ipc",
        "--unshare-uts",
        "--die-with-parent",
        "--dir", "/tmp",
        "--dir", "/var/tmp",
        "--proc", "/proc",
        "--dev", "/dev",
        "--bind", config.workspace_dir, config.workspace_dir,
    ])
    cmd.extend(["--", *script_cmd])
    return cmd

This keeps all existing guard conditions but cuts down on duplicated branches.

@gemini-code-assist
Copy link
Contributor

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants