dev : Implement bubblewrap sandbox backend with availability detection and tests#6709
dev : Implement bubblewrap sandbox backend with availability detection and tests#6709LIghtJUNction wants to merge 7 commits intoAstrBotDevs:devfrom
Conversation
- Based on Linux Namespace for resource isolation - Runs on local computer, with no privilege required - Only supports Linux as namespace & bubblewrap are not present on other platforms. TODO: - Fix dashboard presentation. Why change on src does not affect what is really displayed? - Strenghthen backend availability detection. One known issue is, on some platforms like Ubuntu 24.04, bubblewrap is banned by system guards, even when it's shipped by package manager. A complete detector may contain : 1. run the command with cmdline used by the booter. Return True if succsed. 2. If false, do bottom-up reason detection. Namespace not compiled to kernel? Specific kernel parameters not set? Banned by safety guard? The availability detector should give the user a clear information on why this sandbox backend fails. These work may require helps from frontend developers. It is tested to be usable on my computer, with non-persistent environment(forget on every command) and persistent file storage.
TODO add: - add plugin utility to change ro and rw bind in cmdline - make bind dirs dict instead of list to manually map mount point
in older commits, ro_bind = ['/'] makes skill sync crash. This commit fixes it and adds detection.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a Bubblewrap sandbox environment to Astrbot, enhancing security and configurability for code execution. It includes the implementation of the sandbox, updates to the configuration schema, integration into the computer client, localization support, and testing enhancements, providing a comprehensive solution for isolated code execution. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
Related Documentation 1 document(s) may need updating based on files changed in this PR: AstrBotTeam's Space pr4697的改动View Suggested Changes@@ -117,9 +117,40 @@
- 如果 `computer_use_runtime` 设置为 `"local"`,立即返回本地 booter,无需检查 sandbox booter 配置
- 如果 `computer_use_runtime` 设置为 `"none"`,抛出 `RuntimeError` 提示沙箱运行时已禁用
-- 只有在以上条件均不满足时,才继续检查 sandbox booter 配置(如 `shipyard_neo` 等)
+- 只有在以上条件均不满足时,才继续检查 sandbox booter 配置(如 `shipyard_neo`、`shipyard`、`bwrap` 等)
该优化避免了在运行时类型已明确时进行不必要的沙箱配置检查,提高了边缘场景的性能。
+
+**Bubblewrap 沙箱支持(PR #6709)**
+
+[PR #6709](https://github.com/AstrBotDevs/AstrBot/pull/6709) 新增了基于 Bubblewrap 的沙箱环境(`bwrap`),提供安全且可配置的代码执行隔离层,作为 Docker 沙箱的替代方案。
+
+**核心特性:**
+- **轻量级隔离**:使用 Bubblewrap 提供命名空间级别的沙箱隔离,无需 Docker 容器
+- **自定义挂载**:支持配置额外的读写和只读挂载目录,灵活控制沙箱可访问的文件系统资源
+- **自动检测**:系统在运行时自动检测 Bubblewrap 可用性,若未安装则提供明确的错误提示
+- **完整工具支持**:提供 Python、Shell 和文件系统操作能力,与其他沙箱后端保持一致
+
+**配置项(provider_settings.sandbox):**
+- `booter: "bwrap"`:选择 Bubblewrap 作为沙箱驱动器
+- `bwrap_rw_binds`:读写挂载目录列表,用于指定需要在沙箱内以读写权限挂载的目录或文件(默认 `/tmp` 和 `/var/tmp` 已自动配置为读写)
+- `bwrap_ro_binds`:只读挂载目录列表,用于指定需要在沙箱内以只读权限挂载的目录或文件(默认系统目录 `/usr`、`/etc`、`/opt` 已自动配置为只读)
+
+**技术实现:**
+- `BwrapBooter` 类:实现 Bubblewrap 沙箱后端,管理沙箱生命周期和资源清理
+- `BwrapConfig` 数据类:封装 Bubblewrap 配置,包括工作区目录、挂载设置和网络隔离选项
+- `build_bwrap_cmd()` 函数:根据配置构建 Bubblewrap 命令行,确保正确的隔离和挂载
+- `HostBackedFileSystemComponent`:文件操作直接在宿主机工作区执行,提供极快的 I/O 性能
+- 自动可用性检测:系统在启动时检查 `bwrap` 可执行文件是否存在(Windows 系统不支持)
+
+**使用限制:**
+- 仅支持 Linux 和 macOS 系统(通过检测 `bwrap` 命令可用性)
+- Windows 系统下 `available()` 方法返回 `false`
+- 需要系统中已安装 Bubblewrap(在大多数 Linux 发行版中可通过包管理器安装)
+
+**测试覆盖:**
+- `tests/unit/test_computer.py` 包含 Bubblewrap 实现的完整单元测试
+- 测试场景包括配置构建、沙箱生命周期、Shell/Python 执行、文件系统操作和上传/下载功能
**运行时工具列表:**
Note: You must be authenticated to accept/decline updates. |
There was a problem hiding this comment.
Hey - I've found 6 issues, and left some high level feedback:
- The change to
parse_descriptionnow joins frontmatter lines with a literal"\n"instead of real newlines ("\n"→"\\n"), which is likely to break YAML parsing of the frontmatter; consider keeping real newline characters or explicitly explaining why YAML should receive escaped newlines. - In
HostBackedFileSystemComponent._safe_path, absolute paths are passed through without any confinement, which allows file operations outside the sandbox workspace; if stronger isolation is desired, consider normalizing and enforcing that all paths resolve underworkspace_dir(e.g., viaos.path.realpathchecks) and rejecting anything outside. - For
BwrapShellComponent.execwithbackground=True, the createdPopenprocess is never tracked or reaped; if long-running background commands are expected, consider adding some form of lifecycle management (e.g., tracking PIDs or usingsubprocess.DEVNULLplus periodic reaping) to avoid lingering processes.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The change to `parse_description` now joins frontmatter lines with a literal `"\n"` instead of real newlines (`"\n"` → `"\\n"`), which is likely to break YAML parsing of the frontmatter; consider keeping real newline characters or explicitly explaining why YAML should receive escaped newlines.
- In `HostBackedFileSystemComponent._safe_path`, absolute paths are passed through without any confinement, which allows file operations outside the sandbox workspace; if stronger isolation is desired, consider normalizing and enforcing that all paths resolve under `workspace_dir` (e.g., via `os.path.realpath` checks) and rejecting anything outside.
- For `BwrapShellComponent.exec` with `background=True`, the created `Popen` process is never tracked or reaped; if long-running background commands are expected, consider adding some form of lifecycle management (e.g., tracking PIDs or using `subprocess.DEVNULL` plus periodic reaping) to avoid lingering processes.
## Individual Comments
### Comment 1
<location path="astrbot/core/computer/computer_client.py" line_range="217" />
<code_context>
return ""
- frontmatter = "\n".join(lines[1:end_idx])
+ frontmatter = "\\n".join(lines[1:end_idx])
try:
import yaml
</code_context>
<issue_to_address>
**issue (bug_risk):** Joining with a literal "\n" string likely breaks YAML frontmatter parsing.
Using "\\n" will insert the literal backslash-n characters into the frontmatter instead of real line breaks, which is likely to produce invalid or misparsed YAML for multi-line values. Unless there’s a strong reason to change the format, this should continue to join with real newlines ("\n").
</issue_to_address>
### Comment 2
<location path="astrbot/core/computer/booters/bwrap.py" line_range="186-189" />
<code_context>
+ """File operations happen safely on host mapping to workspace, making I/O extremely fast."""
+ workspace_dir: str
+
+ def _safe_path(self, path: str) -> str:
+ # Simply maps it. In a stricter implementation, we could verify it's inside workspace_dir.
+ # But for this implementation, we trust the agent or restrict to workspace_dir.
+ if not path.startswith("/"):
+ path = os.path.join(self.workspace_dir, path)
+ return path
</code_context>
<issue_to_address>
**🚨 issue (security):** Absolute paths bypass workspace restriction, allowing arbitrary host filesystem access.
Returning absolute paths unchanged lets callers use FileSystemComponent to access arbitrary host paths (e.g. `/etc/passwd`, `/home/...`), breaking the intended sandbox isolation. You should ensure all paths resolve under `workspace_dir` (e.g., normalize with `os.path.realpath` and check it’s under the workspace root), or very clearly document and strictly limit which absolute paths are allowed.
</issue_to_address>
### Comment 3
<location path="astrbot/core/computer/booters/bwrap.py" line_range="27-36" />
<code_context>
+ ) -> dict[str, Any]:
+ def _run() -> dict[str, Any]:
+ bwrap_cmd = build_bwrap_cmd(self.config, [os.environ.get("PYTHON", "python3"), "-c", code])
+ try:
+ result = subprocess.run(
+ bwrap_cmd,
+ timeout=timeout,
+ capture_output=True,
+ text=True,
+ )
+ stdout = "" if silent else result.stdout
+ return {
+ "stdout": stdout,
+ "stderr": result.stderr,
+ "exit_code": result.returncode,
+ }
+ except subprocess.TimeoutExpired as e:
+ return {
+ "stdout": e.stdout.decode() if isinstance(e.stdout, bytes) else str(e.stdout or ""),
</code_context>
<issue_to_address>
**suggestion:** Timeout handler assumes `stdout` may be bytes while `text=True` guarantees str.
With `text=True`, `e.stdout` is always a `str` (or `None`), so the `isinstance(e.stdout, bytes)` check and `.decode()` branch will never run. This dead code is misleading. You can simplify and keep behavior aligned with the non-timeout path, e.g.:
```python
stdout = e.stdout or ""
return {
"stdout": "" if silent else stdout,
"stderr": f"Execution timed out after {timeout} seconds.",
"exit_code": 1,
}
```
</issue_to_address>
### Comment 4
<location path="astrbot/core/computer/booters/bwrap.py" line_range="278-279" />
<code_context>
+ self._fs = HostBackedFileSystemComponent(self.config.workspace_dir)
+ self._python = BwrapPythonComponent(self.config)
+ self._shell = BwrapShellComponent(self.config)
+ if not await self.available():
+ raise RuntimeError(
+ "BubbleWrap sandbox unavailable on current machine for no bwrap executable.")
+ test_shl = await self._shell.exec(command = "ls > /dev/null")
</code_context>
<issue_to_address>
**suggestion:** Check availability before constructing shell/python components and using bwrap.
You’re doing all the workspace and component setup before confirming `bwrap` exists, then immediately raising if it doesn’t. Please move the `await self.available()` check (and `RuntimeError`) ahead of creating `HostBackedFileSystemComponent`, `BwrapPythonComponent`, and `BwrapShellComponent` so we avoid unnecessary setup and keep the failure path clearer and cheaper.
</issue_to_address>
### Comment 5
<location path="tests/unit/test_computer.py" line_range="964-930" />
<code_context>
+ await booter.shutdown()
+
+@pytest.mark.skipif(shutil.which("bwrap") is None, reason="bwrap is not installed")
+class TestBwrapShellComponent:
+ @pytest.mark.asyncio
+ async def test_bwrap_shell_exec(self):
+ booter = BwrapBooter()
+ await booter.boot("test_shell")
+ res = await booter.shell.exec("echo 'hello bwrap'")
+ assert res["exit_code"] == 0
+ assert "hello bwrap" in res["stdout"]
+ await booter.shutdown()
+
+ @pytest.mark.asyncio
</code_context>
<issue_to_address>
**suggestion (testing):** Add coverage for BwrapShellComponent background execution and environment handling
The current tests cover only basic execution and a single bind case. Please add tests that exercise more of `BwrapShellComponent.exec`’s behavior, including:
- `background=True` (returns quickly, gives a PID, `exit_code` is `None`).
- Merging of a custom `env` dict into the process environment (e.g., assert a variable appears in `stdout`).
- `cwd` overrides when running in a subdirectory.
- Respect for the `timeout` argument on a long-running command.
Even a couple of focused tests here would substantially reduce the risk of regressions in these paths.
Suggested implementation:
```python
@pytest.mark.asyncio
async def test_bwrap_shell_exec_background(self):
booter = BwrapBooter()
await booter.boot("test_shell_background")
# background=True should return quickly with a PID and no exit code yet
res = await booter.shell.exec("sleep 5", background=True)
assert res.get("exit_code") is None
assert isinstance(res.get("pid"), int)
await booter.shutdown()
@pytest.mark.asyncio
async def test_bwrap_shell_exec_env_merge(self):
booter = BwrapBooter()
await booter.boot("test_shell_env")
# Custom env vars should be visible inside the sandboxed process
res = await booter.shell.exec(
"sh -lc 'echo \"$MY_TEST_VAR\"'",
env={"MY_TEST_VAR": "bwrap_env_value"},
)
assert res["exit_code"] == 0
assert "bwrap_env_value" in res["stdout"]
await booter.shutdown()
@pytest.mark.asyncio
async def test_bwrap_shell_exec_cwd(self, tmp_path):
booter = BwrapBooter()
await booter.boot("test_shell_cwd")
subdir = tmp_path / "subdir"
subdir.mkdir()
file_in_subdir = subdir / "here.txt"
file_in_subdir.write_text("from_subdir")
# Running in the subdirectory via cwd should see the file there
res = await booter.shell.exec("ls", cwd=str(subdir))
assert res["exit_code"] == 0
assert "here.txt" in res["stdout"]
await booter.shutdown()
@pytest.mark.asyncio
async def test_bwrap_shell_exec_timeout(self):
booter = BwrapBooter()
await booter.boot("test_shell_timeout")
# A long-running command should respect the timeout argument
with pytest.raises(asyncio.TimeoutError):
await booter.shell.exec("sleep 60", timeout=0.1)
await booter.shutdown()
@pytest.mark.asyncio
async def test_bwrap_available(self):
```
1. Ensure `asyncio` is imported at the top of `tests/unit/test_computer.py`:
- `import asyncio`
2. If `BwrapShellComponent.exec` signals timeouts differently (e.g., returns a result dict with a timeout flag or specific exit code instead of raising `asyncio.TimeoutError`), adjust `test_bwrap_shell_exec_timeout` to assert on that behavior instead of using `pytest.raises(asyncio.TimeoutError)`.
3. If the sandbox does not have `/bin/sh` or the `sleep`/`ls` utilities available, replace the shell commands with equivalents that are known to be present in your bwrap environment.
</issue_to_address>
### Comment 6
<location path="astrbot/core/computer/booters/bwrap.py" line_range="1" />
<code_context>
+from __future__ import annotations
+
+import asyncio
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring shared process execution, config initialization, filesystem operations, and validation helpers to centralize logic and remove duplicated code paths in the new bwrap booter implementation.
You can simplify and de-duplicate a few areas without changing behavior.
---
### 1. Centralize bwrap execution for Shell & Python
`BwrapShellComponent.exec` and `BwrapPythonComponent.exec` duplicate a lot of process logic. A small shared helper keeps behavior identical but reduces branching and makes it easier to change timeout/decoding semantics later.
```python
def _run_bwrap(
config: BwrapConfig,
script_cmd: list[str],
*,
cwd: str | None = None,
env: dict[str, str] | None = None,
timeout: int | None = 30,
background: bool = False,
text: bool = False,
silent: bool = False,
) -> dict[str, Any]:
run_env = os.environ.copy()
if env:
run_env.update({str(k): str(v) for k, v in env.items()})
working_dir = cwd or config.workspace_dir
bwrap_cmd = build_bwrap_cmd(config, script_cmd)
if background:
proc = subprocess.Popen(
bwrap_cmd,
cwd=working_dir,
env=run_env,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
return {"pid": proc.pid, "stdout": "", "stderr": "", "exit_code": None}
try:
result = subprocess.run(
bwrap_cmd,
cwd=working_dir,
env=run_env,
timeout=timeout,
capture_output=True,
text=text,
)
if text:
stdout = "" if silent else (result.stdout or "")
stderr = result.stderr or ""
else:
stdout = "" if silent else _decode_shell_output(result.stdout)
stderr = _decode_shell_output(result.stderr)
return {
"stdout": stdout,
"stderr": stderr,
"exit_code": result.returncode,
}
except subprocess.TimeoutExpired as e:
# Preserve existing behavior: best-effort decode
raw_stdout = e.stdout
if text:
decoded = raw_stdout or ""
else:
decoded = _decode_shell_output(raw_stdout) if raw_stdout else ""
return {
"stdout": decoded,
"stderr": f"Execution timed out after {timeout} seconds.",
"exit_code": 1,
}
```
Then both components become thin wrappers:
```python
@dataclass
class BwrapShellComponent(ShellComponent):
config: BwrapConfig
async def exec(...):
def _run() -> dict[str, Any]:
script_cmd = ["/bin/sh", "-c", command] if shell else shlex.split(command)
return _run_bwrap(
self.config,
script_cmd,
cwd=cwd,
env=env,
timeout=timeout,
background=background,
text=False,
)
return await asyncio.to_thread(_run)
@dataclass
class BwrapPythonComponent(PythonComponent):
config: BwrapConfig
async def exec(...):
def _run() -> dict[str, Any]:
script_cmd = [os.environ.get("PYTHON", "python3"), "-c", code]
return _run_bwrap(
self.config,
script_cmd,
timeout=timeout,
text=True,
silent=silent,
)
return await asyncio.to_thread(_run)
```
This also gives you a single place to tweak env handling, timeouts, etc.
---
### 2. Make decoding strategy consistent
Right now `BwrapShellComponent` uses `_decode_shell_output` while `BwrapPythonComponent` relies on `text=True` and manual decode on timeout. With `_run_bwrap` above, you can standardize:
- Use `text=True` for all cases where you want strings.
- Use `_decode_shell_output` only when you deliberately want to work with bytes.
For example, for shell you can also switch to `text=True` and drop `_decode_shell_output` entirely if you don’t need the multi-encoding fallback:
```python
# If you're comfortable with locale-based decoding:
return _run_bwrap(
self.config,
script_cmd,
cwd=cwd,
env=env,
timeout=timeout,
background=background,
text=True,
)
```
Or keep `_decode_shell_output` only inside `_run_bwrap` for the non-text path so there’s exactly one decoding path.
---
### 3. Avoid in-place mutation in `BwrapConfig.__post_init__`
`__post_init__` mutates `self.ro_binds` that comes from `BwrapBooter`, making it less obvious what’s user data vs defaults. You can preserve behavior but overwrite with a new list:
```python
@dataclass
class BwrapConfig:
workspace_dir: str
ro_binds: list[str] = field(default_factory=list)
rw_binds: list[str] = field(default_factory=list)
share_net: bool = True
def __post_init__(self):
default_ro = ["/usr", "/lib", "/lib64", "/bin", "/etc", "/opt"]
# preserve order, remove duplicates
combined = [*self.ro_binds, *default_ro]
seen = set()
self.ro_binds = [p for p in combined if not (p in seen or seen.add(p))]
```
This keeps the effective bind set identical while eliminating hidden mutation of the list passed into the constructor.
Alternatively, you can push that logic into `BwrapBooter.boot` before constructing `BwrapConfig`, but the above is minimal and localized.
---
### 4. Consolidate file operations with `HostBackedFileSystemComponent`
`BwrapBooter.upload_file`/`download_file` partially duplicate filesystem logic already encapsulated in `HostBackedFileSystemComponent`. You can route these operations through `_fs` to avoid two code paths:
```python
class HostBackedFileSystemComponent(FileSystemComponent):
...
async def copy_to_workspace(self, src: str, dest_name: str) -> dict[str, Any]:
target = os.path.join(self.workspace_dir, dest_name)
os.makedirs(os.path.dirname(target), exist_ok=True)
try:
shutil.copy2(src, target)
return {"success": True, "file_path": target}
except Exception as e:
return {"success": False, "error": str(e)}
async def copy_from_workspace(self, src: str, dest: str) -> dict[str, Any]:
if not src.startswith("/"):
src = os.path.join(self.workspace_dir, src)
os.makedirs(os.path.dirname(dest), exist_ok=True)
shutil.copy2(src, dest)
return {"success": True, "file_path": dest}
```
Then `BwrapBooter` becomes thinner:
```python
async def upload_file(self, path: str, file_name: str) -> dict:
if not self._fs:
return {"success": False, "error": "Not booted"}
return await self._fs.copy_to_workspace(path, file_name)
async def download_file(self, remote_path: str, local_path: str) -> dict:
if not self._fs:
return {"success": False, "error": "Not booted"}
return await self._fs.copy_from_workspace(remote_path, local_path)
```
This keeps all file movement in one abstraction and reduces duplication.
---
### 5. Factor `boot` into smaller helpers
`boot` currently mixes wiring, availability, and validation. You can improve readability by extracting validation without changing logic:
```python
class BwrapBooter(ComputerBooter):
...
async def _validate_sandbox(self) -> None:
test_shl = await self._shell.exec(command="ls > /dev/null")
if test_shl["exit_code"] != 0:
raise RuntimeError(
'BubbleWrap sandbox fails to exec test shell command "ls > /dev/null" '
f"with stderr:\n{test_shl['stderr']}"
)
test_py = await self._python.exec(code="print('Yes')")
if test_py["exit_code"] != 0:
raise RuntimeError(
'BubbleWrap sandbox fails to exec test python code "print(\'Yes\')" '
f"with stderr:\n{test_py['stderr']}"
)
async def boot(self, session_id: str) -> None:
workspace_dir = os.path.join(
get_astrbot_temp_path(), f"sandbox_workspace_{session_id}"
)
os.makedirs(workspace_dir, exist_ok=True)
self.config = BwrapConfig(
workspace_dir=os.path.abspath(workspace_dir),
rw_binds=self._rw_binds,
ro_binds=self._ro_binds,
)
if not await self.available():
raise RuntimeError(
"BubbleWrap sandbox unavailable on current machine for no bwrap executable."
)
self._fs = HostBackedFileSystemComponent(self.config.workspace_dir)
self._python = BwrapPythonComponent(self.config)
self._shell = BwrapShellComponent(self.config)
await self._validate_sandbox()
```
This keeps behavior identical while making `boot` easier to scan and maintain.
---
### 6. Reduce repetition in `build_bwrap_cmd`
The ro/rw bind loops repeat the same checks. A tiny helper clarifies the logic:
```python
def _append_bind(cmd: list[str], path: str, flag: str) -> None:
if flag == "--bind":
if path == "/" or path.startswith("/root"):
return
if os.path.exists(path):
cmd.extend([flag, path, path])
def build_bwrap_cmd(config: BwrapConfig, script_cmd: list[str]) -> list[str]:
cmd = ["bwrap"]
if not config.share_net:
cmd.append("--unshare-net")
for path in config.ro_binds:
_append_bind(cmd, path, "--ro-bind")
for path in config.rw_binds:
_append_bind(cmd, path, "--bind")
cmd.extend([
"--unshare-pid",
"--unshare-ipc",
"--unshare-uts",
"--die-with-parent",
"--dir", "/tmp",
"--dir", "/var/tmp",
"--proc", "/proc",
"--dev", "/dev",
"--bind", config.workspace_dir, config.workspace_dir,
])
cmd.extend(["--", *script_cmd])
return cmd
```
This keeps all existing guard conditions but cuts down on duplicated branches.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| return "" | ||
|
|
||
| frontmatter = "\n".join(lines[1:end_idx]) | ||
| frontmatter = "\\n".join(lines[1:end_idx]) |
There was a problem hiding this comment.
issue (bug_risk): Joining with a literal "\n" string likely breaks YAML frontmatter parsing.
Using "\n" will insert the literal backslash-n characters into the frontmatter instead of real line breaks, which is likely to produce invalid or misparsed YAML for multi-line values. Unless there’s a strong reason to change the format, this should continue to join with real newlines ("\n").
| def _safe_path(self, path: str) -> str: | ||
| # Simply maps it. In a stricter implementation, we could verify it's inside workspace_dir. | ||
| # But for this implementation, we trust the agent or restrict to workspace_dir. | ||
| if not path.startswith("/"): |
There was a problem hiding this comment.
🚨 issue (security): Absolute paths bypass workspace restriction, allowing arbitrary host filesystem access.
Returning absolute paths unchanged lets callers use FileSystemComponent to access arbitrary host paths (e.g. /etc/passwd, /home/...), breaking the intended sandbox isolation. You should ensure all paths resolve under workspace_dir (e.g., normalize with os.path.realpath and check it’s under the workspace root), or very clearly document and strictly limit which absolute paths are allowed.
| try: | ||
| return output.decode("utf-8") | ||
| except (LookupError, UnicodeDecodeError): | ||
| pass | ||
|
|
||
| try: | ||
| return output.decode(preferred) | ||
| except (LookupError, UnicodeDecodeError): | ||
| pass | ||
|
|
There was a problem hiding this comment.
suggestion: Timeout handler assumes stdout may be bytes while text=True guarantees str.
With text=True, e.stdout is always a str (or None), so the isinstance(e.stdout, bytes) check and .decode() branch will never run. This dead code is misleading. You can simplify and keep behavior aligned with the non-timeout path, e.g.:
stdout = e.stdout or ""
return {
"stdout": "" if silent else stdout,
"stderr": f"Execution timed out after {timeout} seconds.",
"exit_code": 1,
}| if not await self.available(): | ||
| raise RuntimeError( |
There was a problem hiding this comment.
suggestion: Check availability before constructing shell/python components and using bwrap.
You’re doing all the workspace and component setup before confirming bwrap exists, then immediately raising if it doesn’t. Please move the await self.available() check (and RuntimeError) ahead of creating HostBackedFileSystemComponent, BwrapPythonComponent, and BwrapShellComponent so we avoid unnecessary setup and keep the failure path clearer and cheaper.
| await booter.boot("test_session_123") | ||
| assert booter.config is not None | ||
| assert os.path.exists(booter.config.workspace_dir) | ||
| await booter.shutdown() |
There was a problem hiding this comment.
suggestion (testing): Add coverage for BwrapShellComponent background execution and environment handling
The current tests cover only basic execution and a single bind case. Please add tests that exercise more of BwrapShellComponent.exec’s behavior, including:
background=True(returns quickly, gives a PID,exit_codeisNone).- Merging of a custom
envdict into the process environment (e.g., assert a variable appears instdout). cwdoverrides when running in a subdirectory.- Respect for the
timeoutargument on a long-running command.
Even a couple of focused tests here would substantially reduce the risk of regressions in these paths.
Suggested implementation:
@pytest.mark.asyncio
async def test_bwrap_shell_exec_background(self):
booter = BwrapBooter()
await booter.boot("test_shell_background")
# background=True should return quickly with a PID and no exit code yet
res = await booter.shell.exec("sleep 5", background=True)
assert res.get("exit_code") is None
assert isinstance(res.get("pid"), int)
await booter.shutdown()
@pytest.mark.asyncio
async def test_bwrap_shell_exec_env_merge(self):
booter = BwrapBooter()
await booter.boot("test_shell_env")
# Custom env vars should be visible inside the sandboxed process
res = await booter.shell.exec(
"sh -lc 'echo \"$MY_TEST_VAR\"'",
env={"MY_TEST_VAR": "bwrap_env_value"},
)
assert res["exit_code"] == 0
assert "bwrap_env_value" in res["stdout"]
await booter.shutdown()
@pytest.mark.asyncio
async def test_bwrap_shell_exec_cwd(self, tmp_path):
booter = BwrapBooter()
await booter.boot("test_shell_cwd")
subdir = tmp_path / "subdir"
subdir.mkdir()
file_in_subdir = subdir / "here.txt"
file_in_subdir.write_text("from_subdir")
# Running in the subdirectory via cwd should see the file there
res = await booter.shell.exec("ls", cwd=str(subdir))
assert res["exit_code"] == 0
assert "here.txt" in res["stdout"]
await booter.shutdown()
@pytest.mark.asyncio
async def test_bwrap_shell_exec_timeout(self):
booter = BwrapBooter()
await booter.boot("test_shell_timeout")
# A long-running command should respect the timeout argument
with pytest.raises(asyncio.TimeoutError):
await booter.shell.exec("sleep 60", timeout=0.1)
await booter.shutdown()
@pytest.mark.asyncio
async def test_bwrap_available(self):- Ensure
asynciois imported at the top oftests/unit/test_computer.py:import asyncio
- If
BwrapShellComponent.execsignals timeouts differently (e.g., returns a result dict with a timeout flag or specific exit code instead of raisingasyncio.TimeoutError), adjusttest_bwrap_shell_exec_timeoutto assert on that behavior instead of usingpytest.raises(asyncio.TimeoutError). - If the sandbox does not have
/bin/shor thesleep/lsutilities available, replace the shell commands with equivalents that are known to be present in your bwrap environment.
| @@ -0,0 +1,318 @@ | |||
| from __future__ import annotations | |||
There was a problem hiding this comment.
issue (complexity): Consider refactoring shared process execution, config initialization, filesystem operations, and validation helpers to centralize logic and remove duplicated code paths in the new bwrap booter implementation.
You can simplify and de-duplicate a few areas without changing behavior.
1. Centralize bwrap execution for Shell & Python
BwrapShellComponent.exec and BwrapPythonComponent.exec duplicate a lot of process logic. A small shared helper keeps behavior identical but reduces branching and makes it easier to change timeout/decoding semantics later.
def _run_bwrap(
config: BwrapConfig,
script_cmd: list[str],
*,
cwd: str | None = None,
env: dict[str, str] | None = None,
timeout: int | None = 30,
background: bool = False,
text: bool = False,
silent: bool = False,
) -> dict[str, Any]:
run_env = os.environ.copy()
if env:
run_env.update({str(k): str(v) for k, v in env.items()})
working_dir = cwd or config.workspace_dir
bwrap_cmd = build_bwrap_cmd(config, script_cmd)
if background:
proc = subprocess.Popen(
bwrap_cmd,
cwd=working_dir,
env=run_env,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
return {"pid": proc.pid, "stdout": "", "stderr": "", "exit_code": None}
try:
result = subprocess.run(
bwrap_cmd,
cwd=working_dir,
env=run_env,
timeout=timeout,
capture_output=True,
text=text,
)
if text:
stdout = "" if silent else (result.stdout or "")
stderr = result.stderr or ""
else:
stdout = "" if silent else _decode_shell_output(result.stdout)
stderr = _decode_shell_output(result.stderr)
return {
"stdout": stdout,
"stderr": stderr,
"exit_code": result.returncode,
}
except subprocess.TimeoutExpired as e:
# Preserve existing behavior: best-effort decode
raw_stdout = e.stdout
if text:
decoded = raw_stdout or ""
else:
decoded = _decode_shell_output(raw_stdout) if raw_stdout else ""
return {
"stdout": decoded,
"stderr": f"Execution timed out after {timeout} seconds.",
"exit_code": 1,
}Then both components become thin wrappers:
@dataclass
class BwrapShellComponent(ShellComponent):
config: BwrapConfig
async def exec(...):
def _run() -> dict[str, Any]:
script_cmd = ["/bin/sh", "-c", command] if shell else shlex.split(command)
return _run_bwrap(
self.config,
script_cmd,
cwd=cwd,
env=env,
timeout=timeout,
background=background,
text=False,
)
return await asyncio.to_thread(_run)
@dataclass
class BwrapPythonComponent(PythonComponent):
config: BwrapConfig
async def exec(...):
def _run() -> dict[str, Any]:
script_cmd = [os.environ.get("PYTHON", "python3"), "-c", code]
return _run_bwrap(
self.config,
script_cmd,
timeout=timeout,
text=True,
silent=silent,
)
return await asyncio.to_thread(_run)This also gives you a single place to tweak env handling, timeouts, etc.
2. Make decoding strategy consistent
Right now BwrapShellComponent uses _decode_shell_output while BwrapPythonComponent relies on text=True and manual decode on timeout. With _run_bwrap above, you can standardize:
- Use
text=Truefor all cases where you want strings. - Use
_decode_shell_outputonly when you deliberately want to work with bytes.
For example, for shell you can also switch to text=True and drop _decode_shell_output entirely if you don’t need the multi-encoding fallback:
# If you're comfortable with locale-based decoding:
return _run_bwrap(
self.config,
script_cmd,
cwd=cwd,
env=env,
timeout=timeout,
background=background,
text=True,
)Or keep _decode_shell_output only inside _run_bwrap for the non-text path so there’s exactly one decoding path.
3. Avoid in-place mutation in BwrapConfig.__post_init__
__post_init__ mutates self.ro_binds that comes from BwrapBooter, making it less obvious what’s user data vs defaults. You can preserve behavior but overwrite with a new list:
@dataclass
class BwrapConfig:
workspace_dir: str
ro_binds: list[str] = field(default_factory=list)
rw_binds: list[str] = field(default_factory=list)
share_net: bool = True
def __post_init__(self):
default_ro = ["/usr", "/lib", "/lib64", "/bin", "/etc", "/opt"]
# preserve order, remove duplicates
combined = [*self.ro_binds, *default_ro]
seen = set()
self.ro_binds = [p for p in combined if not (p in seen or seen.add(p))]This keeps the effective bind set identical while eliminating hidden mutation of the list passed into the constructor.
Alternatively, you can push that logic into BwrapBooter.boot before constructing BwrapConfig, but the above is minimal and localized.
4. Consolidate file operations with HostBackedFileSystemComponent
BwrapBooter.upload_file/download_file partially duplicate filesystem logic already encapsulated in HostBackedFileSystemComponent. You can route these operations through _fs to avoid two code paths:
class HostBackedFileSystemComponent(FileSystemComponent):
...
async def copy_to_workspace(self, src: str, dest_name: str) -> dict[str, Any]:
target = os.path.join(self.workspace_dir, dest_name)
os.makedirs(os.path.dirname(target), exist_ok=True)
try:
shutil.copy2(src, target)
return {"success": True, "file_path": target}
except Exception as e:
return {"success": False, "error": str(e)}
async def copy_from_workspace(self, src: str, dest: str) -> dict[str, Any]:
if not src.startswith("/"):
src = os.path.join(self.workspace_dir, src)
os.makedirs(os.path.dirname(dest), exist_ok=True)
shutil.copy2(src, dest)
return {"success": True, "file_path": dest}Then BwrapBooter becomes thinner:
async def upload_file(self, path: str, file_name: str) -> dict:
if not self._fs:
return {"success": False, "error": "Not booted"}
return await self._fs.copy_to_workspace(path, file_name)
async def download_file(self, remote_path: str, local_path: str) -> dict:
if not self._fs:
return {"success": False, "error": "Not booted"}
return await self._fs.copy_from_workspace(remote_path, local_path)This keeps all file movement in one abstraction and reduces duplication.
5. Factor boot into smaller helpers
boot currently mixes wiring, availability, and validation. You can improve readability by extracting validation without changing logic:
class BwrapBooter(ComputerBooter):
...
async def _validate_sandbox(self) -> None:
test_shl = await self._shell.exec(command="ls > /dev/null")
if test_shl["exit_code"] != 0:
raise RuntimeError(
'BubbleWrap sandbox fails to exec test shell command "ls > /dev/null" '
f"with stderr:\n{test_shl['stderr']}"
)
test_py = await self._python.exec(code="print('Yes')")
if test_py["exit_code"] != 0:
raise RuntimeError(
'BubbleWrap sandbox fails to exec test python code "print(\'Yes\')" '
f"with stderr:\n{test_py['stderr']}"
)
async def boot(self, session_id: str) -> None:
workspace_dir = os.path.join(
get_astrbot_temp_path(), f"sandbox_workspace_{session_id}"
)
os.makedirs(workspace_dir, exist_ok=True)
self.config = BwrapConfig(
workspace_dir=os.path.abspath(workspace_dir),
rw_binds=self._rw_binds,
ro_binds=self._ro_binds,
)
if not await self.available():
raise RuntimeError(
"BubbleWrap sandbox unavailable on current machine for no bwrap executable."
)
self._fs = HostBackedFileSystemComponent(self.config.workspace_dir)
self._python = BwrapPythonComponent(self.config)
self._shell = BwrapShellComponent(self.config)
await self._validate_sandbox()This keeps behavior identical while making boot easier to scan and maintain.
6. Reduce repetition in build_bwrap_cmd
The ro/rw bind loops repeat the same checks. A tiny helper clarifies the logic:
def _append_bind(cmd: list[str], path: str, flag: str) -> None:
if flag == "--bind":
if path == "/" or path.startswith("/root"):
return
if os.path.exists(path):
cmd.extend([flag, path, path])
def build_bwrap_cmd(config: BwrapConfig, script_cmd: list[str]) -> list[str]:
cmd = ["bwrap"]
if not config.share_net:
cmd.append("--unshare-net")
for path in config.ro_binds:
_append_bind(cmd, path, "--ro-bind")
for path in config.rw_binds:
_append_bind(cmd, path, "--bind")
cmd.extend([
"--unshare-pid",
"--unshare-ipc",
"--unshare-uts",
"--die-with-parent",
"--dir", "/tmp",
"--dir", "/var/tmp",
"--proc", "/proc",
"--dev", "/dev",
"--bind", config.workspace_dir, config.workspace_dir,
])
cmd.extend(["--", *script_cmd])
return cmdThis keeps all existing guard conditions but cuts down on duplicated branches.
|
Warning Gemini encountered an error creating the review. You can try again by commenting |
This pull request introduces a new Bubblewrap-based sandbox environment ("bwrap") to the Astrbot framework, providing a secure and configurable isolation layer for executing code. It adds the necessary backend implementation, configuration options, and localization support for the new booter. The changes also include updates to the configuration schema and user interface metadata to expose Bubblewrap options to users.
The most important changes are:
Bubblewrap sandbox implementation:
bwrap.py, implementingBwrapBooter,BwrapConfig, and related components for secure, isolated execution of shell and Python commands using Bubblewrap. This includes support for custom read-write and read-only bind mounts, file operations, and runtime checks for Bubblewrap availability.Configuration and schema updates:
default.py) to addbwrap_rw_bindsandbwrap_ro_bindsoptions, and updated the schema to allow selecting "bwrap" as a sandbox driver with corresponding bind mount settings. [1] [2]Integration into computer client:
computer_client.pyto support selecting and instantiating the Bubblewrap booter, passing in the appropriate configuration for read-write and read-only binds.parse_descriptionto use escaped newlines for frontmatter parsing.Localization and UI support:
config-metadata.json, ensuring these settings are available and understandable in the dashboard UI. [1] [2] [3] [4] [5] [6]Testing support:
test_computer.pyto import and make available the new Bubblewrap booter and its components for unit testing. [1] [2]Modifications / 改动点
Screenshots or Test Results / 运行截图或测试结果
Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。
Summary by Sourcery
Add a Bubblewrap-based sandbox backend and wire it into the computer client and configuration system for running code in an isolated environment.
New Features:
Enhancements:
Tests: