A lightweight, flexible Python observability framework designed for robotics research. Goggles provides structured logging, experiment tracking, performance profiling, and device-resident temporal memory management for JAX-based pipelines.
- π€ Multi-process logging on a single machine - Synchronize logs across spawned processes via a Unix-domain-socket transport; large numpy payloads travel through shared memory when they cross the threshold.
- π― Multi-output support - Log to console, files, and remote services simultaneously.
- π Experiment tracking - Native integration with Weights & Biases for metrics, images, and videos.
- π Performance profiling -
@goggles.timeitdecorator for automatic runtime measurement. - π Error tracing -
@goggles.trace_on_errorauto-logs full stack traces on exceptions. - π§ Device-resident histories - JAX-based GPU memory management for efficient, long-running experiments metrics.
- π¦ Graceful shutdown - Automatic cleanup of resources and handlers.
- βοΈ Structured configuration - YAML-based config loading with validation.
- π Extensible handlers - Plugin architecture for custom logging backends.
This framework has been battle-tested across multiple research projects:
# Basic installation
uv add robo-goggles # or pip install robo-goggles
# With Weights & Biases support
uv add "robo-goggles[wandb]"
# With JAX device-resident histories
uv add "robo-goggles[jax]"For the development installation, see our How to contribute page.
Warning
Socket selection: Goggles uses a Unix domain socket to route events to a single host process per machine. The first process to bind becomes the host; later processes connect as clients. Two unrelated projects sharing the same socket path will end up sharing a bus, which is not what you want. Pin a per-project path in your .env via GOGGLES_SOCKET=/tmp/goggles-<project>.sock.
import goggles as gg
import logging
# Set up console logging
logger = gg.get_logger("my_experiment")
gg.attach(
gg.ConsoleHandler(name="console", level=logging.INFO),
)
# Basic logging
logger.info("Experiment started")
logger.warning("This is a warning")
logger.error("An error occurred")
# Goggles works by default in async mode,
# to ensure all the jobs are finished use
gg.finish()See also Example 1, which you can run after cloning the repo with
uv run examples/01_basic_run.pyimport goggles as gg
import numpy as np
# Enable metrics logging. `group` and `tags` are forwarded straight to
# `wandb.init`: use `group` to keep related runs together in the W&B UI
# (see "Multiple runs in WandB" below) and `tags` to make a run easy to
# find or filter on later. Pass `tags` as a list β a bare string is
# rejected because W&B would silently iterate it character by character.
logger = gg.get_logger("experiment", with_metrics=True)
gg.attach(
gg.WandBHandler(
project="my_project",
run_name="run_1",
group="experiment_v2",
tags=["baseline", "smoke-test"],
),
)
# Log metrics, images, and videos
for step in range(100):
logger.scalar("loss", np.random.random(), step=step)
logger.scalar("accuracy", 0.8 + 0.2 * np.random.random(), step=step)
# Log images and videos
image = np.random.randint(0, 255, (64, 64, 3), dtype=np.uint8)
logger.image(image, name="sample_image", step=100)
video = np.random.randint(0, 255, (30, 3, 64, 64), dtype=np.uint8)
logger.video(video, name="sample_video", fps=10, step=100)
gg.finish()import goggles as gg
import logging
class Trainer:
@gg.timeit(severity=logging.INFO)
def train_step(self, batch):
# Your training logic here
return {"loss": 0.1}
@gg.trace_on_error()
def risky_operation(self, data):
# This will log full traceback on any exception
return data / 0 # Will trigger trace logging
trainer = Trainer()
trainer.train_step({"x": [1, 2, 3]}) # Logs execution time
try:
trainer.risky_operation(10)
except ZeroDivisionError:
pass # Full traceback was automatically loggedLoad and validate YAML configurations:
import goggles
# Load configuration with automatic validation
config = goggles.load_configuration("config.yaml")
print(config) # Pretty print
print(config["learning_rate"]) # Access as dict
# Pretty-print configuration
goggles.save_configuration(config, "output.yaml")| Platform | Basic | W&B | JAX/GPU | Development |
|---|---|---|---|---|
| Linux | β | β | β | β |
| macOS | β | β | β | β |
| Windows | β | β | β | β |
GPU support requires CUDA-compatible hardware and drivers
Explore the examples/ directory for comprehensive usage patterns:
# Basic logging setup
uv run examples/01_basic_run.py
# Advanced: Multi-scope logging
uv run examples/02_multi_scope.py
# File-based logging (local storage)
uv run examples/03_local_storage.py
# Weights & Biases integration
uv run examples/04_wandb.py
# Advanced: Weights & Biases multi-run setup
uv run examples/05_wandb_multiple_runs.py
# Advanced: Custom handler
uv run examples/06_custom_handler.py
# Graceful shutdown utils
uv run examples/100_interrupt.py
# Pretty and convenient utils for configuration loading
uv run examples/101_config.py
# Advanced: Performance decorators
uv run examples/102_decorators.py
# Advanced: JAX device-resident histories
uv run examples/103_history.py
# Filters: smoothing, outlier rejection, composition
uv run examples/104_filters.py
# Benchmark: producer-side logging latency under Hydra presets
uv run examples/105_benchmark.pyThis section includes some cool functionalities of goggles. Enjoy!
Goggles allow easily to set up different handlers for different scopes. That is, one can have an handler attached to multiple scopes, and a scope having multiple handlers. Each logger is associated to a single scope (by default: global), and logging with that logger will invoke all the loggers associated with the scope.
Within the same run, we may have logs that belong to different scopes. An example is training in Reinforcement Learning, where in a single training run there are multiple episodes. A complete example for this is provided in the multiple runs in WandB section.
# In this example, we set up a handlers associated
# to different scopes.
handler1 = gg.ConsoleHandler(name="examples.basic.console.1", level=logging.INFO)
gg.attach(handler1, scopes=["global", "scope1"])
handler2 = gg.ConsoleHandler(name="examples.basic.console.2", level=logging.INFO)
gg.attach(handler2, scopes=["global", "scope2"])
# We need to get separate loggers for each scope
logger_scope1 = gg.get_logger("examples.basic.scope1", scope="scope1")
logger_scope2 = gg.get_logger("examples.basic.scope2")
logger_scope2.bind(scope="scope2") # You can also bind the scope after creation
logger_global = gg.get_logger("examples.basic.global", scope="global")
# Now we can log messages to different scopes, so that only the interested
# handlers will process them.
logger_scope1.info(f"This will be logged only by {handler1.name}")
logger_scope2.info(f"This will be logged only by {handler2.name}")
logger_global.info("This will be logged by both handlers.")
# The same result can be achieved using namespaces,
# which are indicated by dot notation.
logger_namespace = gg.get_logger("examples.basic.namespace", scope="namespace")
logger_namespace.info("This will be logged by both handlers.")
gg.finish()See also examples/02_multi_scope.py for a running example.
An example of the benefit of scopes is given by the WandBHandler, which instantiate a different WandB run for each scope and groups them together:
import goggles as gg
from goggles import WandBHandler
# In this example, we set up multiple runs in Weights & Biases (W&B).
# All runs created by the handler will be grouped under
# the same project and group.
logger: gg.GogglesLogger = gg.get_logger("examples.basic", with_metrics=True)
handler = WandBHandler(
project="goggles_example", reinit="create_new", group="multiple_runs"
)
# In particular, we set up multiple runs in an RL training loop, with each
# episode being a separate W&B run and a global run tracking all episodes.
num_episodes = 3
episode_length = 10
scopes = [f"episode_{episode}" for episode in range(num_episodes + 1)]
scopes.append("global")
gg.attach(handler, scopes=scopes)
def my_episode(index: int):
episode_logger = gg.get_logger(scope=f"episode_{index}", with_metrics=True)
for step in range(episode_length):
# Supports scopes transparently
# and has its own step counter
episode_logger.scalar("env/reward", index * episode_length + step, step=step)
for i in range(num_episodes):
my_episode(i)
logger.scalar("total_reward", i, step=i)
gg.finish()As in the WandB example, all the handlers work in the background. By default, the logging calls are not blocking, but can be made blocking by setting the environment variable GOGGLES_ASYNC to 0 or false. When you use the async mode, remember to call gg.finish() at the end from your host machine!
Warning
This functionality still needs thorough tesing, as well as a better documentation. Help is appreciated! π€
All processes that share the same GOGGLES_SOCKET path converge on a single EventBus. The first process to bind the socket becomes the host and runs the attached handlers; later processes connect as clients and forward events to it. Cross-machine logging is not supported in the built-in transport; if you need it, add a new implementation of goggles._core.transport.Transport.
At high logging frequency (β₯1 kHz) Python's gen-2 garbage collector can cause millisecond-scale latency spikes. After you have attached handlers and finished setup, call gg.freeze() once before your hot loop:
gg.attach(...)
gg.freeze() # promote startup objects out of the GC scan set
for step in range(steps):
logger.scalar("loss", loss, step=step)gg.freeze() wraps gc.freeze(); the collector still runs on churn allocated after the call, but it stops rescanning the long-lived startup state.
Why this is opt-in (not automatic).
gc.freeze()is process-global, not goggles-scoped: it promotes every currently-tracked Python object β including whatever your code has built so far β into a permanent generation that the GC will skip on subsequent collections. If goggles called it from insideattach()orget_logger(), we'd be making that decision on objects you haven't finished allocating yet. The right call site is "after your setup is done, before your hot loop starts" β and only you know where that line is.
The W&B upload runs on a background thread that W&B's own SDK manages; Goggles' producer thread only calls wandb.log({...}), which enqueues locally and returns quickly. Online vs offline mode therefore does not change the hot-path latency your training loop sees β only where the data ends up changes.
When to set WANDB_MODE=offline:
- Airgapped or flaky-network hosts (shared clusters, HPC compute nodes without outbound Internet access).
- Benchmarking / reproducible latency measurements β removes network jitter from the picture (see examples/105_benchmark.py).
- Untrusted environments where you don't want to stream data out during the run; you review and then sync.
- Faster startup β no auth round-trip at
wandb.init.
Offline runs are written to ./wandb/offline-run-<timestamp>-<id>/ and can be uploaded later with:
wandb sync wandb/ # all offline runs
wandb sync wandb/offline-run-<id> # a specific runNote
Ideally, you should open a PR: We would love to integrate your work!
Adding a custom handler is straightforward:
import goggles as gg
import logging
class CustomConsoleHandler(gg.ConsoleHandler):
"""A custom console handler that adds a prefix to each log message."""
def handle(self, event: gg.Event) -> None:
dict = event.to_dict()
dict["payload"] = f"[CUSTOM PREFIX] {dict['payload']}"
event = gg.Event.from_dict(dict)
super().handle(event)
# Register the custom handler so it can be serialized/deserialized
gg.register_handler(CustomConsoleHandler)
# In this basic example, we set up a logger that outputs to the console.
logger = gg.get_logger("examples.custom_handler")
gg.attach(
CustomConsoleHandler(name="examples.custom.console", level=logging.INFO),
scopes=["global"],
)
# Because the logging level is set to INFO, the debug message will not be shown.
logger.info("Hello, world!")
logger.debug("you won't see this at INFO")
gg.finish()See also examples/05_custom_handler.py for a complete example.
For long-running GPU experiments that need efficient temporal memory management:
During development of fluid control experiments and reinforcement learning pipelines, we needed to:
- Track detailed metrics during GPU-accelerated training
- Avoid expensive device-to-host transfers
- Maintain temporal state across episodes
- Support JIT compilation for maximum performance
- Pure functional and JIT-safe buffer updates
- Per-field history lengths with episodic reset support
- Batch-first convention:
(B, T, *shape)for all tensors - Zero host-device synchronization during updates
- Integrated with FlowGym's
EstimatorStatefor temporal RL memory
from goggles.history import HistorySpec, create_history, update_history
import jax.numpy as jnp
# Define what to track over time
spec = HistorySpec.from_config({
"states": {"length": 100, "shape": (64, 64, 2), "dtype": jnp.float32},
"actions": {"length": 50, "shape": (8,), "dtype": jnp.float32},
"rewards": {"length": 100, "shape": (), "dtype": jnp.float32},
})
# Create GPU-resident history buffers
history = create_history(spec, batch_size=32)
print(history["states"].shape) # (32, 100, 64, 64, 2)
# Update buffers during training (JIT-compiled)
new_state = jnp.ones((32, 64, 64, 2))
history = update_history(history, {"states": new_state})See also examples/103_history.py for a running example.
We welcome contributions! Please see our Contributing Guide for detailed information on:
β’ Development workflow and environment setup β’ Code style requirements and automated checks β’ Testing standards and coverage expectations β’ PR preparation and commit message conventions
This project is licensed under the MIT License - see the LICENSE file for details.