Skip to content
Merged
64 changes: 28 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,18 @@ Multi-Agent Reinforcement Learning (MARL) cybersecurity simulator mathematically
**Project:** NetForge RL
**GNN-based Policy Model:** https://github.com/elprofesoriqo/GNN-based-Policy-Model-for-MARL-Cyber

## Architectural Overhaul Notice
## Architectural Changes & State-of-the-Art Modeling

This repository represents a complete structural redesign of the original CybORG framework. I took ownership of this branch because the legacy CybORG environment was fundamentally restricted to single-agent, turn-based paradigms (utilizing nested OpenAI Gym wrappers) which artificially broke parallel gradients and hindered true Multi-Agent research.
This repository is a dramatic evolution from the legacy CybORG / CAGE challenge environment. While acknowledging the incredible fundamental work by DSTG, NetForge RL transitions the paradigm from a synchronous, fully observable game into a high-fidelity, physically constrained network simulation designed for real-world Sim-to-Real transfer.

### What is Different?
1. **Parallel Execution via PettingZoo:** The core simulator is now strictly built upon the `pettingzoo.ParallelEnv` standard instead of monolithic Gym wrappers. Red and Blue teams act in a simultaneous time vacuum, and the engine natively resolves their conflicting action intents.
2. **Abstract Action Engine:** Actions no longer mutate simulator state directly via complex monolithic switch statements. `BaseAction` computes an `ActionEffect` (JSON representation of physical network impact), which the core environment evaluates and securely commits.
3. **No Legacy Bloat:** I have deleted all obsolete OpenAI Gym references, redundant CAGE challenge sub-modules, and unneeded demo code.
1. **Interruptible Tick-Based Engine:** CybORG's instantaneous actions are gone. NetForge RL runs on an asynchronous `current_tick` clock. Actions have a `duration` natively. Real-time interruptions exist: if the SOC isolates a host mid-exfiltration, the attacker's action is aborted.
2. **Strict POMDP Isolation & Fog of War:** Defenders do not see the ground truth. They receive dynamic telemetry alerts generated by a newly implemented `siem_log_buffer` suffering from realistic `log_latency`. Background noise agents obfuscate true malicious alerts.
3. **MultiDiscrete Tensors & Procedural Networks:** To avoid static overfitting and combinatorial explosions, Action spaces utilize `MultiDiscrete` Arrays (e.g. `[ActionType, TargetIP]`). Topologies procedurally generate up to 50 active nodes utilizing padded masking dynamically.
4. **Attack Economics & Cost Mechanics:** Each agent is bounded by Operational Budgets (`agent_funds`, `agent_compute`). Reckless defensive isolation triggers massive Business Downtime mathematical penalties mirroring real-world SLA fines.
5. **Cyber-Physical (OT) Convergence:** Generating distinct `OT_Subnets` featuring `PLC` nodes mapping thermodynamic vulnerabilities. Red operators can inflict catastrophic Kinetic Impacts `(+10000/-10000 rewards)` overriding logical state tracking entirely.
6. **Social Engineering (Stochastics):** DMZ architectures can natively be bypassed by Red teams leveraging `SpearPhishing` arrays scaled against dynamically rolled `human_vulnerability_score` matrix properties. Blue counters this via explicit `SecurityAwarenessTraining` capital expenditure.
7. **Ray RLlib & PyTorch LSTMs:** Packaged natively with Custom PyTorch Models linking Recurrent Memory sequences (LSTMs) alongside mathematical boolean Action Masking dropping invalid tensor networks natively out-of-the-box.

### Simulator Architecture Flow

Expand All @@ -37,10 +41,10 @@ graph TD
The environment is designed to be highly plug-and-play.

```python
from marl_cyborg.environment.parallel_env import ParallelMarlCyborg
from netforge_rl.environment.parallel_env import NetForgeRLEnv

# Instantiate the native PettingZoo environment
env = ParallelMarlCyborg(scenario_config={})
env = NetForgeRLEnv(scenario_config={})

# Reset to get parallel Gymnasium boxes
observations, infos = env.reset()
Expand All @@ -53,50 +57,38 @@ print("Blue Box:", observations["Blue"])

The primary reason for this fork is extensibility. Want to add an *ARP Poisoning* attack?

Simply inherit the `BaseAction` inside `marl_cyborg/actions/network/arp_poison.py`, write how it modifies the theoretical `ActionEffect`, and the engine natively calculates the physics resolution. See `marl_cyborg.actions.network.ip_fragmentation.IPFragmentationAction` for a physical example of this structural implementation.
Simply inherit the `BaseAction` inside `netforge_rl/actions/network/arp_poison.py`, write how it modifies the theoretical `ActionEffect`, and the engine natively calculates the physics resolution. See `netforge_rl.actions.network.ip_fragmentation.IPFragmentationAction` for a physical example of this structural implementation.

## License & Accreditation
This project is built upon the foundational work provided by the original CybORG contributors (CyberSecurityCRC / DSTG). The core internal simulator physics remain preserved, while the outward translation layers, action hierarchy, and Multi-Agent APIs have been entirely redesigned by Igor Jankowski.

## Repository Structure

- `marl_cyborg/`: Core simulation environment
- `netforge_rl/`: Core simulation environment
- `actions/`: Contains definitions for all `BaseAction` implementations.
- `red_actions.py`: Red team offensive actions.
- `blue_actions.py`: Blue team defensive actions.
- `core/`: State, Observation, and Action abstract base classes.
- `agents/`: Contains specialized algorithmic actors like `GreenAgent` (Background Noise simulation).
- `core/`: State, Observation, and Action abstract base classes enforcing physical constraints.
- `environment/`:
- `parallel_env.py`: The primary PettingZoo MARL environment.
- `parallel_env.py`: The primary asynchronous PettingZoo MARL environment.
- `pcap_synthesizer.py`: Generates synthetic offline `.pcap` network traffic mappings.
- `train_curriculum.py`: Example RL training script.
- `test_physics.py`: Physics unit tests.

## Available Actions

All actions are natively available to the RL models through the environment's discrete action space (`Discrete(256)`). The engine dynamically scales and maps these 11 actions per team against all available network IPs.
All actions are natively available to the RL models through the environment's `MultiDiscrete` action space mapped seamlessly via PyTorch Logit structures.

### Red Team (Offensive)
1. **NetworkScan**: Scans a target subnet for active IP addresses.
2. **DiscoverRemoteSystems**: Performs a Ping Sweep to pinpoint active hosts.
3. **DiscoverNetworkServices**: Port scans a host to enumerate running services.
4. **ExploitRemoteService**: Exploits a vulnerability on a target IP to gain User privileges.
5. **PrivilegeEscalate**: Escalates from User to Root access.
6. **Impact**: Destroys/encrypts data on a compromised host (Ransomware/Wiper).
7. **ExploitBlueKeep**: Exploits RDP (CVE-2019-0708) on Port 3389.
8. **ExploitEternalBlue**: Exploits SMB (MS17-010) on Port 445.
9. **ExploitHTTP_RFI**: Remote File Inclusion exploit targeting Port 80.
10. **JuicyPotato**: Local privilege escalation via DCOM (Windows).
11. **V4L2KernelExploit**: Local privilege escalation via Video4Linux kernel vulns (Linux).
1. **NetworkScan / DiscoverRemoteSystems / DiscoverNetworkServices**: Passive/Active reconnaissance probing ports & ping sweeps.
2. **SpearPhishing**: Bypasses corporate structures directly exploiting human error factors inside user networks.
3. **ExploitRemoteService / ExploitEternalBlue...**: Gain user privileges weaponizing CVEs based on specific OS versions and open Ports.
4. **PrivilegeEscalate**: Pivot from constrained user constraints to `Root`/`System`.
5. **Impact**: Ransomware execution mapping standard IT failure metrics.
6. **OverloadPLC (Kinetic)**: Weaponizes thermodynamics on compromised OT Networks forcing episode kinetic destruction sequences.

### Blue Team (Defensive)
1. **IsolateHost**: Disconnects a host completely from the network.
2. **RestoreHost**: Brings an isolated host back online from a clean snapshot.
3. **Monitor**: Actively monitors traffic on a specific subnet or host for anomalies.
4. **Analyze**: Deep scans a specific host for malware signatures or unauthorized user activity.
5. **DeployDecoy**: Deploys a generic fake service (Apache/Tomcat/Femitter) to bait attackers.
6. **Remove**: Removes unauthorized user privileges.
7. **RestoreFromBackup**: Purges an infected host and restores it to a clean baseline from a backup.
8. **DecoyApache**: Deploys a fake Apache web server (Port 80) honeypot.
9. **DecoySSHD**: Deploys a fake SSH daemon (Port 22) honeypot.
10. **DecoyTomcat**: Deploys a fake Tomcat server (Port 8080) honeypot.
11. **Misinform**: Injects false host telemetry or alters logging to feed Red agents fake data.
1. **IsolateHost / RestoreHost**: Logical quarantining of suspected nodes (Incurs heavily tracked SLA Business downtime).
2. **Monitor / Analyze**: Asynchronous deep network/host scans bypassing standard physical delays.
3. **SecurityAwarenessTraining**: Burns financial budget mathematically slashing organic `human_vulnerability_scores` defending against phish payloads.
4. **DeployHoneytoken (Active Deception)**: Secretly seeds RAM-based tokens triggering massive unevadable 0-delay severity 10 SIEM alerts when parsed by automated Red lateral mapping capabilities.
5. **DecoyApache / DecoySSHD / DeployDecoy**: Deploys visible port-80/22 traps binding attacker compute resources across dead execution loops.
4 changes: 2 additions & 2 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Changelog

All notable changes to the `marl_cyborg` project will be documented in this file.
All notable changes to the `netforge_rl` project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## [3.0.0] - 2026-02-28
### Added
- **PettingZoo API Core Integration**: Created `marl_cyborg/environment/parallel_env.py` substituting the legacy wrapper paradigm with `pettingzoo.ParallelEnv`, explicitly allowing concurrent multi-agent action steps.
- **PettingZoo API Core Integration**: Created `netforge_rl/environment/parallel_env.py` substituting the legacy wrapper paradigm with `pettingzoo.ParallelEnv`, explicitly allowing concurrent multi-agent action steps.
- **Gymnasium Box Compatibility**: All spaces natively map to `gymnasium.spaces` APIs instead of arbitrary nested classes.
- **`BaseAction` / `BaseObservation` Abstract Hierarchy**: Abstracted action mutation. Cyber attacks no longer edit the state directly, but rather return a theoretical JSON impact via `ActionEffect` allowing the environment to resolve simultaneity conflicts natively.
- **Python 3.12 Support (Native)**: Enforced via the new `pyproject.toml` definition.
Expand Down
62 changes: 0 additions & 62 deletions marl_cyborg/core/action.py

This file was deleted.

4 changes: 0 additions & 4 deletions marl_cyborg/environment/__init__.py

This file was deleted.

Loading
Loading