CybORG

Simulated cyber-defense arena based on the CAGE Challenge 3 DroneSwarm scenario.

Overview

CybORG is a cyber operations research gym for training and evaluating autonomous security agents. The CodeClash arena uses CybORG's simulated DroneSwarm scenario through the PettingZoo parallel interface. It does not run real exploit tooling, emulate external networks, or interact with live systems.

Each CodeClash player edits a blue-team CybORG agent. A round evaluates every submitted agent on the same seeded episode batch and scores players by average episode reward.

Resources

Implementation

codeclash.arenas.cyborg.cyborg.CybORGArena

CybORGArena(config: dict, *, tournament_id: str, local_output_dir: Path, keep_containers: bool = False)

Bases: CodeArena

Source code in codeclash/arenas/arena.py

def __init__(self, config: dict, *, tournament_id: str, local_output_dir: Path, keep_containers: bool = False):
    """The CodeArena class is responsible for running games, i.e., taking a list of code
    from different agents/players and running them against each other.
    It also provides the environments for the game and agents to run in.

    The central method is `run_round`, which takes a list of agents and returns the winner of the round.

    At the end of the the tournament, run the `end` method to clean up the game and agents and write the metadata.

    Args:
        config: The overall config for the tournament.
        tournament_id: The id of the tournament.
        local_output_dir: The host/local directory to write logs to.
        keep_containers: Do not remove containers after games/agent finish.
    """
    self.url_gh: str = f"git@github.com:{GH_ORG}/{self.name}.git"
    self.artifacts: list[Path] = []
    """Artifact objects that we might want to clean up after the game."""
    self.config: dict = config
    self._keep_containers: bool = keep_containers
    self._metadata: dict = {
        "name": self.name,
        "config": self.config["game"],
        "game_id": tournament_id,
        "created_timestamp": int(time.time()),
    }
    self.log_env: Path = DIR_LOGS
    self.log_local: Path = local_output_dir
    self.logger = get_logger(self.name, log_path=self.log_local / "game.log", emoji="🏓")
    self.environment: DockerEnvironment = self.get_environment()
    """The running docker environment for executing the game"""

name `class-attribute` `instance-attribute`

name: str = 'CybORG'

submission `class-attribute` `instance-attribute`

submission: str = 'cyborg_agent.py'

description `class-attribute` `instance-attribute`

description: str = "CybORG is a simulated cyber-defense arena based on the CAGE Challenge 3 DroneSwarm scenario.\n\nYour bot is a Python file named `cyborg_agent.py` that defines a class named `MyAgent`.\n`MyAgent` should inherit from a CybORG BaseAgent-compatible class, for example:\n\n    from CybORG.Agents import RandomAgent\n\n    class MyAgent(RandomAgent):\n        ...\n\nEach round evaluates every submitted agent independently on the same seeded DroneSwarm episodes.\nYour agent controls the blue-team drone agents through CybORG's simulated PettingZoo interface.\nThe objective is to maximize average episode reward. This arena uses CybORG simulation only and does\n    not run real exploit tools or interact with external networks.\n    "

default_args `class-attribute` `instance-attribute`

default_args: dict = {'steps_per_episode': 30, 'num_drones': 18, 'timeout': 240}

validate_code

validate_code(agent: Player) -> tuple[bool, str | None]

Source code in codeclash/arenas/cyborg/cyborg.py

def validate_code(self, agent: Player) -> tuple[bool, str | None]:
    quoted_submission = shlex.quote(self.submission)
    file_check = agent.environment.execute(f"test -f {quoted_submission} && echo exists")
    if "exists" not in file_check["output"]:
        return False, f"Submission file `{self.submission}` not found in the workspace root"

    content = agent.environment.execute(f"cat {quoted_submission}")["output"]
    if not content.strip():
        return False, f"`{self.submission}` is empty"

    syntax_check = agent.environment.execute(f"python -m py_compile {quoted_submission}")
    if syntax_check["returncode"] != 0:
        return False, f"Python syntax error in `{self.submission}`:\n{syntax_check['output']}"

    import_check = agent.environment.execute(
        "python - <<'PY'\n"
        "import importlib.util\n"
        f"spec = importlib.util.spec_from_file_location('submission_agent', {self.submission!r})\n"
        "module = importlib.util.module_from_spec(spec)\n"
        "spec.loader.exec_module(module)\n"
        "assert hasattr(module, 'MyAgent'), 'MyAgent class not found'\n"
        "from CybORG.Agents import BaseAgent\n"
        "assert issubclass(module.MyAgent, BaseAgent), 'MyAgent must inherit from a CybORG BaseAgent class'\n"
        "PY"
    )
    if import_check["returncode"] != 0:
        return False, f"Could not import `MyAgent` from `{self.submission}`:\n{import_check['output']}"

    return True, None

execute_round

execute_round(agents: list[Player]) -> None

Source code in codeclash/arenas/cyborg/cyborg.py

def execute_round(self, agents: list[Player]) -> None:
    agent_args = []
    for agent in agents:
        agent_args.extend(["--agent", f"{agent.name}=/{agent.name}/{self.submission}"])

    cmd = [
        "python",
        "run_cyborg.py",
        "--episodes",
        str(self._episodes_per_round()),
        "--steps",
        str(self._game_arg("steps_per_episode")),
        "--drones",
        str(self._game_arg("num_drones")),
        "--output",
        str(self.log_env / RESULTS_JSON),
        *agent_args,
    ]
    full_cmd = " ".join(shlex.quote(part) for part in cmd)
    self.logger.info(f"Running game: {full_cmd}")
    try:
        response = self.environment.execute(full_cmd, timeout=int(self._game_arg("timeout")))
    except subprocess.TimeoutExpired as exc:
        raise RuntimeError("CybORG round timed out") from exc
    assert_zero_exit_code(response, logger=self.logger)

get_results

get_results(agents: list[Player], round_num: int, stats: RoundStats)

Source code in codeclash/arenas/cyborg/cyborg.py

def get_results(self, agents: list[Player], round_num: int, stats: RoundStats):
    result_file = self.log_round(round_num) / RESULTS_JSON
    if not result_file.exists():
        self.logger.error(f"Missing result file: {result_file}")
        stats.winner = RESULT_TIE
        for agent in agents:
            stats.scores[agent.name] = 0.0
            stats.player_stats[agent.name].score = 0.0
        return

    with open(result_file) as f:
        result = json.load(f)

    scores = {agent.name: 0.0 for agent in agents}
    for player, score in result.get("average_scores", {}).items():
        if player in scores:
            scores[player] = float(score)

    stats.scores = scores
    stats.details = result.get("details", [])
    for player, score in scores.items():
        stats.player_stats[player].score = score

    if not scores:
        stats.winner = RESULT_TIE
        return

    top_score = max(scores.values())
    winners = [player for player, score in scores.items() if score == top_score]
    stats.winner = winners[0] if len(winners) == 1 else RESULT_TIE

Agent Interface

Your bot must be a Python file named cyborg_agent.py that defines MyAgent.

MyAgent must inherit from a CybORG BaseAgent class. A valid starting point is:

from CybORG.Agents import RandomAgent


class MyAgent(RandomAgent):
    pass

The arena runs MyAgent through CybORG's PettingZoo parallel wrapper. For each episode, the same MyAgent class controls all blue-team drone agents. get_action(observation, action_space) should return an action accepted by the provided CybORG action space.

Configuration Example

tournament:
  rounds: 1
game:
  name: CybORG
  sims_per_round: 2
  args:
    steps_per_episode: 5
    num_drones: 8
    timeout: 240
players:
  - agent: dummy
    name: alpha
  - agent: dummy
    name: beta

Scoring

The arena runs sims_per_round independent simulated DroneSwarm episodes for each submitted player. Each player receives the sum of mean blue-agent rewards per episode. The final CodeClash score is the average episode score across the round.

The runtime pins CybORG to the upstream v3.0 code and installs it editable from a checked-out repository because the upstream package expects data files such as CybORG/version.txt to be present next to the source tree.

bug_report Something broken/unclear?

Open an issue on GitHub!

help Open-ended discussions

Join our Slack!

Our projects