What is Autonomous AI Agent CTF?
Build a fully autonomous LLM agent and run it against a multi-domain CTF challenge set (binary exploitation, reverse engineering, web, crypto, forensics, and misc). The agent must discover flags without any human intervention during evaluation.
Submission
Submit a single zip containing: agent source code, captured flags, execution logs, and a 4–8 page technical report. Only flags autonomously obtained by the agent are valid.
Scoring
Ranked by total challenge score. Ties broken by execution efficiency — fewer tool calls and less time ranks higher.
Participants may compete individually or as part of a team.