bots-bench / Cyber Blue Team CTF / Open weights
CyBT-CTF: GLM 5.2
GLM 5.2 sits in the top open-weight tier in the current CyBT-CTF view, tying frontier proprietary references opencode / Opus 4.8 and Codex / GPT-5.5 at 28/59. Qwen3.7 Plus, a proprietary Alibaba model, now shares that score with lower aggregate time and cost, while GLM 5.2 remains unusually important because it has a high agreement signal with GPT-5.5: MCC 0.797 and Cohen's kappa 0.795 on the blinded test. That is serious enough for distillation/copying review, but not proof by itself because exact same wrong text is 0.