keyboard_arrow_up
CogniCrypt: Synergistic Directed Execution and LLM-Driven Analysis for Zero-Day AI-Generated Malware Detection

Authors

George Edward, Mahdi Eslamimehr, Quandary Peak Research, USA

Abstract

The weaponization of Large Language Models (LLMs) for automated malware generation poses an existential threat to conventional detection paradigms. AI-generated malware exhibits polymorphic, metamorphic, and context-aware evasion capabilities that render signature-based and shallow heuristic defenses obsolete. This paper introduces CogniCrypt, a novel hybrid analysis framework that synergistically combines concolic execution with LLM-augmented path prioritization and deep-learning-based vulnerability classification to detect zero-day AI-generated malware with provable guarantees. We formalize the detection problem within a first-order temporal logic over program execution traces, define a lattice-theoretic abstraction for path constraint spaces, and prove both the soundness and relative completeness of our detection algorithm, assuming classifier correctness. The framework introduces three novel algorithms: (i) an LLM-guided concolic exploration strategy that reduces the average number of explored paths by 73.2% compared to depth-first search while maintaining equivalent malicious-path coverage; (ii) a transformer-based path-constraint classifier trained on symbolic execution traces; and (iii) a feedback loop that iteratively refines the LLM’s prioritization policy using reinforcement learning from detection outcomes. We provide a comprehensive implementation built upon angr 9.2, Z3 4.12, Hugging Face Transformers 4.38, and PyTorch 2.2, with full configuration details enabling reproducibility. Experimental evaluation on the EMBER, Malimg, SOREL-20M, and a novel AI-Gen-Malware benchmark comprising 2,500 LLM-synthesized samples demonstrates that CogniCrypt achieves 98.7% accuracy on conventional malware and 97.5% accuracy on AI-generated threats, outperforming ClamAV, YARA, MalConv, and EMBER-GBDT baselines by margins of 8.4-52.2 percentage points on AI-generated samples.

Keywords

Concolic Execution, Large Language Models, AI-Generated Malware, Symbolic Execution, Vulnerability Discovery, Software Security, Formal Verification, Deep Learning, Zero-Day Detection, Secure Coding.

Full Text  Volume 16, Number 9