System logs represent a valuable source of cyber threat intelligence (CTI), capturing attacker behaviors, exploited vulnerabilities, and traces of malicious activity. Yet their utility is often limited by a lack of structure, semantic inconsistency, and fragmentation across devices and sessions. Extracting actionable CTI from logs, therefore, requires approaches that can reconcile noisy, heterogeneous data into coherent and interoperable representations. We introduce OntoLogX, an autonomous AI agent that leverages large language models (LLMs) to transform raw logs into ontology-grounded knowledge graphs (KGs). OntoLogX integrates a lightweight log ontology with retrieval augmented generation and iterative correction steps, ensuring that generated KGs are syntactically and semantically valid. Beyond event-level analysis, the system aggregates KGs into sessions and employs a LLM to predict MITRE ATT&CK tactics, linking low-level log evidence to higher-level adversarial objectives. We evaluate OntoLogX on both public and real-world honeypot datasets, demonstrating robust KG generation across multiple LLMs backends and accurate mapping of adversarial activity to MITRE ATT&CK tactics. Results highlight the effectiveness of the methodology in constructing ontology-compliant KGs, along with their value in extracting actionable CTI.
OntoLogX: Ontology-Guided Knowledge Graph Extraction From Cybersecurity Logs With Large Language Models
Rula A.;Bianchini D.;Cerutti F.
2026-01-01
Abstract
System logs represent a valuable source of cyber threat intelligence (CTI), capturing attacker behaviors, exploited vulnerabilities, and traces of malicious activity. Yet their utility is often limited by a lack of structure, semantic inconsistency, and fragmentation across devices and sessions. Extracting actionable CTI from logs, therefore, requires approaches that can reconcile noisy, heterogeneous data into coherent and interoperable representations. We introduce OntoLogX, an autonomous AI agent that leverages large language models (LLMs) to transform raw logs into ontology-grounded knowledge graphs (KGs). OntoLogX integrates a lightweight log ontology with retrieval augmented generation and iterative correction steps, ensuring that generated KGs are syntactically and semantically valid. Beyond event-level analysis, the system aggregates KGs into sessions and employs a LLM to predict MITRE ATT&CK tactics, linking low-level log evidence to higher-level adversarial objectives. We evaluate OntoLogX on both public and real-world honeypot datasets, demonstrating robust KG generation across multiple LLMs backends and accurate mapping of adversarial activity to MITRE ATT&CK tactics. Results highlight the effectiveness of the methodology in constructing ontology-compliant KGs, along with their value in extracting actionable CTI.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


