System logs represent a valuable source of cyber threat intelligence (CTI), capturing attacker behaviors, exploited vulnerabilities, and traces of malicious activity. Yet their utility is often limited by a lack of structure, semantic inconsistency, and fragmentation across devices and sessions. Extracting actionable CTI from logs, therefore, requires approaches that can reconcile noisy, heterogeneous data into coherent and interoperable representations. We introduce OntoLogX, an autonomous AI agent that leverages large language models (LLMs) to transform raw logs into ontology-grounded knowledge graphs (KGs). OntoLogX integrates a lightweight log ontology with retrieval augmented generation and iterative correction steps, ensuring that generated KGs are syntactically and semantically valid. Beyond event-level analysis, the system aggregates KGs into sessions and employs a LLM to predict MITRE ATT&CK tactics, linking low-level log evidence to higher-level adversarial objectives. We evaluate OntoLogX on both public and real-world honeypot datasets, demonstrating robust KG generation across multiple LLMs backends and accurate mapping of adversarial activity to MITRE ATT&CK tactics. Results highlight the effectiveness of the methodology in constructing ontology-compliant KGs, along with their value in extracting actionable CTI.

OntoLogX: Ontology-Guided Knowledge Graph Extraction From Cybersecurity Logs With Large Language Models

Rula A.;Bianchini D.;Cerutti F.
2026-01-01

Abstract

System logs represent a valuable source of cyber threat intelligence (CTI), capturing attacker behaviors, exploited vulnerabilities, and traces of malicious activity. Yet their utility is often limited by a lack of structure, semantic inconsistency, and fragmentation across devices and sessions. Extracting actionable CTI from logs, therefore, requires approaches that can reconcile noisy, heterogeneous data into coherent and interoperable representations. We introduce OntoLogX, an autonomous AI agent that leverages large language models (LLMs) to transform raw logs into ontology-grounded knowledge graphs (KGs). OntoLogX integrates a lightweight log ontology with retrieval augmented generation and iterative correction steps, ensuring that generated KGs are syntactically and semantically valid. Beyond event-level analysis, the system aggregates KGs into sessions and employs a LLM to predict MITRE ATT&CK tactics, linking low-level log evidence to higher-level adversarial objectives. We evaluate OntoLogX on both public and real-world honeypot datasets, demonstrating robust KG generation across multiple LLMs backends and accurate mapping of adversarial activity to MITRE ATT&CK tactics. Results highlight the effectiveness of the methodology in constructing ontology-compliant KGs, along with their value in extracting actionable CTI.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11379/645505
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact