Extracting structured information from unstructured text is crucial for modeling real-world processes, but traditional schema mining relies on semi-structured data, limiting scalability. This paper introduces schema-miner, a novel tool that combines large language models with human feedback to automate and refine schema extraction. Through an iterative workflow, it organizes properties from text, incorporates expert input, and integrates domain-specific ontologies for semantic depth. Applied to materials science—specifically atomic layer deposition—schema-miner demonstrates that expert-guided LLMs generate semantically rich schemas suitable for diverse real-world applications.

LLMs4SchemaDiscovery: A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Models

Rula, Anisa;
2025-01-01

Abstract

Extracting structured information from unstructured text is crucial for modeling real-world processes, but traditional schema mining relies on semi-structured data, limiting scalability. This paper introduces schema-miner, a novel tool that combines large language models with human feedback to automate and refine schema extraction. Through an iterative workflow, it organizes properties from text, incorporates expert input, and integrates domain-specific ontologies for semantic depth. Applied to materials science—specifically atomic layer deposition—schema-miner demonstrates that expert-guided LLMs generate semantically rich schemas suitable for diverse real-world applications.
2025
Lecture Notes in Computer Science
Altre Istituz. pubb. estere
Inglese
22nd European Semantic Web Conference, ESWC 2025
2025
svn
15719 LNCS
244
261
18
9783031945779
9783031945786
Springer Science and Business Media Deutschland GmbH
Human-in-the-loop Workflow; Large Language Models; Schema Discovery; Schema Mining; Scientific Schemas
Not applicable
none
Sadruddin, Sameer; D'Souza, Jennifer; Poupaki, Eleni; Watkins, Alex; Babaei Giglou, Hamed; Rula, Anisa; Karasulu, Bora; Auer, Sören; Mackus, Adrie; Ke...espandi
273
info:eu-repo/semantics/conferenceObject
10
4 Contributo in Atti di Convegno (Proceeding)::4.1 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11379/628805
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact