In the last decades, the rise of Big Data solutions has significantly advanced the analysis of time series data as representation of dynamic phenomena through sequences of observations. Recent research efforts have advocated for the adoption of data summarisation techniques, such as incremental clustering, to promptly capture data evolution, thus facilitating domain experts in making informed and proactive decisions, capitalising on a compact representation of time series. Neverthe-less, while incremental clustering effectively reduces data volume, thus preserving relevant statistical information, it is crucial to estimate the degree of approximation between the original time series data and its summarised version. This evaluation is pivotal whenever the summarisation output is the starting point to set up complex analytical pipelines (e.g., for pattern recognition and anomaly detection purposes). Stemming from practical and empirical considerations made upon both a synthetic and a real-world dataset, we propose in this paper a variant of a renowned quality metric for incremental clustering, to assess the extent to which the time series summary accurately captures the dynamics of the original data.

An Empirical Approach for Clustering-Based Time Series Summarisation Assessment

Bianchini D.;Garda M.
2024-01-01

Abstract

In the last decades, the rise of Big Data solutions has significantly advanced the analysis of time series data as representation of dynamic phenomena through sequences of observations. Recent research efforts have advocated for the adoption of data summarisation techniques, such as incremental clustering, to promptly capture data evolution, thus facilitating domain experts in making informed and proactive decisions, capitalising on a compact representation of time series. Neverthe-less, while incremental clustering effectively reduces data volume, thus preserving relevant statistical information, it is crucial to estimate the degree of approximation between the original time series data and its summarised version. This evaluation is pivotal whenever the summarisation output is the starting point to set up complex analytical pipelines (e.g., for pattern recognition and anomaly detection purposes). Stemming from practical and empirical considerations made upon both a synthetic and a real-world dataset, we propose in this paper a variant of a renowned quality metric for incremental clustering, to assess the extent to which the time series summary accurately captures the dynamics of the original data.
2024
9798350376968
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11379/614710
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact