The most commonly employed tool for wind turbine performance analysis is the power curve, which is the relation between wind intensity and power. The diffusion of SCADA systems has boosted the adoption of data-driven approaches to power curves. In particular, a recent research line involves multivariate methods, employing further input variables in addition to the wind speed. In this work, an innovative contribution is investigated, which is the inclusion of thirteen sub-component temperatures as possible covariates. This is discussed through a real-world test case, based on data provided by ENGIE Italia. Two models are analyzed: support vector regression with Gaussian kernel and Gaussian process regression. The input variables are individuated through a sequential feature selection algorithm. The sub-component temperatures are abundantly selected as input variables, proving the validity of the idea proposed in this work. The obtained error metrics are lower with respect to benchmark models employing more typical input variables: the resulting mean absolute error is 1.35% of the rated power. The results of the two types of selected regressions are not remarkably different. This supports that the qualifying points are, rather than the model type, the use and the selection of a potentially vast number of input variables.

Multivariate Data-Driven Models for Wind Turbine Power Curves including Sub-Component Temperatures

Astolfi D.;
2023-01-01

Abstract

The most commonly employed tool for wind turbine performance analysis is the power curve, which is the relation between wind intensity and power. The diffusion of SCADA systems has boosted the adoption of data-driven approaches to power curves. In particular, a recent research line involves multivariate methods, employing further input variables in addition to the wind speed. In this work, an innovative contribution is investigated, which is the inclusion of thirteen sub-component temperatures as possible covariates. This is discussed through a real-world test case, based on data provided by ENGIE Italia. Two models are analyzed: support vector regression with Gaussian kernel and Gaussian process regression. The input variables are individuated through a sequential feature selection algorithm. The sub-component temperatures are abundantly selected as input variables, proving the validity of the idea proposed in this work. The obtained error metrics are lower with respect to benchmark models employing more typical input variables: the resulting mean absolute error is 1.35% of the rated power. The results of the two types of selected regressions are not remarkably different. This supports that the qualifying points are, rather than the model type, the use and the selection of a potentially vast number of input variables.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11379/593306
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact