# Oscar Perpiñán Lamigueiro

Books Papers Software Resources

Associate Professor at UPM (Electrical Engineering, Photovoltaic and Solar Energy). R and Emacs enthusiast. Always learning.

## Books

**Displaying time series, spatial and space-time data with R**(code and figures)- A data graphic is not only a static image, but it also tells a story about the data. It activates cognitive processes that are able to detect patterns and discover information not readily available with the raw data. This is particularly true for time series, spatial, and space-time datasets. Focusing on the exploration of data with visual methods, “Displaying Time Series, Spatial, and Space-Time Data with R” presents methods and R code for producing high-quality graphics of time series, spatial, and space-time data. Practical examples using real-world datasets help you understand how to apply the methods and code. The book illustrates how to display a dataset starting with an easy and direct approach and progressively adding improvements that involve more complexity. Each of the book’s three parts is devoted to different types of data. In each part, the chapters are grouped according to the various visualization methods or data characteristics. Along with the main graphics from the text, the book’s website offers access to the datasets used in the examples as well as the full R code. This combination of freely available code and data enables you to practice with the methods and modify the code to suit your own needs.

**Energía Solar Fotovoltaica**(español)- Este libro consta de varios capítulos que cubren los aspectos más importantes de esta tecnología desde el punto de visto de la ingeniería de sistemas: geometría y radiación solar, dispositivos, módulos y generadores fotovoltaicos, las tres principales aplicaciones (sistemas de conexión a red, autónomos de electrificación rural, y autónomos de bombeo de agua), seguridad eléctrica en sistemas fotovoltaicos, y tiempo de recuperación energética. Cuenta con un conjunto de ejercicios resueltos relacionados con cada capítulo. Todo el material está en este repositorio GitHub. La maduración de este libro se ha realizado en el contexto del Máster de Energías Renovables y Mercado Energético de la EOI y en los cursos de experto profesional impartidos por el Departamento de Ingeniería Eléctrica, Electrónica y de Control de la ETSII-UNED. Una versión adaptada de este libro fue publicada en 2012 por la editorial Progensa con el título “Diseño de Sistemas Fotovoltaicos” y con licencia Creative Commons.

## Journal Papers and Book Chapters

- M. Pinho Almeida, M. Muñoz, I. de la Parra, O. Perpiñán,
**Comparative study of PV power forecast using parametric and nonparametric PV models**, Solar Energy, 155, 2017: 854-866, ISSN 0038-092X, 10.1016/j.solener.2017.07.032 : pdf - Forecast procedures for large ground mounted PV plants or smaller BIPV or BAPV systems may use a parametric or a nonparametric model of the PV system. In this paper, both approaches are used independently to calculate the energy delivered to the grid on an hourly basis in forecast procedures that use meteorological variables from a Numerical Weather Prediction model as inputs, and their performances against real generation data from six PV plants are analyzed. The parametric approach relies on mathematical models with several parameters that describe the PV systems and it was implemented in MATLAB, whereas the nonparametric approach is based on Quantile Regression Forests with training and forecast stages and its code was built in R. The parametric approach presented more significant bias on its results, mostly due to the input data and the transposition model of irradiance from a horizontal surface to the plane of the PV array.
- Muñoz, J., O. Perpiñán,
**A Simple Model for the Prediction of Yearly Energy Yields for Grid-Connected PV Systems Starting from Monthly Meteorological Data**. Renewable Energy 97, 2016: 680–88. 10.1016/j.renene.2016.06.023 - This paper presents a simple model, called Clear-cloudy sky, which estimates yearly energy yields for PV systems starting from the twelve monthly values of global horizontal solar irradiation, diffuse fraction, Linke turbidity and minimum and maximum ambient temperatures. The proposed model has been included in an online and free-software simulator of PV systems, called SISIFO, which has been used to analyse the performance of the model in comparison with other synthetic models using as reference the typical meteorological years (TMY3) of more than two hundred Class I stations belonging to the NREL American National Solar Radiation database. The results of this comparison show that the model provides yearly predictions on PV system performance parameters that have low bias and uncertainty with respect to the same figures obtained with the original TMY3 hourly time series.
- M. Pinho Almeida, O. Perpiñán, L. Narvarte,
**PV Power Forecast Using a Nonparametric PV Model**. Solar Energy 115, 2015: 354–68. 10.1016/j.solener.2015.03.006 : pdf, code - Forecasting the AC power output of a PV plant accurately is important both for plant owners and electric system operators. Two main categories of PV modeling are available: the parametric and the nonparametric. In this paper, a methodology using a nonparametric PV model is proposed, using as inputs several forecasts of meteorological variables from a Numerical Weather Forecast model, and actual AC power measurements of PV plants. The methodology was built upon the R environment and uses Quantile Regression Forests as machine learning tool to forecast AC power with a confidence interval. Real data from five PV plants was used to validate the methodology, and results show that daily production is predicted with an absolute cvMBE lower than 1.3%.
- F. Antonanzas-Torres, Andres Sanz-Garcia, Javier Antonanzas-Torres, Oscar Perpiñán, and Francisco Javier Martínez-de-Pisón-Ascacibar.
**Current Status and Future Trends of the Evaluation of Solar Global Irradiation using Soft-Computing-Based Models**Soft Computing Applications for Renewable Energy and Energy Efficiency. IGI Global, 2015. 1-22. 10.4018/978-1-4666-6631-3.ch001 - Most of the research on estimating Solar Global Irradiation (SGI) is based on the development of parametric models. However, the use of methods based on the use of statistics and machine-learning theories can provide a significant improvement in reducing the prediction errors. The chapter evaluates the performance of different Soft Computing (SC) methods, such as support vector regression and artificial neural networks-multilayer perceptron, in SGI modeling against classical parametric and lineal models. SC methods demonstrate a higher generalization capacity applied to SGI modeling than classic parametric models. As a result, SC models suppose an alternative to satellite-derived models to estimate SGI in near-to-present time in areas in which no pyranometers are installed nearby.
- F. Antonanzas-Torres, F.J. Martínez de Pisón, J. Antonanzas, O. Perpiñán,
**Downscaling of global solar irradiation in complex areas in R**, Journal of Renewable and Sustainable Energy, 6, 063105 (2014), 10.1063/1.4901539: pdf, code - A methodology for downscaling solar irradiation from satellite-derived databases is described using R software. Different packages such as raster, parallel, solaR, gstat, sp, and rasterVis are considered in this study for improving solar resource estimation in areas with complex topography, in which downscaling is a very useful tool for reducing inherent deviations in satellite-derived irradiation databases, which lack of high global spatial resolution. A topographical analysis of horizon blocking and sky-view is developed with a digital elevation model to determine what fraction of hourly solar irradiation reaches the Earth’s surface. Eventually, kriging with external drift is applied for a better estimation of solar irradiation throughout the region analyzed including the use of local measurements. This methodology has been implemented as an example within the region of La Rioja in northern Spain. The mean absolute error found using the methodology proposed is 91.92 kWh/m² vs. 172.62 kWh/m² using the original satellite-derived database (a striking 46.75% lower). The code is freely available without restrictions for future replications or variations of the study at https://github.com/EDMANSolar/downscaling.
- F. Antonanzas-Torres, A. Sanz-Garcia, F. J. Martínez-de-Pisón, O. Perpiñán, J. Polo,
**Towards downscaling of aerosol gridded dataset for improving solar resource assessment. Application to Spain**, Renewable Energy, Volume 71, November 2014, Pages 534-544, ISSN 0960-1481, 10.1016/j.renene.2014.06.010: pdf - Solar radiation estimates with clear sky models require estimations of aerosol data. The low spatial resolution of current aerosol datasets, with their remarkable drift from measured data, poses a problem in solar resource estimation. This paper proposes a new downscaling methodology by combining support vector machines for regression (SVR) and kriging with external drift, with data from the MACC reanalysis datasets and temperature and rainfall measurements from 213 meteorological stations in continental Spain. The SVR technique was proven efficient in aerosol variable modeling. The Linke turbidity factor (TL) and the aerosol optical depth at 550nm (AOD 550) estimated with SVR generated significantly lower errors in AERONET positions than MACC reanalysis estimates. The TL was estimated with relative mean absolute error (rMAE) of 10.2% (compared with AERONET), against the MACC rMAE of 18.5%. A similar behavior was seen with AOD 550, estimated with rMAE of 8.6% (compared with AERONET), against the MACC rMAE of 65.6%. Kriging using MACC data as external drift was found useful in generating high resolution maps (0.05o x0.05o ) of both aerosol variables. We created high resolution maps of aerosol variables in continental Spain for the year 2008. The proposed methodology was proven to be a valuable tool to create high resolution maps of aerosol variables (TL and AOD 550). This methodology shows meaningful improvements when compared with estimated available databases and therefore, leads to more accurate solar resource estimations. This methodology could also be applied to the prediction of other atmospheric variables, whose datasets are of low resolution.
- F. Antonanzas-Torres, A. Sanz-Garcia, F.J. Martínez-de-Pisón, O. Perpiñán,
**Evaluation and improvement of empirical models of global solar irradiation: Case study northern Spain**, Renewable Energy, Volume 60, December 2013, Pages 604-614, ISSN 0960-1481, 10.1016/j.renene.2013.06.008: pdf - This paper presents a new methodology to build parametric models to estimate global solar irradiation adjusted to specific on-site characteristics based on the evaluation of variable importance. Thus, those variables higly correlated to solar irradiation on a site are implemented in the model and therefore, different models might be proposed under different climates. This methodology is applied in a study case in La Rioja region (northern Spain). A new model is proposed and evaluated on stability and accuracy against a review of twenty-two already existing parametric models based on temperatures and rainfall in seventeen meteorological stations in La Rioja. The methodology of model evaluation is based on bootstrapping, which leads to achieve a high level of confidence in model calibration and validation from short time series (in this case five years, from 2007 to 2011). The model proposed improves the estimates of the other twenty-two models with average mean absolute error (MAE) of 2.195 MJ/m2 day and average confidence interval width (95% C.I., n=100) of 0.261 MJ/m2 day. 41.65% of the daily residuals in the case of SIAR and 20.12% in that of SOS Rioja fall within the uncertainty tolerance of the pyranometers of the two networks (10% and 5%, respectively). Relative differences between measured and estimated irradiation on an annual cumulative basis are below 4.82%. Thus, the proposed model might be useful to estimate annual sums of global solar irradiation, reaching insignificant differences between measurements from pyranometers.
- F. Antoñanzas, F. Cañizares, O. Perpiñán,
**Comparative assessment of global irradiation from a satellite estimate model (CM SAF) and on-ground measurements (SIAR): a Spanish case study**, Renewable and Sustainable Energy Reviews, Volume 21, May 2013, Pages 248-261, 10.1016/j.rser.2012.12.033: pdf, code - An analysis and comparison of daily and yearly solar irradiation from the satellite CM SAF database and a set of 301 stations from the Spanish SIAR network is performed using data of 2010 and 2011. This analysis is completed with the comparison of the estimations of effective irradiation incident on three different tilted planes (fixed, two axis tracking, north-south horizontal axis) using irradiation from these two data sources. Finally, a new map of yearly values of irradiation both on the horizontal plane and on inclined planes is produced mixing both sources with geostatistical techniques (kriging with external drift, KED) The Mean Absolute Difference (MAD) between CM SAF and SIAR is approximately 4% for the irradiation on the horizontal plane and is comprised between 5% and 6% for the irradiation incident on the inclined planes. The MAD between KED and SIAR, and KED and CM SAF is approximately 3% for the irradiation on the horizontal plane and is comprised between 3% and 4% for the irradiation incident on the inclined planes. The methods have been implemented using free software, available as supplementary material, and the data sources are freely available without restrictions.
- O. Perpiñán, J. Marcos, E. Lorenzo,
**Electrical Power Fluctuations in a Network of DC/AC inverters in a Large PV Plant: relationship between correlation, distance and time scale**, Solar Energy, Volume 88, February 2013, 10.1016/j.solener.2012.1: pdf, code - This paper analyzes the correlation between the fluctuations of the electrical power generated by the ensemble of 70 DC/AC inverters from a 45.6 MW PV plant. The use of real electrical power time series from a large collection of photovoltaic inverters of a same plant is an important contribution in the context of models built upon simplified assumptions to overcome the absence of such data. This data set is divided into three different fluctuation categories with a clustering procedure which performs correctly with the clearness index and the wavelet variances. Afterwards, the time dependent correlation between the electrical power time series of the inverters is estimated with the wavelet transform. The wavelet correlation depends on the distance between the inverters, the wavelet time scales and the daily fluctuation level. Correlation values for time scales below one minute are low without dependence on the daily fluctuation level. For time scales above 20 minutes, positive high correlation values are obtained, and the decay rate with the distance depends on the daily fluctuation level. At intermediate time scales the correlation depends strongly on the daily fluctuation level.
- O. Perpiñán, M.A. Sánchez-Urán, F. Álvarez, J. Ortego, F. Garnacho,
**Signal analysis and feature generation for pattern identification of partial discharges in high-voltage equipment**, Electric Power Systems Research, 2013, 95:C (56-65), 10.1016/j.epsr.2012.08.016: pdf - This paper proposes a method for the identification of different partial discharges (PD) sources through the analysis of a collection of PD signals acquired with a PD measurement system. This method, robust and sensitive enough to cope with noisy data and external interferences, combines the characterization of each signal from the collection, with a clustering procedure, the CLARA algorithm. Several features are proposed for the characterization of the signals, being the wavelet variances, the frequency estimated with the Prony method, and the energy, the most relevant for the performance of the clustering procedure. The result of the unsupervised classification is a set of clusters each containing those signals which are more similar to each other than to those in other clusters. The analysis of the classification results permits both the identification of different PD sources and the discrimination between original PD signals, reflections, noise and external interferences.
- O. Perpiñán,
**solaR: Solar Radiation and Photovoltaic Systems with R**, Journal of Statistical Software, 2012. 50(9), (1-32): pdf and code - The
`solaR`

package allows for reproducible research both for photovoltaics systems performance and solar radiation. It includes a set of classes, methods and functions to calculate the sun geometry and the solar radiation incident on a photovoltaic generator and to simulate the performance of several applications of the photovoltaic energy. This package performs the whole calculation procedure from both daily and intradaily global horizontal irradiation to the final productivity of grid connected PV systems and water pumping PV systems. It is designed using a set of S4 classes whose core is a group of slots with multivariate time series. The classes share a variety of methods to access the information and several visualisation methods. In addition, the package provides a tool for the visual statistical analysis of the performance of a large PV plant composed of several systems. Although solaR is primarily designed for time series associated to a location defined by its latitude/longitude values and the temperature and irradiation conditions, it can be easily combined with spatial packages for space-time analysis. - O. Perpiñán,
**Cost of energy and mutual shadows in a two-axis tracking PV system**, Renewable Energy, 2011, 10.1016/j.renene.2011.12.001: pdf, code - The performance improvement obtained from the use of trackers in a PV system cannot be separated from the higher requirement of land due to the mutual shadows between generators. Thus, the optimal choice of distances between trackers is a compromise between productivity and land use to minimize the cost of the energy produced by the PV system during its lifetime. This paper develops a method for the estimation and optimization of the cost of energy function. It is built upon a set of equations to model the mutual shadows geometry and a procedure for the optimal choice of the wire cross-section. Several examples illustrate the use of the method with a particular PV system under different conditions of land and equipment costs.
- O. Perpiñán and E. Lorenzo,
**Analysis and synthesis of the variability of irradiance and PV power time series with the wavelet transform**, Solar Energy, 85:1 (188-197), 2010, 10.1016/j.solener.2010.08.013: pdf (rev. 2011-12-26), code, data - The irradiance fluctuations and the subsequent variability of the power output of a PV system are analysed with some mathematical tools based on the wavelet transform. It can be shown that the irradiance and power time series are nonstationary process whose behaviour resembles that of a long memory process. Besides, the long memory spectral exponent is a useful indicator of the fluctuation level of a irradiance time series. On the other side, a time series of global irradiance on the horizontal plane can be simulated by means of the wavestrapping technique on the clearness index and the fluctuation behaviour of this simulated time series correctly resembles the original series. Moreover, a time series of global irradiance on the inclined plane can be simulated with the wavestrapping procedure applied over a signal previously detrended by a partial reconstruction with a wavelet multiresolution analysis, and, once again, the fluctuation behaviour of this simulated time series is correct. This procedure is a suitable tool for the simulation of irradiance incident over a group of distant PV plants. Finally, a wavelet variance analysis and the long memory spectral exponent show that a PV plant behaves as a low-pass filter.
- O. Perpiñán,
**Statistical analysis of the performance and simulation of a two-axis tracking PV system**, Solar Energy, 83:11(2074–2085), 2009, 10.1016/j.solener.2009.08.008: pdf - The energy produced by a photovoltaic system over a given period can be estimated from the incident radiation at the site where the Grid Connected PV System (GCPVS) is located, assuming knowledge of certain basic features of the system under study. Due to the inherently stochastic nature of solar radiation, the question ``How much energy will a GCPVS produce at this location over the next few years?’’ involves an exercise of prediction inevitably subjected to a degree of uncertainty. Moreover, during the life cycle of the GCPVS, another question arises: ``Is the system working correctly?’’. This paper proposes and examines several methods to cope with these questions. The daily performance of a PV system is simulated. This simulation and the interannual variability of both radiation and productivity are statistically analyzed. From the results several regression adjustments are obtained. This analysis is shown to be useful both for productivity prediction and performance checking exercises. Finally, a statistical analysis of the performance of a GCPVS is carried out as a detection method of malfunctioning parts of the system.
- O. Perpiñán, E. Lorenzo, M. A. Castro, and R. Eyras.
**Energy payback time of grid connected pv systems: comparison between tracking and fixed systems**. Progress in Photovoltaics: Research and Applications, 17:137-147, 2009: pdf - A review of existing studies about LCA of PV systems has been carried out. The data from this review have been completed with our own figures in order to calculate the Energy Payback Time of double and horizontal axis tracking and fixed systems. The results of this metric span from 2 to 5 years for the latitude and global irradiation ranges of the geographical area comprised between -10º to 10º of longitude, and 30º to 45º of latitude. With the caution due to the uncertainty of the sources of information, these results mean that a GCPVS is able to produce back the energy required for its existence from 6 to 15 times during a life cycle of 30 years. When comparing tracking and fixed systems, the great importance of the PV generator makes advisable to dedicate more energy to some components of the system in order to increase the productivity and to obtain a higher performance of the component with the highest energy requirement. Both double axis and horizontal axis trackers follow this way, requiring more energy in metallic structure, foundations and wiring, but this higher contribution is widely compensated by the improved productivity of the system.
- O. Perpiñán, E. Lorenzo, M. A. Castro, and R. Eyras.
**On the complexity of radiation models for PV energy production calculation**. Solar Energy, 82:2 (125-131), 2008: pdf - Several authors have analysed the changes of the probability density function of the solar radiation with different time resolutions. Some others have approached to study the significance of these changes when produced energy calculations are attempted. We have undertaken different transformations to four Spanish databases in order to clarify the interrelationship between radiation models and produced energy estimations. Our contribution is straightforward: the complexity of a solar radiation model needed for yearly energy calculations, is very low. Twelve values of monthly mean of solar radiation are enough to estimate energy with errors below 3%. Time resolutions better than hourly samples do not improve significantly the result of energy estimations.
- O. Perpiñán, E. Lorenzo, and M. A. Castro.
**On the calculation of energy produced by a PV grid-connected system**. Progress in Photovoltaics: Research and Applications, 15(3):265–274, 2007:pdf - This study develops a proposal of method of calculation useful to estimate the energy produced by a PV grid-connected system making use of irradiance-domain integrals and definition of statistical moment. Validation against database of real PV plants performance data shows that acceptable energy estimation can be obtained with first to fourth statistical moments and some basic system parameters. This way, only simple calculations at the reach of pocket calculators, are enough to estimate AC energy.

## Software

`solaR`

- Calculation methods of solar radiation and performance of photovoltaic systems from daily and intradaily irradiation data sources.
`rasterVis`

- Methods for enhanced visualization and interaction with raster data.
`meteoForecast`

- Provides access to forecasts published by NWP-WRF services using the NetCDF Subset Service.
`PVF`

- Non-parametric forecast of AC power produced by grid-connected PV systems. This package has been developed in the framework of the European Project PVCROPS
`tdr`

- R implementation of Target Diagrams.
`pxR`

- Provides a set of functions for reading and writing PC-Axis files, used by different statistical organizations around the globe for data dissemination.
`pdCluster`

- Tools for feature generation, exploratory graphical analysis, clustering and variable importance quantification for partial discharge signals.

## Resources

- Meteorological Data Sources (wiki)
- Introducción a R (spanish)
- Gists
- Proyecto de Investigación “Sistemas Fotovoltaicos en Redes de Distribución”