Oscar Perpiñán Lamigueiro
Books Papers Software Resources
Photovoltaic and solar energy lecturer (EOI). Electrical engineering assistant professor (UPM). R, LaTeX, Emacs enthusiast. Always learning.
Books
 Displaying time series, spatial and spacetime data with R (code and figures)
 A data graphic is not only a static image, but it also tells a story about the data. It activates cognitive processes that are able to detect patterns and discover information not readily available with the raw data. This is particularly true for time series, spatial, and spacetime datasets. Focusing on the exploration of data with visual methods, “Displaying Time Series, Spatial, and SpaceTime Data with R” presents methods and R code for producing highquality graphics of time series, spatial, and spacetime data. Practical examples using realworld datasets help you understand how to apply the methods and code. The book illustrates how to display a dataset starting with an easy and direct approach and progressively adding improvements that involve more complexity. Each of the book’s three parts is devoted to different types of data. In each part, the chapters are grouped according to the various visualization methods or data characteristics. Along with the main graphics from the text, the book’s website offers access to the datasets used in the examples as well as the full R code. This combination of freely available code and data enables you to practice with the methods and modify the code to suit your own needs.
 Energía Solar Fotovoltaica (español)
 Este libro consta de varios capítulos que cubren los aspectos más importantes de esta tecnología desde el punto de visto de la ingeniería de sistemas: geometría y radiación solar, dispositivos, módulos y generadores fotovoltaicos, las tres principales aplicaciones (sistemas de conexión a red, autónomos de electrificación rural, y autónomos de bombeo de agua), seguridad eléctrica en sistemas fotovoltaicos, y tiempo de recuperación energética. Cuenta con un conjunto de ejercicios resueltos relacionados con cada capítulo. Todo el material está en este repositorio GitHub. La maduración de este libro se ha realizado en el contexto del Máster de Energías Renovables y Mercado Energético de la EOI y en los cursos de experto profesional impartidos por el Departamento de Ingeniería Eléctrica, Electrónica y de Control de la ETSIIUNED. Una versión adaptada de este libro fue publicada en 2012 por la editorial Progensa con el título “Diseño de Sistemas Fotovoltaicos” y con licencia Creative Commons.
Papers
 F. AntonanzasTorres, A. SanzGarcia, F. J. MartínezdePisón, O. Perpiñán, J. Polo, Towards downscaling of aerosol gridded dataset for improving solar resource assessment. Application to Spain, Renewable Energy, Volume 71, November 2014, Pages 534544, ISSN 09601481, 10.1016/j.renene.2014.06.010: pdf

Solar radiation estimates with clear sky models require estimations of aerosol data. The low spatial resolution of current aerosol datasets, with their remarkable drift from measured data, poses a problem in solar resource estimation. This paper proposes a new downscaling methodology by combining support vector machines for regression (SVR) and kriging with external drift, with data from the MACC reanalysis datasets and temperature and rainfall measurements from 213 meteorological stations in continental Spain. The SVR technique was proven efficient in aerosol variable modeling. The Linke turbidity factor (TL) and the aerosol optical depth at 550nm (AOD 550) estimated with SVR generated significantly lower errors in AERONET positions than MACC reanalysis estimates. The TL was estimated with relative mean absolute error (rMAE) of 10.2% (compared with AERONET), against the MACC rMAE of 18.5%. A similar behavior was seen with AOD 550, estimated with rMAE of 8.6% (compared with AERONET), against the MACC rMAE of 65.6%. Kriging using MACC data as external drift was found useful in generating high resolution maps (0.05o x0.05o ) of both aerosol variables. We created high resolution maps of aerosol variables in continental Spain for the year 2008. The proposed methodology was proven to be a valuable tool to create high resolution maps of aerosol variables (TL and AOD 550). This methodology shows meaningful improvements when compared with estimated available databases and therefore, leads to more accurate solar resource estimations. This methodology could also be applied to the prediction of other atmospheric variables, whose datasets are of low resolution.
 F. AntonanzasTorres, F.J. Martínez de Pisón, J. Antonanzas, O. Perpiñán, Downscaling of global solar irradiation in R, November 2013, arXiv:1311.7235, code

A methodology for downscaling solar irradiation from satellitederived databases is described using R software. Different packages such as raster, parallel, solaR, gstat, sp and rasterVis are considered in this study for improving solar resource estimation in areas with complex topography, in which downscaling is a very useful tool for reducing inherent deviations in satellitederived irradiation databases, which lack of high global spatial resolution. A topographical analysis of horizon blocking and skyview is developed with a digital elevation model to determine what fraction of hourly solar irradiation reaches the Earth’s surface. Eventually, kriging with external drift is applied for a better estimation of solar irradiation throughout the region analyzed. This methodology has been implemented as an example within the region of La Rioja in northern Spain, and the mean absolute error found is a striking 25.5% lower than with the original database.
 F. AntonanzasTorres, A. SanzGarcia, F.J. MartínezdePisón, O. Perpiñán, Evaluation and improvement of empirical models of global solar irradiation: Case study northern Spain, Renewable Energy, Volume 60, December 2013, Pages 604614, ISSN 09601481, 10.1016/j.renene.2013.06.008: pdf

This paper presents a new methodology to build parametric models to estimate global solar irradiation adjusted to specific onsite characteristics based on the evaluation of variable importance. Thus, those variables higly correlated to solar irradiation on a site are implemented in the model and therefore, different models might be proposed under different climates. This methodology is applied in a study case in La Rioja region (northern Spain). A new model is proposed and evaluated on stability and accuracy against a review of twentytwo already existing parametric models based on temperatures and rainfall in seventeen meteorological stations in La Rioja. The methodology of model evaluation is based on bootstrapping, which leads to achieve a high level of confidence in model calibration and validation from short time series (in this case five years, from 2007 to 2011). The model proposed improves the estimates of the other twentytwo models with average mean absolute error (MAE) of 2.195 MJ/m2 day and average confidence interval width (95% C.I., n=100) of 0.261 MJ/m2 day. 41.65% of the daily residuals in the case of SIAR and 20.12% in that of SOS Rioja fall within the uncertainty tolerance of the pyranometers of the two networks (10% and 5%, respectively). Relative differences between measured and estimated irradiation on an annual cumulative basis are below 4.82%. Thus, the proposed model might be useful to estimate annual sums of global solar irradiation, reaching insignificant differences between measurements from pyranometers.
 F. Antoñanzas, F. Cañizares, O. Perpiñán, Comparative assessment of global irradiation from a satellite estimate model (CM SAF) and onground measurements (SIAR): a Spanish case study, Renewable and Sustainable Energy Reviews, Volume 21, May 2013, Pages 248261, 10.1016/j.rser.2012.12.033: pdf, code

An analysis and comparison of daily and yearly solar irradiation from the satellite CM SAF database and a set of 301 stations from the Spanish SIAR network is performed using data of 2010 and 2011. This analysis is completed with the comparison of the estimations of effective irradiation incident on three different tilted planes (fixed, two axis tracking, northsouth horizontal axis) using irradiation from these two data sources. Finally, a new map of yearly values of irradiation both on the horizontal plane and on inclined planes is produced mixing both sources with geostatistical techniques (kriging with external drift, KED) The Mean Absolute Difference (MAD) between CM SAF and SIAR is approximately 4% for the irradiation on the horizontal plane and is comprised between 5% and 6% for the irradiation incident on the inclined planes. The MAD between KED and SIAR, and KED and CM SAF is approximately 3% for the irradiation on the horizontal plane and is comprised between 3% and 4% for the irradiation incident on the inclined planes. The methods have been implemented using free software, available as supplementary material, and the data sources are freely available without restrictions.
 O. Perpiñán, J. Marcos, E. Lorenzo, Electrical Power Fluctuations in a Network of DC/AC inverters in a Large PV Plant: relationship between correlation, distance and time scale, Solar Energy, Volume 88, February 2013, 10.1016/j.solener.2012.1: pdf, code

This paper analyzes the correlation between the fluctuations of the electrical power generated by the ensemble of 70 DC/AC inverters from a 45.6 MW PV plant. The use of real electrical power time series from a large collection of photovoltaic inverters of a same plant is an important contribution in the context of models built upon simplified assumptions to overcome the absence of such data. This data set is divided into three different fluctuation categories with a clustering procedure which performs correctly with the clearness index and the wavelet variances. Afterwards, the time dependent correlation between the electrical power time series of the inverters is estimated with the wavelet transform. The wavelet correlation depends on the distance between the inverters, the wavelet time scales and the daily fluctuation level. Correlation values for time scales below one minute are low without dependence on the daily fluctuation level. For time scales above 20 minutes, positive high correlation values are obtained, and the decay rate with the distance depends on the daily fluctuation level. At intermediate time scales the correlation depends strongly on the daily fluctuation level.
 O. Perpiñán, M.A. SánchezUrán, F. Álvarez, J. Ortego, F. Garnacho, Signal analysis and feature generation for pattern identification of partial discharges in highvoltage equipment, Electric Power Systems Research, 2013, 95:C (5665), 10.1016/j.epsr.2012.08.016: pdf

This paper proposes a method for the identification of different partial discharges (PD) sources through the analysis of a collection of PD signals acquired with a PD measurement system. This method, robust and sensitive enough to cope with noisy data and external interferences, combines the characterization of each signal from the collection, with a clustering procedure, the CLARA algorithm. Several features are proposed for the characterization of the signals, being the wavelet variances, the frequency estimated with the Prony method, and the energy, the most relevant for the performance of the clustering procedure. The result of the unsupervised classification is a set of clusters each containing those signals which are more similar to each other than to those in other clusters. The analysis of the classification results permits both the identification of different PD sources and the discrimination between original PD signals, reflections, noise and external interferences.
 O. Perpiñán, solaR: Solar Radiation and Photovoltaic Systems with R, Journal of Statistical Software, 2012. 50(9), (132): pdf and code
 The
solaR
package allows for reproducible research both for photovoltaics systems performance and solar radiation. It includes a set of classes, methods and functions to calculate the sun geometry and the solar radiation incident on a photovoltaic generator and to simulate the performance of several applications of the photovoltaic energy. This package performs the whole calculation procedure from both daily and intradaily global horizontal irradiation to the final productivity of grid connected PV systems and water pumping PV systems. It is designed using a set of S4 classes whose core is a group of slots with multivariate time series. The classes share a variety of methods to access the information and several visualisation methods. In addition, the package provides a tool for the visual statistical analysis of the performance of a large PV plant composed of several systems. Although solaR is primarily designed for time series associated to a location defined by its latitude/longitude values and the temperature and irradiation conditions, it can be easily combined with spatial packages for spacetime analysis.
 O. Perpiñán, Cost of energy and mutual shadows in a twoaxis tracking PV system, Renewable Energy, 2011, 10.1016/j.renene.2011.12.001: pdf, code

The performance improvement obtained from the use of trackers in a PV system cannot be separated from the higher requirement of land due to the mutual shadows between generators. Thus, the optimal choice of distances between trackers is a compromise between productivity and land use to minimize the cost of the energy produced by the PV system during its lifetime. This paper develops a method for the estimation and optimization of the cost of energy function. It is built upon a set of equations to model the mutual shadows geometry and a procedure for the optimal choice of the wire crosssection. Several examples illustrate the use of the method with a particular PV system under different conditions of land and equipment costs.
 O. Perpiñán and E. Lorenzo, Analysis and synthesis of the variability of irradiance and PV power time series with the wavelet transform, Solar Energy, 85:1 (188197), 2010, 10.1016/j.solener.2010.08.013: pdf (rev. 20111226), code, data
 The irradiance fluctuations and the subsequent variability of the power output of a PV system are analysed with some mathematical tools based on the wavelet transform. It can be shown that the irradiance and power time series are nonstationary process whose behaviour resembles that of a long memory process. Besides, the long memory spectral exponent is a useful indicator of the fluctuation level of a irradiance time series. On the other side, a time series of global irradiance on the horizontal plane can be simulated by means of the wavestrapping technique on the clearness index and the fluctuation behaviour of this simulated time series correctly resembles the original series. Moreover, a time series of global irradiance on the inclined plane can be simulated with the wavestrapping procedure applied over a signal previously detrended by a partial reconstruction with a wavelet multiresolution analysis, and, once again, the fluctuation behaviour of this simulated time series is correct. This procedure is a suitable tool for the simulation of irradiance incident over a group of distant PV plants. Finally, a wavelet variance analysis and the long memory spectral exponent show that a PV plant behaves as a lowpass filter.
 O. Perpiñán, Statistical analysis of the performance and simulation of a twoaxis tracking PV system, Solar Energy, 83:11(2074–2085), 2009, 10.1016/j.solener.2009.08.008: pdf
 The energy produced by a photovoltaic system over a given period can be estimated from the incident radiation at the site where the Grid Connected PV System (GCPVS) is located, assuming knowledge of certain basic features of the system under study. Due to the inherently stochastic nature of solar radiation, the question ``How much energy will a GCPVS produce at this location over the next few years?’’ involves an exercise of prediction inevitably subjected to a degree of uncertainty. Moreover, during the life cycle of the GCPVS, another question arises: ``Is the system working correctly?’’. This paper proposes and examines several methods to cope with these questions. The daily performance of a PV system is simulated. This simulation and the interannual variability of both radiation and productivity are statistically analyzed. From the results several regression adjustments are obtained. This analysis is shown to be useful both for productivity prediction and performance checking exercises. Finally, a statistical analysis of the performance of a GCPVS is carried out as a detection method of malfunctioning parts of the system.
 O. Perpiñán, E. Lorenzo, M. A. Castro, and R. Eyras. Energy payback time of grid connected pv systems: comparison between tracking and fixed systems. Progress in Photovoltaics: Research and Applications, 17:137147, 2009: pdf
 A review of existing studies about LCA of PV systems has been carried out. The data from this review have been completed with our own figures in order to calculate the Energy Payback Time of double and horizontal axis tracking and fixed systems. The results of this metric span from 2 to 5 years for the latitude and global irradiation ranges of the geographical area comprised between 10º to 10º of longitude, and 30º to 45º of latitude. With the caution due to the uncertainty of the sources of information, these results mean that a GCPVS is able to produce back the energy required for its existence from 6 to 15 times during a life cycle of 30 years. When comparing tracking and fixed systems, the great importance of the PV generator makes advisable to dedicate more energy to some components of the system in order to increase the productivity and to obtain a higher performance of the component with the highest energy requirement. Both double axis and horizontal axis trackers follow this way, requiring more energy in metallic structure, foundations and wiring, but this higher contribution is widely compensated by the improved productivity of the system.
 O. Perpiñán, E. Lorenzo, M. A. Castro, and R. Eyras. On the complexity of radiation models for PV energy production calculation. Solar Energy, 82:2 (125131), 2008: pdf
 Several authors have analysed the changes of the probability density function of the solar radiation with different time resolutions. Some others have approached to study the significance of these changes when produced energy calculations are attempted. We have undertaken different transformations to four Spanish databases in order to clarify the interrelationship between radiation models and produced energy estimations. Our contribution is straightforward: the complexity of a solar radiation model needed for yearly energy calculations, is very low. Twelve values of monthly mean of solar radiation are enough to estimate energy with errors below 3%. Time resolutions better than hourly samples do not improve significantly the result of energy estimations.
 O. Perpiñán, E. Lorenzo, and M. A. Castro. On the calculation of energy produced by a PV gridconnected system. Progress in Photovoltaics: Research and Applications, 15(3):265–274, 2007: pdf
 This study develops a proposal of method of calculation useful to estimate the energy produced by a PV gridconnected system making use of irradiancedomain integrals and definition of statistical moment. Validation against database of real PV plants performance data shows that acceptable energy estimation can be obtained with first to fourth statistical moments and some basic system parameters. This way, only simple calculations at the reach of pocket calculators, are enough to estimate AC energy.
Software

solaR
 Calculation methods of solar radiation and performance of photovoltaic systems from daily and intradaily irradiation data sources.

rasterVis
 Complements
raster
providing a set of methods for enhanced visualization and interaction. 
meteoForecast
 Provides access to forecasts published by NWPWRF services using the NetCDF Subset Service.

pxR
 Provides a set of functions for reading and writing PCAxis files, used by different statistical organizations around the globe for data dissemination.

pdCluster
 Tools for feature generation, exploratory graphical analysis, clustering and variable importance quantification for partial discharge signals.
Resources
 Meteorological Data Sources (wiki)
 Introducción a R (spanish)
 Gists