pdCluster: Partial Discharges Clustering
Table of Contents
Partial discharge measurements analysis may determine the existence of defects. This package provides several tools for feature generation, exploratory graphical analysis, clustering and variable importance quantification for partial discharge signals.
More details available in this paper:
O. Perpiñán, M.A. Sánchez-Urán, F. Álvarez, J. Ortego, F. Garnacho,
Signal analysis and feature generation for pattern identification of partial discharges in high-voltage equipment,
Electric Power Systems Research, 2013, 95:C (56-65), 10.1016/j.epsr.2012.08.016
The development repository of pdCluster
is here. It can be installed
with:
remotes::install_github("oscarperpinan/pdcluster")
Along this webpage you will find some examples using some real datasets.
library(pdCluster)
1 The Prony’s method
A clean partial discharge signal can be regarded as a finite combination of damped complex exponentials. Under this assumption, the so-called Prony’s method allows for the estimation of frequency, amplitude, phase and damping components of the signal.
We have a collection of signals in a list
named signalList
(download).
load('data/signalList.RData')
The signals contain zeros at the beginning and at the
end. The no0
function can remove these parts.
xyplot(signalList, y.same=NA, FUN=function(x){xyplot(ts(no0(x)))})
With these cleaned signals the Prony’s method can provide their components.
signal <- signalList[[3]] pr <- prony(signal, M=10) xyplot(pr)
Since the number of components must be fixed \a priori\,
the function compProny
allows the comparison of different numbers:
compProny(signal, M=c(10, 20, 30, 40))
2 Feature generation
pdCluster
includes several functions for feature
generation. The analysis
function comprises all of them. The
results for our example signal are:
analysis(signal)
This function can be used with a list of signals in order to obtain a matrix of features:
analysisList <- lapply(signalList[1:10], analysis) pdData <- do.call(rbind, analysisList)
Now we need the angle and reflection information, available from
another different dataset (named pdSummary
, download).
load('data/pdSummary.RData')
In order to safely share the information, both data frames must be reordered by their energy values:
idxOrderSummary=order(pdSummary$sumaCuadrados) idxOrderData=order(pdData$energy) pdDataOrdered=cbind(pdData[idxOrderData,], pdSummary[idxOrderSummary,c('angulo', 'separacionOriginal')])
Later, the data frame to be used with the clustering algorithm has to
ordered by time. Thus the samples of the clara
method will
be random.
idx <- do.call(order, pdSummary[idxOrderSummary, c('segundo', 'inicio')]) pdDataOrdered <- pdDataOrdered[idx,]
We can now construct a PD
object. (The
pdCluster
package is designed with S4 classes and
methods. Two classes have been defined: PD
and PDCluster
).
pd <- df2PD(pdDataOrdered)
The results of analysis
to the whole dataset are available here.
load('data/dfHibr.RData') dfHibr <- df2PD(dfHibr)
3 Transformations
Prior to the clustering algorithm, the feature matrix has to be filtered:
dfFilter <- filterPD(dfHibr)
and transformed:
dfTrans <- transformPD(dfFilter)
The next figure compares the datasets after and before of the transformations:
nZCbefore <- as.data.frame(dfFilter)$nZC nZCafter <- as.data.frame(dfTrans)$nZC comp <- data.frame(After=nZCafter, Before=nZCbefore)
histogram(~After+Before, data=comp, scales=list(x=list(relation='free'), y=list(relation='free', draw=FALSE)), breaks=100, col='gray', xlab='', strip.names=c(TRUE, TRUE), bg='gray', fg='darkblue')
4 Graphical tools
The pdCluster
packages includes a set of graphical exploratory
tools, such as a scatterplot matrices with hexagonal binning, density
plots histograms or phase resolved partial discharge patterns, both
with partial transparency or hexagonal binning.
splom(dfTrans)
densityplot(dfTrans)
histogram(dfTrans)
xyplot(dfTrans)
hexbinplot(dfTrans)
5 Clustering
The filtered and transformed object can now be used with the clustering algorithm. The results are displayed with a phase resolved pattern with clusters in separate panels. The colors encode the distance of each point to the medoid of its cluster. The displays the same pattern with superposed clusters. Here the colors encode the membership to a certain cluster, and transparency is used to denote density of points in a region.
The results can be easily understood with the density plots of each cluster and feature or with the histograms .
dfTransCluster <- claraPD(dfTrans, noise.rm = FALSE)
xyplot(dfTransCluster)
xyplot(dfTransCluster, panelClust=FALSE)
histogram(dfTransCluster)
densityplot(dfTransCluster)