DataPreprocessing {fExtremes} | R Documentation |
A collection and description of functions for data
preprocessing of extreme values. This includes tools
to separate data beyond a threshold value, to compute
blockwise data like block maxima, and to decluster
point process data.
The functions are:
blockMaxima | Block Maxima from a vector or a time series, |
findThreshold | Upper threshold for a given number of extremes, |
pointProcess | Peaks over Threshold from a vector or a time series, |
deCluster | Declusters clustered point process data. |
blockMaxima(x, block = c("monthly", "quarterly"), doplot = FALSE) findThreshold(x, n = floor(0.05*length(as.vector(x))), doplot = FALSE) pointProcess(x, u = quantile(x, 0.95), doplot = FALSE) deCluster(x, run = 20, doplot = TRUE)
block |
[blockMaxima] - the block size. A numeric value is interpreted as the number of data values in each successive block. All the data is used, so the last block may not contain block observations.
If the data has a times attribute containing (in
an object of class "POSIXct" , or an object that can be
converted to that class, see as.POSIXct ) the
times/dates of each observation, then block may instead
take the character values "month" , "quarter" ,
"semester" or "year" . By default monthly blocks
from daily data are assumed.
|
doplot |
a logical value. Should the results be plotted? By
default TRUE .
|
n |
[findThreshold] - a numeric value or vector giving number of extremes above the threshold. By default, n is
set to an integer representing 5% of the data from the
whole data set x .
|
run |
[deCluster] - parameter to be used in the runs method; any two consecutive threshold exceedances separated by more than this number of observations/days are considered to belong to different clusters. |
u |
[pointProcess] - a numeric value at which level the data are to be truncated. By default the threshold value which belongs to the 95% quantile, u=quantile(x,0.95) .
|
x |
[finThreshold][blocks][blockMaxima][deCluster] - a numeric data vector from which findThreshold and
blockMaxima determine the threshold values and block
maxima values.
For the function deCluster the argument
x represents a numeric vector of threshold exceedances
with a times attribute which should be a numeric
vector containing either the indices or the times/dates
of each exceedance (if times/dates, the attribute should
be an object of class "POSIXct" or an object that
can be converted to that class; see as.POSIXct ).
|
Finding Thresholds:
The function findThreshold
finds a threshold so that a given
number of extremes lie above. When the data are tied a threshold is
found so that at least the specified number of extremes lie above.
Computing Block Maxima:
The function blockMaxima
calculates block maxima from a vector
or a time series, whereas the function
blocks
is more general and allows for the calculation of
an arbitrary function FUN
on blocks.
De-Clustering Point Processes:
The function deCluster
declusters clustered point process
data so that Poisson assumption is more tenable over a high threshold.
blockMaxima
returns a timeSeries object or a numeric vector of block
maxima data.
findThreshold
returns a numeric value or vector of suitable thresholds.
pointProcess
returns a timeSeries object or a numeric vector of peaks over
a threshold.
deCluster
returns a timeSeries object or a numeric vector for the
declustered point process.
Some of the functions were implemented from Alec Stephenson's
R-package evir
ported from Alexander McNeil's S library
EVIS
, Extreme Values in S, some from Alec Stephenson's
R-package ismev
based on Stuart Coles code from his book,
Introduction to Statistical Modeling of Extreme Values and
some were written by Diethelm Wuertz.
Coles S. (2001); Introduction to Statistical Modelling of Extreme Values, Springer.
Embrechts, P., Klueppelberg, C., Mikosch, T. (1997); Modelling Extremal Events, Springer.
## findThreshold - # Threshold giving (at least) fifty exceedances for Danish data x = as.timeSeries(data(danishClaims)) findThreshold(x, n = c(10, 50, 100)) ## blockMaxima - # Block Maxima (Minima) for left tail of BMW log returns: BMW = as.timeSeries(data(bmwRet)) colnames(BMW) = "BMW.RET" head(BMW) x = blockMaxima( BMW, block = 65) head(x) y = blockMaxima(-BMW, block = 65) head(y) ## deCluster - # Decluster the 200 exceedances of a particular # threshold in the negative BMW log-return data PP = pointProcess(x = -BMW, u = quantile(as.vector(x), 0.75)) PP dim(PP) DC = deCluster(x = PP, run = 15, doplot = TRUE) DC dim(DC)