PCA-MVE for JAVA v. 1.1: Minimum Volume Ellipsoid with Principal Component
Analysis
extension.
© 2004 Kim van der Linde,
Florida State University
Usage of the program is free, but citation is appreciated.
The underlying JAVA classes and source code are
licensed
under the
Gnu General Public License.
See the Applied Usage of the Minimum-Volume Ellipsoid paper for more details.
This method is used for robust outlier detection in multivariate space. The
algorithm takes subsamples of the dataset and calculates the Volume of the
Ellipsoid representing that subsample. Outliers increase the volume
dramatically, so the Minimum Volume Ellipsoid (MVE) will correspond with the
actual core of the dataset. After this, data points exceeding the cut-off value
for the Mahalanobis distance will be designated outliers.
This version performs first a Principal Component Analysis (PCA) and estimates
the eigenvalues. Only those factor scores, which are based on PC's with
non-zero eigenvalues, are used for the MVE procedure. This effectively
eliminates the issues the original version of MVE has with singular matrices.
The MVE method was developed by
Peter J. Rousseeuw
of the
Antwerp Group On Robust & Applied Statistics
and published in 1985 (Rousseeuw P.J., Multivariate estimation with high
breakdown point. In: Grossman W., Pflug G., Vincze I., Wertz W. (eds.)
Mathematical Statistics and Applications. Reidel, Dordrecht, The Netherlands,
p. 283) and improved later in
Rousseeuw P.J. & Leroy A., (1987), Robust regression and outlier detection. New York, John Wiley.
Additional notes can be found at the
MVE-page
.
The program needs the Java Runtime Environment available from
the java site of SUN
There are two options:
-
either download it as jar file which enables you to load and save files. To
execute it use
<path to jar file>/java -jar rma.jar
or let for example windows handle it from
the download dialog box (works wonders under my Mozilla and win XP system).
-
or use the java webstart option. As java webstart programs run in a so-called
sandbox, it can not access files on the
hard disk so it works with an input dialog and can not save the results.
Two remarks:
(1) the executables jre, javaw and java are exchangeable.
(2) '/' and '\' depend on the platform you use, so change accordingly.
Cite program as:
Kim van der Linde
(2004) PCA-MVE: Robust Minimum Volume Ellipsoid estimation for robust outlier
detection
in multivariate space, Java version. Website:
http://www.kimvdlinde.com/professional/pcamve.html.
Cite article describing this method and usage as:
Kim van der Linde
&
David Houle
(submitted).
Applied usage of the Minimum Volume Ellipsoid.
Biometrics.
Version History:
-
Version 1.0 (29 October 2004):
-
Version 1.1 (28 November 2004):
-
added sorted output option.