Java implementation of an information-theoretic algorithm that combines Multivariate Correlations with Early Classification
MCEC (Multivariate Correlations for Early Classification) algorithm is a Java implementation of an information-theoretic method for examining the early classification opportunity in a dataset. This dataset contains univariate or multivariate time series together with their respective class labels. The program can be downloaded here.
The input file must be in comma-separated values (CSV) format, containing the time series and the respective class labels.
Dataset example:
X1_1, X2_1, X1_2, X2_2, class
TRUE, FALSE, FALSE, FALSE, C1
FALSE, FALSE, TRUE, FALSE, C0
TRUE, TRUE, FALSE, FALSE, C0
TRUE, FALSE, TRUE, TRUE, C1
(...)
The outcomes of the difference in entropy, log-likelihood, MDL score, AIC score and classification accuracy analysis, all for n = {1, ..., L} (where L represents the time series length) are outputted from the Java program in text files:
The proposed implementation provides the Markov Lag, an alternative to the standard Early Classification approach. Basically, instead of analysing the correlations from the initial time point until the last, it uses the inverse order (from the last to the first one). In this case, the idea is to check of how much information from the closest past we need, in order to obtain a satisfactory prediction.
MCEC algorithm depends on two external libraries:
Execute the jar file:
$ java -jar MCECalgorithm.jar dataset-filename.csv N optionClass MarkovLag
where the command-line options correspond to:
dataset-filename Type: String - Name of the dataset file to be analysed.
N Type: Integer - Number of features per time point.
optionClass Type: Boolean - With classification analysis (TRUE)
or without classification analysis (FALSE).
MarkovLag Type: Boolean - With Markov lag approach (TRUE)
or with standard Early Classification (FALSE).
The very simple syntheticTest.csv dataset example is described in the following table:
The command for analysing the early classification opportunity is
$ java -jar MCECalgorithm.jar syntheticTest.csv 1 TRUE FALSE
and produces the following files: