<div class="moz-text-flowed" style="font-family: -moz-fixed">Description of the Essex Entry on Data Set V with Raw EEG Signals

Louis C.S. Tsui and John Q. Gan
Department of Computer Science
University of Essex
Colchester CO4 3SQ
UK
Email: csltsu@essex.ac.uk, jqgan@essex.ac.uk 


We have tried different methods for extracting features from the raw 
EEG signals, and the method finally used is described below. In 
developing classification methods, we have paid special attention to 
the very likely overfitting problem. The methods used in preparing 
this entry are briefly described as follows:

1. Feature Extraction
The raw EEG signals were spatially filtered first. After that frequency 
domain features over the last second of data from each channel were 
extracted. The first feature vector was extracted from the first 512 
samples, the second feature vector was extracted from the 33rd to 
544th samples, and so on. That is, a moving window of 512 samples wide 
moves by a step of 32 samples to extract a new feature vector. This 
would produce (NumberOfSamples-512)/32+1 feature vectors for a session, 
where NumberOfSamples is the total number of samples in the session. 

2. Feature Dimension Reduction
To reduce the dimension of the feature space aims to achieve better 
generalisation. The dimension reduction include both channel selection 
and frequency band selection. Techniques used here include two types of
PCA methods and cross-validation for choosing optimal feature subset.

3. Classification:
LDA (one against the rest) and neural networks have been investigated 
for classification. Decision fusion has also been considered with LDA
as the dominant classifier. 

4. Postprocessing:
The purpose of postprocessing is to obtain reliable/robust classification.
Techniques used include smoothing window on previous classification outputs, 
mental task change detection and confirmation. 

The classification results on the testing sessions of 3 subjects are 
included in the attached file EssexEntryDataSetV_RAW.mat. In the data 
set description, there are two requirements for the classification 
output: the first is to provide an output for every input vector, 
the second is to provide an output every 0.5 seconds. We are not sure 
which would be used for final evaluation of all the entries. Therefore,
in EssexEntryDataSetV_RAW.mat there are two vectors for each subject, 
named as subjectiTest and subjectiTest8 respectively. The number of 
estimated class labels in subjectiTest is the number of input vectors 
in the testing session of subject i (i=1,2,3), whilst the number of 
estimated class labels in subjectiTest8 is (n-1)/8, where n is the 
number of extracted feature vectors from the testing session.

In evaluating the classification accuracy of this entry, please use the 
corresponding true class labels.
</div>