Taxonomy of Feature Extraction and Translation Methods for BCI

http://www.cs.colostate.edu/eeg/taxonomy.html

Participants in the Signal Processing: Feature Extraction and Translation Workshop at the Third International Meeting on Brain-Computer Interface Technology, June 14-19, 2005, Rensselaerville Institute, NY, were asked for summaries of the feature extraction and translation methods that they have used. This document lists their summaries and contains a draft of a taxonomy of methods that is continually changing. We will discuss and modify this taxonomy at the meeting.

The goals of this effort are to discover how our work relates to the collective effort of this community, to prompt a discussion of which methods appear to be most fruitful for various applications, and to highlight new methods yet to be tried.

Please send suggestions for changes to Chuck Anderson at anderson@cs.colostate.edu

A related analysis of methods by S.Mason, A. Bashashati, M. Fatourechi, and G. Birch is being developed through an extensive survey of the literature. While the material in this web page is focused on feature extraction and translation, the work of Mason, et al., encompasses all of the steps involved in BCI research and application. It is avalable at www.braininterface.org.

Reasons to make a taxonomy
Desirable characteristics,
Summaries from workshop participants,
Taxonomy

Reasons to make a taxonomy

People new to field can see overview.
Experienced people can see how their work relates to that of others.
Highlight new methods or combinations of methods.

Desirable characteristics of feature extraction and translation methods

Accuracy
1. Correct at least x% of the time for classification, or correct within e for x% of the time if continuous. x and e depend on application.
2. Robust to interference from environmental signals and non-EEG biological signals.
3. Reliable (repeatable?) from hour to hour, day to day, across different applications of electrode cap, over different environments, and different subjects.
4. Features that are easily discriminated. (orthogonal)
Fast, Responsive
1. BCI decision within y seconds, or fraction of second.
2. Computation time small.
3. Storage requirements small.
4. Need not wait for artifact-free segments.
5. Training time short, requiring reasonable amount of data.
Interpretable
1. Can explain how BCI decision is being made.
2. Relate to known electrophysiology and contribute with new knowledge.
3. Will lead to better feature extraction methods.
4. Intuitive visualization, leading to biofeedback if in real-time.
Practical
1. Inexpensive, or at least affordable.
2. Somewhat portable.
3. Open source.
4. Easy setup for subject.
5. Finely discriminatng between multiple thoughts, states
6. Automaticity, little effort needed by subject

Summaries

In response to our request for information from workshop participants, we received replies from Chuck Anderson, Benjamin Blankertz, Clemens Brunner, Anna Buttfield, Mehrdad Fatourechi, Greg Gage, Xiarong Gao, Paul Hammon, Bin He, Ruthy Kaidar, Dean Krusienski, Dennis McFarland, Alois Schloegl, Len Trejo, Doug Weber. Replies are outlined below.

Chuck Anderson
- anderson@cs.colostate.edu
- http://www.cs.colostate.edu/eeg
- Features
  1. Multichannel EEG samples are augmented by multiple past samples lagged by single sample intervals.
  2. Artifacts filtered from decomposed signals separated by maximum signal fraction.
  3. Augmented samples are segmented into 1/2-second windows, overlapping by 1/4 second.
  4. Samples in each window are decomposed by singular-value decomposition.
  5. Each window is represented by subset of left singular vectors (dimension of each is equal to number of channels x number of lags)
- Classifier
  1. k-nearest neighbors, k = 1, 2, ... 10
  2. linear and quadratic discriminant analysis
  3. committee of decision trees
  4. neural networks
  5. generative models of multiple gaussians
- Application
  1. Classification of several 10-second trials recorded from subjects performing two mental tasks, such as mental multiplication and imagined letter writing.
- Citation
  1. Kirby, M. and Anderson, C.W. (2003) Geometric Analysis for the Characterization of Nonstationary Time-Series. In Springer Applied Mathematical Sciences Series Celebratory Volume for the Occasion of the 70th Birthday of Larry Sirovich, ed. by Kaplan, E., Marsden, J., and Sreenivasan, K.R., Springer-Verlag, Chapter 8, pp. 263--292.
  2. Anderson, C.W., and Kirby, M. (2003) EEG Subspace Representations and Feature Selection for Brain-Computer Interfaces. In Proceedings of the 1st IEEE Workshop on Computer Vision and Pattern Recognition for Human Computer Interaction (CVPRHCI), June 17, 2003, Madison, Wisconsin.
Benjamin Blankertz
- blanker@first.fraunhofer.de
- Method A: Classifying Movement Intentions based on LRP features (LRP: lateralized readiness potential)
  1. Features
    1. Multichannel EEG, 100Hz DC (or high-pass below 0.1 Hz)
    2. Channelwise 128-point Fourier Transform with one-sided cosine window win(n)= 1-cos(n*pi/128)
    3. Discarding DC (first) and higher frequency bins (>4 Hz)
    4. Transform back to time domain (inverse FT)
    5. Retain of the last 200ms
    6. Subsampling by calculating the mean of non-overlapping windows of 50ms length
    7. Results in 4 dimensions per channel
  2. Classifier
    1. Regularized least mean squares regression (in principle equivalent to LDA and Fisher Discriminant). Regularization parameter selected by cross-validation.
  3. Application
    1. Discriminating upcoming movements before EMG activity starts (interval of movement intention)
    2. Offline and online discrimination of left vs. right hand index finger movements and hand vs. shoulder movements
    3. Offline classification of index vs. little finger; hand vs. foot movement; left foot vs. right foot movement
    4. other classifications, online feedback experiment, study with phantom movements of amputees: will be included in our BCI Meeting Proceedings contribution.
  4. Citation
    1. Benjamin Blankertz and Guido Dornhege and Christin Schaefer and Roman Krepki and Jens Kohlmorgen and Klaus-Robert Mueller and Volker Kunzmann and Florian Losch and Gabriel Curio, Boosting Bit Rates and Error Detection for the Classification of Fast-Paced Motor Commands Based on Single-Trial EEG Analysis, IEEE Transactions on Neural Systems and Rehabilitation Engineering 11(2), 127-131, 2003
- Method B: Classifying Imagined Movements based on ERD features (ERD: event-related desynchronization)
  1. Features
    1. For training:
      1. band-pass filter signals (order 5 butterworth IIR filter)
      2. Common Spatial Patterns (CSP) analysis, retain 2 to 6 CSP channels. (Remark. Here it is unclear whether CSP is to be listed under 'Feature Extraction' because it uses label information and acts almost like a classifier.)
      3. Calculate variance along time in training epochs and take logarithm
      4. Results in features with 1 dimension per CSP channel (total 2 to 6 dim feature)
    2. For feedback:
      1. Apply spatial filter that was determind by CSP
      2. Apply band-pass filter
      3. Take last (most recent) 500ms or 1000ms of continuous data
      4. Calculate variance along time and take logarithm
  2. Classifier
    1. LDA (no regularization)
  3. Application
    1. Discrimination between imagined left hand vs. right hand vs. foot vs. tongue movement. For feedback mostly only two classes are used. Feedback applications were, e.g., one dimensional cursor control with an asynchronous protocol (also as mental typewriter application) and simple computer games like brain pong
  4. Citation
    1. Offline results: Guido Dornhege and Benjamin Blankertz and Gabriel Curio and Klaus-Robert Mueller, Boosting bit rates in non-invasive EEG single-trial classifications by feature combination and multi-class paradigms, IEEE Transaction on Biomedical Engineering 51(6), 993-1002, 2004
    2. Results of a feedback study: Poster at the BCI Meeting. Also a Technical Report will be available at the Meeting. It will be submitted for publication at the meeting.
- Method C: For both Applications above also combined LRP/ERD features
  1. Features
    1. Separately for LRP and ERD features as described above.
  2. Classifier
    1. Special linear classifier optimized for the assumption that both features are independent
  3. Application
    1. Discriminating upcoming movements before EMG activity starts (interval of movement intention). Offline discrimination of left vs. right hand index finger movements. Discrimination between imagined left hand vs. right hand vs. foot movement.
  4. Citation
    1. Combination for classifying imagined movements: Guido Dornhege and Benjamin Blankertz and Gabriel Curio and Klaus-Robert Mueller, Boosting bit rates in non-invasive EEG single-trial classifications by feature combination and multi-class paradigms, IEEE Transaction on Biomedical Engineering 51(6), 993-1002, 2004.
    2. Combination for classifying movement intentions: will be included in our BCI Meeting Proceedings contribution.
Clemens Brunner
- clemens.brunner@tugraz.at
- Features
  1. Classical bandpower (averaged over 1 second windows)
  2. Adaptive autoregressive parameters (sample by sample)
  3. Common spatial patterns
  4. Phase features (coupling between pairs of electrodes)
  5. We've been using each kind of feature separately and also in combination with others.
- Classifier
  1. Linear discriminant analysis, for more than 2 classes we're usually using a one-versus-the-rest classification scheme with multiple classifiers.
- Application
  1. Above methods have been applied to classify up to 4 classes (motor imagery of left hand, right hand, foot, and tongue, respectively). We've analyzed different electrode setups, e.g. 60, 22, 3 or only 2 channels.
- Citation
Anna Buttfield
- anbutt@idiap.ch
- Features
  1. 16 times a second we compute the power spectral density in the band 8-30 Hz over the last minute with a frequency resolution of 2Hz.
  2. A 96 element feature array is constructed by taking these PSD values for 8 electrodes.
- Classifier
  1. Gaussian mixture classifier
- Application
  1. This method has been applied to 3 class problems with subjects performing tasks such as imagination of left and right hand movement and a language task.
- Citation
  1. Brain-Actuated Interaction, J. del R. Millan, F. Renkens, J. Mouri no, and W. Gerstner, in "Artificial Intelligence", 2004.
Mehrdad Fatourechi
- mehrdadf@ece.ubc.ca
- http://ipl.ece.ubc.ca/mehrdadf.html
- Features
  1. Wavelet-like function
  2. combined with PCA to reduce number of features.
  3. Select feature subset using genetic algorithm
- Classifier
  1. k-nearest neighbor, k=1
- Application
  1. classification of movement-related potentials (MRPs) associated with movement of the right index finger in an asynchronous BCI system.
- Citation
  1. A Hybrid Genetic Algorithm Approach for Improving the Performance of the LF-ASD Brain Computer Interface Fatourechi, M.; Bashashati, A.; Ward, R.K.; Birch, G.E.; Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on Volume 5, March 18-23, 2005 Page(s):345 - 348
Greg Gage
- gagegreg@umich.edu
- Features
  1. Individual spikes and multi-unit clusters are sorted online using PCA and template matching.
  2. Spike times are collected into 90ms bins and processed in real time for neuroprosthetic control.
- Classifier
  1. A Kalman filter is used to convert binned spike data into cursor movement predictions.
  2. After each trial, the unknown parameters in the state and observation models are iteratively estimated using dynamic regression of the observed neural activity along with the "intended" movement path.
  3. The updated filter parameters are used to decode the next trial
- Application
  1. Above methods have been used to allow rats to learn cortical control of an unfamiliar auditory cursor. This method is intended to be applied in situations where (1) subjects have not received prior motor training to control a prosthetic device (naive user) and (2) the neural encoding of movement parameters in the cortex is unknown to the decoding filter (naive controller).
- Citation
  1. Naive Coadaptive Cortical Control (2005) Gregory J Gage, Kip A Ludwig, Kevin J Otto, Edward L Ionides and Daryl R Kipke J. Neural Eng. 2 (2) 52-63.
Xiarong Gao
- gxr-dea@tsinghua.edu.cn
- Features
  1. Multichannel EEG are filtered by two frequency bands ( 0-3Hz and 8-30Hz ).
  2. Data segmented into 2 second windows by EMG active trigger.
  3. The two bands data are then filtered in spatial domain, CSSD(Common spatial subsapce decomposition) is used to design the spatial filters from training data.
  4. Two variance features based on BP and ERD/ERS are obtained from the output as spatial filters vector for that window.
- Classifier
  1. Simple linear classifier ( a perceptron with two input and one output) is used to classify the figer movement related EEG mental tasks.
- Application
  1. Above methods have been used to classify single-trial EEG during right/left finger movement.
- Citation
  1. Yong Li, Xiaorong Gao and Shangkai Gao (2004) Classification of single-trial Electroencephalogram during finger movement. IEEE-T-BME, Vol.51, No.6, June 2004, 1019-1025.
Paul Hammon
- phammon@ucsd.edu
- Features
  1. spectral data was a particularly effective feature, and that either autoregressive coefficients or conventional power spectral estimates (not practical in an online setup) were effective
  2. pre-processing with an ICA transform (with a moderate amount of dimensionality reduction via PCA) was effective
- Classifier
  1. I tried both SVMs and L1-regularized logistic regression. I found that although SVMs generally win out in classification tasks, regularized logistic regression often performs well, and has the added benefit of very fast classification.
- Application
  1. data set I of the BCI Competition III, consisting of data from implanted ECoG electrodes
- Citation
Bin He
- binhe@umn.edu
- Method A
  1. Features
    1. Multi-channel EEG time series
    2. Laplacian Spatial Filtering
    3. Frequency decomposition with overlapping frequency bands for each channel
    4. Envelope extraction on the time series that are decomposed into each frequency band
    5. The spatial pattern for each time-frequency pair to be the feature vector
  2. Classifier
    1. For a given mental state, the characteristic spatial pattern on each time-frequency pair is extracted from training set.
    2. Classification is performed on every time-frequency pair by calculating the correlation of the spatial patterns with the characteristic pattern
    3. Weighted synthesis in time and frequency
  3. Application
  4. Citation
    1. Wang T, Deng J, He B: "Classifying EEG-based Motor Imagery Tasks by means of Time-frequency Synthesized Spatial Patterns," Clinical Neurophysiology, 115(12): 2744-2753, 2004.
- Method B
  1. Features
    1. Multi-channel EEG time series
    2. Laplacian Spatial Filtering
    3. Band-pass filtering
    4. Noise normalization
    5. ICA decomposition
    6. EEG inverse technique (cortical current density estimation, single- or two-dipole fitting)
  2. Classifier
    1. Classification on continuous cortical source distribution, or locations of equivalent dipole sources
  3. Application
  4. Citation
    1. Kamousi B, Liu Z, He B: "Classification of Motor Imagery Tasks for Brain-Computer Interface Applications by means of Two Equivalent Dipoles Analysis," IEEE Transactions on Neural Systems and Rehabilitation Engineering, in press, 2005.
    2. Qin L, Ding L, He B: "Motor Imagery Classification by Means of Source Analysis for Brain Computer Interface Applications", J of Neural Engineering, 1:135-141, 2004.
Ruthy Kaidar
- ruthykdr@techunix.technion.ac.il
- Features
  1. Data is segmented into 550ms long movement and non-movement intervals
  2. Looked at single-trial amplitude of movement vs. non-movement segments in time.
    1. Found where the segments are significantly difference, using statistical tests.
    2. Electrodes with the largest significant difference were used for movement detection.
  3. Spectral power estimation of movement segments was done using the Multi-taper method.
    1. Looked at the gamma-band power, and found where left and right movement segments are significantly different, using statistical tests.
    2. Electrodes with largest significant gamma-band power difference were selected for detection of laterality of movement.
- Classifier
  1. Support Vector Machines were used to classify single-trials in two separate dimensions:
    1. Time-domain features were used to classify between movement and non-movement segments
    2. Frequency-domain features were used to classify between left and right movement tasks.
- Application
  1. Above methods have been used to classify 20-channel EEG recorded from subjects in an attentive state, performing a movement task either with left or right index finger, depending on target cue.
- Citation
  1. Work not published yet.
Dean Krusienski
- dkrusien@wadsworth.org
- Features
  1. 2 large laplacian channels over opposing hemispheres of the sensorimotor cortex.
  2. Data segmented into 400ms, overlapping by 20ms.
  3. Segments are processed with a matched filter (matched to the mu-rhythm), producing a continous feature output.
- Classifier
  1. Features are translated using a linear equation with adaptive coefficients to control the cursor.
- Application
  1. 1d and 2d cursor control via a mu-rhythm matched filter
- Citation
  1. Krusienski, DJ, Schalk, G, Mcfarland, DJ, Wolpaw, JF, Tracking of the Mu Rhythm using an Empirically Derived Matched Filter, Proc. IEEE International Conference on Neural Engineering, March 2005.
Dennis McFarland
- mcfarlan@wadsworth.org
- Features
  1. Multichannel (64) EEG
  2. Spatial Laplacian
  3. Data segmented into 400 msec windows with 7/8 overlap between successive windows
  4. AR spectral analysis with 16th order model
- Classifier
  1. Multiple Linear Regression to predict target position ( 1 and 2 spatial dimensions) from EEG features
- Application:
- Citation
  1. Wolpaw, J.R and McFarland, D.J. (2004) Control of a two-dimensional movement signal by a non-invasive brain-computer interface in humans. Proceedings of the National Academy of Sciences, 101, 17849-17854.
  2. McFarland, D.J. and Wolpaw, J.R. Sensorimotor rhythm-based braiin-computer interface (BCI): Feature selection by regression improves performance. IEEE Transactions in Neural Systems and Rehabilitation Engineering (in press).
Alois Schloegl
- alois.schloegl@tugraz.at
- Method A
  1. Features
    1. Bandpower estimated with FFT
  2. Classifier
    1. Learning Vector Quantisation. DS-LVQ (Pregenzer) was used for feature selection.
    2. That classification result ++,+,0,-,-- (5 grades from correct to wrong) was displayed at the end of one trial.
  3. Application
  4. Citation
    1. Pfurtscheller et al 1997.
- Method B
  1. Features
    1. Adaptive autoregressive (AAR) parameters are estimated continously with Kalman filtering, [in previous works we used also LMS and RLS]
  2. Classifier
    1. Linear or quadratic classifiers are applied to short window segments (typically 10-25 samples, 1/16-1/5 seconds).
    2. The best discrimanting segment is used to calculate the classifier.
    3. That classifier is applied continously to the AAR parameters (Schloegl)
  3. Application
    1. The output (continous in time an magnitude) is used to control the horizontal position of a falling ball or the length of a horizontal bar.
  4. Citation
    1. *A. Schlogl*, K. Lugger and G. Pfurtscheller (1997) Using Adaptive Autoregressive Parameters for a Brain-Computer-InterfaceExperiment, Proceedings of the 19th Annual International Conference if the IEEE Engineering in Medicine and Biology Society ,vol 19 , pp.1533-1535, 1997
    2. C. Neuper, *A. Schlogl*, G. Pfurtscheller. Enhancement of left-right sensorimotor EEG differences during feedback-regulated motor imagery. J Clin Neurophysiol 1999 Jul;16(4):373-82.
- Method C
  1. Features
    1. Logarithmic Bandpower (BP) values estimated bandpass filtering, squaring and smoothing (1s window)
  2. Classifier
    1. Linear discriminant analysis are applied to short window segments (1/4 - 1 seconds) The best discriminating segment is used to calculate the classifier.
    2. That classifier is applied continously to the BP parameters
  3. Application
    1. The output (continous in time an magnitude) is used to control the horizontal position of a falling ball or the length of a horizontal bar.
  4. Citation
    1. Scherer et al.
    2. Krausz et al.
    3. Pfurtscheller et.
    4. Guger et al.
    5. Similar experiments were done with the following modifications
      1. Feature extraction: Common Spatial Patterns (Ramoser et al)
      2. Feature extraction and Classification: Hidden Markov Model (Obermaier et al)
- Method D
  1. Features
    1. Adaptive autoregressive (AAR) parameters are estimated continously with Kalman filtering, Logarithmic Bandpower (BP) values estimated bandpass filtering, squaring and smoothing (1s window)
  2. Classifier
    1. using Adaptive classifiers, Linear or quadratic classifiers are online updated.
    2. That classifier is applied continously to the AAR and/or BP parameters.
  3. Application
    1. The output (continous in time an magnitude) is used to control the horizontal position of a falling ball.
  4. Citation
    1. Carmen Vidaurre et al. (papers submitted, see poster)
Len Trejo
- ltrejo@mail.arc.nasa.gov
- http://vision.arc.nasa.gov/personnel/ltrejo/
- Features
  1. Eight-channel EEG critically filtered and decimated to 128 Hz sampling rate.
  2. Segmentation into 2-s segments overlapped 1.75 s with prior segments (250 ms update). We have used other segment lengths, up to 5 s.
  3. Application of automatic EOG correction using wavelet-shrinkage enhanced linear artifact estimator (optional; usually not needed).
  4. Estimation of power spectral density (FFT-based periodogram) with normalization of target frequencies (SSVEP stimulus fundamentals and second harmonics) using straddling reference bands.
  5. Retention of normalized bands of target frequency PSD bins.
- Classifier
  1. Concatenation of multichannel PSD bins into a single row vector
  2. Projection of row vector onto five n-component linear KPLS classifiers derived from prior training data (n is determined by cross-validation and usually ranges from 1-5). A separate model exists for each of five control commands: turn left, right, up, down, and center/stop. Each model classifies output using one versus many (eg. 1 vs other 4) method.
  3. Votes are tallied for all models into control classes (left, right, up, down, stop).
  4. Winning control class determines direction for next cursor position increment.
- Application
  1. These methods have been used to provide normal human subjects with SSVEP-based control of a moving map application. You can see a video of the whole system in operation with narrated explanation of processing steps at http://128.102.102.53/qtmedia/Media/bci_demo_2005_fullmpeg4.mp4. If that link fails find the demo link at the bottom of my home page, http://ti.arc.nasa.gov/ltrejo.
- Citation
  1. Rosipal R., Trejo L.J., Matthews B. Kernel PLS-SVC for Linear and Nonlinear Classification. In Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), 640-647, Washington DC, 2003. full-text PDF, , talk
Doug Weber
- doug.weber@ualberta.ca
- http://www.physedandrec.ualberta.ca/research.cfm
- Features
  1. Mutliple single unit recording of primary sensory afferents in dorsal root ganglion using chronically implanted microelectrode arrays
  2. Single unit firing rates computed by convolving spike trains with a triangular-shaped kernel (16 ms wide at base)
  3. Position and velocity coding of each afferent tested in a linear regression model and the best units (highest R-squared value) selected for decoding (translation)
- Classifier:
  1. Position and velocity variables for hindlimb decoded in linear filter with 5-10 afferent neurons as inputs
- Application
  1. multiple channel, single unit afferent recording to study coding of proprioceptive signals in primary sensory neurons in cat hindlimb during walking
- Citation
  1. R. B. Stein, D. J. Weber, Y. Aoyagi, A. Prochazka, J. Wagenaar, S. Shoham, and R. A. Normann, Coding of position by simultaneously recorded sensory neurons in the cat dorsal root ganglion, J Physiol, 2004.

Taxonomy

Features extraction (not based on knowledge of desired translation result)
1. Initial filtering (based on generally-applicable knowledge about signals, not on knowledge specific to particular application)
  1. Spatial
    1. Laplacian filter of neighboring voltages (Krusienski, McFarland, He)
    2. Simple differences of voltages
  2. Temporal
    1. Single frequency passband
    2. 50 or 60 Hz notch filter
    3. Smoothing of single unit firing rates with triangular kernel (Fatourechi)
  3. Multitrial averages synchronized to stimulus
2. Amplitude (Kaidar, Blankertz)
3. Frequency
  1. Spatial dependence---number of channels
    1. Single
      1. FFT (Trejo, Hammon, Brunner, Gao, He, Buttfield, Blankertz, Schloegl)
      2. IFFT (Blankertz)
      3. Multi-taper method, gamma-band (Kaidar)
      4. Matched filter
      5. Wavelets (Fatourechi, Trejo---EOG correction)
    2. Multiple
      1. Phase differences (Brunner)
      2. matched filters
      3. multivariate AR
      4. Bispectrum ?
4. Geometric subspaces (directions in EEG data space that capture most variation, strongest signal, strongest task-specific signal, etc.)
  1. Linear decomposition of matrix of samples
    1. Maximization of variance captured
      1. Project to components with most variance (Fatourechi, Hammon, Gage)
      2. Project to components chosen by validation
      3. Spectrum defined by two data sets, common spatial patterns (Brunner, Gao, Blankertz)
      4. SVD of lagged, multichannel samples (Anderson)
    2. Higher-order statistics
      1. Independent components analysis, ICA (Hammon, He)
5. Model (not based on knowledge of desired translation result, unsupervised)
  1. Spatial dependence---Number of channels
    1. Single channel
    2. Multi-channel
  2. Temporal dependence---Independent or dependent on history
    1. None---static model
    2. Some---dynamic model
      1. (adaptive) autoregressive model (McFarland, Hammon, Brunner)
  3. source localization (He)
  4. Complexity (nonlinear systems)
6. Feature subset selection
  1. Most variance
  2. Most significant difference, single electrode selection (Kaidar)
  3. Classification or prediction accuracy
    1. Exhaustive search (Anderson)
    2. Genetic search (Fatourechi)
    3. Sequential forward or backward
Translation, into categories for classification or real-values for prediction (based on knowledge of desired translation output, supervised)
1. Memory based
  1. k-nearest neighbors (Fatourechi, Anderson)
2. Discriminant functions
  1. Linear
    1. Linear regression (Krusienski, Fatourechi, McFarland, Blankertz)
    2. LDA (Brunner, Anderson, Blankertz, Schloegl)
    3. Partial Least Squares, PLS
    4. Perceptron (Gao)
  2. Quadratic, QDA (Anderson)
  3. Nonlinear
    1. neural network (Anderson)
    2. support vector machines (Hammon, Kaidar)
    3. decision trees (Anderson)
    4. LVQ (maybe more like feature extraction) (Fatourechi, (Schloegl)
3. Models---per class
  1. Logistic regression (L1-regularized) (Hammon)
  2. Kalman filters (Gage)
  3. LDA, QDA
  4. Kernel Partial Least Squares, KPLS (Trejo)
  5. k-means
  6. mixture of gaussians (Anderson, Buttfield)
  7. HMM---hidden markov models
  8. Combinations
    1. Voting (Trejo, Anderson)
    2. Averaging