Space-Time-Frequency Bag of Words Models for Capturing EEG Variability: A Comprehensive Study




Su, Kyung Min
Robbins, Kay A.

Journal Title

Journal ISSN

Volume Title


UTSA Department of Computer Science


EEG data are multi-channel time-series recorded from the scalp surface to provide spatial and temporal measurements of brain signals. Because EEG headsets have varying channel numbers, channel placements, and sampling rates, EEG data may have different dimensions depending on the type of headset used for signal acquisition. These differences make it difficult to combine datasets for large-scale machine learning or data mining applications.

Many traditional EEG features, including raw signal, are channel-specific [1], and not appropriate for processing multi-headset data of various channel configurations. Frame-based EEG features, which extract values from a field topography [2], [3], are less channel-specific. However, they usually assume that all EEG datasets are from the same headset.

To represent EEG data regardless of headsets configurations, we have investigated several variations of the classical Bag-of-Words (BOW) model, a widely used technique to extract features from images for applications such as retrieval [4]. Images come in different sizes, shapes, and orientations. BOW approaches are effective in mapping such data to common feature sets. Traditional BOW models use a dictionary of local features based on key points and then construct a histogram of the occurrences of these features in an image. A disadvantage of BOW features is that they lose information about global spatial relationships of the key points in the image. However, this loss also makes the features robust to variations in scale and orientation.

In this document, we describe several BOW approaches for EEG data that retain some frequency, spatial, and temporal relationships in EEG data. The proposed descriptors are relatively insensitive to the number of channels, channel placement, sampling rates, signal range, and subject response time. As a result, we can process EEG datasets of various configurations using a common dictionary of features. We have experimentally compared various approaches and parameters to provide an empirical basis for choosing optimal conditions. Section 2 describes the ideas of configuration independent EEG features based on BOW models, and Section 3 explains the implementation details and test results. Section 4 briefly discusses the implications of results for EEG analysis.





Computer Science