High-Speed Visual Target Identification for Low-Cost Wearable Brain-Computer Interfaces

(1)

High-Speed Visual Target Identification for

Low-Cost Wearable Brain-Computer Interfaces

DOKYUN KIM1_{, (Student Member, IEEE), WOOSEOK BYUN}2_{, (Student Member, IEEE),}

YUNSEO KU3, (Member, IEEE), AND JI-HOON KIM 4, (Senior Member, IEEE)

1_{Department of Electrical Information Engineering, Seoul National University of Science and Technology, Seoul 01811, South Korea} 2_{Department of Electronics Engineering, Chungnam National University, Daejeon 34134, South Korea}

3_{Department of Biomedical Engineering, College of Medicine, Chungnam National University, Daejeon 34134, South Korea} 4_{Department of Electronic and Electrical Engineering, Ewha Womans University, Seoul 01811, South Korea}

Corresponding author: Ji-Hoon Kim ([email protected])

This work was supported in part by the National Research Foundation (NRF) of Korea through the Basic Science Research Program, funded by the Ministry of Science, ICT, and Future Planning, under Grant 2015R1D1A1A01060247, and in part by the X-Project through NRF, funded by the Ministry of Science, ICT, and Future Planning, under Grant 2017R1E1A2A02023127.

ABSTRACT Non-invasive brain-computer interfaces (BCI) have received a great deal of attention due to recent advances in signal processing. Two types of electroencephalograms (EEG), P300 and steady-state visual evoked potential (SSVEP), have been widely used to enable paralyzed patients to communicate with others. Although there have been many signal processing algorithms focusing on target identification accuracies such as power spectral density analysis (PSDA) and canonical correlation analysis (CCA), their high computational complexity drives up the cost of such systems. In the proposed lightweight target identification algorithm, we have focused on developing an improved information transfer rate (ITR) for high-quality communication and reducing overall implementation cost. The proposed algorithm, CCA-Lite, includes two algorithmic optimizations–signal binarization and on-the-fly covariance matrix calculation– which have enabled the development of a low-cost, single-channel, and wearable BCI system using SSVEP. The prototypical BCI system makes use of an ARM Cortex-M3-based low-cost microcontroller unit (MCU), which has been built for 1.5s SSVEP recordings. Compared to the state-of-the-art CCA-based target identification algorithm, CCA-Lite exhibits 25% better ITR and has reduced memory requirements by 92% and single-target identification cycle time by 26%.

INDEX TERMS Brain-computer interface (BCI), canonical correlation analysis (CCA), electroencephalo-gram (EEG), steady-state visual evoked potential (SSVEP), target identification.

I. INTRODUCTION

Many people with paralysis have difficulty communicating with others. Brain-computer interfaces (BCIs) can trans-late electrical signals from brain activity into interpretable information reflecting a user’s intent, without neuromuscu-lar control [1]–[3]. Because BCI technology only requires an electrical signal, it can provide communication and con-trol channels for people with devastating neuromuscular dis-orders such as amyotrophic lateral sclerosis (ALS), spinal cord injury, brainstem stroke, and cerebral palsy [1], [3]. For people with paralysis, a BCI speller can allow for the use of word processing programs [2], [4] or the control of devices such as a multi-dimensional cursor, robotic arm, and wheelchair [5], [6].

The associate editor coordinating the review of this manuscript and approving it for publication was Berdakh Abibullaev.

BCI technology can be classified as invasive or non-invasive, depending on whether surgery is performed. In non-invasive BCIs, the electroencephalogram (EEG)-based BCI speller has been widely used for people with paralysis, due to a high time-resolution, low cost, external electrode safety, and wide range of applications [7]–[9]. With recent advances in signal processing algorithms, the EEG-based BCI speller has improved in target identification accuracy, despite having a low signal-to-noise ratio (SNR) and spatial resolution [10]. Since a speller system for people with paralysis requires fast and intuitive communication, a steady-state visual evoked potential (SSVEP) has been widely used because of its high-speed communication and limited requirement for user training [8], [10], [11].

Recent studies on high-speed, SSVEP-based BCI

speller systems have primarily used electrode-array caps with an EEG recording instrument and external signal

(2)

processing machine. This type of BCI speller system is uncomfortable, takes a long time to install, and relies on a high-cost external computing machine [12]. Even though these systems have a fast information transfer rate (ITR), their high cost and massive form-factor limit their practicality for real world situations.

A previous study demonstrated a wearable, in-ear, SSVEP-based BCI speller system with a small form-factor [13]; however, the system could only acquire SSVEP signals and wirelessly transmit raw EEG data to an external signal-processing machine. Moreover, a subsequent study found that the power consumption of raw EEG data trans-mission could be reduced by 90% by extracting EEG features locally, in the device itself, to reduce the amount of data being wirelessly transmitted [14].

Typically, power spectral density analysis (PSDA)-based target identification has been used in single-channel SSVEPs [15], [16]; although, this approach yields a low signal-to-noise ratio (SNR) due to spontaneous EEG signals [10], [17]. Recently, canonical correlation analysis (CCA)-based target identification algorithms have been widely used in SSVEP-based BCI speller systems due to their high identification accuracy and noise robustness, compared to those of PSDA-based systems [16]. However, real time CCA-based algorithms are hard to implement in low-cost microcontroller units (MCUs) because of their complexity. Therefore, it will be crucial to develop an improved tar-get identification algorithm that is optimized for wearable devices, with negligible declines in performance over time.

Toward this end, a multi-channel, SSVEP-based, embed-ded BCI system has been proposed for performing CCA on a Cortex-M4F-based MCU [18]. Unfortunately, though, a multi-channel processing system usually cannot avoid matrix decomposition in CCA algorithms. Moreover, the computational complexity of matrix decomposition can be increased exponentially via floating point arithmetic, as opposed to a fixed-point system. Therefore, this embedded system requires high battery capacity and is very costly due to the high-performance (24-bit resolution) multi-channel ADC chip (TI ADS 1298), as well as the computational complexity of the CCA.

For a low-cost wearable BCI system, we demonstrated the possibility of utilizing single-channel SSVEP and achieved comparable accuracy performance compared to that of a multi-channel approach, as described in Fig. 4. Such a sys-tem requires lightweight, high-performance target identifica-tion algorithm to overcome poor-quality EEG signals; thus, we propose a lightweight, CCA-based target identification algorithm, CCA-Lite, which can run on a low-cost MCU with small battery capacity, without any external computing machine. In particular, the proposed CCA-Lite includes two algorithmic optimizations – signal binarization and on-the-flycovariance matrix calculations – which enable significant improvements in ITR, memory requirements, and system out-put latency, with a negligible decrease in target identification accuracy.

The rest of this paper is organized as follows. Section II introduces the overall SSVEP-based BCI speller system and well-known target identification algorithms. In section III, the proposed algorithm and its two optimizations are explained. The experimental results are described in section IV. Finally, Section V contains the conclusions from this study.

II. OVERVIEW OF TARGET IDENTIFICATION IN BCIs A. SSVEP-BASED BCI SPELLER SYSTEM

SSVEP-based BCI speller systems deliver a user’s intent in the form of electrical signals generated by visual stimuli [20], [1]. Typically, lights flicker on a screen at dif-ferent frequencies, which provokes a brain response in the visual cortex at a specific frequency. In turn, the SSVEP con-tains corresponding frequency components from the stimulus and its harmonics [21]. The proposed wearable SSVEP-based BCI speller is illustrated in Fig. 1.

1) TARGET GAZING

On the display device, several pre-defined visual targets with specific flickering frequencies are displayed using a speller GUI. When the user looks at a character, the brain response, SSVEP, is generated through the visual pathway, with a specific frequency of square-wave-modulated stimuli via joint-frequency and coding methods [25].

2) SIGNAL ACQUISITION

The single-channel SSVEP can be measured with the best signal quality in the occipital region of the scalp, where the visual cortex is located. The SSVEP dataset was recorded using a BioSemi ActiveTwo EEG system (Biosemi, Inc.) at a sampling rate of 2,048Hz, then down-sampled to 256Hz [22] and filtered using a 6-80Hz bandpass filter.

3) TARGET IDENTIFICATION

From the recorded SSVEP, the BCI speller system will find the target the user focused on. The three most widely used target identification algorithms – PSDA, CCA-Standard, and CCA-Comb – identify a target (e.g., ‘‘A,’’ ‘‘B,’’ ‘‘C,’’ etc.) from a predefined target-frequency mapping table, using the frequency components of the recorded SSVEP.

4) VISUAL FEEDBACK

The identified target information is wirelessly transmitted to the speller GUI software running on the display device.

5) GAZE SHIFTING

After the end of a flickering series, the user shifts his or her gaze to the next visual target to continue typing. The timeline for selecting a single target is shown in Fig. 1(b). To achieve high-speed communication in the proposed BCI speller sys-tem, the single-target identification cycle time, Tcyc, should

be minimized. Here, the signal processing time of the target identification algorithm, Tsel, is the only variable that can be

(3)

FIGURE 1. Proposed wearable SSVEP-based BCI speller system. (a) Single-target processing procedure and (b) timing diagram.

Propagation delay,τpd, which can be assumed to be a fixed

time duration between the onset of a visual stimulus and the SSVEP presentation in the occipital region, is typically reported as 120-140ms [19], [22]–[24]. SSVEP recording time, Tgaze, has a significant impact on the SSVEP SNR and

ITR performance, and can be considered as the optimal length via algorithmic simulation (in this paper, 1.5s). The wire-less communication delay, Tcomm, is very short and can be

ignored. According to recent studies, the minimum time for gaze shifting, Tshift, should be fixed at 0.5s [11], [19], [24].

The main performance evaluation metric for BCI spellers is the ITR, which considers various factors such as the num-ber of target frequencies, Nf, classification accuracy P, and

average target selection time T (seconds/selection) [19], can be understood through the following equation [20]:

ITR = log₂ Nf · PP·  1 − P Nf −1 1−P! × 60 T (1) The classification accuracy P was calculated using a leave-one-out cross-validation. Since the experiment was repeated 15 times under the same configuration, cross-validation was performed using 14 blocks for training and 1 block for testing.

B. TARGET IDENTIFICATION ALGORITHMS

In this sub-section, three commonly used target identification algorithms for SSVEP-based BCI speller are introduced and compared in terms of accuracy, relative signal processing time in MATLAB, and ITR.

1) POWER SPECTRAL DENSITY ANALYSIS (PSDA)

PSDA is the primary method of target identification for SSVEP-based BCI spellers. Although there are various meth-ods for implementing PSDA algorithms, they all involve finding the frequency at which a maximum magnitude in the power spectrum of the SSVEP signal occurs [15], [16].

PSDA is usually implemented using a fast Fourier trans-form (FFT) for a single-channel time-series signal.

Since the target frequencies differ by 0.5Hz (see C. Data Information), we defined the target decision criterion as ±0.25Hz from the target frequency. If the frequency of the maximum amplitude in the frequency domain belongs to a specific target decision boundary, the PSDA identifies the detected frequency as the corresponding target frequency of that region. Therefore, we performed a 1,024-point FFT for each 0.5s SSVEP segment (128 samples at a 256Hz sampling rate), with zero-padding so that the PSDA could exhibit a 0.25Hz frequency resolution.

If the FFT result is not one of the target frequencies, the larger amplitude frequency between two adjacent target frequencies becomes the identified result. Here, the SSVEP length is set to a multiple of 0.5s. Since the PSDA was designed to perform FFT only for 0.5s SSVEP, the PSDA was performed for a multiple of 0.5s SSVEP by accumulating the FFT results for each 0.5 segments. For multi-channel analysis, the PSDA result is obtained by accumulating FFT results for each channel.

2) CCA-STANDARD

Recently, SSVEP-based BCI speller studies have mostly used CCA-based algorithms [16]. In mathematics, CCA involves finding two sets of basis vectors such that the correlations between variable projections onto these basis vectors are mutually maximized [16]. In other words, CCAs find two sets of basis vectors between two variables, one for X and the other for Y , to maximize the canonical correlationρ for linear combinations x = XTwx and y = YTwy of the two

variables. X is the two-dimensional multi-channel SSVEP data matrix and Y is the set of reference sinusoidal signals for Nhharmonics of a specific target frequency fn, as described

(4)

FIGURE 2. Block diagram for the CCA-standard calculation process. The eigenvalue decomposition is computed for a multi-channel EEG.

of samples. The function to be maximized is as follows:

Yfn =        sin(2πfnt) cos(2πfnt) ... sin(2πNhfnt) cos(2πNhfnt)        , t = 1 fs ,2 fs , · · · ,Ns fs (2) ρ(x, y) = maxwx,wy ExyT q ExxT_{E yy}T =maxwx,wy EwT_xXYTwy r EwT xXXTwx E h wT yYYTwy i (3)

To calculate the canonical correlation, two variables x and y are preprocessed to be the zero-mean variables. Then, the total covariance matrix C is

C = Cxx Cxy Cyx Cyy = E x y  x y T (4) Cis a block matrix where Cxx and Cyyare within-set

covari-ance matrices of x and y, and these are transpositions of each other. The canonical correlation between x and y can be obtained by eigenvalue equations:

(

C_xx−1CxyCyy−1Cyxwˆx =ρ2wˆx

C_yy−1CyxCxx−1Cxywˆy=ρ2wˆy

(5) This calculation process is described in Fig. 2, where the parameter i indicates the number of SSVEP channels and j indicates 2 × Nh. In this paper, we refer to this original CCA

algorithm as CCA-Standard.

3) CCA-COMB

CCA-Comb is a combinatorial method for CCA-Standard and individual template-based CCA (IT-CCA) [11], [19], which detects temporal features of EEG signals using CCA-Standard between test data and individual template signals that can be obtained by averaging multiple training trials. The CCA-Comb uses the following four weight vectors

FIGURE 3. CCA-Comb calculation block diagram.

as spatial filters to enhance the SNR of SSVEPs, where n indi-cates the index of the target stimulus: (a) Wx( ˆX ¯χn) between

the test set ˆX and the individual template ¯χn, (b) Wx( ˆX Yn)

between the test set ˆXand the reference sinusoidal signals Yn,

(c) Wx( ¯χnYn) between the individual template ¯χn and the

reference sinusoidal signals Yn, and (d) Wy( ˆX Yn) between

the test set ˆX and the reference sinusoidal signals Yn. These

weight vectors are used for Pearson’s correlation calculations with three input data points, as represented in Fig. 3. The final feature for target identification is a weighted correlation coef-ficientρn, using correlation vectors rn,las follows [11], [19]:

rn =     rn,1 rn,2 rn,3 rn,4     =     r( ˆXTWx( ˆX Yn), YnTWy( ˆX Yn)) r( ˆXTWx( ˆX ¯χn), ¯χnTWx( ˆX ¯χn)) r( ˆXTWx( ˆX Yn), ¯χnTWx( ˆX Yn)) r( ˆXTWx( ¯χnYn), ¯χnTWx( ¯χnYn))     (6) ρn = X4 l=1sign(rn,l) · r 2 n,l (7)

(5)

where r(a, b) indicates the Pearson’s correlation coefficient between the two inputs a and b. As shown in Fig. 3, there are three CCA-Standard operations, four Pearson’s corre-lation calcucorre-lations, and several matrix multiplications and additions. Compared to CCA-Standard, the CCA-Comb is extremely complex, but is also known to have much better target identification accuracy at short SSVEP lengths.

4) MULTI-CHANNEL VS. SINGLE-CHANNEL

To improve ergonomics and usability, it is desirable to reduce the number of channels for SSVEP acquisition. However, given that there is a tradeoff between the number of channels and target identification accuracy, the validity of the target identification algorithm should be verified after reducing the number of channels.

FIGURE 4. Target identification accuracy of PSDA, CCA-Standard, and CCA-Comb, according to the SSVEP length between 8-channel and single-channel cases.

Fig. 4 shows the target identification accuracy for the three algorithms, according to the length of recorded 8-channel and single-channel SSVEPs. CCA-Comb exhibited the best accuracy for both 8-channel and single-channel cases, espe-cially for short SSVEPs. Unlike CCA-Standard and PSDA, CCA-Comb’s accuracy declined minimally across these cases. Thus, we adopted the single-channel system for use in further experiments. Since the CCA-Comb correlation vectors rn,2, rn,3, and rn,4 in (6) have the same value for

single-channel SSVEPs, the calculation for rn,3 and rn,4can

be omitted. This calulation omission comes from the fact that the weight vector Wx has only a single value in

single-channel SSVEP processing. On the other hand, these three correlation vectors should be calculated independently for the multi-channel SSVEPs. Although the calculation omission for rn,3and rn,4can be used for both the CCA-Comb and the

our proposed algorithm, this paper does not use it to clarify the effect of the proposed methods described in Section III.

The single-channel target identification accuracy, ITR, and relative signal processing time for the three algorithms

FIGURE 5. Target identification accuracy, ITR, and signal processing time of single-channel PSDA, CCA-Standard, and CCA-Comb.

are depicted on Fig. 5. CCA-Comb exhibited the highest target identification accuracy and ITR; although, the sig-nal processing time was more than three times that of CCA-Standard. The superiority of the single-channel SSVEP-based CCA-Comb can be emphasized since the ITR for a short SSVEP of 1 to 1.5 seconds was much better than that of the other two algorithms.

C. DATA INFORMATION

This work used single-channel (Oz) SSVEP data extracted from multi-channel SSVEP data from [22]. The BCI sys-tem used in the online experiment to acquire SSVEP data had 12 target visual stimuli (6 × 6 cm each) with a 60Hz refresh rate on a 27-inch, 1280 × 800-pixel LCD monitor (ASUS VG278). These stimuli were arranged in a 4×3 matrix with horizontal and vertical intervals between two neighbor-ing stimuli of 5cm and 1.5cm, respectively, and generated using joint-frequency and phase coding methods (12

fre-quencies: f0 = 9.25Hz; 1f = 0.5Hz; 4 phases: ∅0 = 0;

1∅ = 0.5π), as described in (8) [25],

stim(f, φ, i) = square(2πf (i/RefreshRate) + φ) (8) where square() generates a 50% duty cycle square-wave. Here, f indicates stimulus frequency with an initial phaseφ, and i indicates the frame index. This approach generates visual stimulus frequency up to half of RefreshRate [26].

The dataset was recorded with eight Ag/AgCl electrodes covering the occipital area, using a BioSemi ActiveTwo EEG system (Biosemi, Inc.) in 10 healthy subjects (9 males and 1 female, mean age: 28 years), with normal or corrected-to-normal vision. These eight electrodes were referenced to the CMS electrode, close to Cz. The subjects were asked to avoid during the stimulation period and to gaze at one of the 12 visual stimuli, generated at random, for 4s, and this single trial was repeated for all 12 targets. This experiment performed simulated online BCI speller to record data for offline analysis. In a dim room, the subjects were seated in

(6)

a comfortable chair that was placed 60cm in front of the monitor.

For each subject, the experiment consisted of 15 blocks for complete 12 trials. A single trial consisted of a 1s rest step, during which subjects were asked to shift their gaze to the target, and a 4s stimulation step. The SSVEP was originally amplified and digitized at a sampling rate of 2,048Hz, then down-sampled to 256Hz [22] and filtered from a 6-80Hz bandpass filter. SSVEP recording was initiated after the onset of a visual stimuli, with a 0.135s delay, in order to allow for visual processing.

FIGURE 6. Target identification evaluation platform.

D. TARGET IDENTIFICATION EVALUATION PLATFORM

As depicted in Fig 6., evaluating the performance of the proposed algorithm for an online wearable BCI system model involved a STM32F103ZET (STMicroelectronics Inc.) board as the MCU for target identification. This is an ARM Cortex-M3 CPU-based system-on-chip, with a 72MHz oper-ating frequency, that has 512kB of flash memory, 64kB of SRAM, and various peripherals.

All algorithms mentioned in this paper were implemented in the C language. Datasets were pre-stored in 64kB of SRAM to model individual template preparation and data acquisition.

III. PROPOSED ALGORITHM

For high-accuracy target identification in a low-cost wear-able BCI system, this algorithm should be optimized for real time processing, using minimal memory, on a low-performance MCU. If an algorithmic optimization can reduce the amount of SSVEP data, the memory requirement and computational complexity may also be reduced. In addition,

decreasing the timing-dependent operations minimizes the computational complexity and system output latency.

The proposed target identification algorithm has signif-icant improvements regarding the issues mentioned above, including signal binarization and on-the-fly covariance matrix calculation. Signal binarization reduces the amount of data, without critical information loss with respect to frequency and phase, while the on-the-fly covariance matrix calcula-tion reduces the system output latency and computacalcula-tional complexity.

FIGURE 7. Concept of signal binarization on SSVEP and reference sine signal.

A. SIGNAL BINARIZATION

Signal binarization assigns the amplitude of the SSVEP sig-nal a value of +1 or −1. Positive samples are mapped to +1 and negative or zero samples are mapped to −1, as shown in Fig. 7. Before applying signal binarization, CCA-Comb processes multi-bit EEG and multi-bit reference sinusoidal signals, which are defined as float-type variables in the C language. Accordingly, their implementation demands high memory requirements and computational complexity.

Conversely, when the eight binarized samples are assigned as single char-type variables (8-bit), with encoding of −1/+1 as 0/1, the computational complexity and memory require-ments can be reduced due to the data type change from float (4-byte) to char (1-byte) as well as the combination of eight samples into a single char-type variable.

Fig. 8 shows a frequency distortion caused by the signal binarization for SSVEPs, according to signal length. The three different vertical lines represent the peak frequency of the binarized SSVEP and the original SSVEP, and the target frequency flickered at stimulus display. A longer signal length correlates with a small frequency error between the

(7)

FIGURE 8. Peak frequency distortion of signal binarization in SSVEP (subject #8, 15th block data).

FIGURE 9. Averaged accuracy of CCA-Comb across all subjects for all dataset combinations, depending on whether or not signal binarization had been applied.

original and binarized SSVEPs. For SSVEPs longer than 1s, binarized SSVEP maintains frequency information without error.

This mapping policy can be applied to three datasets from CCA-Comb such as the individual template set, SSVEP signal set, and reference sinusoidal set. The accuracy of CCA-Comb with signal binarization depends on whether signal binarization is applied to each dataset. Fig. 9 shows the average accuracy of CCA-Comb across all subjects, with single-channel SSVEP lengths ranging from 0.5s to 3s

for all dataset combinations. This shows the rationale for our proposed signal binarization to determine which of the three input datasets it applies to and which it does not. There are total of eight signal binarization combinations for three datasets. In Fig. 9, we explain how we chose one of these eight combinations.

The number of harmonics in the reference sinusoidal set and the number of training trials were set to 4 and 14, respec-tively. Case 7 and Case 8, which have the lowest accuracy, were binarized in both the individual template set and the input SSVEP signal set. This means that one of the two EEG datasets should not have been extremely quantized, regardless of whether the signal binarization had been applied to the reference sinusoidal set.

Since this study pursued high-speed communication, we did not consider SSVEP lengths of 2s to 3s. The Case 1, Case2, Case 5, and Case 6 have huge memory requirements because the signal binarization is not applied to the individual template set which is pre-stored data in the memory. Among the Case 3 and Case 4, the Case 4 has higher accuracy. Therefore, we chose the Case 4 as the best combination for high speed, low memory requirement, and high accuracy.

Signal binarization that maintains signal phase informa-tion for individual template and reference sinusoidal sets can be represented by zero-crossing point analysis in the time domain, as shown in Fig. 10. If phase information is preserved after signal binarization, the zero-crossing points of the binarized and original signals should be located at the same position. As shown in Fig. 10, the zero-crossing points of the binarized individual template signal are at a similar position to those of the original individual template signal. This result indicates that applying signal binarization to the SSVEP signal can maintain phase information. In addition,

(8)

FIGURE 10. Zero-crossing points of the original individual template signal, binarized individual template signal, and CCA-filtered reference sinusoidal signal, using a binarized individual template signal.

the zero-crossing points were at similar positions after a CCA-Standard operation was performed on the reference sinusoidal signal, using the binarized individual template signal.

B. ON-THE-FLY COVARIANCE MATRIX CALCULATION

The CCA-Standard covariance matrix operation plays an important role in correlation analyses between two datasets. This essential operation becomes a bottleneck for fast target identification due to the internal mean calculation.

First, an input matrix M composed of two datasets, X and Y , is transformed into a matrix in which each signal is a zero-mean – i.e., the signal indicates each SSVEP vector or sinusoidal signal vector. Then, this signal-wise zero-mean matrix is multiplied with its transposed matrix. If the mean value of the signal is very close to zero, compared to the signal amplitude dynamic range, it is possible to perform a vector inner-product operation to begin receiving stream-in signal samples, without waiting for the last sample to calculate the mean value. This is key in reducing the single-target identi-fication cycle time. The covariance matrix operation can be described by C = Eh(M − E [M]) (M − E [M])Ti (9) Cij = cov Mi, Mj = E(Mi−µi) Mj−µj = EMiMj −µiµj (10)

where C is the covariance matrix, M is the input matrix consisting of two datasets, Cij is an element of covariance

matrix C, Mi is the ith row vector of M which can be a

1.5s-long SSVEP or 1.5s-long sine/cosine signal, andµi is

the mean of row vector Mi. If M is a signal-wise

close-to-zero-mean matrix, each mean value of theµiµj term can be

omitted.

Table 1 shows the three averaged SSVEP amplitude met-rics – minimum, maximum, and mean – and averaged stan-dard deviation of amplitude means across 10 subjects (upper table) and 12 targets (lower table). The amplitude mean is approximately zero, with a standard deviation of 0.13, while the SSVEP amplitude ranges from −19.48 to 19.86, on aver-age. Even though the mathematically strict covariance matrix operation involves subtracting a mean value from each vector, we omit this mean operation because the SSVEP signal has a very small amplitude mean and can be ignored. The reference sinusoidal signal inherently has an approximate zero-mean characteristic when the sinusoidal signal length is sufficiently long compared to its wavelength, much like the visual stimuli of the dataset we used.

(9)

FIGURE 11. Magnitudes of the difference between two terms that represent each element in the covariance matrix. (a) X : SSVEP signal set, Y : individual template set. (b) X : individual template set, Y : reference sinusoidal set. (c) X : SSVEP signal set, Y : reference sinusoidal set.

As indicated in (10), each element of the covariance matrix can be represented as the subtraction of two mean prod-ucts from the expectation of a two-vector multiplication. Fig. 11 shows that subtracting the multiplication of two means has a negligible effect on the final value of each element in the covariance matrix for all dataset combinations. Without the on-the-fly covariance matrix calculation, the entire covariance matrix operation must include an aver-aging operation for vector samples. Since the mean value of the vector samples cannot be obtained until the end of SSVEP recording, signal processing can only start after recording has finished. However, since the on-the-fly covariance matrix cal-culation does not require a subtracted mean value, the algo-rithm dose not need to wait for SSVEP recording to end. Therefore, the covariance matrix operation of CCA-Comb starts at the same time as SSVEP recording and is completed immediately after recording ends.

In CCA-Lite, the signal processing procedure can be divided into Stage-1 and Stage-2. Stage-1 only involves a covariance matrix calculation on the three CCA-Standrad procedures during SSVEP recording, while Stage-2 occurs after the three covariance matrix calculation in CCA-Lite. In CCA-Comb, the target selection time, Tsel, includes the

covariance matrix calculation time; whereas, the Tsel in

CCA-Lite only contains the time for Stage-2.

IV. RESULTS

The proposed optimization methods, signal binarization and on-the-fly covariance matrix calculation, can improve CCA-Comb performance with respect to memory require-ments, single-target identification cycle time, and computa-tional complexity for a low-cost wearable BCI speller.

As represented in Fig. 12, the proposed CCA-Lite has similar target identification accuracy to that of CCA-Comb, with negligible loss for 1.5s-long SSVEPs.

FIGURE 12. Comparison of target identification accuracy.

FIGURE 13. Comparison of target selection time and ITR.

While maintaining this accuracy, CCA-Lite can reduce the target selection time by 93% after the recording of 1.5s SSVEPs and can also improve the ITR by 25% compared to CCA-Comb as indicated in Fig. 13. Even though 1s SSVEPs have the highest ITRs, the target identification accuracy is under 80%. For convenient communication, a 1.5s SSVEP recording with higher target identification accuracy is more appropriate. Here, a 93% reduction in target selection time is mostly due to the on-the-fly covariance matrix calculation.

The signal processing time on the Cortex-M3, at 72MHz operating frequency, is represented in Fig. 14(a). This exper-iment was performed in a situation where all SSVEP sam-ples were stored without considering real time input from the SSVEP samples, in order to analyze the computational complexity of the MCU through the net signal processing time. The signal processing time of CCA-Lite was reduced by 47% compared to that of CCA-Comb. Since the signal binarization applied CCA-Comb (M1) has no effect on the improvement, the on-the-fly covariance matrix calculation is found to be the most effective method for reducing signal processing time.

As shown in Fig. 14(b), there was a significant reduc-tion (92%) in the memory requirements for CCA-Lite com-pared to that of CCA-Comb. For the memory requirements,

(10)

FIGURE 14. CCA-Lite performance evaluation for (a) signal processing time on the Cortex-M3 and (b) required memory, where M1 indicates the signal binarization applied CCA-Comb; M2 indicates the on-the-fly covariance matrix calculation applied CCA-Comb; and M1 + M2 is the proposed CCA-Lite.

FIGURE 15. Comparison of single-target identification cycle times.

the on-the-fly covariance matrix calculation has negligible effect on reducing memory requirements.

Fig. 15 represents the performance improvement of single-target identification cycle time compared to that of the baseline algorithm, CCA-Comb. The SSVEP recording time Tgaze, also known as visual stimulus time or signal

acquisition time, was fixed at 1.5s. The propagation delay τpd is assumed to be 135ms, and communication delay was

also fixed at negligible durations. Since the gaze shifting time should be guaranteed for at least 0.5s, Tshift.base and

Tshift.prop were fixed at 0.5s. Therefore, the 26% reduction

of single-target identification cycle time Tcycwas caused by

reducing the target selection time Tselthrough the on-the-fly

covariance matrix calculation of CCA-Lite.

These performance improvements can enable fast target identification speed and cost reductions, with a high target identification accuracy.

V. CONCLUSION

The proposed CCA-Lite for a low-cost wearable BCI speller is an improved version of CCA-Comb that results from two optimization methods: signal binarization and on-the-flycovariance matrix calculation. The memory requirements for the low-cost MCU platform can be significantly reduced by up to 92% through signal binarization, which performs extreme quantization on the input data. The computational complexity and signal processing time in the MCU can be reduced via these two optimization methods. In addition, the single-target identification cycle time can be reduced by up to 26% through the on-the-fly covariance matrix calcu-lation. This performance improvement in high-speed com-munication has increased the ITR by 25%. The proposed target identification algorithm can be used in various low-cost wearable BCI speller devices.

ACKNOWLEDGMENT

Dokyun Kim and Wooseok Byun contributed equally to this work.

REFERENCES

[1] D. J. McFarland and J. R. Wolpaw, ‘‘Brain-computer interface operation of robotic and prosthetic devices,’’ Computer, vol. 41, no. 10, pp. 52–56, Oct. 2008.

[2] N. Birbaumer et al., ‘‘A spelling device for the paralysed,’’ Nature, vol. 398, pp. 297–298, Mar. 1999.

[3] N. Birbaumer and L. G. Cohen, ‘‘Brain–computer interfaces: Communica-tion and restoraCommunica-tion of movement in paralysis,’’ J. Physiol., vol. 579, no. 3, pp. 621–636, Mar. 2007.

[4] E. W. Sellers, D. J. Krusienski, D. J. McFarland, T. M. Vaughan, and J. R. Wolpaw, ‘‘A P300 event-related potential brain-computer interface (BCI): The effects of matrix size and inter stimulus interval on perfor-mance,’’ Biol. Psychol., vol. 73, no. 3, pp. 242–252, 2006.

[5] J. R. Wolpaw and D. J. McFarland, ‘‘Control of a two-dimensional move-ment signal by a noninvasive brain-computer interface in humans,’’ Proc.

Nat. Acad. Sci. USA, vol. 101, no. 51, pp. 17849–17854, 2004. [6] M. Velliste, S. Perel, M. C. Spalding, A. S. Whitford, and A. B. Schwartz,

‘‘Cortical control of a prosthetic arm for self-feeding,’’ Nature, vol. 453, pp. 1098–1101, Jun. 2008.

[7] H. Cecotti, ‘‘Spelling with non-invasive brain–computer interfaces–current and future trends,’’ J. Physiol.-Paris, vol. 105, pp. 106–114, Jun. 2011. [8] D. Zhu, J. Bieger, G. G. Molina, and R. M. Aarts, ‘‘A survey of

stimu-lation methods used in SSVEP-based BCIs,’’ Comput. Intell. Neurosci., vol. 2010, Jan. 2010, Art. no. 702357.

[9] M. Arvaneh, C. Guan, K. K. Ang, and C. Quek, ‘‘Optimizing spatial filters by minimizing within-class dissimilarities in electroencephalogram-based brain–computer interface,’’ IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 4, pp. 610–619, Apr. 2013.

[10] M. Cheng, X. Gao, S. Gao, and D. Xu, ‘‘Design and implementation of a brain-computer interface with high transfer rates,’’ IEEE Trans. Biomed.

Eng., vol. 49, no. 10, pp. 1181–1186, Oct. 2002.

[11] Y. Wang, M. Nakanishi, Y.-T. Wang, and T.-P. Jung, ‘‘Enhancing detection of steady-state visual evoked potentials using individual training data,’’ in Proc. 36th Annu. Int. Conf. IEEE Eng. Med. Bio. Soc., Aug. 2014, pp. 3037–3040.

(11)

[12] M. Nakanishi, Y. Wang, X. Chen, Y.-T. Wang, X. Gao, and T.-P. Jung, ‘‘Enhancing detection of SSVEPs for a high-speed brain speller using task-related component analysis,’’ IEEE Trans. Biomed. Eng., vol. 65, no. 1, pp. 104–112, Jan. 2018.

[13] J. W. Ahn, Y. Ku, D. Y. Kim, J. Sohn, J.-H. Kim, and H. C. Kim, ‘‘Wear-able in-the-ear EEG system for SSVEP-based brain–computer interface,’’

Electron. Lett., vol. 54, no. 7, pp. 413–414, Mar. 2018.

[14] N. Verma, A. Shoeb, J. Bohorquez, J. Dawson, J. Guttag, and A. P. Chandrakasan, ‘‘A micro-power EEG acquisition SoC with integrated feature extraction processor for a chronic seizure detection system,’’ IEEE

J. Solid-State Circuits, vol. 45, no. 4, pp. 804–816, Apr. 2010.

[15] W. Yijun, W. Ruiping, G. Xiaorong, and G. Shangkai, ‘‘Brain-computer interface based on the high-frequency steady-state visual evoked poten-tial,’’ in Proc. 1st Int. Conf. Neural Interface Control, May 2005, pp. 37–39.

[16] Z. Lin, C. Zhang, W. Wu, and X. Gao, ‘‘Frequency recognition based on canonical correlation analysis for SSVEP-based BCIs,’’ IEEE Trans.

Biomed. Eng., vol. 54, no. 6, pp. 1172–1176, Jun. 2007.

[17] Y. Wang, R. Wang, X. Gao, B. Hong, and S. Gao, ‘‘A practical VEP-based brain-computer interface,’’ IEEE Trans. Neural Syst. Rehabil. Eng., vol. 14, no. 2, pp. 234–240, 2006.

[18] M. Salvaro, S. Benatti, V. J. Kartsch, M. Guermandi, and L. Benini, ‘‘A minimally invasive low-power platform for real-time brain com-puter interaction based on canonical correlation analysis,’’ IEEE Internet

Things J., vol. 6, no. 1, pp. 967–977, Feb. 2019.

[19] M. Nakanishi, Y. Wang, Y.-T. Wang, Y. Mitsukura, and T.-P. Jung, ‘‘A high-speed brain speller using steady-state visual evoked potentials,’’ Int. J.

Neu-ral Syst., vol. 24, no. 6, Sep. 2014, Art. no. 1450019.

[20] J. R. Wolpaw, N. Birbaumer, D. J. McFarland, G. Pfurtscheller, and T. M. Vaughan, ‘‘Brain–computer interfaces for communication and con-trol,’’ Clin. Neurophysiol., vol. 113, no. 6, pp. 767–791, Jun. 2002. [21] F.-B. Vialatte, M. Maurice, J. Dauwels, and A. Cichocki, ‘‘Steady-state

visually evoked potentials: Focus on essential paradigms and future per-spectives,’’ Prog. Neurobiol., vol. 90, no. 4, pp. 418–438, Apr. 2010. [22] M. Nakanishi, Y. Wang, Y.-T. Wang, and T.-P. Jung, ‘‘A comparison

study of canonical correlation analysis based methods for detecting steady-state visual evoked potentials,’’ PLoS ONE, vol. 10, no. 10, Oct. 2015, Art. no. e0140703

[23] M. Nakanishi, Y. Wang, Y.-T. Wang, and T.-P. Jung, ‘‘A dynamic stopping method for improving performance of steady-state visual evoked potential based brain-computer interfaces,’’ in Proc. IEEE 37th Ann. Int. Conf. Eng.

Med. Biol. Soc. (EMBC), Aug. 2015, pp. 1057–1060.

[24] X. Chen, Y. Wang, M. Nakanishi, X. Gao, T.-P. Jung, and S. Gao, ‘‘High-speed spelling with a noninvasive brain–computer interface,’’ Proc. Nat.

Acad. Sci. USA, vol. 112, no. 44, pp. E6058–E6067, 2015.

[25] X. Chen, Y. Wang, M. Nakanishi, T.-P. Jung, and X. Gao, ‘‘Hybrid fre-quency and phase coding for a high-speed SSVEP-based BCI speller,’’

Proc. 36th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., Aug. 2014, pp. 3993–3996.

[26] Y. Wang, Y.-T. Wang, and T.-P. Jung, ‘‘Visual stimulus design for high-rate SSVEP BCI,’’ Electron. Lett., vol. 46, no. 15, pp. 1057–1058, Jul. 2010.

DOKYUN KIM (S’18) received the B.S. degree in electronics engineering from Chungnam National University, Daejeon, South Korea, in 2017. He is currently pursuing the M.S. degree with the Department of Electrical and Information Engi-neering, Seoul National University of Science and Technology (SeoulTech), Seoul, South Korea.

His current research interests include brain-computer interface and biomedical wearable devices system-on-chips architecture.

WOOSEOK BYUN (S’13) received the B.S. degree in electronics engineering and the M.S. degree in electronics, radio, and information communications engineering from Chungnam National University, Daejeon, South Korea, in 2013 and 2015, respectively, where he is cur-rently pursuing the Ph.D. degree.

His current research interests include CPU/DSP, brain-computer interface, and energy-efficient neural processing engine.

YUNSEO KU (M’18) received the B.S. degree in electrical engineering and the M.S. and Ph.D. degrees in biomedical engineering from Seoul National University, Seoul, South Korea, in 2004, 2008, and 2017, respectively.

From 2008 to 2018, he was involved in research on biomedical circuits and systems for mobile healthcare applications with the Samsung Advanced Institute of Technology, Suwon, South Korea. He is currently an Assis-tant Professor with the Department of Biomedical Engineering, College of Medicine, Chungnam National University, Daejeon, South Korea. His research interests include biomedical system design and neuromodulation technologies for neurological disorders.

JI-HOON KIM (S’05–M’11–SM’17) received the B.S. (summa cum laude) and Ph.D. degrees in electrical engineering and computer science from KAIST, Daejeon, South Korea, in 2004 and 2009, respectively.

In 2009, he joined Samsung Electronics. In 2018, he joined the Faculty of the Department of Electronic and Electrical Engineering, Ewha Womans University, where he is currently an Associate Professor. His current research interests include CPU/DSP, communication modem, and low-power SoC design for security/biomedical systems. He is a Technical Committee Member of the circuits and systems for communications and VLSI systems and applications in the IEEE Circuits and Systems Society. He was a recipient of the Best Design Award from the Dongbu HiTek IP Design Contest, in 2007, and the First Place Award from the International SoC Design Conference Chip Design Contest, in 2008.