Giter Club home page Giter Club logo

clean_rawdata's Introduction

Clean_rawdata EEGLAB plug-in

The Clean Rawdata plug-in (version 2.0) interface has been redesigned and will soon become the default EEGLAB method for removing artifacts from EEG and related data. The plug-in detects and can separate low-frequency drifts, flatline and noisy channels from the data. It can also apply ASR (automated subspace removal) to detect and reject or remove high-amplitude non-brain ('artifact') activity (produced by eye blinks, muscle activity, sensor motion, etc.) by comparing its structure to that of known artifact-free reference data, thereby revealing and recovering (possibly smaller) EEG background activity that lies outside the subspace spanned by the artifact processes.

Note: This plug-in uses the Signal Processing toolbox for pre- and post-processing of the data (removing drifts, channels and time windows); the core ASR method (clean_asr) does not require this toolbox but you will need high-pass filtered data if you use it directly.

This project needs you

We need community maintain to this project. Please review existing issues and issue pull requests. A section in this documentation with link to all the existing methodological papers is also needed.

Credit

This plug-in, clean_rawdata uses methods (e.g., Artifact Subspace Reconstruction, ASR) by Christian Kothe from the BCILAB Toolbox (Kothe & Makeig, 2013), first wrapped into an EEGLAB plug-in by Makoto Miyakoshi and further developed by Arnaud Delorme with Scott Makeig.

This plug-in cleans raw EEG data. Methods from the BCILAB toolbox are being used (in particular Artifact Subspace Reconstruction) designed by Christian Kothe.

These functions were wrapped up into an EEGLAB plug-in by Makoto Myakoshi, then later by Arnaud Delorme with input from Scott Makeig.

The private folder contains 3rd party utilities, including:

  • findjobj.m Copyright (C) 2007-2010 Yair M. Altman
  • asr_calibrate.m and asr_process.m Copyright (C) 2013 The Regents of the University of California Note that this function is not free for commercial use.
  • sperhicalSplineInterpolate.m Copyright (C) 2009 Jason Farquhar
  • oct_fftfilt Copyright (C) 1996, 1997 John W. Eaton
  • utility functions from the BCILAB toolbox Copyright (C) 2010-2014 Christian Kothe

The folder "manopt" contains the Matlab toolbox for optimization on manifolds.

Graphic interface

Below we detail the GUI interface. Individual function contain additional help information.

High pass filter the data

Check checkbox (1) if the data have not been high pass filtered yet. If you use this option, the edit box in (2) allows setting the transition band for the high-pass filter in Hz. This is formatted as[transition-start, transition-end]. Default is 0.25 to 0.75 Hz.

Reject bad channels

Check checkbox (3) to reject bad channels. Options (4) allows removal of flat channels. The edit box sets the maximum tolerated (non-rejected) flatline duration in seconds. If a channel has a longer flatline than this, it will be considered abnormal and rejected. The default is 5 seconds. Option (5) sets the Line Noise criterion: If a channel has more line noise relative to its signal than this value (in standard deviations based on the total channel signal), it is considered abnormal. The default is 4 standard deviations. Option (6) sets the minimum channel correlation. If a channel is correlated at less than this value to an estimate based on other nearby channels, it is considered abnormal in the given time window. This method requires that channel locations be available and roughly correct; otherwise a fallback criterion will be used. The default is a correlation of 0.8.

Artifact Subspace Reconstruction

Check checkbox (7) to use Artifact Subspace Reconstruction (ASR). ASR is described in this publication. In edit box (8) you may change the standard deviation cutoff for removal of bursts (via ASR). Data portions whose variance is larger than this threshold relative to the calibration data are considered missing data and will be removed. The most aggressive value that can be used without losing much EEG is 3. For new users it is recommended to first visually inspect the difference between the aw and the cleaned data (using eegplot) to get a sense of the content the is removed at various levels of this input variable. Here, a quite conservative value is 20; this is the current default value. Use edit box (9) to use Riemannian distance instead of Euclidian distance. This is a beta option as the advantage of this method has not yet been clearly demonstrated. Checkbox (10) allows removal instead of correction of artifact-laden portions of data identified by ASR. One of the strength of ASR is its ability to detect stretches of 'bad data' before correcting them. This option allows use of ASR for data-period rejection instead of correction, and is the default for offline data processing. ASR was originally designed as an online data cleaning algorithm, in which case 'bad data' correction may be used.

Additional removal of 'bad data' periods

Check checkbox (11) to perform additional removal of bad-data periods. Edit box (12) sets the maximum percentage of contaminated channels that are tolerated in the final output data for each considered window. Edit box (13) sets the noise threshold for labeling a channel as contaminated.

Display rejected and corrected regions

Check checkbox (14) plots rejection results overlaid on the original data. This option is useful to visually assess the performance of a given ASR method.

Additional parameters are accessible through the command line interface of the clean_artifacts function.

Additional documentation

Makoto Miyakoshi wrote a page in the wiki section of this repository discussing ASR.

Version history

v0.34 and earlier - original versions

v1.0 - new default values for some of the rejection tools, new GUI

v2.0 - new improved GUI, compatibility with studies

v2.1 - fix issue with 'distance' variable for burst detection

v2.2 - fix history call for pop_clean_rawdata

v2.3 - add maxmem to asr_calibrate to ensure reproducibility of results

v2.4 - fixing issue with running function in parallel for Matlab 2020a

v2.5 - move asr_calibrate out of the private folder so it can be used directly

v2.6 - allowing to exclude channels and a variety of small bug fixes

v2.7 - allowing to fuse channel rejection for datasets with same subject and session (STUDY processing)

v2.8 - better error messages, and fix excluding channels (there was a rare crash)

v2.9 - fix bug when ignoring channels and removing channels at the same time, fix plotting issue with vis_artifact

v2.91 - add support for fractional sampling rate; fix too many splits with high sampling frequencies

clean_rawdata's People

Contributors

amisepa avatar arnodelorme avatar chkothe avatar rgougelet avatar smakeig avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clean_rawdata's Issues

is continuity of calibration data important?

The clean_windows function seems to introduce discontinuities in the data to be used for calibration - can this affect the quality of the ASR calibration & cleaning?

I ask because I'm working with very long EEG recordings and the calibration stage is extremely long given the amount of reference data used.
So I'm considering randomly dropping a fraction of the samples in the reference data to speed things up, this would introduce many discontinuities - is it a reasonable approach?

Alternatively I could just take the first N samples of relatively clean data for calibration, but this is less preferable due to possible differences across the recording.

Inconsistent results using ASR

Dear all,
Initially I have used "clean raw data using ASR" with default values. Different number of channels were rejected and different duration of clean data were obtained in multiple run of same dataset. Later, manually I have removed bad channel and removed bad segment through "Automatic continuous rejection". Faced same problem as earlier. So I m getting inconsistent results for same data set for multiple run.
Kindly help me in this regard.
Thanks in advance.

errors when using clean_rawdata or clean_artifacts in eeglab2019_1

Dear Sir/Madam,
I used "clean_rawdata" function downloaded from the plugin list in the EEGlab wiki with the old eeglab version "eeglab14_1_2b",
"EEG = clean_rawdata(EEG, 5, [-1], 0.8, 4, 20, 0.25)", everything was ok, no error appears.
However, I am using the "eeglab2019_1" version now that contains clean_rawdata and clean_artifacts in the plugin folder, but the clean_rawdata and clean_artifacts functions appears the following error no matter "EEG = clean_rawdata(EEG, 5, 'off', 0.8, 4, 20, 0.25)"
and even I use " clean =clean_artifacts(EEG);"

"Error using distance (line 93)
Not enough input arguments.
Error in clean_artifacts (line 230)
if ~strcmpi(distance, 'euclidian')"

Would you please help me on this problem?
Thank you very much.
Best regards,
Jane

Memory error

Hi there

I am getting "Not enough memory" errors running clean_rawdata 2.6 via the EEGLAB 2021.1 interface to do artifact subspace reconstruction (just actual ASR burst correction, none of the other functionality of clean_rawdata).
I've tested this on two different datasets (similar length, one is 410s at 1000 Hz) on two different machines (one with 128GB RAM and one with 8GB RAM).
EEGLAB reports the error occurring in asr_process (line 132), which is where it checks maxmem.
It seems to be due to maxmem defaulting to 64 in clean_artifacts (line 195) and/or in clean_asr (line 137), which go into asr_process.
I notice that without input, asr_process (line 110) and asr_calibrate (line 121) dynamically calculate maxmem as:
maxmem = hlp_memfree/(2^21)
I've got it 'working' via the GUI by replacing the 64 with [] in clean_artifacts (line 195) and in clean_asr (line 137) and adding
|| isempty(maxmem) to the if statement on line 120 of asr_calibrate, to make the code use hlp_memfree instead.
Is this reasonable?
(Obviously it can be solved by not using the EEGLAB interface, but I would like to include ASR in a pre-processing pipeline usable by non-coders)

Cheers!
Rohan

ASR burst criterion default threshold

Hi there,

In our lab we noticed that there is a discrepancy between the ASR default cutoff and the suggested "conservative" criterion reported in the README. Currently, the default criterion is set to 5 SD, but the one suggested is 10.

A cutoff of 5 is very aggressive, and there is evidence that a value at least greater than 10 should be used to obtain a sensible reduction of artifacts without losing too much information (Chang at al., 2018, also referenced in the wiki section of this repository).

Currently, the README file states:

For new users it is recommended to first visually inspect the difference between the aw and the cleaned data (using eegplot) to get a sense of the content the is removed at various levels of this input variable. Here, a quite conservative value is 10; this is the current default value.

This might throw off some users if they do not keep attention at the actual cut off.

Suggestions

  • Set the default value for ASR at 10. Possibly even higher (maybe 20) to stay on the conservative side. As for the paper reported above, a cutoff of 10 would potentially modify just under 60% of the data.
  • Modify the help text of the functions that uses ASR (clean_rawdata, clean_asr, clean_artifacts) as they currently report the following: For new users it is recommended to at first visually inspect the difference between the original and cleaned data to get a sense of the removed content at various levels. A quite conservative value is 5. Default: 5. This might be another source of confusion, as 5 is said to be a conservative default, while it is actually aggressive. This help text should be updated independently of whether the default parameter is modified.

Consult about "vis_artifacts failed" & "Skipping visualization"

Dear all,

I would like to consult you about a failure to plot rejected data after using Clean_Rawdata and the ASR algorithm. I am going to detail (1) the setting of EEGLAB and MATLAB on my computer and (2) the whole pipeline by the application of Clean_Rawdata and the ASR algorithm.

Setting:
I am using "MATLAB R2022a" and "EEGLAB v2022.0" on a DELL Inspiron 15 5510 that uses Windows.

Procedure:
1. I open MATLAB and type EEGLAB in the command window. (Btw, I cannot figure out how to address the warning. I have checked all folders including the "Toolbox" folder, but I do not see EEGLAB in any folders in the MATLAB folder.)

Screenshot (138)

2. I import a CNT file / .set file to EEGLAB. (I tried both types of files but still got the warning "vis_artifacts failed. " on the bottom.)

Screenshot (139)

3. I load channel locations. I choose the MNI coordinate file for BEM dipfit model.

4. I rereference data to the average of the two mastoids, following steps in "https://wiki.cimec.unitn.it/tiki-index.php?page=Data+pre-processing#Re-referencing". That is, I first append the electrode used for online referencing (i.e. CPz) in the "Edit channel info -- pop_chanedit()". Then, in the "pop_reref - average reference or re-reference data" window, I choose "re-reference data to channel(s)", and I choose TP9 / TP10 next to this option. Next, I check "Retain ref. channel(s) in data", and choose "CPz" for "Add old ref. channel back to the data". After that, I go to the same window again, choose "re-reference data to channel(s)", and select both "TP9" and "TP10" for this option. Then, I click "Ok". After rereferencing, the GUI looks like this:

Screenshot (140)

5. I first low-pass filter the data at 30 Hz, and then high-pass filter the data at 1 Hz in the "Filter the data -- pop_eegfiltnew()".

Screenshot (141)

6. In this step, I apply "pop_clean_rawdata()", leaving every parameter unchanged. This means that I do not high-pass filter the data again, but I (1) remove bad channels, (2) perform artifact subspace reconstruction bad burst correction to remove bad data periods, (3) perform additional removal of bad data periods, and (4) enable "Pop up scrolling data window with rejected data highlighted".

After applying "pop_clean_rawdata()", I cannot see the pop-up window showing the rejected data, and I receive this error message:

Warning: vis_artifacts failed. Skipping visualization.
In pop_clean_rawdata (line 180)

Screenshot (143)

Could you please help me with this issue? Thank you so much in advance!!!

Best regards,
Samuel

Regarding Mixing matrix in asr_calibrate.m

In mixing matrix,

  1. Do real have any significance? As far as I can see the output from reshape is always a non imaginary matrix.
  2. Do we have need to do sqrtm?
  3. Can Mixing matrix consist imaginary values? Not all sqrtm(A) produces a proper matrix with rational numbers.

M = sqrtm(real(reshape(block_geometric_median(U/blocksize),C,C)));

If you have any document supporting the explanation of it. Please add it in comment. Thank You.

Problem adding back channels removed

I am trying to do a bad trial rejection using the pop_clean_rawdata function with the following parameters:

  EEG_cortical_BT = pop_clean_rawdata(EEG_cortical, 'channels_ignore', {'HEOG', 'VEOG'},  'FlatlineCriterion','off',...
        'ChannelCriterion','off',...
        'LineNoiseCriterion','off',...
        'Highpass','off',...
        'BurstCriterion',20,...   
        'WindowCriterion',0.3,... 
        'BurstRejection','on',...
        'Distance','Euclidian',...
        'WindowCriterionTolerances',[-Inf 5]);

I want the function to not consider the EOG channels to do this, but for some reason that I can't figure out I get the following error:

Error using clean_artifacts (line 320)
Issue with adding back channels removed. Send us your data.

Am I doing something wrong or do I have to do some previous step to avoid this error? Here is the Drive link to the signal I am currently using: https://drive.google.com/drive/folders/1LqJiUg7PxSzE_QKXlP8esi7Tl4D5teVC?usp=sharing

Error in pop_clean_rawdata when calling multiple datasets

When I try to run pop_clean_rawdata with multiple datasets as input I get an error in:

[ EEG, com ] = eeg_eval( @clean_artifacts, EEG, 'params', options );

because clean_artifacts is given as an input handle instead of a string to eeg_eval().

I would suggest changing lines 134 and 136 to:
[ EEG, com ] = eeg_eval( 'clean_artifacts', EEG, 'warning', 'on', 'params', options );
and
[ EEG, com ] = eeg_eval( 'clean_artifacts', options );
respectively.

I am using the most recent version of clean_rawdata and eeglab2021 in matlab R2020b.
Best,
Cristina Gil

vis_artifacts compatibility issue between matlab versions

Dear developers,

I am using vis_artifacts to check data before and after preprocessing. I can run vis_artifacts in matlab R2020b without any problem, however, when I run it in matlab R2020a I get into an error because the scale variable in line 220 is defined as a row vector instead of column vector in 2020a.

scale = ((ylr(2)-ylr(1))/size(new.data,1)) ./ (opts.yscaling*iqrange); scale(~isfinite(scale)) = 0;

That affects repmat() two lines later. I would suggest to include
scale = scale(:);
just before line 222 to make scale a column vector and ensure compatibility with earlier versions of matlab.
Thank you very much!
Cristina Gil

fit_eeg_distribution in asr_calibrate crash

Transfered from https://sccn.ucsd.edu/bugzilla/show_bug.cgi?id=13115

The line: H = bsxfun(@times,X(1:m,:),nbins./X(m,:)); also crashes when m = 0;

[reply] [−]DescriptionTyler Grummett 2017-07-18 17:55:02 PDT

In the clean_rawdata0.32 toolbox, I am having an issue with the fit_eeg_distribution function, which is a function within the asr_calibration function. The error is as follows:

Error using bsxfun
Non-singleton dimensions of the two input arrays must match each
other.

Error in asr_calibrate>fit_eeg_distribution (line 377)
kl =
sum(bsxfun(@times,p,bsxfun(@minus,log(p),logq(1:end-1,:)))) +
log(m);

Error in asr_calibrate (line 180)
[mu(c),sig(c)] =
fit_eeg_distribution(rms,min_clean_fraction,max_dropout_fraction);

Error in clean_asr (line 164)
state = asr_calibrate(ref_section.data,ref_section.srate,cutoff);

Error in clean_artifacts (line 219)
EEG =
clean_asr(EEG,burst_crit,[],[],[],burst_crit_refmaxbadchns,burst_crit_reftolerances,[]);
end

Error in clean_rawdata (line 83)
cleanEEG = clean_artifacts(EEG, 'FlatlineCriterion', arg_flatline,...

Looking into it a little further, it appear as though logq only has one row of numbers, so it crashes when it tries to grab logq(1:end-1,:). This occurs when m = 1, n = 6 (X is a 3x6 matrix of numbers), and therefore H is a row of NaNs.

Please note that this is one set of data which didnt work out of a large number of tasks and subjects, so it is working for the vast majority of the time.

Regards,
Tyler

clean_rawdata() function

Hello,

I got the error when I run the clean_rawdata() function.
Could you please let me know how to solve it?
"Error using distance (line 93)
Not enough input arguments.

Error in clean_artifacts (line 230)
if ~strcmpi(distance, 'euclidian')

Error in clean_rawdata (line 86)
cleanEEG = clean_artifacts(EEG, 'FlatlineCriterion', arg_flatline,..."

clean_rawdata and -nojvm

Hi!
I cant run clean_rawdata in headless matlab mode using matlab -nojvm (-nodesktop still takes one "virtual" slot of the X which are capped at 75 at least for me).

The reason is this line:
hlp_memfree, which calls a java function to get free memory.

result = java.lang.management.ManagementFactory.getOperatingSystemMXBean().getFreePhysicalMemorySize();

Which I cannot overwrite in asr_calibrate, but it is correctly skipped in other functions e.g. asr_process_r.

I think the bug is, that asr_calibrate never receives the maxmem option.

(minor: maybe refactor that helper function given that it is defined again in asr_process_r)

Best, Bene

Edit: I got confused, it is asr_calibrate not asr_calibrate_r (or maybe both? whats the difference except one is in private the other not?).
Edit Edit: I just got notice that there is "nodesktop" and "nodisplay", which fixes the problem for me. The bug still exists though

High sampling rate problem (60 000 Hz)

I have a EEG data sample which contains six channels. Two of them are scalp channels (Tp9 & Tp10), the other four are implanted electrodes which are located near the medial geniculate nucleus (MGB) of the patients brain during deep brain stimulation (DBS) surgery. The exact location is not known, but these four channels are only millimeters apart in order to extract accurate local field potentials (LFP). I added the EEG data as EDF file and the channel locations as LOC file. When I try to run ASR with the following code for example, I receive an error saying:

Input -->
EEG = eeg_checkset( EEG );
EEG = pop_clean_rawdata(EEG, 'FlatlineCriterion','off','ChannelCriterion','off','LineNoiseCriterion','off','Highpass','off','BurstCriterion',20,'WindowCriterion',0.25,'BurstRejection','on','Distance','Euclidian','WindowCriterionTolerances',[-Inf 7] );
[ALLEEG EEG CURRENTSET] = pop_newset(ALLEEG, EEG, 1,'overwrite','on','gui','off');

Output-->
'Array indices must be positive integers or logical values.
occurred in:
clean_windows: 105
clean_asr: 146
clean_artifacts: 233
pop_clean_rawdata: 151'

or

'Integers can only be combined with integers of the same class, or scalar doubles.'

I tried to convert the EEG.data to double in order to make positive integers or integers, but this does not seem to work.
I cannot go into the detailed functions to check where it goes wrong.
Can you please help me out?

Thank you in advance,
AD stimulus AD elektrodes.zip

Sincerely,
Jelle van der Eerden

I cannot use Clean Rawdata and ASR on a single dataset

Hello, I am using eeglab 2021 and I'm trying to use clean rawdata on a single dataset, but the option in Tools is always disabled, regardless all the previous pre-processing steps (removing baseline, filtering, ecc.)
However, if I am working with a study, the option is always enabled. The only problem is that with a study the ASR tool fails because of the N-dimensionality of the EEG object.
I am mostly working from eeglab GUI because I am not very familiar with Matlab, how could I solve this issue?
I attached 2 screenshots:

  • first one shows the disabled option when using a single dataset
  • second one shows the error caused by running ASR with a multidimensional EEG object (3 datasets)

Screenshot (48)
Screenshot (49)

Thanks for any help you can provide and sorry for using the issue tracker as a Stackoverflow thread :/

ASR implementation in Python

I'm using Python instead of Matlab/EEGLab for signal processing.
Is there any implementation of the Artifact Subspace Reconstruction in Python?

Thanks

Possibly misleading documentation

Setting manually 'WindowCriterion' to 0.4 instead of default 0.25 tends to reject more data (according to Greg Perlman). This is in contradiction with the documentation which indicate that
"Generally a lower value makes the criterion more aggressive"

Unable to see artifact visualization

I keep trying to use the pop_clean_rawdata and have even copied the latest 2.6 version but continue to get errors when trying to use it for viewing the artifacts that have been edited. I thought the issue might be that the dataset I am using has 20 channels but when I pick out both the 20 as well as the 19 subset, neither allows this to work.

Specifically I get:
Warning: vis_artifacts failed. Skipping visualization.

In pop_clean_rawdata (line 158)

Any insight/suggestions would be greatly appreciated. TIA. Also, please let me know if this is not appropriate to list this issue here, or in another place, I apologize if this is not the place to do so.

How to stop clean_rawdata deleting channels?

Hello ,I recently confront a problem. When I use clean_rawdata, it tends to delete one or two channels of the data, whitch affect my following processing( I need all channels ) . Any way I can handle this?

Sample indices to retain in Clean_windows.m function might be wrong

File: Clean_windows.m

Line 116: swz = sort(wz);

Since the Z-score values are sorted, the corresponding window indices also change. This change is probably not taken care of, leading to the wrong calculation of removed_samples (line 126).

asr_error

Please confirm. Thanks.
Best,
Velu

memory usage

asr_process() splits the data into chunks according to available memory, however the moving_average() function within it ends up operating on the covariance matrix, thereby multiplying the size of each chunk by the number of channels. it also appears that it then doubles the size of the covariance matrix, as well. so, for example, if there are 32 channels, the chunks are 64x too large. this seems to lead to consistent out of memory errors.

simply dividing the chunk size by 2*nchan yields insufficiently small chunks (7 samples, in my case). i attempted to input each channelxchannel vector individually into the moving_average function, but this ends up running too slowly. perhaps there is a simpler solution i'm not seeing.

Citation of Bad Channel Removal Methods

Hi Arno (and others),

I wonder how to cite properly the bad channel removal methods within the CRD plugin. Are these coming from the PREP toolbox (findnoisychannels.m)?

Thanks,
Velu

design_fir crashes on data with high sampling rates

Transfered from https://sccn.ucsd.edu/bugzilla/show_bug.cgi?id=12974

This bug involves the design_fir function which is called by the clean_channels function in the clean_rawdata toolbox

The design_fir function crashes on data with a high sampling rate (9600) but performs perfectly well with data with a lower sampling rate (9000).

line 89 of clean_channels:

B = design_fir(100,[2*[0 45 50]/signal.srate 1],[1 1 0 0]);

line 47 of design_fir:

F = interp1(round(F*nfft),A,(0:nfft),'pchip');

The original F ([2*[0 45 50]/signal.srate 1]) is multiplied by the nFFT (512) to produce [ 0 5 5 512] with a sampling rate of 9600, and [ 0 5 6 512] with a sampling rate of 9000. As griddedInterpolate needs unique numbers, the function crashes with the 9600 sr data.

If you increase the nFFT to 1024, then it becomes [ 0 10 11 1024] (round([2*[0 45 50]/9600 1]*1024)). However, it seems that the nFFT cannot be specified in clean_channels and is determined by the order of the filter, which is currently hardcoded into clean_channels (N=100).

Currently I only have a try and catch around it in clean_channels, where I specify the 'nfft':

try
    B = design_fir(100,[2*[0 45 50]/signal.srate 1],[1 1 0 0]);
catch
    B = design_fir(100,[2*[0 45 50]/signal.srate 1],[1 1 0 0],1024);
end

However, I dont know whether this is the best fix.

Regards,
Tyler

PS loving the toolbox

Formula for correlation

Dear team,

The line 126 in clean_channels.m script, the formula for computing correlation seems to be missing the mean components (DC offset removal)
corrs(:,o) = sum(XX.*YY)./(sqrt(sum(XX.^2)).*sqrt(sum(YY.^2)));

I was wondering if it is implemented this way for some valid reasons or if it is a bug to be reported. So, I open this thread. Kindly help me understand this. Thank you.

Cheers,
Velu

Different results with repeated runs of clean_rawdata

Hi there,

I just followed up on this thread (https://sccn.ucsd.edu/pipermail/eeglablist/2020/015456.html) I found on the eeglablist about inconsistent results when using clean_artifatcs for bad channel detection and removal. I encounter the same problem. After running quite a few tests with a couple of different datasets and different parameters I think I observed the following. Results are inconsistent when:

  1. The recording is short (less than 10 minutes recorded at 1000Hz before downsampling at 250Hz). So, in general, when the data has less than 150.000 samples. The less the number of samples, the more inconsistent the results are.
  2. The cut-off for the low-pass filter is 0.1 (as used in ERP research - I did not test with lower than this).

In my testing, these two conditions needed to be met in order to find inconsistent results. Furthermore, they were related, as the lower the number of samples, the higher the cut-off necessary to create inconsistent results. I think the problem arises from the use of rand() in clean_channels called in the background. Setting a seed create stable results, but it is difficult to tell which channels are actually bad.

I do understand this is a problem that not many people might run into given these conditions but it could be helpful to have a warning appearing (or simply being displayed on the console) in case someone is trying to process a short dataset

Proposed solution

Add a warning when calling clean_artifacts or pop_clean_rawdata (as it uses the former) if the length of EEG.data is shorter than N (I would say 150.000, that is 10minutes sampled/downsampled at 250Hz). A simple implementation would be:

if length(EEG.data) < 150000
    warning('clean_artifacts: The dataset is rather short. It is possible to encounter inconsistent results for the detection of bad channels. Try a few times to verify this. ')
end

This for both the function (maybe in pop_clean_rawdata) the message could be displayed in a window (as it is likely that people are using the GUI in that case).

Thank you,

Daniele

Providing resting-state data as the clean reference in clean_windows.m

Dear developers,

This issue is a follow up on an email exchange I got with Arnaud Delorme and Makoto Miyakoshi a year ago.

I wanted to know whether it was possible to provide (for one given subject) Clean_rawdata with resting-state data to build the clean reference to finally preprocess ERP data (which are surely dirtier that resting data). Since the idea of the clean_windows is to find a clean reference section, I figured a resting-state recording would provide cleaner reference sections compared to portions of the ERP data.

Here is the code I used (MATLAB R2019b & EEGLAB 2021.0 & Clean_rawdata2.3):

% EEG settings
ERPRAW = XXX % provide here a subject's ERP recording (EEGLAB dataset format)
rsEEG = XXX % provide here a resting-state recording from the same subject (EEGLAB dataset format)
[ALLEEG EEG CURRENTSET ALLCOM] = eeglab; 

% ASR settings
asr_windowlen = max(0.5,1.5*ERPRAW.nbchan/ERPRAW.srate);
BurstCriterion = 10;
asr_stepsize = [];
maxdims = 1;
MaxMemory = 8000; % This is now in MB and not GB!
usegpu = false;
ASR_rsERP = ERPRAW; % Get the EEGLAB structure

% Set seed
rng(0) % For reproducible results 

% Running ASR
% Creating a clean reference section (based on resting data)
% EEGCleanRef = clean_windows(rsEEG,0.075,[-3.5 5.5],1);   
EEGCleanRef = clean_windows(rsEEG); % using defaults

% Calibrate on the reference data
state = asr_calibrate_r(double(EEGCleanRef.data), EEGCleanRef.srate,...
    BurstCriterion, [], [], [], [], [], [], []); 

% Extrapolate last few samples of the signal
sig = [ERPRAW.data bsxfun(@minus,2*ERPRAW.data(:,end),...
    ERPRAW.data(:,(end-1):-1:end-round(asr_windowlen/2*ERPRAW.srate)))];

% Process signal using ASR
[ASR_rsERP.data,state] = asr_process_r(sig,ERPRAW.srate,state,...
    asr_windowlen,asr_windowlen/2,asr_stepsize,maxdims,MaxMemory,usegpu);

% Shift signal content back (to compensate for processing delay)
ASR_rsERP.data(:,1:size(state.carry,2)) = [];

The code runs smoothly, unfortunately using resting-state data to select the clean reference section seems to generate more noise that it filters them (RED = Raw; BLUE = Cleaned) :
image

Does anyone have a hint why this is failing? Maybe this is conceptually wrong?

I currently have this procedure registered in an upcoming Registered Report and would like to dig in deeper to see if I can make it work.

Many thanks,

Corentin

Unsolved mystery using average reference

Import this dataset.

Do average reference. Apply clean_rawdata with the default but changing the filter to 0.75 to 1.25 Hz. The resulting dataset crash because ASR rejection is too aggressive (331 samples left).

However, if you do not apply average reference with the same settings, then the size is 16000 samples. Also, if you use the default 0.25 to 0.75 filter, the size is 16000 (even with average reference).

options = {'FlatlineCriterion',5,'ChannelCriterion',0.85, ...
            'LineNoiseCriterion',4,'Highpass',[0.75 1.25] ,'BurstCriterion',20, ...
            'WindowCriterion',0.25,'BurstRejection','on','Distance','Euclidian', ...
            'WindowCriterionTolerances',[-Inf 7] ,'fusechanrej',1};
pop_clean_rawdata(EEG, options{:});

There must be some weird problems in ASR and interaction between the average reference and HP filter. Need further investigation by a motivated soul.

excluding the externals from being impacted by the function

Hi,
I was wondering if there is a way to specifically not have externals included in this function. I want to be able to reference back later to these, but I don't see a option to add a channel range.
As an example I used to use EEG = pop_rejchan(EEG,'elec', [1:160],'threshold',5,'norm','on','measure','kurt');

But I would prefer to use the clean_artifacts function for this.

Thank you!
Douwe

BUG - version 2.6-seeding the RNG in MATLAB 2020a

  ---------
  Error ID:
  ---------
  'MATLAB:rng:reseedLegacy'
  --------------
  Error Details:
  --------------
  Error using rng (line 99)
  The current random number generator is the legacy generator.  This is
  because you have executed a command such as rand('state',0), which activates
  MATLAB's legacy random number behavior.  You may not use RNG to reseed the
  legacy random number generator.
  
  Use rng('default') to reinitialize the random number generator to its
  startup configuration, or call RNG using a specific generator type, such as
  rng(seed,'twister').
  
  Error in rasr_nonlinear_eigenspace (line 59)
      rng(0); % seeds the random number generator to produce a predictable
      sequence of numbers

Windows to remove in clean_windows

Dear all,

I find problems to understand how to determine which windows to remove, in function clean_windows and in lines code: 119-122

I don't understand exactly why we are looking only at the channel (end-max_bad_channels) to check the condition ( > max(zthresholds)) and only at channel (1+max_bad_channels) to check the condition (< min(zthresholds)) ?

And how in that sense, the variable 'max_bad_channels' refers to the maximum number or fraction of bad channels that a retained window may still contain (more than this and it is removed)?

Should we, for each window, check the conditions for all the channels? And keep windows where a certain number of channels (max_bad_channels) check the above conditions?

thank you in advance

ASR with low number of channels

Dear Arno & team,

This is out of curiosity, I wonder if ASR was tested on a low-density EEG setup such as (8, or 16) and what are the implications, in that case?

In general, is there any guideline for the number of optimal channels that ASR would require to function in its fullest?

Thank you so much,

Velu

Comments by Makoto

• The default max_mem is 64 (MB), but in asr_calibrate() it says 'The recommended value is at least 256.' Which is the recommended value?

We should change the recommended value to 64.

• In asr_calibrate, you use 2^21 to convert MB to Byte. But isn't it 2^20?

I copied the code from the asr_process.m. Looks like it should be 2^20 but Christian put 2^21. I think there is a reason for that (like when you type "df" on Linux you have to divide the memory block by 2 or something). @chkothe any idea?

• In the main GUI, 'Acceptable [min max] power range' is incorrect. In clean_windows line 105, sqrt is taken.

The function header also mentions power. I am confused @chkothe

• This is a reminder in case I forget it tomorrow--when we specify a large value (4096 MB for example) for the max_mem, the final result seems improved. We want to ask why to Christian tomorrow.

Good point.

eegh writes a wrong channel criterion

Dear developers,

I came across an issue when trying to retrieve the code for pop_clean_rawdata via eegh after calling it from the GUI. After using the default settings of the function on a dataset with channel locations, the code from eegh looks as follows:

EEG = pop_clean_rawdata(EEG, 'FlatlineCriterion',5,...
'ChannelCriterion',5,'LineNoiseCriterion',4,...
'Highpass','off','BurstCriterion',20,'WindowCriterion',0.25,...
'BurstRejection','on','Distance','Euclidian',...
'WindowCriterionTolerances',[-Inf 7] );

By default, the ChannelCriterion should be 0.8, but at some point this value is overwritten (most likely in clean_articafts that is calling the function clean_channels) and set to 5, leading the function to delete all channels and to the following warning:

'Your dataset appears to lack correct channel locations; using a location-free channel cleaning method.'

This is only a minor issue that only appears when trying to work with the command window rather than the GUI, but I hope it can be resolved at some point since it is not that easy to spot. I would appreciate it a lot if you could confirm this bug.

Kind Regards,

Arnd Meiser

ASR visualization issue

Dear all,
As I am new in using MATLAB and especially EEGLAB, I have followed all instructions for applying the clean_rawdat, however I can not visualize the original and new data at all. I received the following output in the command window
"Use vis_artifacts to compare the cleaned data to the original.
Warning: vis_artifacts failed. Skipping visualization.

In pop_clean_rawdata (line 151) "

can anyone help me fix the issue, simply?

When 'eeglab' is called before running clean_rawdata(), the output changes.

I received the following user report on July 5, 2021: "These suggested to me that ASR performs well, but when I kept digging the problem I found a way to reproduce the bug. If I initialize eeglab before first computation and then run clean_rawdata(), I get stable results no matter how many runs it does. But a simple call of 'eeglab' command before every computation changes the behavior dramatically."

`asr_process` silently skips processing if insufficient memory

In asr_process.m, the number of splits are calculated with the line:

splits = ceil((C*C*S*8*8 + C*C*8*S/stepsize + C*S*8*2 + S*8*5) / (maxmem*1024*1024 - C*C*P*8*3));

However, the denominator can be negative if maxmem indicates there is insufficient memory. The following for i=1:splits loop is then skipped entirely and no processing is done, without any error or warning being printed.

Remarks/bugs? clean_windows; asr_calibrate and clean_asr

Hello,

I have noticed some bugs in clean_asr.m :

1)The output of the function clean_windows that is called in clean_asr.m returns all the data and not only the calibration data.

ref_section = clean_windows(signal,ref_maxbadchannels,ref_tolerances, ref_wndlen);

I think it should be

[ref_section,sample_mask] = clean_windows(signal,ref_maxbadchannels,ref_tolerances,ref_wndlen);
ref_section.data=ref_section.data(:,sample_mask(1:length(ref_section.data)));

2)The function asr_calibrate as it is called in clean_asr.m :

state = asr_calibrate(ref_section.data,ref_section.srate,cutoff);

doesn’t contain the input parameter WindowLength ( Window length that is used to check the data for artifact content (Default: 0.5 s))

  1. In the function asr_calibrate.m even if we change the value of WindowLength, it will not be taken into account because of (nargin <8) :

if nargin < 8 || isempty(window_len)
window_len = 0.5; end

I think it should be

if nargin < 7 || isempty(window_len)
window_len = 0.5; end

Can you confirm please.
Thank you
Best

Discrepancies between EEG.etc.clean_sample_mask, EEG.event boundaries, and clean file length (EEG.pnts) after using clean_rawdata and ASR

Using the GUI in EEGLab ("Reject data using CleanRawdata and ASR"), I ran your plugin on my study, opting to "remove bad data periods (instead of correcting them)." I noticed the plugin prints the %data and number of seconds kept to the command window after rejection, but I am specifically interested in the %data kept between specific events in each file, so after all datasets were processed, I wrote a script to summarize this. However, after running this script, several of my files had >100% clean data, so I dug deeper and realized I had been making assumptions that were not true of the data structures. I want to make sure 1) that I understand the data structures accurately & that these assumptions are supposed to be true and 2) my solution to this discrepancy makes sense and does not harm the integrity of my dataset.

1) Are my assumptions supposed to be true? Am I misunderstanding anything?
Assumption 1: The length of EEG.etc.clean_sample_mask = the length of the original, raw file (pre-rejection)
Assumption 2: The sum of EEG.etc.clean_sample_mask = the length of the new, clean file (post-rejection)
Assumption 3: I can divide the sum of EEG.etc.clean_sample_mask between 2 indices by the difference in those indices to find the %clean data in that period (e.g. sum(EEG.etc.clean_sample_mask(start_index:end_index) / (end_index - start_index) * 100)
Assumption 4: I can also find the length of the new, clean file by subtracting the sum of all boundary durations in EEG.event from the length of the original file (e.g. length(EEG.etc.clean_sample_mask).

Given Assumptions 2 and 4, sum(EEG.etc.clean_sample_mask) = EEG.pnts = length(EEG.etc.clean_sample_mask) - total boundary duration (I used a for loop to sum the boundary durations from EEG.event). I wrote a script to tell me if Assumptions 2 and 4 were true for each dataset in my study by comparing each computation to EEG.pnts. For some datasets both were true, for some one assumption was true and one was false, and for some both were false.

2) My solution
When Assumptions 2 and 4 are true, do not change anything. There is no error.

When Assumption 2 is true and 4 is false, there is an error in EEG.event regarding boundary latency and/or duration, therefore I should use the clean_sample_mask to overwrite the boundary latencies and duration (i.e. boundaries should start when there is a 0 following a 1 and end when there is and last until there is another 1).

When Assumption 4 is true and 2 is false, there is an error in EEG.etc.clean_sample_mask, therefore I should use EEG.event boundary information to overwrite the clean_sample_mask.

When Assumption 2 and 4 are both false... I'm not sure what to do about this yet. I think I'll have to look at these on a case-by-case basis because for some, Assumption 2 is false, but sum(EEG.etc.clean_sample_mask) is only 1 point greater than EEG.pnts, which seems to be a rounding difference, making Assumption 2 essentially true, but in other cases the values differ by much more and I have not yet determined the source of the discrepancies / what is supposed to be true.

I can upload code to demonstrate this, but I wanted to first make sure my understanding and logic is sound.

Add ytick channel labels in vis_artifacts

Transfered from Bugzilla.

Hopefully I am not overstepping, but I added some code to vis_artifacts in clean_rawdata0.32 after line 189:

channel_y = (ylr(2):(ylr(1)-ylr(2))/(size(new.data,1)-1):ylr(1))';

I added the code:

% add channel locations
set( hAxis, 'ytick', flipud(channel_y));
set( hAxis, 'yticklabel', fliplr({old.chanlocs.labels}));

It changes the ytick values and labels to the channel labels in line with the data, so that you can see what clean_rawdata has done in terms of channel removal more clearly.

Regards,
Tyler

Has the availableRAM_GB option been removed or replaced?

Hi,

I recently updated to 2.2 from within EEGLAB, and 'availableRAM_GB' now gives the following error.

EG = clean_rawdata(EEG, -1, -1, -1, -1, 20, 0.25, 'availableRAM_GB', 8);
Error using clean_rawdata
Too many input arguments.

It seems to be because ARS_calibrate does not accept it as one possible input anymore. I was wondering if it has been replaced by some alternatives (or default value), or why this had been removed.

Thank you,
Cyrille

"Reject data using Clean Rawdata and ASR" not available

Hi there,

I am using EEGLAB v2019.1 on MATLAB 2017b, and wanted to use “Reject data using Clean Rawdata and ASR”.
I’m attaching the screenshot but I can’t seem to select it whichever data I upload, epoched or continuous.
Is this a bug, or is there anything wrong with my data?
I updated my EEGLAB to v2021.0 but the issue remains. I also tried using it on MATLAB 2019.b but no luck.

Thanks for your help in advance!

eeglab_error

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.