Giter Club home page Giter Club logo

Comments (12)

cboulay avatar cboulay commented on June 12, 2024

What's the longest duration xdf file you've had with this problem? Can you attach a problematic one that is at least 1 minute long to this issue?
And you're using the Matlab loader to import it?

If I recall correctly, the clock offsets are retrieved via UDP whereas the data come via TCP. If you have any reason to think that your network configuration might be dropping a large number of UDP packets then this could be the source of your problem.

from app-labrecorder.

dmedine avatar dmedine commented on June 12, 2024

from app-labrecorder.

dmedine avatar dmedine commented on June 12, 2024

@garygan89, can you please confirm what Chad asked and also give details of your setup? Specifically, I am interested in what OS you are using on the outlet side and which OS you are using to host LabRecorder.

I can also say that my recent bouts with this issue have only occurred when using liblsl (or is it just LSL now?) >=14. When I have a chance I will downgrade to 13 and see if the problem persists.

from app-labrecorder.

garygan89 avatar garygan89 commented on June 12, 2024

What's the longest duration xdf file you've had with this problem? Can you attach a problematic one that is at least 1 minute long to this issue?

It seems like the problem seems to be unrelated to the file size / duration, since I could have the error in 5MB or 20MB file. I will upload the problemetic file when I'm at the lab later.

And you're using the Matlab loader to import it?

Yes I loaded using load_xdf function from EEGLAB (SCCN repo), both command line and the EEGLAB GUI.

If I recall correctly, the clock offsets are retrieved via UDP whereas the data come via TCP. If you have any reason to think that your network configuration might be dropping a large number of UDP packets then this could be the source of your problem.

I ran both LSL consumer and producer (SendDataC) in a closed loop system, in particular the Freescale IMX8 SOM (ARM64/aarch64 architecture) that we mount in our custom board, with no IP assign to eth0 because our custom board does not have a real ethernet port. I suspect it was the missing IP at first, but it happened to my reference board with eth0 IP assigned.

from app-labrecorder.

garygan89 avatar garygan89 commented on June 12, 2024

@garygan89, can you please confirm what Chad asked and also give details of your setup? Specifically, I am interested in what OS you are using on the outlet side and which OS you are using to host LabRecorder.

I can also say that my recent bouts with this issue have only occurred when using liblsl (or is it just LSL now?) >=14. When I have a chance I will downgrade to 13 and see if the problem persists.

Yes I just followed up on that with Chad. I'm running it on a Freescale IMX8 SOM on our custom board, OS is Debian 10 (bullseye). Both consumer (LabRecorderCLI) v1.13.1, libLSL v1.14 and producer (SendDataC from the liblsl example) are running on the same host so that we could form a closed loop system. I didn't assign IP to the eth0 interface.

This issue seems to start happening on liblsl v1.13 as reported, but I'm not sure whether it somehow creep to v1.14.

from app-labrecorder.

cboulay avatar cboulay commented on June 12, 2024

I've never tried that kind of network setup. I'm happy you're using LSL and we'll try to fix this problem as best we can, but if you're running everything in a closed system on a custom platform, why not use shared memory? Shared memory will definitely have lower latency and be more efficient than LSL. LSL wins on flexibility, network synchronization across computers, and compatibility with many devices, but it sounds like you aren't using any of those features. Maybe you plan to?

As you're debugging this, please use https://github.com/xdf-modules/xdf-Matlab instead of the loader that comes with EEGLAB. Ultimately they should be the same thing, but if we provide a fix then it'll appear in xdf-Matlab before EEGLAB.

Also note in the load_xdf function there are many command line options like load_xdf(..., 'HandleClockSynchronization', false);
That'll stop it from trying to do clock synchronization, and clock synchronization is unnecessary when everything is on the same system.

I hope to get Matlab again in a couple weeks. Until then I'll use pyxdf. Please attach the file when you can so I can try loading it in pyxdf to see if it loads and if it doesn't where the error is coming from, then maybe work backwards to find the source.

from app-labrecorder.

garygan89 avatar garygan89 commented on June 12, 2024

Thanks Chad. The primary motivation to use LSL in our closed loop setting is really the how LSL is able to synchronize multiple stream (we have EEG and visual stimuli presentation and marker all running in the same board). And I reckon using LSL is the fastest way for me to pipe them together.

Here are the list of XDF I uploaded to MF. http://www.mediafire.com/folder/osblwmlc4u9at/LSL_XDF

The one that gave the error is in the "Problematic folder". The consumer is the SendC code from liblsl examples.

The -noeth in the filename just identifies that the LabRecorderCLI is run in the system without any IP assigned to eth0 interface. eth0 is the only network interface in the system.

I will further try to load them using xdf-Matlab after the weekend and see if that improves.

from app-labrecorder.

dmedine avatar dmedine commented on June 12, 2024

I believe this problem results from a numerical issue. When calculating the mapping from outlet to LabRecorder inlet, load_xdf.m must perform a Cholesky decomposition of the matrix that is a combination of timestamps on the outlet PC.

What appears to be happening is that when the timestamps are very, very high---which they are when timestamps are the number of seconds since January 1, 1970---the combination A'A (where the first column of A is 1/.0001 and the second column is the timestamps/.0001) results by definition in a square, symmetric 2x2 diagonal matrix. This is theoretically guaranteed to be positive definite, and therefor theoretically it can always be decomposed into LL* by Cholesky. But, what appears to be happening is that sometimes A'A has an eigenvalue that is a very, very, very small negative! number. This appears to happen randomly as a result of the limits of numerical precision when performing eigen value and Cholesky decomposition. I suspect that this eigenvalue test, or something similar, is what Matlab's chol function does to test for positive definite-ness, and this is why it is reporting an error.

For example, when examining eig(A'*A) when breaking into https://github.com/xdf-modules/xdf-Matlab/blob/master/load_xdf.m#L459 and step into the robust fit function (https://github.com/xdf-modules/xdf-Matlab/blob/master/load_xdf.m#L747-L783) on the sample @garygan89 provided called 'SendData-C-LabRecorderv1.13.1-noeth0-run1.xdf' I get the following:

K>> eig(A'*A)

ans =

   1.0e+27 *

   0.000000000000000
   2.312098056682916

And here chol works. However, if I do the same for the 'problematic' data in 'SendDataC-LabRecorderv1.13.1-noeth0-run3.xdf' I get this:

K>> eig(A'*A)

ans =

   1.0e+27 *

  -0.000000000000000
   3.082798078938963

Note the negative sign in the first value. In both cases the first eigenvalue should be 0, or very close to it on the positive side, but due to precision, it sometimes ends up on the negative and this (I am guessing) is what stops chol in its tracks.

The workaround seems to be to increase the WinsorThreshold value. I confess that I have never fully understood how this works or how this parameter truly has a Winsorizing effect on the ADMM algorithm, but when I set it to 1 (as I mentioned above, the default is .0001), the matrix A is smaller by a factor of 10e4 and the numerical problem disappears:

K>> eig(A'*A)

ans =

   1.0e+19 *

   0.000000000000000
   3.082798078938964

by calling load_xdf.m with this option (s = load_xdf('SendDataC-LabRecorderv1.13.1-noeth0-run3.xdf', 'WinsorThreshold', 1.0);), I can successfully load the data set.

Again, I am not sure how this affects the precision of the clock offset mapping, but this will allow you to load these problematic sets. I am also unsure what to do to fix this. If we are at the point where the time since the Epoch is so great that this is going to happen, then this whole mechanism needs to be fixed. After all, this workaround will stop working in about 100 million seconds ;-).

I am also unsure where to re-open this issue. Is it a problem with xdf-Matlab or liblsl? It is definitely not, however, a problem with LabRecorder, and that is a good thing.

from app-labrecorder.

dmedine avatar dmedine commented on June 12, 2024

Also, 100 million seconds is only 3 years, so the clock is literally ticking!

from app-labrecorder.

garygan89 avatar garygan89 commented on June 12, 2024

I believe this problem results from a numerical issue. When calculating the mapping from outlet to LabRecorder inlet, load_xdf.m must perform a Cholesky decomposition of the matrix that is a combination of timestamps on the outlet PC.

What appears to be happening is that when the timestamps are very, very high---which they are when timestamps are the number of seconds since January 1, 1970---the combination A'A (where the first column of A is 1/.0001 and the second column is the timestamps/.0001) results by definition in a square, symmetric 2x2 diagonal matrix. This is theoretically guaranteed to be positive definite, and therefor theoretically it can always be decomposed into LL* by Cholesky. But, what appears to be happening is that sometimes A'A has an eigenvalue that is a very, very, very small negative! number. This appears to happen randomly as a result of the limits of numerical precision when performing eigen value and Cholesky decomposition. I suspect that this eigenvalue test, or something similar, is what Matlab's chol function does to test for positive definite-ness, and this is why it is reporting an error.

For example, when examining eig(A'*A) when breaking into https://github.com/xdf-modules/xdf-Matlab/blob/master/load_xdf.m#L459 and step into the robust fit function (https://github.com/xdf-modules/xdf-Matlab/blob/master/load_xdf.m#L747-L783) on the sample @garygan89 provided called 'SendData-C-LabRecorderv1.13.1-noeth0-run1.xdf' I get the following:

K>> eig(A'*A)

ans =

   1.0e+27 *

   0.000000000000000
   2.312098056682916

And here chol works. However, if I do the same for the 'problematic' data in 'SendDataC-LabRecorderv1.13.1-noeth0-run3.xdf' I get this:

K>> eig(A'*A)

ans =

   1.0e+27 *

  -0.000000000000000
   3.082798078938963

Note the negative sign in the first value. In both cases the first eigenvalue should be 0, or very close to it on the positive side, but due to precision, it sometimes ends up on the negative and this (I am guessing) is what stops chol in its tracks.

The workaround seems to be to increase the WinsorThreshold value. I confess that I have never fully understood how this works or how this parameter truly has a Winsorizing effect on the ADMM algorithm, but when I set it to 1 (as I mentioned above, the default is .0001), the matrix A is smaller by a factor of 10e4 and the numerical problem disappears:

K>> eig(A'*A)

ans =

   1.0e+19 *

   0.000000000000000
   3.082798078938964

by calling load_xdf.m with this option (s = load_xdf('SendDataC-LabRecorderv1.13.1-noeth0-run3.xdf', 'WinsorThreshold', 1.0);), I can successfully load the data set.

Again, I am not sure how this affects the precision of the clock offset mapping, but this will allow you to load these problematic sets. I am also unsure what to do to fix this. If we are at the point where the time since the Epoch is so great that this is going to happen, then this whole mechanism needs to be fixed. After all, this workaround will stop working in about 100 million seconds ;-).

I am also unsure where to re-open this issue. Is it a problem with xdf-Matlab or liblsl? It is definitely not, however, a problem with LabRecorder, and that is a good thing.

Thanks for the detailed investigation @dmedine ! I must admin I have little knowledge about Cholesky decomposition, but it is certainly good that these problematic XDF are still loadable with no data lose. The loss of clock precision offset might not be as important in my case since everything is streamed and timestamped from a same closed loop host (hope this is correct statement for a same host recording and streaming). Probably more investigation need to be done to see its effect on synchronizing multiple data stream.

As mentioned it does sound like this is more of an issue inherent to liblsl instead of LabRecorder, since the data stream are captured without any loss.

I will try your method and see if I can salvage all of those problematic ones.

from app-labrecorder.

dmedine avatar dmedine commented on June 12, 2024

I have been trying to do some more experiments with Raspberry Pi, and I believe this problem is very terrible. Currently, I am unable to synchronize streams between Windows and Raspbian when recording on Windows. The winsor threshold trick is distorting the signal beyond recognition.

I am not sure how to confirm my original hypothesis that this is a numerical issue, but I will try some XDF surgery and see what I can figure out. In the meantime, I would say that you should proceed with extreme caution. Sorry.

from app-labrecorder.

cboulay avatar cboulay commented on June 12, 2024

The fix has been merged in both xdf-Matlab and pyxdf.
@garygan89 , please let us know if you are still experiencing any problems.

from app-labrecorder.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.