Giter Club home page Giter Club logo

Comments (151)

jester-xbmc avatar jester-xbmc commented on June 15, 2024

cross posting from the old closed ticket:

@mutability says:
I think it is generating the timestamps as little-endian; they should be big-endian.

ping @touil

from piaware.

mutability avatar mutability commented on June 15, 2024

The traceback indicates there might be a problem with the stream framing, too. I'd need to see the raw bytes to be sure.

20MHz means it won't work out-of-the-box with mlat-server. mlat-client should be mostly OK with it, but mlat-server will just never manage to get any sync with other receivers because the timing will be way off. I can add a new receiver type to mlat-client if it's always going to be 20MHz, but a simpler approach might be to make beast-format actually look like a Beast i.e. produce a 12MHz timestamp.

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

new version fixes the endian issue (http://airspy.com/?ddownload=3753) but still this error:

02/17/2016 22:43:41 mlat(8338): Unexpected exception on connection to localhost:30005
02/17/2016 22:43:41 mlat(8338): Traceback (most recent call last):
02/17/2016 22:43:41 mlat(8338): File "/usr/lib/python3.2/asyncore.py", line 83, in read
02/17/2016 22:43:41 mlat(8338): File "/usr/lib/python3.2/asyncore.py", line 449, in handle_read_event
02/17/2016 22:43:41 mlat(8338): File "/tmp/buildd/piaware-2.1/debian/venv/lib/python3.2/site-packages/mlat/client/receiver.py", line 129, in handle_read
02/17/2016 22:43:41 mlat(8338): ValueError: Lost sync with input stream: expected 0x1A after 0x1A escape

indeed might help to either default 12 MHZ for beast streams or add parameter to select this (in airspy_adsb)

from piaware.

touil avatar touil commented on June 15, 2024

May be I'm not sending something? what's the sequence for?

from piaware.

mutability avatar mutability commented on June 15, 2024

The Beast format is originally a serial data format where you might start mid-packet, so it does the equivalent of DLE-stuffing with 1A characters to make it easier to scan for a packet start. See http://wiki.modesbeast.com/Mode-S_Beast:Data_Output_Formats. If you need to send a 1A byte within the body of the message, then it should be sent as 1A 1A.

from piaware.

mutability avatar mutability commented on June 15, 2024

Yes, it applies to everything. The only un-doubled 1A is the very first byte.

from piaware.

touil avatar touil commented on June 15, 2024

OK. Got it and fixing. Thanks.

from piaware.

touil avatar touil commented on June 15, 2024

Looks fine now.

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

looks like we have a winner! i'll keep it running overnight, no errors so far.
thanks !

from piaware.

touil avatar touil commented on June 15, 2024

Now that the Beast format is implemented, we will have a firmware update that allows 12MHz MLAT. Stay tuned!

from piaware.

mutability avatar mutability commented on June 15, 2024

It would be fine to run at whatever rate you want and rescale that clock to 12MHz just for the Beast output. mlat-server just ends up scaling it by the clock frequency anyway. ADS-B based synchronization between receivers only gives about 0.5-1us accuracy so the rounding errors from rescaling aren't a big deal.

from piaware.

touil avatar touil commented on June 15, 2024

FYI. There will be another (cheaper) Airspy Mini device with can run at 12MSPS directly, so rescaling might be a good option to keep things simple.
In the other hand, I can add the re-scaling into the decoder itself and still get the nominal resolution at +/- 1 sample. So there will be no degradation in the end.
Home work for tonight!

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

After running the new version for a few hours, its pretty stable, however it seems the mlat-client doesn't sync with other nearby servers (and yes there are several nearby)

02/18/2016 12:26:35 mlat(712): Receiver status: connected
02/18/2016 12:26:35 mlat(712): Server status: not synchronized with any nearby receivers
02/18/2016 12:26:35 mlat(712): Receiver: 803.3 msg/s received 15.7kB/s from receiver
02/18/2016 12:26:35 mlat(712): Server: 0.0 kB/s from server 0.0kB/s TCP to server 2.1kB/s UDP to server
02/18/2016 12:26:35 mlat(712): Aircraft: 17 of 32 Mode S, 33 of 71 ADS-B used

from piaware.

touil avatar touil commented on June 15, 2024

That's expected. The MLAT is almost twice as fast in Airspy. Watch out for tonight's version.

from piaware.

touil avatar touil commented on June 15, 2024

Everything is implemented and working as expected.

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

Getting closer but:

02/18/2016 22:31:20 mlat(5320): Receiver status: connected
02/18/2016 22:31:20 mlat(5320): Server status: clock unstable
02/18/2016 22:31:20 mlat(5320): Receiver: 383.4 msg/s received 7.5kB/s from receiver
02/18/2016 22:31:20 mlat(5320): Server: 0.0 kB/s from server 0.0kB/s TCP to server 0.8kB/s UDP to server
02/18/2016 22:31:20 mlat(5320): Aircraft: 2 of 9 Mode S, 19 of 31 ADS-B used

from piaware.

mutability avatar mutability commented on June 15, 2024

Yeah, definitely something wrong with the timing here.

The server takes pairs of even/odd DF17 position messages and pairs them up based on message contents + the interval between the messages. The message interval has to be within 1ms of the interval seen by other receivers, and they are sufficiently close that it's getting past that check. However after correcting for propagation delay from the aircraft to each receiver, the intervals disagree by a lot - 200ppm or more, and they jump around a lot.

from piaware.

touil avatar touil commented on June 15, 2024

Are you sure you are using v1.3?

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

Yup, just downloaded:
airspy_adsb v1.3

from piaware.

touil avatar touil commented on June 15, 2024

May be it's the scaling code. Basically ts_12MHz = (ts_20MHz * 12 + 20 / 2) / 20. It should be fine within 1 sample

from piaware.

mutability avatar mutability commented on June 15, 2024

I can't see anything obviously wrong with that scaling. Are you confident of the underlying timestamp?

from piaware.

mutability avatar mutability commented on June 15, 2024

I have turned on a little more debugging on the server side so I can see the actual differences in intervals, but the client reconnected at around 0115 UTC and seems to have changed behaviour at that point, now it's back to not even passing the 1ms test.

from piaware.

touil avatar touil commented on June 15, 2024

I updated the RPi build if someone want to give it a try. Reworked the whole thing to avoid accumulating errors.

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

uhm, now it goes back to the "old" message:

02/19/2016 14:14:57 mlat(19379): Receiver status: connected
02/19/2016 14:14:57 mlat(19379): Server status: not synchronized with any nearby receivers
02/19/2016 14:14:57 mlat(19379): Receiver: 659.8 msg/s received 12.9kB/s from receiver
02/19/2016 14:14:57 mlat(19379): Server: 0.0 kB/s from server 0.0kB/s TCP to server 1.6kB/s UDP to server

from piaware.

touil avatar touil commented on June 15, 2024

@mutability Can you check the server side?

from piaware.

mutability avatar mutability commented on June 15, 2024

@jester-xbmc: This is for mac address b8:27:eb:50:0a:97 right?

@touil:

20160219 14:40:23.191 master INFO     clocktrack           interval out of range, 997434.8333us vs 1449631.0000us
20160219 14:40:23.193 master INFO     clocktrack           interval out of range, 685604.7500us vs 1010058.7500us
20160219 14:40:23.194 master INFO     clocktrack           interval out of range, 1460863.9167us vs 2104718.2500us
20160219 14:40:23.282 master INFO     clocktrack           interval out of range, 1048143.4167us vs 1520012.0006us
20160219 14:40:23.289 master INFO     clocktrack           interval out of range, 726760.4167us vs 1070821.6667us
20160219 14:40:23.312 master INFO     clocktrack           interval out of range, 726760.4167us vs 1070824.7500us
20160219 14:40:23.330 master INFO     clocktrack           interval out of range, 1048143.4167us vs 1520040.5000us
20160219 14:40:23.339 master INFO     clocktrack           interval out of range, 695248.8333us vs 1056890.3333us
20160219 14:40:23.517 master INFO     clocktrack           interval out of range, 352520.5000us vs 500019.3333us
20160219 14:40:23.584 master INFO     clocktrack           interval out of range, 352520.5000us vs 499971.5001us
20160219 14:40:23.673 master INFO     clocktrack           interval out of range, 352520.5000us vs 499998.9167us
20160219 14:40:23.692 master INFO     clocktrack           interval out of range, 312953.6667us vs 450621.0834us
20160219 14:40:23.693 master INFO     clocktrack           interval out of range, 1999907.2500us vs 2869857.5000us
20160219 14:40:23.695 master INFO     clocktrack           interval out of range, 372910.7500us vs 540043.5000us
20160219 14:40:23.696 master INFO     clocktrack           interval out of range, 1843207.2500us vs 2659225.5001us

Lefthand numbers are intervals as measured by the airspy, righthand are intervals as measured by other receivers that it's trying to match to. There's no particularly consistent factor there I can see. Note that some of them may be real mismatches (trying to match up transmission times that aren't for the same messages, this can happen when an identical position message is sent at different times)

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

@mutability correct.

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

when running with /airspy_adsb -l 30005 -f beast -b -g 21 -p -d 2 -w 4
i'm getting MLAT data packets...

from piaware.

touil avatar touil commented on June 15, 2024

this means there were packet drops

from piaware.

mutability avatar mutability commented on June 15, 2024

Whatever you changed when you reconnected at 1515 brought the inter-message intervals back into range and the frequency offset numbers look sensible. Now the server reckons your clock is unstable because it sees periodic large jumps in the absolute clock offset:

20160219 15:16:04.897 master INFO     clocksync            fAT-b827eb280072_000084:fAT-b827eb500a97_000162: fAT-b827eb500a97_000162 clock was 142547.7us slower than predicted, drift error -0.8ppm

which if you were running dump1090 would mean that you were losing large chunks of samples before they reached dump1090 itself. (dump1090 will jump the timestamp forward if it drops a block of samples due to overload, but in extreme overload cases the USB subsystem can start dropping data too). I guess something similar can happen with the airspy.

from piaware.

touil avatar touil commented on June 15, 2024

The threading and decimation options can help a lot here. Can you check with 4 workers?

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

trying various -w and -d values, however after a number of MLAT messages it stalls, clock drift ? packet loss ?

from piaware.

touil avatar touil commented on June 15, 2024

Packets are being dropped.

from piaware.

touil avatar touil commented on June 15, 2024

I'll optimize libairspy for the RPi. More efficient unpacking to come. Stay tuned!

from piaware.

touil avatar touil commented on June 15, 2024

@mutability Anything worth mentioning in the server side?

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

As the mlat-client stalls reading directly from airspy_adsl, we've done a trick, however piping airspy_adsb output through modesmixer gives unstable results, 15 minutes of clock unstable and then 15 minutes of "normal" running but it never stalls.

02/20/2016 16:00:50 mlat(698): Receiver status: connected
02/20/2016 16:00:50 mlat(698): Server status: synchronized with 126 nearby receivers
02/20/2016 16:00:50 mlat(698): Receiver: 682.2 msg/s received 12.7kB/s from receiver
02/20/2016 16:00:50 mlat(698): Server: 0.7 kB/s from server 0.0kB/s TCP to server 2.4kB/s UDP to server
02/20/2016 16:00:50 mlat(698): Results: 331.0 positions/minute
02/20/2016 16:00:50 mlat(698): Aircraft: 14 of 28 Mode S, 25 of 55 ADS-B used
02/20/2016 16:04:32 33242 msgs recv'd from modesmixer2 (1228 in last 5m); 32602 msgs sent to FlightAware
02/20/2016 16:09:32 34592 msgs recv'd from modesmixer2 (1350 in last 5m); 33952 msgs sent to FlightAware
02/20/2016 16:14:32 35992 msgs recv'd from modesmixer2 (1400 in last 5m); 35352 msgs sent to FlightAware
02/20/2016 16:15:51 mlat(698): Receiver status: connected
02/20/2016 16:15:51 mlat(698): Server status: clock unstable
02/20/2016 16:15:51 mlat(698): Receiver: 736.3 msg/s received 13.7kB/s from receiver
02/20/2016 16:15:51 mlat(698): Server: 0.7 kB/s from server 0.0kB/s TCP to server 2.1kB/s UDP to server
02/20/2016 16:15:51 mlat(698): Results: 331.7 positions/minute
02/20/2016 16:15:51 mlat(698): Aircraft: 16 of 28 Mode S, 29 of 63 ADS-B used

from piaware.

mutability avatar mutability commented on June 15, 2024

What do you mean by "stalls reading directly from airspy_adsb"?

from piaware.

mutability avatar mutability commented on June 15, 2024

Note that the server status: is just a snapshot of the latest status reported; it only tells you something about the approx 30 second period before the report.

from piaware.

touil avatar touil commented on June 15, 2024

Is there any mechanism to compensate for packet losses? This can happen frequently with non real time kernels in tight hardware like the RPI.

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

airspy_adsb -> fa-mlat-client stops "working" after about 5 minutes and server status gives clock unstable, stop working meaning no packets are given back to me (i'm monitoring using -mlatResultsFormat "basestation,listen,31003" to VRS)

with airspy_adsb -> modesmixer -> fa-mlat-client it keeps working, but still gives clock unstable

from piaware.

touil avatar touil commented on June 15, 2024

@mutability Need an airspy? ;-)

from piaware.

mutability avatar mutability commented on June 15, 2024

@jester-xbmc: that's probably just luck more than anything. The server tracks how "bad" the instability of a receiver is; over time if a receiver consistently produces unreliable results then it will trust it less and less; past a certain point it stops requesting data for mlat entirely (it keeps taking ADS-B data so it can check if the synchronization fixes itself). There is hysteresis there so once the receiver is inhibited like this, it has to produce good results for quite a long time before it gets turned back on again. Once a receiver is inhibited it's not contributing to mlat so it stops receiving results.

from piaware.

mutability avatar mutability commented on June 15, 2024

@touil: The server relies on the timestamp sequence being a continuous sequence, and being approximately the same frequency in the short term. If you drop samples and do not adjust your timestamp to compensate, then that looks to the server like the clock stopped for a while. It will handle this, after enough out of range values are seen it decides they were not just spurious outliers and starts using the new values, but that also flags the clock as unstable. If it happens often enough then the server stops using the receiver's data because it can't be sure the clock is predictable.

dump1090 has a couple of threads: a read thread that's doing the basic USB work via librtlsdr, and the main thread that's running the demodulator etc. Most of the CPU goes in the main thread. They communicate via a fixed-size FIFO of sample buffers. If the read thread needs to hand off a new sample buffer but finds the FIFO is full, then it drops the buffer, but importantly it still increments the timestamp. So in the "mild overload" case where the demodulator is not keeping up, but the read thread still gets enough CPU to at least copy the data out from USB, this avoids having the dropped buffer cause a jump in the timestamps. Once things get overloaded to the point where the read thread can't even keep up with USB, that doesn't help though.

See https://github.com/mutability/dump1090/blob/master/dump1090.c#L438 (which notices the FIFO is full and starts dropping data until it's at least half-empty) and https://github.com/mutability/dump1090/blob/master/dump1090.c#L450 ("outbuf->dropped" is the number of samples that were dropped between the last buffer that was processed and the current buffer)

Does airspy_adsb proactively drop samples when it finds it is overloaded? If so, does it adjust the timestamp for the missed samples?

from piaware.

touil avatar touil commented on June 15, 2024

Samples can be dropped in the RTL chip as well. Not whole buffers, but a few samples occasionally when the host computer fails to process the transfer for any given reason. These drops are much harder to track than the ones in the driver.
For airspy, we have counters both in the device side and the host side. It just requires some work.

from piaware.

mutability avatar mutability commented on June 15, 2024

Sure, I see exactly that behaviour where small numbers of samples are dropped in bursts (on the order of a few tens or hundreds of samples). There are a few long threads on the flightaware forums about it; it's the #1 cause of clock instability and it is very dependent on the exact setup (the particular dongle, Pi, USB cabling, ambient temperature..)

In practice it is infrequent enough that usually it only makes the receiver drop out for 30 seconds at a time, infrequently. The only case I've seen it be a real problem is with someone with a really long cheap USB cable.

However, this is not the behaviour I see with the airspy data: the airspy receiver is jumping by 100s of milliseconds (see my comment above), that's got to be a whole sample buffer or something similar, not just a failed USB transfer.

from piaware.

touil avatar touil commented on June 15, 2024

That's normal. I didn't implement them yet.

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

Just for reference: its a pi2, with a fresh raspbian jessie light OS, running just the airspy, powersupply is official raspberry one, standard 1 meter USB cable that came with the airspy, added one extra clamp ferrite, temp. around 19 C, same setup runs fine with rtl-sdr.

from piaware.

mutability avatar mutability commented on June 15, 2024

I took a look at todays logs and it's currently seeing quite a lot of 4.9ms jumps which is different to what it was doing before:

20160220 02:09:33.485 master INFO     clocksync            fAT-0024d692b410_000172:fAT-b827eb500a97_000216: fAT-b827eb500a97_000216 clock was 4915.3us slower than predicted, drift error 0.1ppm
20160220 03:02:54.682 master INFO     clocksync            fAT-000c29d5f76e_000071:fAT-b827eb500a97_000216: fAT-b827eb500a97_000216 clock was 4914.9us slower than predicted, drift error 0.2ppm
20160220 03:03:15.250 master INFO     clocktrack           Inhibiting receiver fAT-b827eb500a97_000216 (stage 1: don't use mlat data)
20160220 03:07:45.237 master INFO     clocktrack           Uninhibiting receiver fAT-b827eb500a97_000216
20160220 03:10:20.460 master INFO     clocksync            fAT-000c29d5f76e_000071:fAT-b827eb500a97_000216: fAT-b827eb500a97_000216 clock was 4915.5us slower than predicted, drift error 0.2ppm
20160220 03:10:47.677 master INFO     clocktrack           Inhibiting receiver fAT-b827eb500a97_000216 (stage 1: don't use mlat data)
20160220 03:17:55.623 master INFO     clocktrack           Uninhibiting receiver fAT-b827eb500a97_000216
20160220 03:47:44.515 master INFO     clocksync            fAT-000c29d5f76e_000071:fAT-b827eb500a97_000216: fAT-b827eb500a97_000216 clock was 4915.1us slower than predicted, drift error 0.2ppm
20160220 03:48:00.665 master INFO     clocktrack           Inhibiting receiver fAT-b827eb500a97_000216 (stage 1: don't use mlat data)
20160220 03:50:05.100 master INFO     clocktrack           Uninhibiting receiver fAT-b827eb500a97_000216
20160220 03:50:24.153 master INFO     clocksync            fAT-063e380017dc_000102:fAT-b827eb500a97_000216: fAT-b827eb500a97_000216 clock was 4915.2us slower than predicted, drift error -0.5ppm
20160220 03:50:36.257 master INFO     clocktrack           Inhibiting receiver fAT-b827eb500a97_000216 (stage 1: don't use mlat data)
20160220 03:57:21.593 master INFO     clocktrack           Inhibiting receiver fAT-b827eb500a97_000216 (stage 2: disable mlat data feed)

(and it didn't recover until a reconnect)

Anything magic about that 4915us value? (It is a little under 100k samples at 20MHz)

from piaware.

touil avatar touil commented on June 15, 2024

OKI. Let's give it a try.

static int rx_callback(airspy_transfer_t* transfer) { adsb_parser_advance(transfer->ctx, transfer->dropped_samples); adsb_parser_process(transfer->ctx, transfer->samples, transfer->sample_count); return do_exit; }

from piaware.

touil avatar touil commented on June 15, 2024

4.9ms is what I call an improvement :-) I reworked all the threading yesterday. Let's see how it behaves with the host side counters.

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

@mutability new version up and running, shows promise, can you check on server ?

02/20/2016 17:45:11 1882 msgs recv'd from airspy_adsb (1716 in last 5m); 1262 msgs sent to FlightAware
02/20/2016 17:50:11 3767 msgs recv'd from airspy_adsb (1885 in last 5m); 3147 msgs sent to FlightAware
02/20/2016 17:55:11 5765 msgs recv'd from airspy_adsb (1998 in last 5m); 5145 msgs sent to FlightAware
02/20/2016 17:56:33 mlat(758): Receiver status: connected
02/20/2016 17:56:33 mlat(758): Server status: synchronized with 119 nearby receivers
02/20/2016 17:56:33 mlat(758): Receiver: 936.8 msg/s received 18.5kB/s from receiver
02/20/2016 17:56:33 mlat(758): Server: 0.6 kB/s from server 0.0kB/s TCP to server 2.4kB/s UDP to server
02/20/2016 17:56:33 mlat(758): Results: 292.3 positions/minute
02/20/2016 17:56:33 mlat(758): Aircraft: 19 of 45 Mode S, 40 of 95 ADS-B used

from piaware.

touil avatar touil commented on June 15, 2024

Seems all is good now. @mutability Anything suspicious on the server side?

from piaware.

mutability avatar mutability commented on June 15, 2024

It is stable enough that it works, but I still see it dropping multiples of 4.9ms from time to time:

20160220 18:30:38.973 master INFO     clocksync            fAT-0024d692b410_000172:fAT-b827eb500a97_000327: fAT-b827eb500a97_000327 clock was 39322.2us slower than predicted, drift error -0.4ppm
20160220 18:35:29.919 master INFO     clocksync            fAT-b827eb500a97_000327:fAT-b827ebba5b45_000160: fAT-b827ebba5b45_000160 clock was 9831.4us slower than predicted, drift error -0.1ppm
20160220 19:02:12.211 master INFO     clocksync            fAT-0024d692b410_000172:fAT-b827eb500a97_000327: fAT-b827eb500a97_000327 clock was 4915.3us faster than predicted, drift error -0.1ppm

Also it seems to jump in both directions sometimes (in this case, one of the messages used for sync was 4.9ms early, so the server adjusted the offset, but then the next sync point with the same peer receiver using the adjusted offset was 4.9ms late):

20160220 18:51:32.109 master INFO     clocksync            fAT-b827eb500a97_000327:fAT-b827ebd75fb6_000002: fAT-b827ebd75fb6_000002 clock was 4915.2us slower than predicted, drift error -0.4ppm
20160220 18:51:32.875 master INFO     clocksync            fAT-b827eb500a97_000327:fAT-b827ebd75fb6_000002: fAT-b827ebd75fb6_000002 clock was 4915.3us faster than predicted, drift error -0.5ppm

I wonder if it is a problem with sample blocks being processed out of order?

from piaware.

touil avatar touil commented on June 15, 2024

Let's improve that. It looks like atomic operations in libairspy don't behave the the same way on ARM. This forces a much slower code based on mutexes...

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

@mutability new version up and running, could you entertain us with another peek at the server ?

02/20/2016 21:11:00 3049 msgs recv'd from airspy_adsb (903 in last 5m); 2602 msgs sent to FlightAware
02/20/2016 21:12:33 mlat(709): Receiver status: connected
02/20/2016 21:12:33 mlat(709): Server status: synchronized with 118 nearby receivers
02/20/2016 21:12:33 mlat(709): Receiver: 508.3 msg/s received 10.0kB/s from receiver
02/20/2016 21:12:33 mlat(709): Server: 0.1 kB/s from server 0.0kB/s TCP to server 1.2kB/s UDP to server
02/20/2016 21:12:33 mlat(709): Results: 55.3 positions/minute
02/20/2016 21:12:33 mlat(709): Aircraft: 6 of 11 Mode S, 30 of 49 ADS-B used

from piaware.

mutability avatar mutability commented on June 15, 2024

Mostly OK, a couple of 39.3ms (== 4.9ms*8) glitches:

20160220 21:03:44.474 master INFO     clocksync            fAT-063e380017dc_000102:fAT-b827eb500a97_000346: fAT-b827eb500a97_000346 clock was 39321.4us slower than predicted, drift error 0.4ppm 
(... same offset versus lots of other receivers ...)
20160220 21:24:47.143 master INFO     clocksync            fAT-b827eb500a97_000346:fAT-b827eb6d382a_000035: fAT-b827eb6d382a_000035 clock was 39321.6us faster than predicted, drift error -0.1ppm
(... same offset versus lots of other receivers ...)

from piaware.

touil avatar touil commented on June 15, 2024

Interesting... That's exactly the size of the fifo. https://github.com/airspy/host/blob/master/libairspy/src/airspy.c#L45

from piaware.

mutability avatar mutability commented on June 15, 2024

From a quick look:

You unlock the consumer_mp mutex in consumer_threadproc after decreasing the number of received buffers but before you have copied out the data you just grabbed from the FIFO, around here: https://github.com/airspy/host/blob/master/libairspy/src/airspy.c#L359

But I don't see what prevents the USB callback from reusing that buffer before you've done that copy. If the USB callback does get in first, then you end up processing samples that are FIFO-size samples "in the future" from where you think you are?

Delaying the change to received_buffer_count until after you've done the copy might work (I don't know offhand if it works out cheaper to release and reobtain the lock, or to hold the lock during the copy)

from piaware.

touil avatar touil commented on June 15, 2024

This? https://github.com/airspy/host/blob/master/libairspy/src/airspy.c#L442

from piaware.

mutability avatar mutability commented on June 15, 2024

But you release that lock here, before you have done the copy: https://github.com/airspy/host/blob/master/libairspy/src/airspy.c#L361

from piaware.

touil avatar touil commented on June 15, 2024

OK. You mean "consuming" the samples.

from piaware.

mutability avatar mutability commented on June 15, 2024

No, I mean this copy: https://github.com/airspy/host/blob/master/libairspy/src/airspy.c#L382

The sequence of events I am talking about is this (if I read the code right):

initially, assume

received_buffers_count = 8
received_samples_queue_head = 0
received_samples_queue_tail = 0
received_samples_queue[0] = some buffer B
there is a pending usb transfer callback for some buffer C

thread T1 is running in consumer_threadproc
thread T2 is running in airspy_libusb_transfer_callback

T1 acquires consumer_mp
T2 tries to acquire consumer_mp and blocks on L442
T1 sets input_samples = received_samples_queue[0] = B
T1 decrements received_buffers_count, received_buffers_count = 7,
T1 releases the lock; this preempts T1 and wakes T2
T2 acquires the lock
T2 swaps usb_transfer->buffer (C) and received_samples_queue0, now usb_transfer->buffer = B
T2 releases the lock, but doesn't get preempted
T2 submits usb_transfer to libusb
libusb submits buffer B to the kernel
the kernel fills buffer B
T1 wakes and starts processing input_samples = B, but that buffer now contains the wrong samples

from piaware.

touil avatar touil commented on June 15, 2024

I see what you mean, but the problem with that approach is a lock over a long conversion operation. It just needs another condition to skip the samples in that specific case.

from piaware.

mutability avatar mutability commented on June 15, 2024

Sure, that's what I mean about holding the lock vs. reobtaining it. All you need to make sure is that you don't update received_buffers_count until after the conversion is complete; you could do that via a separate acquire/decrement/release (or atomics) after the conversion is done.

from piaware.

touil avatar touil commented on June 15, 2024

It was relying on atomics before => not so portable. So, that's an extra acquire/decrement/release until I find something better. Thanks!

from piaware.

touil avatar touil commented on June 15, 2024

There were so many people hacking that code, including myself. But by the look of it it should be fine now. 4 x eyes rules! :-)

from piaware.

mutability avatar mutability commented on June 15, 2024

Move the lock/unlock so the mutex is held on entry to the loop, that'd eliminate the extra acquire/release.

from piaware.

touil avatar touil commented on June 15, 2024

That's worse. It will lock on a very long operation. We are only locking on fast ones now. If it was Windows only code I'd be using InterlockedIncrement/Decrement/Exchange functions. But apparently there's nothing portable with the same semantics.

from piaware.

mutability avatar mutability commented on June 15, 2024

No, it will not lock on a very long operation.

lock
while (...) {
  get next buffer, pthread_cond_wait as needed
  unlock
  process buffer
  lock
  update counter to release the buffer to the usb thread
}
unlock

If you want portable atomics, use stdatomic.h (you will need C11): http://en.cppreference.com/w/c/atomic

from piaware.

touil avatar touil commented on June 15, 2024

Yes, very valid. I thought you meant a single lock/unlock inside the while block.
Added: airspy/airspyone_host@750032e

Btw, the code should be running since yesterday. Anything in the server side?

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

I rebooted due to security update installation, however yesterday latest code is running and i'm not seeing any clock unstable messages:

02/21/2016 11:37:49 17821 msgs recv'd from airspy_adsb (2083 in last 5m); 17264 msgs sent to FlightAware
02/21/2016 11:38:54 mlat(693): Receiver status: connected
02/21/2016 11:38:54 mlat(693): Server status: synchronized with 108 nearby receivers
02/21/2016 11:38:54 mlat(693): Receiver: 1030.3 msg/s received 20.2kB/s from receiver
02/21/2016 11:38:54 mlat(693): Server: 0.6 kB/s from server 0.0kB/s TCP to server 2.1kB/s UDP to server
02/21/2016 11:38:54 mlat(693): Results: 289.4 positions/minute
02/21/2016 11:38:54 mlat(693): Aircraft: 22 of 38 Mode S, 30 of 91 ADS-B used
02/21/2016 11:42:49 19866 msgs recv'd from airspy_adsb (2045 in last 5m); 19309 msgs sent to FlightAware
02/21/2016 11:47:49 21788 msgs recv'd from airspy_adsb (1922 in last 5m); 21231 msgs sent to FlightAware
02/21/2016 11:52:49 23723 msgs recv'd from airspy_adsb (1935 in last 5m); 23166 msgs sent to FlightAware
02/21/2016 11:53:54 mlat(693): Receiver status: connected
02/21/2016 11:53:54 mlat(693): Server status: synchronized with 112 nearby receivers
02/21/2016 11:53:54 mlat(693): Receiver: 947.5 msg/s received 18.7kB/s from receiver
02/21/2016 11:53:54 mlat(693): Server: 0.5 kB/s from server 0.0kB/s TCP to server 1.9kB/s UDP to server
02/21/2016 11:53:54 mlat(693): Results: 247.4 positions/minute
02/21/2016 11:53:54 mlat(693): Aircraft: 16 of 36 Mode S, 36 of 87 ADS-B used

from piaware.

mutability avatar mutability commented on June 15, 2024

Still seems to occasionally drop or misorder data:

20160221 14:45:09.479 master INFO     clocksync            fAT-b827eb2ca9e0_000040:fAT-b827eb500a97_000398: fAT-b827eb500a97_000398 clock was 9830.2us faster than predicted, drift error -0.1ppm
20160221 14:45:09.687 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827ebf76967_000142: fAT-b827ebf76967_000142 clock was 4915.3us slower than predicted, drift error -0.2ppm
20160221 14:45:09.816 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827ebd75fb6_000002: fAT-b827ebd75fb6_000002 clock was 9830.3us slower than predicted, drift error 0.1ppm
20160221 14:45:11.499 master INFO     clocksync            fAT-b827eb2ca9e0_000040:fAT-b827eb500a97_000398: fAT-b827eb500a97_000398 clock was 9830.1us slower than predicted, drift error -0.4ppm
20160221 14:45:11.588 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827ebf76967_000142: fAT-b827ebf76967_000142 clock was 4915.8us faster than predicted, drift error -0.0ppm
20160221 14:45:11.687 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827ebd75fb6_000002: fAT-b827ebd75fb6_000002 clock was 9830.6us faster than predicted, drift error 0.2ppm
20160221 14:47:12.886 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827eba584f9_000114: fAT-b827eba584f9_000114 clock was 9830.2us slower than predicted, drift error -0.1ppm
20160221 14:47:12.944 master INFO     clocksync            fAT-b827eb2ca9e0_000040:fAT-b827eb500a97_000398: fAT-b827eb500a97_000398 clock was 14745.1us faster than predicted, drift error -0.1ppm
20160221 14:47:13.013 master INFO     clocksync            fAT-b827eb4f7376_000177:fAT-b827eb500a97_000398: fAT-b827eb500a97_000398 clock was 9829.9us faster than predicted, drift error -0.0ppm
20160221 14:47:13.401 master INFO     clocksync            fAT-b827eb3c0752_000126:fAT-b827eb500a97_000398: fAT-b827eb500a97_000398 clock was 4915.9us faster than predicted, drift error 0.0ppm
20160221 14:47:13.625 master INFO     clocksync            fAT-0024d692b410_000172:fAT-b827eb500a97_000398: fAT-b827eb500a97_000398 clock was 4916.1us faster than predicted, drift error -0.1ppm
20160221 14:47:14.227 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827ebbbe72c_000416: fAT-b827ebbbe72c_000416 clock was 4915.8us slower than predicted, drift error 0.3ppm
20160221 14:47:14.228 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827ebf76967_000142: fAT-b827ebf76967_000142 clock was 4915.3us slower than predicted, drift error 0.3ppm
20160221 14:47:14.300 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827eb71d588_000158: fAT-b827eb71d588_000158 clock was 4915.4us slower than predicted, drift error 0.2ppm
20160221 14:47:14.397 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827ebd75fb6_000002: fAT-b827ebd75fb6_000002 clock was 9830.3us slower than predicted, drift error 0.5ppm
20160221 14:47:14.432 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827eb6b68d8_000021: fAT-b827eb6b68d8_000021 clock was 9829.7us slower than predicted, drift error -0.0ppm
20160221 14:47:15.820 master INFO     clocksync            fAT-b827eb2ca9e0_000040:fAT-b827eb500a97_000398: fAT-b827eb500a97_000398 clock was 9830.9us slower than predicted, drift error 0.0ppm
20160221 14:47:15.821 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827ebd75fb6_000002: fAT-b827ebd75fb6_000002 clock was 9830.2us faster than predicted, drift error -0.0ppm
20160221 14:47:15.821 master INFO     clocksync            fAT-b827eb3c0752_000126:fAT-b827eb500a97_000398: fAT-b827eb500a97_000398 clock was 4915.6us slower than predicted, drift error 0.0ppm
20160221 14:47:15.822 master INFO     clocksync            fAT-0024d692b410_000172:fAT-b827eb500a97_000398: fAT-b827eb500a97_000398 clock was 4915.9us slower than predicted, drift error -0.0ppm
20160221 14:47:15.933 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827eba584f9_000114: fAT-b827eba584f9_000114 clock was 9830.2us faster than predicted, drift error -0.1ppm
20160221 14:47:16.240 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827ebf76967_000142: fAT-b827ebf76967_000142 clock was 4915.5us faster than predicted, drift error 0.0ppm
20160221 14:47:18.193 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827eb6b68d8_000021: fAT-b827eb6b68d8_000021 clock was 9829.7us faster than predicted, drift error -0.2ppm
20160221 14:47:18.921 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827eb71d588_000158: fAT-b827eb71d588_000158 clock was 4914.7us faster than predicted, drift error 0.0ppm
20160221 14:47:19.175 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827ebbbe72c_000416: fAT-b827ebbbe72c_000416 clock was 4915.3us faster than predicted, drift error 0.0ppm
20160221 15:17:07.940 master INFO     clocksync            fAT-b827eb2ca9e0_000040:fAT-b827eb500a97_000398: fAT-b827eb500a97_000398 clock was 9830.4us faster than predicted, drift error -0.0ppm
20160221 15:17:10.269 master INFO     clocksync            fAT-b827eb2ca9e0_000040:fAT-b827eb500a97_000398: fAT-b827eb500a97_000398 clock was 9830.5us slower than predicted, drift error -0.3ppm
20160221 15:30:19.846 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827ebf76967_000142: fAT-b827ebf76967_000142 clock was 4915.4us slower than predicted, drift error -0.0ppm
20160221 15:30:22.672 master INFO     clocksync            fAT-b827eb500a97_000398:fAT-b827ebf76967_000142: fAT-b827ebf76967_000142 clock was 4915.5us faster than predicted, drift error 0.1ppm

from piaware.

touil avatar touil commented on June 15, 2024

@jester-xbmc Are you sure you are using the latest build?

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

v1.5 -> 43176 Feb 21 00:21 airspy_adsb

from piaware.

touil avatar touil commented on June 15, 2024

That's not the latest with the fix...
Just to be sure: http://airspy.com/downloads/airspy_adsb-linux-x86.tgz

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

you mean: http://airspy.com/downloads/airspy_adsb-linux-arm.tgz
;)
its running now -> 43176 Feb 21 15:01 airspy_adsb

from piaware.

touil avatar touil commented on June 15, 2024

Yes. my bad.

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

Everything nominal on this side: VRS reported 1 badly formatted message against 5931 good ones
02/21/2016 16:02:29 5986 msgs recv'd from airspy_adsb (2009 in last 5m); 5461 msgs sent to FlightAware
02/21/2016 16:03:22 mlat(708): Receiver status: connected
02/21/2016 16:03:22 mlat(708): Server status: synchronized with 119 nearby receivers
02/21/2016 16:03:22 mlat(708): Receiver: 918.6 msg/s received 18.0kB/s from receiver
02/21/2016 16:03:22 mlat(708): Server: 0.7 kB/s from server 0.0kB/s TCP to server 2.4kB/s UDP to server
02/21/2016 16:03:22 mlat(708): Results: 307.1 positions/minute
02/21/2016 16:03:22 mlat(708): Aircraft: 19 of 42 Mode S, 36 of 84 ADS-B used

from piaware.

mutability avatar mutability commented on June 15, 2024

Still seeing the same behaviour on the server side.

from piaware.

touil avatar touil commented on June 15, 2024

Strange. I see no obvious errors in the lib.

from piaware.

mutability avatar mutability commented on June 15, 2024

Yeah, nothing obvious, but the jumps are always multiples of 4915us which is highly suspicious.

Can you ever get short transfers from libusb in practice? The airspy lib doesn't seem to look at actual_length in the usb transfer struct, it always assumes it got a full buffer.

from piaware.

mutability avatar mutability commented on June 15, 2024

If you were using the default 256kB buffer size and libusb gave you a short transfer that only filled 64kB of it (but you assumed the whole buffer was full), then your timestamp would effectively jump by 4915.2us. Hmm...

from piaware.

touil avatar touil commented on June 15, 2024

It always gives full transfers. I can put an assert on that.

from piaware.

mutability avatar mutability commented on June 15, 2024

You may want to set LIBUSB_TRANSFER_SHORT_NOT_OK (https://github.com/libusb/libusb/blob/master/libusb/libusb.h#L1138)

from piaware.

touil avatar touil commented on June 15, 2024

OK. I added that flag and debugged the thing on rpi2 with lots of traces. Here are my observations:

  • No buffers skipped at all when using 2 or more workers.
  • No transfer errors
  • All the transfers are completely filled (usb_transfer->actual_length == usb_transfer->length)

I can open a port so you can have a look on the data.

from piaware.

mutability avatar mutability commented on June 15, 2024

I'm not sure if there is much I can debug remotely, the raw data is not all that much use as the question is "do the message timestamps accurately reflect the arrival time of the message" which you can only work out if you have other receivers to correlate with.

Since it might be USB related and that's notoriously sensitive to the particular hardware etc, it might be worth getting a version with extra paranoia (actual_length checks etc) running on jester's setup?

from piaware.

touil avatar touil commented on June 15, 2024

OK. Let's do that.

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

just throwing this out there, as its USB related, doing this should not cause issues like the above I assume ? -> usbhid.quirks=0xdead:0xbeef:0x4 into /boot/cmdline.txt to stop dmesg spam should

from piaware.

touil avatar touil commented on June 15, 2024

It's still running with the assertions I mentioned earlier.

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

just checked, still running.

from piaware.

touil avatar touil commented on June 15, 2024

So, definitely nothing lost in libairspy.

from piaware.

mutability avatar mutability commented on June 15, 2024

Still seeing the same behaviour in the server logs, so I guess it is probably something in airspy_adsb itself.

from piaware.

touil avatar touil commented on June 15, 2024

That's expected. Same code with added assertions only. The thing is there's nothing much left to check, apart from the re-scaling code. The 20MHz MLAT already works fine with PlanePlotter network.

I added some masks to avoid overflows in the re-scaling:

uint64_t timestamp = ((((base_timestamp + frame->position) & 0x0000ffffffffffff) * mlat_oversampling_factor + oversampling_factor / 2) / oversampling_factor) & 0x0000ffffffffffff;

Where:

  • base_timestamp is a 64bit unsigned that maintains the MLAT state
  • frame->position is the position of the frame within the currently processed buffer
  • oversampling_factor is the over sampling factor. for 20MSPS, its value is 10
  • mlat_oversampling_factor is 6 for Beast (12MHz)

When all the frames are processed, the base_timestamp is incremented by the number of received samples:

base_timestamp = (base_timestamp + count) & 0x0000ffffffffffff

Note that the masks correspond to 6bytes MLAT. The MLAT rescaling shouldn't overflow with this code.

from piaware.

mutability avatar mutability commented on June 15, 2024

It works 99% of the time. The jumps I see are

a) relatively infrequent - perhaps 3 or 4 per hour, depending on traffic
b) always a multiple of 4915us
c) only affecting the airspy receiver, not other receivers
d) often happen in both directions (i.e. the synchronization offset jumps in one direction then immediately jumps back in the other direction)

Without seeing all the airspy_adsb code I can only speculate, but I would guess that it is an infrequent problem to do with the order in which you are processing buffers (I assume you split incoming buffers between multiple worker threads) rather than anything wrong with the timestamp calculation itself.

from piaware.

touil avatar touil commented on June 15, 2024

It's easy to verify: Force one worker.
FYI, the same buffer is being processed with many threads from a threadpool. This is the high level algo:

  • The main thread enqueues N detection tasks
  • The worker threads process the tasks independently
  • The main thread wait for all the tasks to complete
  • The main thread consolidates the detected frames and performs the time-stamping

Btw, @jester-xbmc is not running the MLAT code with the masks.

from piaware.

mutability avatar mutability commented on June 15, 2024

I don't know if a Pi is going to keep up with one worker, though.

The processing method you describe sounds reasonable and it should work fine. Given that it sometimes doesn't work fine, I think there's a bug in there ;-)

from piaware.

touil avatar touil commented on June 15, 2024

The Pi doesn't seem to moan about the workload: ~180% total at 20MSPS which is not bad after all ;-) I think it was the scaling. We need our friend to try the latest code. Ping @jester-xbmc!

from piaware.

mutability avatar mutability commented on June 15, 2024

2**64 / 6 / 10MHz is around 9800 years, FWIW.

from piaware.

jester-xbmc avatar jester-xbmc commented on June 15, 2024

latest code pulled and running....
43352 Feb 22 10:43 airspy_adsb

from piaware.

mutability avatar mutability commented on June 15, 2024
20160222 14:08:34.836 child  INFO     adept                [70.42.6.198:60223] Handshake successful (fAT-b827eb500a97_000541 adept dump1090)'
20160222 14:11:07.908 master INFO     clocksync            fAT-b827eb500a97_000541:fAT-b827ebcb3b68_000007: fAT-b827ebcb3b68_000007 clock was 14734.0us faster than predicted, drift error -0.3ppm

(only the one pairing did that, so it's not conclusive, but 14734us is very close to a multiple of 4915us)

I am heading out now but I will check again when I'm back.

from piaware.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.