Giter Club home page Giter Club logo

Comments (25)

grrtrr avatar grrtrr commented on July 29, 2024

I am not seeing the same behaviour on my x86_64 laptop -- during about 10 minutes, the memory usage remains constant at about 0.1% of 4GB (~4MB).

It is a while ago that I last looked at leaks using valgrind. From that time I remember there were a few occurrences in llist that seemed not to be free()d; but that was in the order of perhaps some hundred bytes.

To find out whether and where there might be a problem: which page/tab did it run when the memory was reduced? The scan window does a lot of allocation/deallocation. If that should be the culprit, changing the update frequency of the scan window would change the RAM usage frequency.

It is difficult to say, as I don't have an ARM board for comparison.

from wavemon.

joerg-krause avatar joerg-krause commented on July 29, 2024

I am running wavemon showing the Info page the most time. Which version of libnl and ncurses does your system uses?

from wavemon.

grrtrr avatar grrtrr commented on July 29, 2024

I am using libnl 3.2.24-2 and ncurses 5.9.20140913 under debian 8.5.

It seems unlikely that ncurses is the culprit -- if that is the case, then other applications (e.g. htop) would also cause memory failure.

I just ran a quick check with valgrind, there were about 16kB not released at program exit time. When I get some time, I will look into this. (Still, it is less than the figures you are seeing.)

from wavemon.

grrtrr avatar grrtrr commented on July 29, 2024

Hi Joerg, I hope it is not too late to ask this - do you remember if you were using the development or master branch when the above happened? The above referenced issue reported problems with hanging, on the development branch, and I would like to narrow down where the problem is coming from.

from wavemon.

joerg-krause avatar joerg-krause commented on July 29, 2024

Sorry, I somehow forgot about this issue, but I still remember πŸ˜„

I was using version 0.8.0.

from wavemon.

grrtrr avatar grrtrr commented on July 29, 2024

Thank you. That is the nl80211 based version. So there are now 2 issues related to the info page on ARM, both on the info page. It seems to hint at the same area of code.

from wavemon.

joerg-krause avatar joerg-krause commented on July 29, 2024

It looks like it. If there is anything I can test, please tell me.

from wavemon.

rofl0r avatar rofl0r commented on July 29, 2024

i know something you could test: if the leak is due to a bug in libnl, we could test alternatives, for example there's libnl-tiny: https://github.com/sabotage-linux/libnl-tiny

from wavemon.

grrtrr avatar grrtrr commented on July 29, 2024

Hi Joerg, on the master branch there is an update which might be a fix for this problem, too. It did fix issue #21 (unresponsiveness on ARM). The cause of that other issue was that an interval timer periodically polled at fixed intervals, fixed by replacing it with a separate thread.

It may be that the problems which caused unresponsiveness could also have triggered the out-of-memory condition. One possibility would have been overlapping of calls to the same poll function at a higher refresh rate (the old implementation did not use locking to prevent that).

If you have time to test with the latest master, it would help validate if the above thought makes sense, or if the problem is somewhere else.

from wavemon.

grrtrr avatar grrtrr commented on July 29, 2024

Joerg, can you please test with the latest master if the issue still exists? It has been a while, and the code has changed in the meantime. With the fix of #21, it is possible that this issue no longer exists. The architectures in both cases were the same, and the problem seemed to come out of the part of the code.

I do not have the hardware to test this, and have not seen anything similar on x86_64 laptops.

If it is not possible for you to test, please let me know. I will otherwise close this ticket after some time.

from wavemon.

joerg-krause avatar joerg-krause commented on July 29, 2024

Hi Gerrit, sorry for not replying sooner! So many things to do, so little time...

Yes, I can test the master branch this week.

from wavemon.

joerg-krause avatar joerg-krause commented on July 29, 2024

wavemon failed starting for me because of this issue: #23.

from wavemon.

grrtrr avatar grrtrr commented on July 29, 2024

I am looking into this and will get back with a tested fix for #23 within the next 2 days.

from wavemon.

joerg-krause avatar joerg-krause commented on July 29, 2024

I enabled wext support in the kernel. However, running wavemon crashs after some seconds with Illegal instruction:

β”Œβ”€Interface────────────────────────────────────────────────────────────────────┐
β”‚wlan0 (IEEE 802.11), phy 0, reg: n/a, SSID: xxxxxxxxIllegal instruction       β”‚
β”œβ”€Levels────────────────────────────────────────────────────────────────# ──────
β”‚                                                                              β”‚
β”‚link quality: 66%  (46/70)                                                    β”‚
β”‚===================================================                           β”‚
β”‚                                                                              β”‚
β”‚                                                                              β”‚
β”‚signal level: -64 dBm (0.40 nW)                                               β”‚
β”‚===============================                                               β”‚
β”‚                                                                              β”‚
β”œβ”€Statistics────────────────────────────────────────────────────────────────────
β”‚RX: 3010 (0 B)                                                                β”‚
β”‚TX: 1868 (0 B)                                                                β”‚
β”œβ”€Info──────────────────────────────────────────────────────────────────────────
β”‚mode: Managed, connected to: 34:81:C4:29:51:96, time: 4:12m, inactive: 0.0s   β”‚
β”‚frequency/channel: n/a                                                        β”‚
β”‚rx rate: 39.0 MBit/s, tx rate: 65.0 MBit/s                                    β”‚
β”‚station flags: WME, preamble: short, slot: short                              β”‚
β”‚power mgt: off,  tx-power: 31 dBm (1258.93 mW)                                β”‚
β”‚retry: short limit 7,  rts/cts: off,  frag: off                               β”‚
β”‚encryption: off (no key set)                                                  β”‚
β”œβ”€Network───────────────────────────────────────────────────────────────────────
β”‚wlan0 (UP RUNNING BROADCAST MULTICAST)                                        β”‚

from wavemon.

rofl0r avatar rofl0r commented on July 29, 2024

run it under gdb so we can see what's causing this (hint: for terminal apps i usually attach gdb to the running process from another terminal with the -pid=$(pgrep wavemon) parameter, but in that case you gotta be quick to catch the SIGILL!).
most likely there's some problem with your arm toolchain (code for some not exported FPU or CPU feature or thumb vs arm code etc).

from wavemon.

grrtrr avatar grrtrr commented on July 29, 2024

Do you think the problem may lie in the ARM wext support? One easy thing to do is to enable core dump

bash> ulimit -c unlimited

and then check in the core dump where it crashed (SIGILL dumps core). In combination with what you suggested earlier ("no supported wireless interfaces found"), I am thinking more of removing all Wext support; even though this would mean reducing wavemon output quite a bit.

from wavemon.

joerg-krause avatar joerg-krause commented on July 29, 2024

I rebuild wavemon with debugging symbols and run gdbserver wavemon on the target and arm-buildroot-linux-musleabi-gdb wavemon on the host. This is the output:

(gdb) target remote 192.168.178.23:2345
Remote debugging using 192.168.178.23:2345
Reading /usr/bin/wavemon from remote target...
warning: File transfers from remote targets can be slow. Use "set sysroot" to access files locally instead.
Reading /usr/bin/wavemon from remote target...
Reading symbols from target:/usr/bin/wavemon...(no debugging symbols found)...done.
Reading /lib/ld-musl-arm.so.1 from remote target...
Reading /lib/ld-musl-arm.so.1 from remote target...
Reading symbols from target:/lib/ld-musl-arm.so.1...(no debugging symbols found)...done.
0xb6fbdb74 in _dlstart () from target:/lib/ld-musl-arm.so.1
(gdb) c
Continuing.
Reading /usr/lib/libncurses.so.5 from remote target...
Reading /usr/lib/libnl-genl-3.so.200 from remote target...
Reading /usr/lib/libnl-3.so.200 from remote target...
[New Thread 657]

Program received signal SIGILL, Illegal instruction.
0xb6f839a4 in free () from target:/lib/ld-musl-arm.so.1
(gdb) disass 0xb6f839a4,0xb6f839ff
Dump of assembler code from 0xb6f839a4 to 0xb6f839ff:
=> 0xb6f839a4 <free+56>:	udf	#0
   0xb6f839a8 <free+60>:	add	sp, sp, #36	; 0x24
   0xb6f839ac <free+64>:	pop	{r4, r5, r6, r7, r8, r9, r10, r11, lr}
   0xb6f839b0 <free+68>:	b	0xb6f9a470 <munmap>
   0xb6f839b4 <free+72>:	ldr	r2, [r4, r9]
   0xb6f839b8 <free+76>:	add	r5, r4, r9
   0xb6f839bc <free+80>:	cmp	r3, r2
   0xb6f839c0 <free+84>:	beq	0xb6f839c8 <free+92>
   0xb6f839c4 <free+88>:	udf	#0
   0xb6f839c8 <free+92>:	mov	r3, #0
   0xb6f839cc <free+96>:	str	r3, [sp, #4]
   0xb6f839d0 <free+100>:	ldr	r3, [pc, #944]	; 0xb6f83d88 <free+1052>
   0xb6f839d4 <free+104>:	ldr	r10, [pc, #944]	; 0xb6f83d8c <free+1056>
   0xb6f839d8 <free+108>:	add	r3, pc, r3
   0xb6f839dc <free+112>:	add	r3, r3, #1024	; 0x400
   0xb6f839e0 <free+116>:	str	r3, [sp, #16]
   0xb6f839e4 <free+120>:	add	r10, pc, r10
   0xb6f839e8 <free+124>:	add	r3, r3, #8
   0xb6f839ec <free+128>:	str	r3, [sp, #20]
   0xb6f839f0 <free+132>:	add	r3, r10, #1024	; 0x400
   0xb6f839f4 <free+136>:	add	r3, r3, #8
   0xb6f839f8 <free+140>:	mov	r7, r9
   0xb6f839fc <free+144>:	str	r3, [sp, #28]
End of assembler dump.

Note, that I built wavemon using a Buildroot toolchain with GCC 6.2.0 and musl 1.1.15.

from wavemon.

grrtrr avatar grrtrr commented on July 29, 2024

Thank you for all the work to get this condition caught. I don't understand the details of the dump, what it looks to me is that it has to do with memory management (free()).
Perhaps this problem is related to the original memory leak. I don't have details of how malloc/free are implemented on the ARM architecture, on glib they use brk() to adjust the data segment.
Maybe @rofl0r has a better idea of what the cause might be?

from wavemon.

joerg-krause avatar joerg-krause commented on July 29, 2024

Looking at the free() implementation in musls malloc.c shows that in certain cases a_crash is called, which actually maps to the udf instruction we can see in the disassembly. If I interpret the disassembly correctly, the crash happens because of Crash on double free.

Unfortunately, I did not managed to get a backtrace with gdb as it fails with "stack corrupted".

from wavemon.

rofl0r avatar rofl0r commented on July 29, 2024

yes, @joerg-krause 's analysis appears to be correct. i would suggest running wavemon with a similar setup on a glibc host under valgrind to find out where the double-free happens.
valgrind is currently not yet ported to musl.
if that is out of question, building musl libc with "-g3 -O0" will at least allow getting a proper backtrace to the site in wavemon that called free(), so one can try to find the bug like that.

from wavemon.

grrtrr avatar grrtrr commented on July 29, 2024

Thank you both. I will look into valgrind debugging during the weekend.

from wavemon.

grrtrr avatar grrtrr commented on July 29, 2024

Update

After running various valgrind sessions, I did find a bug, fixed in 6f64d78 .
There was access to uninitialized iw_range fields ( ir->max_encoding_tokens),
which may have caused the double free and/or over-allocation.

The bug was introduced when I converted sampling from interval handler to a pthread,
so it may not be the cause of the original memory leak problem.

With this fixed, it is quite likely that the code will now run on arm.

I had no more valgrind errors reported. On the info/history screens, there were 0 leaks.
On the scan screen there is a constant leak of 16384/224 bytes. The trace seems to lead
into libnl (iov buffer). I was not able to find further clues at the given time; the values were
pretty much the same each time I ran valgrind.

from wavemon.

joerg-krause avatar joerg-krause commented on July 29, 2024

Many thanks for your investigation. I will test the updated master branch this week.

from wavemon.

joerg-krause avatar joerg-krause commented on July 29, 2024

@grrtrr Just checked the current master branch and it works fine now! No "Illegal Instructions" anymore and RAM stays constant now. Great job!

from wavemon.

grrtrr avatar grrtrr commented on July 29, 2024

Thank you once again for all the testing - this has been a huge help.

from wavemon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.