Giter Club home page Giter Club logo

Comments (10)

Syllo avatar Syllo commented on May 22, 2024 2

Could you please confirm that the patch 8b56210 on the dev branch fixes your problem?

from nvtop.

Syllo avatar Syllo commented on May 22, 2024

Hi,

It seems that there is a problem with the window initialization code. Your backtrace shows a NULL window pointer.
What I think is happening is that this cascades into an unsigned underflow in initialize_gpu_mem_plot followed by malloc that over-commits an insanely huge buffer.
Finally, the OS starts allocating the real pages when it is accessed by the nvtop_line_plot code and 💣

Could you please provide me the gdb output after:
break inteface.c:439
run
print *plot_positions

At which terminal size does it break?

from nvtop.

daniel-j-h avatar daniel-j-h commented on May 22, 2024

Here's the full screen terminal size which runs into this issue

$ stty size
49 190

and a tmux split pane in which in works is of size 23 190.

The gdb output:

(gdb) print *plot_positions
$1 = {posX = 0, posY = 11, sizeX = 189, sizeY = 12}

Thank you! 🙇

from nvtop.

daniel-j-h avatar daniel-j-h commented on May 22, 2024

Wonderful, the dev branch fixes the problem! 🎉 Thank you for this quick fix! 🤗

Shows me two plots at the top and a third one below (out of six gpus).

from nvtop.

Syllo avatar Syllo commented on May 22, 2024

You are welcome,
The fix is now part of master.

from nvtop.

lyu avatar lyu commented on May 22, 2024

Hi,

Sorry for commenting in a closed issue but I am still having the same issue as the OP faced, but our system has 8 GPUs. Reducing the size of the terminal and nvtop works correctly by showing 4 plots, each displaying 2 GPUs.

I am building nvtop from the master branch.

from nvtop.

Syllo avatar Syllo commented on May 22, 2024

Hello,

Can you please provide the location of the error in the same way Daniel did, the size of your terminal, and the output of the debugger for the following commands:

break interface_layout_selection.c:174
print num_plot_stacks
print plot_per_row
print *plot_types

To generate a debug build you have to specify -DCMAKE_BUILD_TYPE=Debug while running cmake.

Thanks

from nvtop.

lyu avatar lyu commented on May 22, 2024

tput cols: 142
tput lines : 75
print num_plot_stacks: 3
print plot_per_row: 1
print *plot_types: plot_gpu_duo

Address sanitizer backtrace:

=================================================================
==32111==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x2ab2b8fcf7e0 at pc 0x000000440867 bp 0x7fffffffcb30 sp 0x7fffffffcb28
READ of size 8 at 0x2ab2b8fcf7e0 thread T0
    #0 0x440866 in nvtop_line_plot /dev/shm/nvtop/src/plot.c:29
    #1 0x421f1a in draw_plots /dev/shm/nvtop/src/interface.c:1604
    #2 0x4222bb in draw_gpu_info_ncurses /dev/shm/nvtop/src/interface.c:1625
    #3 0x4059a5 in main /dev/shm/nvtop/src/nvtop.c:270
    #4 0x2aaaac994504 in __libc_start_main (/lib64/libc.so.6+0x22504)
    #5 0x4048a8  (/gpfs/home/USER_NAME/.local/bin/nvtop+0x4048a8)

0x2ab2b8fcf7e0 is located 16 bytes to the right of 34359738320-byte region [0x2aaab8fcf800,0x2ab2b8fcf7d0)
allocated by thread T0 here:
    #0 0x2aaaaadd8cb8 in __interceptor_calloc ../../../../gcc-9.2.0/libsanitizer/asan/asan_malloc_linux.cc:153
    #1 0x4083d8 in initialize_gpu_mem_plot /dev/shm/nvtop/src/interface.c:364
    #2 0x4091da in alloc_plot_window /dev/shm/nvtop/src/interface.c:409
    #3 0x4098c6 in initialize_all_windows /dev/shm/nvtop/src/interface.c:439
    #4 0x40c1a6 in initialize_curses /dev/shm/nvtop/src/interface.c:564
    #5 0x4057ed in main /dev/shm/nvtop/src/nvtop.c:249
    #6 0x2aaaac994504 in __libc_start_main (/lib64/libc.so.6+0x22504)

SUMMARY: AddressSanitizer: heap-buffer-overflow /dev/shm/nvtop/src/plot.c:29 in nvtop_line_plot
Shadow bytes around the buggy address:
  0x0556d71f1ea0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0556d71f1eb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0556d71f1ec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0556d71f1ed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0556d71f1ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0556d71f1ef0: 00 00 00 00 00 00 00 00 00 00 fa fa[fa]fa fa fa
  0x0556d71f1f00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0556d71f1f10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0556d71f1f20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0556d71f1f30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0556d71f1f40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==32111==ABORTING

from nvtop.

Syllo avatar Syllo commented on May 22, 2024

@lyu I think that the patch 71b7f96 should fix this problem.
Could you please tell me if it solves the problem on your system?

from nvtop.

lyu avatar lyu commented on May 22, 2024

@Syllo The problem is gone, thank you!

from nvtop.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.