Giter Club home page Giter Club logo

Comments (14)

sezaru avatar sezaru commented on June 18, 2024 1

My CPU is a Ryzen 5 1600:

processor       : 11
vendor_id       : AuthenticAMD
cpu family      : 23
model           : 8
model name      : AMD Ryzen 5 1600 Six-Core Processor
stepping        : 2
microcode       : 0x800820d
cpu MHz         : 3569.248
cache size      : 512 KB
physical id     : 0
siblings        : 12
core id         : 6
cpu cores       : 6
apicid          : 13
initial apicid  : 13
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca sev sev_es
bugs            : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass retbleed smt_rsb srso div0
bogomips        : 6399.93
TLB size        : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]

But I have my TrueNAS instance running on top of proxmox, so the CPU that the docker image sees is a QEMU Virtual CPU version 2.5+

processor       : 2
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 107
model name      : QEMU Virtual CPU version 2.5+
stepping        : 1
microcode       : 0x1000065
cpu MHz         : 3205.846
cache size      : 512 KB
physical id     : 0
siblings        : 12
core id         : 2
cpu cores       : 12
apicid          : 2
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm rep_good nopl cpuid extd_apicid tsc_known_freq pni ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lahf_lm cmp_legacy 3dnowprefetch vmmcall
bugs            : fxsave_leak sysret_ss_attrs null_seg swapgs_fence amd_e400 spectre_v1 spectre_v2
bogomips        : 6411.69
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

I will see if I can change that to something that doesn ´t crash

from livebook.

sezaru avatar sezaru commented on June 18, 2024 1

Ok, FYI, after I changed the CPU from KVM64 to host, installing explorer started working again.

I'm not sure if @florianb issue is the same as mine, but if it is, then this is not a Livebook issue.

from livebook.

josevalim avatar josevalim commented on June 18, 2024

What happens if you build a similar docker image (matching Elixir/Erlang versions) with iex instead and then run Mix.install [:explorer]? Because my gut feeling says it is not a Livebook issue per se.

from livebook.

florianb avatar florianb commented on June 18, 2024

Thanks for the quick response - i don't think it's Livebook issue, too - i wondered if there's a way to find out why it crashes.

I will try to create a livebook-instance in a different base container and come back as soon as i have new details.

from livebook.

jonatanklosko avatar jonatanklosko commented on June 18, 2024

Is it the whole Docker container that crashes, or only the notebook runtime?

from livebook.

florianb avatar florianb commented on June 18, 2024

Only the notebook runtime - the container keeps running and i can immediately invoke a reconnect from the gui. Unfortunately the logs seem to stay silent and interestingly there are empty log lines.

from livebook.

josevalim avatar josevalim commented on June 18, 2024

Either a segmentation fault or the OS somehow killing it because it things it is running out of memory?

from livebook.

florianb avatar florianb commented on June 18, 2024

Here is the log of two subsequent reconnects & setup:

Mai 06 16:07:47 livebook docker[24339]: 14:07:47.322 [debug] HANDLE EVENT "queue_cell_evaluation" in LivebookWeb.SessionLive
Mai 06 16:07:47 livebook docker[24339]:   Parameters: %{"cell_id" => "setup", "disable_dependencies_cache" => false}
Mai 06 16:07:47 livebook docker[24339]: 14:07:47.323 [debug] Replied in 194µs
Mai 06 16:07:48 livebook docker[24339]: 
Mai 06 16:07:48 livebook docker[24339]: 14:07:48.541 [debug] Copying NIF from cache and extracting to /home/livebook/.cache/mix/installs/elixir-1.15.7-erts-14.1.1/444655ddd876e676876661841d5d38eb/_build/dev/lib/explorer/priv/native/libexplorer-v0.8.2-nif-2.15-x86_64-unknown-linux-gnu.so
Mai 06 16:07:51 livebook docker[24339]: 14:07:51.891 [debug] HANDLE EVENT "queue_cell_evaluation" in LivebookWeb.SessionLive
Mai 06 16:07:51 livebook docker[24339]:   Parameters: %{"cell_id" => "setup", "disable_dependencies_cache" => false}
Mai 06 16:07:51 livebook docker[24339]: 14:07:51.891 [debug] Replied in 182µs
Mai 06 16:07:53 livebook docker[24339]: 
Mai 06 16:07:53 livebook docker[24339]: 14:07:53.100 [debug] Copying NIF from cache and extracting to /home/livebook/.cache/mix/installs/elixir-1.15.7-erts-14.1.1/444655ddd876e676876661841d5d38eb/_build/dev/lib/explorer/priv/native/libexplorer-v0.8.2-nif-2.15-x86_64-unknown-linux-gnu.so

The container runs in a vm with 12GB memory and there's currently no load in the container.

from livebook.

sezaru avatar sezaru commented on June 18, 2024

I have the exact same issue happening to me using livebook-dev/livebook:latest (I also tried edge, same issue).

In my case I'm running it in my TrueNAS instance.

from livebook.

sezaru avatar sezaru commented on June 18, 2024

So, I opened a shell in my docker container, started iex and run Mix.install, the error I get is Illegal instruction:

image

from livebook.

josevalim avatar josevalim commented on June 18, 2024

That's what I suspected, something going on Docker's emulation layer. :( Does the OS version in the NIF match the system you are currently running? What is your host OS?

from livebook.

florianb avatar florianb commented on June 18, 2024

@sezaru - awesome finding! Indeed - it seems like the precompiled NIFs make use of some CPU-features. I ended up using x86-64-v3 to preserve live-migration (we're using PVE as well).

So i guess the issue can be closed!

@josevalim do you want me do file a PR adding some kind of warning to the docs about this issue? For sure its no dedicated Livebook issue but other LB-users might run into this as well..

from livebook.

josevalim avatar josevalim commented on June 18, 2024

Wouldn't this be more of an Explorer kind of issue? are we assuming too many CPU features?

from livebook.

florianb avatar florianb commented on June 18, 2024

I think it's an issue with NIFs in general and what me boggles is the fact that i got no feedback from the stack. "Illegal instruction" is something to work with, but yeah you're right - this should be pushed downwards.

I just assumed that this repo is one likely sink for this issue. But we might add a hint later, after more people stumbled over this. In that time i will try to find out where the cpu restrictions sneaked in, maybe its a thing which should be handled as default in rustler.

Of course, it would be the best if the NIF could indicate issues before the NIF is loaded.

from livebook.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.