Giter Club home page Giter Club logo

Comments (4)

crackcomm avatar crackcomm commented on June 20, 2024 1

Thank you for your commitment.

I can confirm the sequential scheduler does not cause a crash. It works on both systems.

from hdbscan.

wangyiqiu avatar wangyiqiu commented on June 20, 2024 1

Hi, the error has been fixed in a recent update. Let me know if the problem persists.

from hdbscan.

wangyiqiu avatar wangyiqiu commented on June 20, 2024

Hi, thanks for the feedback. I think the error is likely caused by a parallel scheduler crash on the particular system that you are using.
To verify the guess, I created a sequential version in a branch, which I believe will not crash because the parallel scheduler is disabled.
Here's the code: sequential version

I am communicating with authors of the parallel scheduler here to see if the problem can be resolved by compiling against a different scheduler. I will try to get back to you ASAP.

from hdbscan.

mmisiewicz avatar mmisiewicz commented on June 20, 2024

The problem seems to be occurring on macOS too.

The source builds successfully (mkdir build; cmake ..; make -j) though there are two warnings:

CMake Warning (dev) at /opt/homebrew/Cellar/cmake/3.24.2/share/cmake/Modules/CMakeDependentOption.cmake:89 (message):
 Policy CMP0127 is not set: cmake_dependent_option() supports full Condition
 Syntax.  Run "cmake --help-policy CMP0127" for policy details.  Use the
 cmake_policy command to set the policy and suppress this warning.
Call Stack (most recent call first):
 pybindings/pybind11/CMakeLists.txt:98 (cmake_dependent_option)
This warning is for project developers.  Use -Wno-dev to suppress it.
  • ld: warning: -undefined dynamic_lookup may not work with chained fixups

Running the supplied example also results in a set fault:

➜  ./hdbscan -m 2 ../../example-data.csv
[1]    92235 segmentation fault  ./hdbscan -m 2 ../../example-data.csv

Also happens with the python bindings. For example:

ipython3
Python 3.10.6 (main, Aug 30 2022, 04:58:14) [Clang 13.1.6 (clang-1316.0.21.2.5)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.5.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from pyhdbscan import HDBSCAN

In [2]: from sklearn.datasets import load_iris

In [3]: ir = load_iris()

In [4]: res = HDBSCAN(ir["data"], 2)
[1]    92455 segmentation fault  ipython3

With this crash, looks like Parlay.

-------------------------------------
Translated Report (Full Report Below)
-------------------------------------

Process:               Python [92455]
Path:                  /opt/homebrew/*/Python.framework/Versions/3.10/Resources/Python.app/Contents/MacOS/Python
Identifier:            org.python.python
Version:               3.10.6 (3.10.6)
Code Type:             ARM-64 (Native)
Parent Process:        zsh [57577]
Responsible:           iTerm2 [750]
User ID:               501

Date/Time:             2022-09-30 00:25:41.6247 -0400
OS Version:            macOS 12.6 (21G115)
Report Version:        12
Anonymous UUID:        40AFB959-8F40-4439-AFA3-4FF93B2F5147


Time Awake Since Boot: 620000 seconds

System Integrity Protection: enabled

Crashed Thread:        44

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000000
Exception Codes:       0x0000000000000001, 0x0000000000000000
Exception Note:        EXC_CORPSE_NOTIFY

Termination Reason:    Namespace SIGNAL, Code 11 Segmentation fault: 11
Terminating Process:   exc handler [92455]

VM Region Info: 0 is not in any region.  Bytes before following region: 4297244672
      REGION TYPE                    START - END         [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      UNUSED SPACE AT START
--->  
      __TEXT                      10022c000-100230000    [   16K] r-x/r-x SM=COW  .../MacOS/Python

Thread 0::  Dispatch queue: com.apple.main-thread
0   pyhdbscan.cpython-310-darwin.so	       0x10354db90 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 328
1   pyhdbscan.cpython-310-darwin.so	       0x10354da9c void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 84
2   pyhdbscan.cpython-310-darwin.so	       0x10354dcb4 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 620
3   pyhdbscan.cpython-310-darwin.so	       0x10354db60 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 280
4   pyhdbscan.cpython-310-darwin.so	       0x10354db60 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 280
5   pyhdbscan.cpython-310-darwin.so	       0x10354db60 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 280
6   pyhdbscan.cpython-310-darwin.so	       0x10354d970 parlay::block_allocator::initialize_list(parlay::block_allocator::block*) + 180
7   pyhdbscan.cpython-310-darwin.so	       0x10354d82c void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool) + 240
8   pyhdbscan.cpython-310-darwin.so	       0x10354df44 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 176
9   pyhdbscan.cpython-310-darwin.so	       0x10354d7c8 void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool) + 140
10  pyhdbscan.cpython-310-darwin.so	       0x10354df44 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 176
11  pyhdbscan.cpython-310-darwin.so	       0x10354d7c8 void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool) + 140
12  pyhdbscan.cpython-310-darwin.so	       0x10354df44 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 176
13  pyhdbscan.cpython-310-darwin.so	       0x10354d7c8 void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool) + 140
14  pyhdbscan.cpython-310-darwin.so	       0x10354df44 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 176
15  pyhdbscan.cpython-310-darwin.so	       0x10354d7c8 void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool) + 140
16  pyhdbscan.cpython-310-darwin.so	       0x10354df44 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 176
17  pyhdbscan.cpython-310-darwin.so	       0x10354d7c8 void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool) + 140
18  pyhdbscan.cpython-310-darwin.so	       0x10354c7a8 parlay::block_allocator::block_allocator(unsigned long, unsigned long, unsigned long, unsigned long) + 360
19  pyhdbscan.cpython-310-darwin.so	       0x10354c3a4 parlay::pool_allocator::pool_allocator(std::__1::vector<unsigned long, std::__1::allocator<unsigned long> > const&) + 500
20  pyhdbscan.cpython-310-darwin.so	       0x10354be8c parlay::internal::get_default_allocator() + 96
21  pyhdbscan.cpython-310-darwin.so	       0x1035616b0 py_hdbscan(pybind11::array_t<double, 17>, unsigned long) + 460
22  pyhdbscan.cpython-310-darwin.so	       0x10357af74 void pybind11::cpp_function::initialize<pybind11::array_t<double, 16> (*&)(pybind11::array_t<double, 17>, unsigned long), pybind11::array_t<double, 16>, pybind11::array_t<double, 17>, unsigned long, pybind11::name, pybind11::scope, pybind11::sibling, char [21], pybind11::arg, pybind11::arg>(pybind11::array_t<double, 16> (*&)(pybind11::array_t<double, 17>, unsigned long), pybind11::array_t<double, 16> (*)(pybind11::array_t<double, 17>, unsigned long), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [21], pybind11::arg const&, pybind11::arg const&)::'lambda'(pybind11::detail::function_call&)::__invoke(pybind11::detail::function_call&) + 552
23  pyhdbscan.cpython-310-darwin.so	       0x10356de84 pybind11::cpp_function::dispatcher(_object*, _object*, _object*) + 3388

I tested the sequential version of the code posted here and observed these results:

➜  ./hdbscan -m 2 ../../example-data.csv
build-tree-time = 5.29289e-05
core-dist-time = 0.000156879
---
 beta = 2
 rho = -0.1 -- 2.02927
 edges = 9
 mst-edges = 9
---
 beta = 4
 rho = 2.02927 -- 3.04736
 edges = 36
 mst-edges = 36
---
 beta = 8
 rho = 3.04736 -- 1.79769e+308
 edges = 157
 mst-edges = 39
wspd-time = 0.000174046
kruskal-time = 0.000128746
mark-time = 3.93391e-05
dendrogram-time = 4.00543e-05
timing = 0.000643015

Also:

  • cmake version 3.24.2
  • macOS 12.6, Xcode 14.0.1
  • clang, according to cmake:
-- The C compiler identification is AppleClang 14.0.0.14000029
-- The CXX compiler identification is AppleClang 14.0.0.14000029
  • hardware: M1 Ultra (20 core)

from hdbscan.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.