Comments (4)
Thank you for your commitment.
I can confirm the sequential scheduler does not cause a crash. It works on both systems.
from hdbscan.
Hi, the error has been fixed in a recent update. Let me know if the problem persists.
from hdbscan.
Hi, thanks for the feedback. I think the error is likely caused by a parallel scheduler crash on the particular system that you are using.
To verify the guess, I created a sequential version in a branch, which I believe will not crash because the parallel scheduler is disabled.
Here's the code: sequential version
I am communicating with authors of the parallel scheduler here to see if the problem can be resolved by compiling against a different scheduler. I will try to get back to you ASAP.
from hdbscan.
The problem seems to be occurring on macOS too.
The source builds successfully (mkdir build; cmake ..; make -j
) though there are two warnings:
CMake Warning (dev) at /opt/homebrew/Cellar/cmake/3.24.2/share/cmake/Modules/CMakeDependentOption.cmake:89 (message):
Policy CMP0127 is not set: cmake_dependent_option() supports full Condition
Syntax. Run "cmake --help-policy CMP0127" for policy details. Use the
cmake_policy command to set the policy and suppress this warning.
Call Stack (most recent call first):
pybindings/pybind11/CMakeLists.txt:98 (cmake_dependent_option)
This warning is for project developers. Use -Wno-dev to suppress it.
ld: warning: -undefined dynamic_lookup may not work with chained fixups
Running the supplied example also results in a set fault:
➜ ./hdbscan -m 2 ../../example-data.csv
[1] 92235 segmentation fault ./hdbscan -m 2 ../../example-data.csv
Also happens with the python bindings. For example:
➜ ipython3
Python 3.10.6 (main, Aug 30 2022, 04:58:14) [Clang 13.1.6 (clang-1316.0.21.2.5)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.5.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from pyhdbscan import HDBSCAN
In [2]: from sklearn.datasets import load_iris
In [3]: ir = load_iris()
In [4]: res = HDBSCAN(ir["data"], 2)
[1] 92455 segmentation fault ipython3
With this crash, looks like Parlay.
-------------------------------------
Translated Report (Full Report Below)
-------------------------------------
Process: Python [92455]
Path: /opt/homebrew/*/Python.framework/Versions/3.10/Resources/Python.app/Contents/MacOS/Python
Identifier: org.python.python
Version: 3.10.6 (3.10.6)
Code Type: ARM-64 (Native)
Parent Process: zsh [57577]
Responsible: iTerm2 [750]
User ID: 501
Date/Time: 2022-09-30 00:25:41.6247 -0400
OS Version: macOS 12.6 (21G115)
Report Version: 12
Anonymous UUID: 40AFB959-8F40-4439-AFA3-4FF93B2F5147
Time Awake Since Boot: 620000 seconds
System Integrity Protection: enabled
Crashed Thread: 44
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000000
Exception Codes: 0x0000000000000001, 0x0000000000000000
Exception Note: EXC_CORPSE_NOTIFY
Termination Reason: Namespace SIGNAL, Code 11 Segmentation fault: 11
Terminating Process: exc handler [92455]
VM Region Info: 0 is not in any region. Bytes before following region: 4297244672
REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL
UNUSED SPACE AT START
--->
__TEXT 10022c000-100230000 [ 16K] r-x/r-x SM=COW .../MacOS/Python
Thread 0:: Dispatch queue: com.apple.main-thread
0 pyhdbscan.cpython-310-darwin.so 0x10354db90 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 328
1 pyhdbscan.cpython-310-darwin.so 0x10354da9c void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 84
2 pyhdbscan.cpython-310-darwin.so 0x10354dcb4 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 620
3 pyhdbscan.cpython-310-darwin.so 0x10354db60 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 280
4 pyhdbscan.cpython-310-darwin.so 0x10354db60 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 280
5 pyhdbscan.cpython-310-darwin.so 0x10354db60 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::initialize_list(parlay::block_allocator::block*)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 280
6 pyhdbscan.cpython-310-darwin.so 0x10354d970 parlay::block_allocator::initialize_list(parlay::block_allocator::block*) + 180
7 pyhdbscan.cpython-310-darwin.so 0x10354d82c void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool) + 240
8 pyhdbscan.cpython-310-darwin.so 0x10354df44 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 176
9 pyhdbscan.cpython-310-darwin.so 0x10354d7c8 void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool) + 140
10 pyhdbscan.cpython-310-darwin.so 0x10354df44 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 176
11 pyhdbscan.cpython-310-darwin.so 0x10354d7c8 void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool) + 140
12 pyhdbscan.cpython-310-darwin.so 0x10354df44 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 176
13 pyhdbscan.cpython-310-darwin.so 0x10354d7c8 void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool) + 140
14 pyhdbscan.cpython-310-darwin.so 0x10354df44 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 176
15 pyhdbscan.cpython-310-darwin.so 0x10354d7c8 void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool) + 140
16 pyhdbscan.cpython-310-darwin.so 0x10354df44 void parlay::fork_join_scheduler::pardo<void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda'(), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'()>(parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool)::'lambda0'(), bool) + 176
17 pyhdbscan.cpython-310-darwin.so 0x10354d7c8 void parlay::fork_join_scheduler::parfor_<parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long)>(unsigned long, unsigned long, parlay::block_allocator::reserve(unsigned long)::'lambda'(unsigned long), unsigned long, bool) + 140
18 pyhdbscan.cpython-310-darwin.so 0x10354c7a8 parlay::block_allocator::block_allocator(unsigned long, unsigned long, unsigned long, unsigned long) + 360
19 pyhdbscan.cpython-310-darwin.so 0x10354c3a4 parlay::pool_allocator::pool_allocator(std::__1::vector<unsigned long, std::__1::allocator<unsigned long> > const&) + 500
20 pyhdbscan.cpython-310-darwin.so 0x10354be8c parlay::internal::get_default_allocator() + 96
21 pyhdbscan.cpython-310-darwin.so 0x1035616b0 py_hdbscan(pybind11::array_t<double, 17>, unsigned long) + 460
22 pyhdbscan.cpython-310-darwin.so 0x10357af74 void pybind11::cpp_function::initialize<pybind11::array_t<double, 16> (*&)(pybind11::array_t<double, 17>, unsigned long), pybind11::array_t<double, 16>, pybind11::array_t<double, 17>, unsigned long, pybind11::name, pybind11::scope, pybind11::sibling, char [21], pybind11::arg, pybind11::arg>(pybind11::array_t<double, 16> (*&)(pybind11::array_t<double, 17>, unsigned long), pybind11::array_t<double, 16> (*)(pybind11::array_t<double, 17>, unsigned long), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [21], pybind11::arg const&, pybind11::arg const&)::'lambda'(pybind11::detail::function_call&)::__invoke(pybind11::detail::function_call&) + 552
23 pyhdbscan.cpython-310-darwin.so 0x10356de84 pybind11::cpp_function::dispatcher(_object*, _object*, _object*) + 3388
I tested the sequential version of the code posted here and observed these results:
➜ ./hdbscan -m 2 ../../example-data.csv
build-tree-time = 5.29289e-05
core-dist-time = 0.000156879
---
beta = 2
rho = -0.1 -- 2.02927
edges = 9
mst-edges = 9
---
beta = 4
rho = 2.02927 -- 3.04736
edges = 36
mst-edges = 36
---
beta = 8
rho = 3.04736 -- 1.79769e+308
edges = 157
mst-edges = 39
wspd-time = 0.000174046
kruskal-time = 0.000128746
mark-time = 3.93391e-05
dendrogram-time = 4.00543e-05
timing = 0.000643015
Also:
cmake version 3.24.2
- macOS 12.6, Xcode 14.0.1
- clang, according to cmake:
-- The C compiler identification is AppleClang 14.0.0.14000029
-- The CXX compiler identification is AppleClang 14.0.0.14000029
- hardware: M1 Ultra (20 core)
from hdbscan.
Related Issues (9)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hdbscan.