Comments (9)
Fixed in version 3.0.4
from flexiblas.
More findings about the issue:
- If I perform a SVD decomposition with R on Fedora 33, it runs in parallel correctly.
- If I install FlexiBLAS on Fedora 32 (with this repo) and numpy, then I run the benchmark with
LD_PRELOAD=/lib64/libflexiblas.so.3
, decompositions run in parallel correctly.
So this seems to happen only when numpy is built against FlexiBLAS. But AFAICT, it is correctly linked and the output of ldd
for the different numpy .so
files seem sane to me. So I'm a bit lost here.
from flexiblas.
That sounds strange. I remind me that that setup.py procedure of numpy is not only searching for a BLAS library it also tries to search for some special symbols in the BLAS library to identify whether it is ATLAS, Goto, Open, MKL.. etc. I think me need to look in the build process of the numpy package. Are the build logs for the fc33 are somewhere available.
from flexiblas.
Thats a good pointer. It looks like flexiblas is configured using openblas options in Fedoras numpy build: https://src.fedoraproject.org/rpms/numpy/blob/master/f/numpy.spec#_110
from flexiblas.
It is configured using the openblas
key, because there's no specific key for flexiblas upstream and we thought it would be best if Numpy just thinks that it's using OpenBLAS, but this configuration points to libflexiblas (at the top of the spec, blaslib
is set to flexiblas
). In the build log, you can check that the flag being used is -lflexiblas
, and libflexiblas.so.3()(64bit)
is correctly listed as Requires at the end of the build.
from flexiblas.
Also, for Fedora 33, we have:
# ldd /usr/lib64/python3.9/site-packages/numpy/linalg/lapack_lite.cpython-39-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffcb4d80000)
libflexiblas.so.3 => /lib64/libflexiblas.so.3 (0x00007f1291ceb000)
libc.so.6 => /lib64/libc.so.6 (0x00007f1291b1f000)
libm.so.6 => /lib64/libm.so.6 (0x00007f12919d9000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f12919d2000)
libgfortran.so.5 => /lib64/libgfortran.so.5 (0x00007f1291717000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f12916f5000)
libquadmath.so.0 => /lib64/libquadmath.so.0 (0x00007f12916a9000)
/lib64/ld-linux-x86-64.so.2 (0x00007f129209e000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f129168e000)
For Fedora 32 (linked against openblas-threads), we have
# ldd /usr/lib64/python3.8/site-packages/numpy/linalg/lapack_lite.cpython-38-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007fff9a9fa000)
libopenblasp.so.0 => /lib64/libopenblasp.so.0 (0x00007f0ce9b92000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0ce99c8000)
libm.so.6 => /lib64/libm.so.6 (0x00007f0ce9883000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f0ce9861000)
libgfortran.so.5 => /lib64/libgfortran.so.5 (0x00007f0ce9596000)
/lib64/ld-linux-x86-64.so.2 (0x00007f0cebfd4000)
libquadmath.so.0 => /lib64/libquadmath.so.0 (0x00007f0ce954c000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f0ce952f000)
The presence of libdl.so.2
is the only difference, besides some reordering. I checked SciPy too, and we have the same problem for its SVD function (which was to be expected, because they have very similar configuration and build systems).
from flexiblas.
Maybe using the buildtime openblas key changes something internally?
from flexiblas.
According to numpy's documentation, they use dgesdd
from LAPACK to compute the SVD. In the benchmark, the SVD is called such that the singular values and the vectors are returned. I created a minimal working example to test this routines here: https://gist.githubusercontent.com/grisuthedragon/8956628ddde9af7505fb8c0d5c799c1e/raw/6d595f8006ef1129d16b60e82b4014e5e5a8d7ad/test_dgesdd.f90
Then I did the following on a FC33 VM / 4 Cores / Intel Sandybridge:
[root@fedora33 ~]# gfortran test_dgesdd.f90 -lflexiblas -o test_dgesdd.flexiblas
[root@fedora33 ~]# gfortran test_dgesdd.f90 -lopenblaso-r0.3.10 -o test_dgesdd.openblaso
[root@fedora33 ~]# gfortran test_dgesdd.f90 -lopenblas-r0.3.10 -o test_dgesdd.openblass
[root@fedora33 ~]# gfortran test_dgesdd.f90 -llapack -lblas -o test_dgesdd.netlib
[root@fedora33 ~]# FLEXIBLAS=OPENBLAS-OPENMP ./test_dgesdd.flexiblas
Time = 1.4351626060000000
[root@fedora33 ~]# FLEXIBLAS=OPENBLAS-SERIAL ./test_dgesdd.flexiblas
Time = 2.8791492609999998
[root@fedora33 ~]# ./test_dgesdd.openblaso
Time = 1.4242952959999999
[root@fedora33 ~]# ./test_dgesdd.openblass
Time = 2.9305672999999999
[root@fedora33 ~]# ./test_dgesdd.netlib
Time = 18.114536298000001
Regarding the runtimes we see that, as expected, FlexiBLAS-OpenBLAS and OpenBLAS naive are similar. But the Netlib time is high as expected. From this point of view it seems that in the Numpy case, FlexiBLAS falls back to its internal NETLIB fallback interface. I think it has something to do with the way Python and Numpy include their shared library addons. Using strange flags in dlopen
could be the reason. In R, as @Enchufa2 tested, this isn't a problem. I remeber from my first developments (around version 1 in 2013) with FlexiBLAS, that a colleague produced sometimes a "symbol not found" error, when using FlexiBLAS with Numpy.
I will take a deeper look in this during this week. But If anybody has some ideas, especially how to monitor what python is doing under the hood, I am interested in.
from flexiblas.
I think I have the reason. Python loads all shared objects with RLTD_LAZY | RTLD_LOCAL
and thus the first call to a LAPACK routine picks the first symbol in the GOT which fits the requirements. Since FlexiBLAS first integrates all Fallback symbols in the GLOBAL offset table during the initialization and the remaining ones(like everything from BLAS) it not looked up yet. The first call from Python/Numpy to LAPACK will pick the fallback we loaded in the GOT. I am currently developing the fix for this.
A short fix would be to replace the RTLD_GLOBAL flags in src/flexiblas.c / flexiblas_init by RTLD_LOCAL, but some first test show that this is only half of the truth.
from flexiblas.
Related Issues (20)
- Build failure on macOS HOT 6
- Cross-compilation issues HOT 3
- Suffixed 64-bit integer symbols names HOT 17
- Compile all backends without auto-detection HOT 3
- LAPACK 3.10.0 HOT 7
- building FlexiBLAS 3.0.4 with -DMKL_CUSTOM=ON fails HOT 5
- segmentation fault with numpy on POWER9 (only) when using FlexiBLAS HOT 13
- Static build of FlexiBLAS HOT 5
- flexiblas-openblas-openmp with OMP_PROC_BIND binds the program to a single core HOT 11
- 3.1.0 build failures HOT 11
- Int instead of int in standalone API HOT 3
- Wrong -verbose flag in CXXFLAGS HOT 2
- Documentation on switching backends HOT 8
- No LAPACK API version specified HOT 7
- Improving backend auto detection HOT 5
- Missing flexiblas-octave tar in release v3.2.0 HOT 2
- macOS support HOT 4
- Strange annocheck reports HOT 2
- macOS vecLib Accelerate Framework thread setting HOT 1
- Switch to explicit prototypes HOT 15
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flexiblas.