Comments (6)
@9il Thank you for fast merge! but I still have a question on my ger example for the recurrent neural network backpropgation algorithm.
git clone https://github.com/ShigekiKarita/numir-char-rnn.git
git checkout -b <branch-name> origin/<branch-name>
git submodule update --init --recursive # lubeck is submoduled
dub clean --all-packages
dub run --compiler=ldc2 -b=release-nobounds
dub run --compiler=ldc2 -b=release-nobounds
using lubeck.mtimes (ger disabled)
https://github.com/ShigekiKarita/numir-char-rnn/blob/mir-blas-gemm/source/app.d#L63-L72
dub run --compiler=ldc2 -b=release-nobounds 59.09s user 0.94s system 395% cpu 15.180 total
using lubeck.mtimes (ger enabled)
https://github.com/ShigekiKarita/numir-char-rnn/blob/mir-blas-ger/source/app.d#L63-L72
dub run --compiler=ldc2 -b=release-nobounds 46.70s user 0.59s system 395% cpu 11.968 total
using direct call of mir.blas.ger
https://github.com/ShigekiKarita/numir-char-rnn/blob/master/source/app.d#L84-L94
dub run --compiler=ldc2 -b=release-nobounds 21.89s user 0.20s system 390% cpu 5.661 total
Do you have any comments on the reason why this ger inside mtimes is much slower than the raw ger?
from lubeck.
- gemm vs ger: BLAS operations are not optimized for cases like k = 1. So gemm performs gemm even if it can call ger. Other possible optimization is call gemv if only on of matrixes is a vector.
- Lubeck vs mir-blas/mir-lapack: Lubeck allocates a memory for result. This memory probably not in CPU cache. So, if you preallocate all memory and use only mir-blas/mir-lapack (and without D's hashmaps) then your algorithm will be faster. Both Python and Lubeck use internal BLAS implementation. So if you want win Python you need to preallocate memory, which is not possible in numpy.
3.Take a look into this line:
auto dh = mtimes(params["Why"].transposed, dy).slice;
There are two memory allocations. The first one is mtimes
, the second one is slice
.
from lubeck.
Thanks, I see. Do you have a plan to add an optional argument to store the result in lubeck? As we can see it in numpy and torch, it is good for the preallocation strategy.
from lubeck.
@9il The mtimes documentation could be adjusted to say "General matrix-matrix multiplication. Allocates result to an uninitialized slice using GC."
I don't know how many of the other functions that will be in lubeck will need to think about these sorts of issues. It probably makes sense to create an Issue for further discussion on pre-allocation or the use of alternate allocators.
from lubeck.
Thanks, I see. Do you have a plan to add an optional argument to store the result in lubeck? As we can see it in numpy and torch, it is good for the preallocation strategy.
The gemm
wrapper for ndslice from mir-blas do the same.
@9il The mtimes documentation could be adjusted to say "General matrix-matrix multiplication. Allocates result to an uninitialized slice using GC."
PR is welcome. In the same time it is a general Lubeck concept. Lubeck was originally created to port a commercial Matlab library to D. Similarity, readability, and simplicity were key features. The speed was too, but it was already increased more then one hundred times compared with the original Matlab code.
I don't know how many of the other functions that will be in lubeck will need to think about these sorts of issues. It probably makes sense to create an Issue for further discussion on pre-allocation or the use of alternate allocators.
Currently @EmTee70 works on matrix classes that can hold different payloads (like symmteric, diagonal and other matrixes) and have "*" overloaded operation.
I thought a lot about RC based separate Matrix type system with clever expressions that will be felt like Julia or Matlab. Something like that:
Assume it can hold different types of matrixes:
- dense
- symmetric
- diagonal
- sparse
- tridiagonal
and is clever enough, for example, to fuse at run-time an expressions like
Mat C = alpha * J * J.t - beta * B * R;
into two BLAS (openblas) calls:
syrk
(11 - symmetric rank k operation) for
C = alpha * J * J.t.
and than
gemm
(12 - general rank k operation) for
`C -= beta * B * R`
Plus it would be able to solve linear systems using lapack:
B /= A;
@Laeeth, @firoozye, possibly it may be a good concept for math programming in D.
from lubeck.
@9il Submitted PR.
I"m glad progress is being made on general matrix classes. Would Jean-Louis Leroy's open multi-methods library be useful for these run-time features? He has Matrix examples, but I don't see ones that implement operator overloading.
from lubeck.
Related Issues (18)
- undefined reference to `dgetri_' HOT 3
- Add Windows Install Instructions (in Wiki?) HOT 8
- Helper Function for QRResult to Calculate Q/R HOT 4
- eigSymmetric - HOT 3
- Fails to build due to a dub dependency problem HOT 2
- Newest release doesn't compile anymore on Buildkite HOT 2
- allocations of complex number arrays fails at GC since 2.085 HOT 3
- Seems there is a dub conflict with this package. HOT 2
- Make BlasType public HOT 2
- On entry to SGEMM parameter number 8 had an illegal value HOT 2
- Docs? HOT 2
- lubeck documentation tests fail HOT 6
- eigSymmetric fails to compile with mir.ndslice HOT 4
- Can't use tolerance argument of pinv
- eigSymmetric fails to instantiate
- Add new API HOT 4
- Trying to build minimal example at https://tour.dlang.org/tour/en/dub/lubeck HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lubeck.