Giter Club home page Giter Club logo

Comments (6)

ShigekiKarita avatar ShigekiKarita commented on June 3, 2024

@9il Thank you for fast merge! but I still have a question on my ger example for the recurrent neural network backpropgation algorithm.

git clone https://github.com/ShigekiKarita/numir-char-rnn.git
git checkout -b  <branch-name> origin/<branch-name>
git submodule update --init --recursive # lubeck is submoduled
dub clean --all-packages
dub run --compiler=ldc2 -b=release-nobounds
dub run --compiler=ldc2 -b=release-nobounds

using lubeck.mtimes (ger disabled)

https://github.com/ShigekiKarita/numir-char-rnn/blob/mir-blas-gemm/source/app.d#L63-L72
dub run --compiler=ldc2 -b=release-nobounds 59.09s user 0.94s system 395% cpu 15.180 total

using lubeck.mtimes (ger enabled)

https://github.com/ShigekiKarita/numir-char-rnn/blob/mir-blas-ger/source/app.d#L63-L72
dub run --compiler=ldc2 -b=release-nobounds 46.70s user 0.59s system 395% cpu 11.968 total

using direct call of mir.blas.ger

https://github.com/ShigekiKarita/numir-char-rnn/blob/master/source/app.d#L84-L94
dub run --compiler=ldc2 -b=release-nobounds 21.89s user 0.20s system 390% cpu 5.661 total

Do you have any comments on the reason why this ger inside mtimes is much slower than the raw ger?

from lubeck.

9il avatar 9il commented on June 3, 2024

@ShigekiKarita

  1. gemm vs ger: BLAS operations are not optimized for cases like k = 1. So gemm performs gemm even if it can call ger. Other possible optimization is call gemv if only on of matrixes is a vector.
  2. Lubeck vs mir-blas/mir-lapack: Lubeck allocates a memory for result. This memory probably not in CPU cache. So, if you preallocate all memory and use only mir-blas/mir-lapack (and without D's hashmaps) then your algorithm will be faster. Both Python and Lubeck use internal BLAS implementation. So if you want win Python you need to preallocate memory, which is not possible in numpy.

3.Take a look into this line:

auto dh = mtimes(params["Why"].transposed, dy).slice; 

There are two memory allocations. The first one is mtimes, the second one is slice.

from lubeck.

ShigekiKarita avatar ShigekiKarita commented on June 3, 2024

Thanks, I see. Do you have a plan to add an optional argument to store the result in lubeck? As we can see it in numpy and torch, it is good for the preallocation strategy.

from lubeck.

jmh530 avatar jmh530 commented on June 3, 2024

@9il The mtimes documentation could be adjusted to say "General matrix-matrix multiplication. Allocates result to an uninitialized slice using GC."

I don't know how many of the other functions that will be in lubeck will need to think about these sorts of issues. It probably makes sense to create an Issue for further discussion on pre-allocation or the use of alternate allocators.

from lubeck.

9il avatar 9il commented on June 3, 2024

Thanks, I see. Do you have a plan to add an optional argument to store the result in lubeck? As we can see it in numpy and torch, it is good for the preallocation strategy.

The gemm wrapper for ndslice from mir-blas do the same.

@9il The mtimes documentation could be adjusted to say "General matrix-matrix multiplication. Allocates result to an uninitialized slice using GC."

PR is welcome. In the same time it is a general Lubeck concept. Lubeck was originally created to port a commercial Matlab library to D. Similarity, readability, and simplicity were key features. The speed was too, but it was already increased more then one hundred times compared with the original Matlab code.

I don't know how many of the other functions that will be in lubeck will need to think about these sorts of issues. It probably makes sense to create an Issue for further discussion on pre-allocation or the use of alternate allocators.

Currently @EmTee70 works on matrix classes that can hold different payloads (like symmteric, diagonal and other matrixes) and have "*" overloaded operation.

I thought a lot about RC based separate Matrix type system with clever expressions that will be felt like Julia or Matlab. Something like that:

Assume it can hold different types of matrixes:

  1. dense
  2. symmetric
  3. diagonal
  4. sparse
  5. tridiagonal

and is clever enough, for example, to fuse at run-time an expressions like

Mat C = alpha * J * J.t - beta * B * R;

into two BLAS (openblas) calls:

syrk (11 - symmetric rank k operation) for

C = alpha * J * J.t.

and than

gemm (12 - general rank k operation) for

`C -= beta * B * R`

Plus it would be able to solve linear systems using lapack:

B /= A;

@Laeeth, @firoozye, possibly it may be a good concept for math programming in D.

from lubeck.

jmh530 avatar jmh530 commented on June 3, 2024

@9il Submitted PR.

I"m glad progress is being made on general matrix classes. Would Jean-Louis Leroy's open multi-methods library be useful for these run-time features? He has Matrix examples, but I don't see ones that implement operator overloading.

from lubeck.

Related Issues (18)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.