Giter Club home page Giter Club logo

Comments (7)

miranov25 avatar miranov25 commented on September 17, 2024

additional problem with the median shifted by 1 bin observed in the test

from rootinteractive.

ehellbar avatar ehellbar commented on September 17, 2024

I have to make a correction of the statement in the first comment #41 (comment)

Observed time to make PDF map from histogram ~ 5 minutes (180 x 33 x 40 x 8 )

I think this was for a first version in which the histograms have been smaller. With the histogram size quoted above, the time to create one map is of the order of 30 min on the GSI batch farm.

from rootinteractive.

miranov25 avatar miranov25 commented on September 17, 2024

Commit:
d96b620

  • speeding up median calculation - caching quantiles
  • fixing median bug

from rootinteractive.

miranov25 avatar miranov25 commented on September 17, 2024

Benchmark

In old implementattion sum was calcualeted in loop.
In new implementation (comit d96b620) cumulative function was calculated only once
code to be further speed up

python -m cProfile -s tottime RootInteractive/Tools/test_makePDFMaps.py

Old:

  •     147250797 function calls (147219265 primitive calls) in 138.137 seconds
    
   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 47292148   86.467    0.000   86.467    0.000 {method 'reduce' of 'numpy.ufunc' objects}
        1   26.240   26.240  133.420  133.420 makePDFMaps.py:3(makePdfMaps)
 47291767   10.851    0.000  104.146    0.000 {method 'sum' of 'numpy.ndarray' objects}
 47291767    6.990    0.000   93.295    0.000 _methods.py:34(_sum)
       12    2.665    0.222    2.665    0.222 {method 'astype' of 'numpy.ndarray' objects}
       77    1.210    0.016    1.210    0.016 {method 'copy' of 'numpy.ndarray' objects}

New

  •    5377715 function calls (5346183 primitive calls) in 17.280 seconds
    
   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    9.322    9.322   12.237   12.237 makePDFMaps.py:3(makePdfMaps)
       12    2.896    0.241    2.896    0.241 {method 'astype' of 'numpy.ndarray' objects}
       77    1.298    0.017    1.298    0.017 {method 'copy' of 'numpy.ndarray' objects}
        1    0.538    0.538    0.835    0.835 completer.py:103(<module>)
   237600    0.455    0.000    0.455    0.000 {method 'cumsum' of 'numpy.ndarray' objects}
      208    0.292    0.001    0.292    0.001 {method 'flatten' of 'numpy.ndarray' objects}
     1102    0.176    0.000    0.176    0.000 {method 'reduce' of 'numpy.ufunc' objects}

from rootinteractive.

miranov25 avatar miranov25 commented on September 17, 2024

Reducing memory usage and speed up:

commit 409d70b (HEAD -> master, miranov25/master)
Author: miranov25 [email protected]
Date: Fri Apr 24 13:00:23 2020 +0200

Removing not necessary copy of data
* smaller memory usage and faster

Benchmark:

Now the time is spent mostly in python interpreter
next factor 10 to be gained optimizing loops

  •     5377627 function calls (5346095 primitive calls) in 15.739 seconds
    
   Ordered by: internal time
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    9.400    9.400   11.122   11.122 makePDFMaps.py:3(makePdfMaps)
       12    2.679    0.223    2.679    0.223 {method 'astype' of 'numpy.ndarray' objects}
        1    0.566    0.566    0.868    0.868 completer.py:103(<module>)
   237600    0.430    0.000    0.430    0.000 {method 'cumsum' of 'numpy.ndarray' objects}
      208    0.325    0.002    0.325    0.002 {method 'flatten' of 'numpy.ndarray' objects}
     1102    0.272    0.000    0.272    0.000 {method 'reduce' of 'numpy.ufunc' objects}

from rootinteractive.

miranov25 avatar miranov25 commented on September 17, 2024

Pull request #43

Using np.searchsorted for median calculation
Timing improvement factor ~ 2

  •     6324457 function calls (6292925 primitive calls) in 7.651 seconds
    
   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       12    2.798    0.233    2.798    0.233 {method 'astype' of 'numpy.ndarray' objects}
        1    0.736    0.736    2.938    2.938 makePDFMaps.py:3(makePdfMaps)
        1    0.578    0.578    0.883    0.883 completer.py:103(<module>)
   237600    0.424    0.000    0.424    0.000 {method 'cumsum' of 'numpy.ndarray' objects}
      208    0.336    0.002    0.336    0.002 {method 'flatten' of 'numpy.ndarray' objects}
      592    0.289    0.000    0.289    0.000 {method 'reduce' of 'numpy.ufunc' objects}
   237605    0.255    0.000    0.255    0.000 {method 'searchsorted' of 'numpy.ndarray' objects}
        5    0.173    0.035    0.173    0.035 {pandas._libs.lib.maybe_convert_objects}

from rootinteractive.

miranov25 avatar miranov25 commented on September 17, 2024

Closing issue

In benchmark above speeding factor 20 achieved
Next improvement using py-torch GPU implementation + fitting

from rootinteractive.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.