Giter Club home page Giter Club logo

Comments (9)

robertmartin8 avatar robertmartin8 commented on September 13, 2024 2

@schneiderfelipe

I spent quite a bit of time trying different functions, e.g:

d = 1 / np.sqrt(np.diag(cov))
return d.reshape(-1, 1) * (cov * d)

I thought this would be fast because it uses numpy's broadcasting, but it's actually slower than explicit dots (though it preserves pandas labels).

In the end, the fastest approach was a minor improvement over yours, bringing the np.sqrt out as well so it only operates on the diagonal. Then just re-add pandas labels.

def cov_to_corr(cov):
    Dinv = np.diag(1 / np.sqrt(np.diag(cov)))
    return pd.DataFrame(np.dot(Dinv, np.dot(cov, Dinv)), index=cov.index, columns=cov.index)

Let me know if your tests show any difference.

Now that I've spent so much time on this, I'm probably going to add it to risk_models haha. Thanks for all your input, including the tests!

from pyportfolioopt.

robertmartin8 avatar robertmartin8 commented on September 13, 2024 1

Hi,

Thanks for the suggestion! I'll consider adding something like that. FYI, the code can be simplified a lot using linear algebra. I think something like this should work?

Dinv = 1 / np.sqrt(cov.diagonal())
return np.dot(Dinv.T, np.dot(cov, Dinv)))

from pyportfolioopt.

schneiderfelipe avatar schneiderfelipe commented on September 13, 2024 1

I am keeping this open to get an indication of whether people would value this feature. In my mind, it doesn't really belong in the API (particularly because of the two-liner above).

This is nice to cite in the documentation, what do you think?

from pyportfolioopt.

schneiderfelipe avatar schneiderfelipe commented on September 13, 2024 1

Nice! I believe it's as clean and fast as it can get :)

Thanks! It's an awesome feature!

from pyportfolioopt.

robertmartin8 avatar robertmartin8 commented on September 13, 2024

I am keeping this open to get an indication of whether people would value this feature. In my mind, it doesn't really belong in the API (particularly because of the two-liner above).

from pyportfolioopt.

robertmartin8 avatar robertmartin8 commented on September 13, 2024

@schneiderfelipe yeah I think I will. Is my linear algebra correct though?

from pyportfolioopt.

robertmartin8 avatar robertmartin8 commented on September 13, 2024

@shangfr @schneiderfelipe

My linear algebra was a bit off, this will work (see math SE):

Dinv = np.diag(np.diag(1 / np.sqrt(cov)))
return np.dot(Dinv, np.dot(cov, Dinv))

from pyportfolioopt.

schneiderfelipe avatar schneiderfelipe commented on September 13, 2024

My linear algebra was a bit off, this will work (see math SE):

Dinv = np.diag(np.diag(1 / np.sqrt(cov)))
return np.dot(Dinv, np.dot(cov, Dinv))

Only a minor issue: np.diag(1 / np.sqrt(cov)) takes the square root of n² elements and gets the diagonal, while 1 / np.diag(np.sqrt(cov)) gets the same result but only takes the square root of n elements.

In [1]: from pypfopt import risk_models
In [2]: import numpy as np
...
In [4]: prices.tail()  # tickers from Brazilian stock exchange
Out[4]: 
            ABEV3.SAO  PETR4.SAO  MGLU3.SAO
index                                      
2019-07-25      19.49      26.89     244.48
2019-07-26      19.80      26.14     252.00
2019-07-29      20.50      26.38     263.80
2019-07-30      20.34      26.24     264.55
2019-07-31      20.05      26.34     265.33
In [5]: S = risk_models.CovarianceShrinkage(prices).ledoit_wolf()
In [6]: S
Out[6]: 
           ABEV3.SAO  PETR4.SAO  MGLU3.SAO
ABEV3.SAO   0.059909   0.025989   0.018350
PETR4.SAO   0.025989   0.199124   0.055718
MGLU3.SAO   0.018350   0.055718   0.298723

See comparisons below:

In [7]: np.diag(np.diag(1 / np.sqrt(S)))
Out[7]: 
array([[4.08559668, 0.        , 0.        ],
       [0.        , 2.24098365, 0.        ],
       [0.        , 0.        , 1.82964101]])
In [8]: np.diag(1 / np.diag(np.sqrt(S)))
Out[8]: 
array([[4.08559668, 0.        , 0.        ],
       [0.        , 2.24098365, 0.        ],
       [0.        , 0.        , 1.82964101]])
In [9]: %timeit -n 1000 np.diag(np.diag(1 / np.sqrt(S)))
1.54 ms ± 32.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [10]: %timeit -n 1000 np.diag(1 / np.diag(np.sqrt(S)))
176 µs ± 4.87 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Correlation matrix can then be obtained as you mentioned:

In [11]: Dinv = np.diag(1 / np.diag(np.sqrt(S)))
In [12]: np.dot(Dinv, np.dot(S, Dinv))
Out[12]: 
array([[1.        , 0.23794696, 0.13716936],
       [0.23794696, 1.        , 0.22845599],
       [0.13716936, 0.22845599, 1.        ]])
In [13]: Dinv @ S @ Dinv  # new notation returns a dataframe
Out[13]: 
          0         1         2
0  1.000000  0.237947  0.137169
1  0.237947  1.000000  0.228456
2  0.137169  0.228456  1.000000

By the way, the following also works (and keeps proper dataframe labeling, but is not as fast):

In [14]: d = np.sqrt(np.diag(S))
In [15]: S / np.outer(d, d)
Out[15]: 
           ABEV3.SAO  PETR4.SAO  MGLU3.SAO
ABEV3.SAO   1.000000   0.237947   0.137169
PETR4.SAO   0.237947   1.000000   0.228456
MGLU3.SAO   0.137169   0.228456   1.000000
In [36]: %timeit -n 1000 S / np.outer(d, d)
1.03 ms ± 14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

from pyportfolioopt.

robertmartin8 avatar robertmartin8 commented on September 13, 2024

Good catch! Let me try to see if there are any efficient ways that preserve labelling.

from pyportfolioopt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.