Giter Club home page Giter Club logo

Comments (12)

ForceBru avatar ForceBru commented on May 25, 2024 2

Hello! I saw this project on Reddit and was skeptical about whether compiling code that mainly uses NumPy with Cython provides any speedup.

Just curious because I think I am as skeptical, what is the point of using this weird syntax? I mean wouldn't it be better if performance critical components are written in c/c++ and provide a python API?

Is the syntax that weird, though? This is basically Python syntax with slight modifications to let you specify types. (I think Cython should switch to Python's type annotations like cdef my_var: double = 3.141 at some point) Writing something in C or C++ means that you'll need to know C or C++, which are quite different from Python (and also each other).

Furthermore, Cython already compiles to C, so before rewriting in C or C++ manually, you need to be sure that it'd be worth it, that your code can be faster than code generated by Cython. IMO, Cython is a middle ground between Python and "proper" compiled and statically typed languages like C, C++ and Rust. Which is great, because it's almost as simple as Python and not nearly as difficult as C++. The advantage of Cython is that it's similar to Python and lets you jump right in and get noticeable speedups without too much trouble, whereas using C or C++ implies learning C or C++ first and being able to wield it confidently, which is pretty difficult.

from sealion.

anish-lakkapragada avatar anish-lakkapragada commented on May 25, 2024

could you please explain a little more on what Py_ssize_t is ? I'm not that experienced in C or Cython. Thank you for spending the time to write this.

from sealion.

anish-lakkapragada avatar anish-lakkapragada commented on May 25, 2024

also 10x is a massive improvement!!!

from sealion.

anish-lakkapragada avatar anish-lakkapragada commented on May 25, 2024

I have just updated the source code in version 4.0.7. Thank you for your help, and feel free to keep on giving me feedback if you have the time. Please give me the green light to close this issue.

from sealion.

ForceBru avatar ForceBru commented on May 25, 2024

could you please explain a little more on what Py_ssize_t is ? I'm not that experienced in C or Cython. Thank you for spending the time to write this.

Py_ssize_t is a kind of signed integer that is used for indexing. It's big enough to store any reasonable index, and since arrays could be pretty large, I used this type to make sure that their size could be represented correctly.

Also, I could've written the last function in a cleaner way using memory views instead of np.array:

cpdef r2_score_cython(double[:] y_pred, double[:] y_test):
    assert tuple(y_pred.shape) == tuple(y_test.shape), f"Shape mismatch"
    
    return __r2_score_cython(
        np.array(y_pred),  # make a copy!
        y_test
    )

BTW, it could've been really useful if your library provided the EM-algorithm for working with mixture models. I've been messing around with it for some time and found that sklearn.mixture.GaussianMixture could be sped up significantly. My simple implementation in the Julia language is already about 2 times faster. Maybe Cython can give the same speedup for Python. Good luck with your project!

from sealion.

 avatar commented on May 25, 2024

Hello! I saw this project on Reddit and was skeptical about whether compiling code that mainly uses NumPy with Cython provides any speedup.

Just curious because I think I am as skeptical, what is the point of using this weird syntax? I mean wouldn't it be better if performance critical components are written in c/c++ and provide a python API?

from sealion.

anish-lakkapragada avatar anish-lakkapragada commented on May 25, 2024

yeah, using c(++) and providing a python API like the big ML libraries do is probably what I should have done if I knew what C(++) was at the time and how to use it and how to build it. unfortunately I didn't, so I used cython.

BTW, I built this project for fun and of course there's a thousand things that SWEs can point out I should have done better. thanks!

from sealion.

 avatar commented on May 25, 2024

Good job anyway, you always get to learn something as you go. I understand this being a fun project for learning purposes, I'm more questioning the practice of using cython, rather than criticizing your project, which probably is a great job at your level of experience, because I never saw the point.

from sealion.

anish-lakkapragada avatar anish-lakkapragada commented on May 25, 2024

oh yeah 100% agree wth you. using cython isn't as good as C++

from sealion.

anish-lakkapragada avatar anish-lakkapragada commented on May 25, 2024

would have been better to write python APIs and bind them but at the time Cython was all I knew

from sealion.

anish-lakkapragada avatar anish-lakkapragada commented on May 25, 2024

but because like 55% of the code runs on Cython in this project I might as well keep it in there. unfortunately not planning to update this project anymore.

from sealion.

 avatar commented on May 25, 2024

It's not that weird, I'm just using weird because I wrote a few things in C++, which makes it more familiar to me than Cython. Generally I agree, learning C/C++ is more difficult than simply changing defs to cdefs, declaring additional stuff every now and then, and so on.

from sealion.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.