Giter Club home page Giter Club logo

mojo-arrays's Introduction

Logo

NuMojo

NuMojo is a library for numerical computing in Mojo 🔥 similar to NumPy, SciPy in Python.
Explore the docs»
Check out our Discord»

Table of Contents
  1. About The Project
  2. Goals/Roadmap
  3. Usage
  4. How to install
  5. Contributing
  6. Warnings
  7. License
  8. Acknowledgments

About the project

What NuMojo is

NuMojo intends to capture a wide swath of numerics capability present in the Python packages NumPy, SciPy and Scikit.

NuMojo intends to try and get the most out of the capabilities of Mojo including vectorization, parallelization, and GPU acceleration(once available). Currently, NuMojo extends (most of) the standard library math functions to work on array inputs. NuMojo intends to try and get the most out of the capabilities of Mojo including vectorization, parallelization, and GPU acceleration(once available). Currently, NuMojo extends (most of) the standard library math functions to work on array inputs.

NuMojo intends to be a building block for other Mojo packages that need fast math under the hood without the added weight of a ML back and forward propagation system

What NuMojo is not

NuMojo is not a machine learning library, it will never include back-propagation in the base library.

Goals / Roadmap

For a detailed roadmap, please refer to the Roadmap.md file.

Our main goal is to implement a fast, comprehensive numerics library in Mojo. Following are some brief long-term goals,

Long term goals

  • Linear Algebra
    • Native n-dimensional array types
    • Vectorized, Parallelized math operations
    • Array manipulation - vstack, slicing, concat etc.
  • Calculus
    • Integration & Derivatives etc
  • Optimizers
  • Function approximators
  • Sorting

Usage

An example goes as follows.

import numojo as nm

fn main() raises:
    # Generate two 1000x1000 matrices with random float64 values
    var A = nm.NDArray[nm.f64](shape=List[Int](1000,1000), random=True)
    var B = nm.NDArray[nm.f64](1000,1000, random=True)

    # Print AB
    print(nm.linalg.matmul_parallelized(A, B))

Please find all the available functions here

How to install

There are two approach to install and use the Numojo package.

Build package

This approach invovles building a standalone package file mojopkg.

  1. Clone the repository.
  2. Build the package using mojo pacakge numojo
  3. Move the numojo.mojopkg into the directory containing the your code.

Inlcude NuMojo's path for compiler and LSP

This approach does not require buiding a package file. Instead, when you compile your code, you can include the path of NuMojo reporsitory with the following command:

mojo run -I "../NuMojo" example.mojo

This is more flexible as you are able to edit the NuMojo source files when testing your code.

In order to allow VSCode LSP to resolve the imported numojo package, you can:

  1. Go to preference page of VSCode.
  2. Got to Mojo › Lsp: Include Dirs
  3. Click add item and write the path where the Numojo repository is located, e.g. /Users/Name/Programs/NuMojo.
  4. Restart the Mojo LSP server.

Now VSCode can show function hints for the Numojo pakcage!

Contributing

Any contributions you make are greatly appreciated. For more details and guidelines on contributions, please check here

Warnings

This library is still very much a work in progress and may change at any time.

License

Distributed under the Apache 2.0 License with LLVM Exceptions. See LICENSE and the LLVM License for more information.

Acknowledgements

mojo-arrays's People

Contributors

shivasankarka avatar madalex1997 avatar forfudan avatar mmenendezg avatar sandstromviktor avatar

Stargazers

 avatar Benny Nottonson avatar Juan Esteban Berger avatar N avatar  avatar Cyrill avatar K. avatar  avatar Menegazzi avatar Dan Kelleher avatar Tokarzewski avatar seewind avatar Peiwen Ren avatar  avatar Jensen Holm avatar LB artworks avatar  avatar  avatar Lucifer-02 avatar  avatar Hammad Ali Butt avatar  avatar Alex Maldonado avatar Isaac avatar  avatar Ziyu Huang avatar Amit D. avatar  avatar George avatar Oskar Börjesson avatar Alexander Koch avatar  avatar Daniel avatar Mehdi Echchelh avatar Parsa Bahraminejad avatar Haifeng Jin avatar Sören Brunk avatar Wei-Chen.Chen avatar  avatar Grégoire Baranger avatar  avatar Cyclotomic Fields avatar Zac avatar  avatar Siddharth Chaini avatar Andrew avatar  avatar John Clema avatar Sophia Schröder avatar Josh S Wilkinson avatar Eric Han avatar  avatar  avatar Thomas Frick avatar Abdulaziz Alqasem avatar Martin Dudek avatar Jose RF Junior avatar  avatar Valentin Erokhin avatar Md. Nazrul Islam Khan avatar  avatar Gokul avatar  avatar hendrikjoosten avatar Christophe Meyer avatar Jag Chadha avatar Wei avatar Duo MA avatar Tommy D. Rossi avatar RDJ avatar  avatar  avatar  avatar Andres Nowak avatar  avatar bitdom8 avatar Helehex avatar aetherclouds avatar Asher Cohen avatar Shukla Kunj Rajiv avatar Gleb Antonevich avatar Ben Chuanlong Du avatar Rajkumar Singh avatar Benjamin Lawrence-Sanderson avatar Taqqn avatar  avatar James Usevitch avatar Tushar Kanhe avatar Suminda Sirinath Salpitikorala Dharmasena avatar Hylke avatar Jack Clayton avatar Alex G Rice avatar

Watchers

Timothy Keitt avatar  avatar  avatar  avatar

mojo-arrays's Issues

Error using slices

Dear all. Thanks for creating NuMojo. I have some issues, and perhaps this is just due to the early stage of this project, but i'll submit this anyway.

Describe the bug
All array operations involving slices seem to fail.

To Reproduce

import numojo as nm


def main():
    var A = nm.NDArray[nm.f64](shape=List[Int](3,3), random=True)

    # These work as expected
    print(A.mean())
    print(nm.math.stats.meanall(A))
    print(nm.math.stats.sumall(A))
    print(A.cumsum())
    
    # All these operations throw the same error
    #nm.math.stats.min(A) # Does not work
    #nm.math.stats.prod(A,axis=0) # Does not work
    #nm.math.stats.sum(A, axis=0) # Does not work
    #A.sum(axis=0) # Does not work
    #nm.math.stats.mean(A) # Does not work

and the resulting error message is this

/NuMojo/numojo/core/ndarray.mojo:1341:9: error: @stdlib::@builtin::@builtin_slice::@Slice::@"__copyinit__(stdlib::builtin::builtin_slice::Slice=&,stdlib::builtin::builtin_slice::Slice)_thunk" does not reference a KGEN declaration
mojo: error: failed to run the pass manager

Expected behavior
Being able to calculate mean along a given axis, fetching elements using slices etc.

Desktop (please complete the following information):

  • OS: MacOS 14.6.1
  • Latest Mojo version
  • VSCode

[lib] Make the default `dtype` of NDArray as `DType.float64`

Currently, the default data type of a NDArray is DType.float32. This is somehow different from numpy which uses float64 as default type. For example, numpy.ones().

I am wondering whether it would be nice to change the default data type of NDArray from float32 to float64 also for numojo, so that it is aligned with numpy and is of higher precision.

I do not think it is a top priority but we can have an open discussion on this.

Migrating from `DTypePointer` to `UnsafePointer`

The Mojo team has confirmed in the nightly changelog that they will deprecate the DTypePointer, and they suggest migrating to UnsafePointer.

They have also changed the methods of the UnsafePointer.

This will require to change the way we initialize the pointers for the data of the NDArrays.

I can help on the migration using Max Nightly for the NDArray struct.

Make dimension of an ndarray a parameter and known at compile time

Current situation

Currently, the dimension of the ndarray is inferred from the shape at the run time. This would lead to extra cost.

Proposal

As the dimension is known at initialisation, it might be nice to make ndim a parameter and known at compile time. Example:

var NDArray[2, DType.float64](1000, 1000, random=True)  # Creating a 1000x1000 2-D array (matrix)

Additional benefits

Making dimension a parameter at compile time may have additional benefits. For example, we can define alias to NDArrays more easily.

alias Vector = NDArray[1, _]
alias Matrix = NDArray[2, _]

Merge NDArray inits with fill and random

To reduce the number of initializer overloads for ndarray we could take the __inits__ that are identical other than having either random or fill and make initializers that default to filling with zeros, and do not allow for the fill to be nonzero if the random argument is true.

fn __init__( inout self, *shape: Int, random: Bool = False, order: String = "C" ) raises:

and

fn __init__( inout self, *shape: Int, fill: Scalar[dtype], order: String = "C" ) raises:

becomes

fn __init__( inout self, *shape: Int, fill: Scalar[dtype]=0, random:Bool=False, order: String = "C" ) raises:
    if random == True and fill !=0:
        raise Error("numojo/core/ndarray:NDArray: __init__(*shape, fill, random)): Error if random is true you cannot set a fill value")

This would turn 8 of our initializers into 4 thus reducing our number of initializers from 14 to 10.

Thoughts?

Error creating new NDArray: unknown keyword arguments: 'data', 'shape'

Describe the bug
Creating a NDArray by specifying the data and shape raises an error that both keyword arguments are unknown.

To Reproduce

var array = nm.NDArray[nm.f32](data=List[Int](1,2,3,4,5,6), shape=List[Int](2,3))

Error raised:

error: Expression [4]:2:31: no matching function in initialization
var array = nm.NDArray[nm.f32](data=List[Int](1,2,3,4,5,6), shape=List[Int](2,3))
            ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expression [0] wrapper:17:5: candidate not viable: unknown keyword arguments: 'data', 'shape'

Screenshots
image

Reduce number of `NDArray.__init__()` overloads and make it a pure container

This proposal is for long term.

Situation

Currently, we have many overloads for __init__() of NDArray struct. On the one hand, overload is a good feature that allows more flexibility. On the other hand, it prevents developers/users conveniently checking the arguments during programming, and it means that we are using different overloading functions in other functions/modules. This causes ambiguity and difficulties in maintenance. Thus, we should try to avoid using overloads unless it is necessary.

Solution

For numpy, the solution is quite straightforward: there is only one initializer, which reads in (1) shape (tuple), (2) buffer (can left empty), (3) order (C or F), etc. For other constructors (fill a value, randomize, from string), use separate construction functions.

This makes the function signature clear and makes the ndarray type a pure container.

I think we can also follow this. For NDArray, we only keep ONE initializer: fn __init__[dtype](shape, buffer, order). All other constructors should be included in the module array_creation_routines.

Obstacle

The biggest obstacle at the moment is that the Tuple type is not iterable. This means that we cannot calculate size, strides, etc, from the shape (as Tuple). In future, when Tuple becomes iterable, we can completely align the arguments with numpy.

Bug in calling array.dim_list.__getitem__()

Hello,

First off, thanks for taking on the project of making Mojo's version of Numpy! For the issue I'm seeing, there seems to be something going on when calling the array.dim_list.__getitem__() method for the ndarray implementation. After calling the function once, the next call balloons to a large number (ex. in my case 1.4e14). This becomes extremely problematic when calling the array.arr_print() function as the loop is now out of bounds for the size of the array. Below is the code I used to created the issue:

from ndarray.arraynd import Array
alias array = Array[DType.float64, 4]
let f64funcs = array()

var x = f64funcs .zeros(10)

print(x.size) # -> 10

print(x.tshape[0], x.tshape[1]) # -> 10, 0

# first call to __getitem__
print(x.dim_list[0]) # -> 1

# second call to __getitem__
print(x.dim_list[0]) # -> 140378715213677

# third call to __getitem__
print(x.dim_list[0]) # -> 140378715213677

I'm not sure if this is an issue with the VariadicList during initialization or during each function call, or something else. I know this package is in its infancy, but I just wanted to bring this to your attention.

Thank you for your time and please let me know if there's any additional information you need. (First time submitting one of these!)

Make `sort()` method an inplace method

I would like to know if possible to change the behavior of sort() in an array to make it an inplace method.

The expected behavior would be like:

var array = nm.NDArray[nm.i64](List[Int64](8, 3, 5, 7, 2, 1, 4, 6), shape=List[Int](8))
array.sort()
print(array)

[Output]: 
[	1	2	3	...	6	7	8	]
Shape: [8]  DType: int64

I understand that there are times when you don't want to change the original array and the intention is to create a new array, but renaming the current sort() to sorted() may be an option.

Let me know what you think

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.