Giter Club home page Giter Club logo

Comments (7)

shoyer avatar shoyer commented on May 11, 2024

This is actually useful enough that I wrote a whole special class to encapsulate this logic for xray, LazilyIndexedArray. There may be some relevant logic there for task 1 (e.g., how to slice a slice object).

from dask.

jcrist avatar jcrist commented on May 11, 2024

Well, matching can easily find these terms, RewriteRule could be generalized to take in an arbitrary right hand side. If it's a function, it accepts a dict representing the match. If it's a term, a specific subs function is created and linked in, so that rewriting is always match -> validate (for non linear patterns) -> apply func. I have something like this already written up for SymPy, I'll port it over the weekend/early next week.

from dask.

mrocklin avatar mrocklin commented on May 11, 2024

Here is a first pass on problem 1

from operator import getitem

def test_fuse_getitem():
    pairs = [((getitem, (getitem, 'x', slice(1000, 2000)), slice(15, 20)),
              (getitem, 'x', slice(1015, 1020))),

             ((getitem, (getitem, 'x', (slice(1000, 2000), slice(100, 200))),
                        (slice(15, 20), slice(50, 60))),
              (getitem, 'x', (slice(1015, 1020), slice(150, 160)))),

             ((getitem, (getitem, 'x', slice(1000, 2000)), 10),
              (getitem, 'x', 1010)),

             ((getitem, (getitem, 'x', (slice(1000, 2000), 10)),
                        (slice(15, 20),)),
              (getitem, 'x', (slice(1015, 1020), 10))),

             ((getitem, (getitem, 'x', (10, slice(1000, 2000))),
                        (slice(15, 20),)),
              (getitem, 'x', (10, slice(1015, 1020))))

        ]

    for inp, expected in pairs:
        result = fuse_getitem(inp)
        assert result == expected


def fuse_getitem(term):
    (_, (_, x, a), b) = term

    c = fuse_slice(a, b)

    return (getitem, x, c)


def normalize_slice(s):
    start, stop, step = s.start, s.stop, s.step
    if start is None:
        start = 0
    if step is None:
        step = 1
    return slice(start, stop, step)

def fuse_slice(a, b):
    """

    >>> fuse_slice(slice(1000, 2000), slice(10, 15))
    slice(1010, 1015, None)

    >>> fuse_slice(slice(1000, 2000), 10)
    1010

    >>> fuse_slice(slice(1000, 2000, 5), 10)
    1050

    >>> fuse_slice(slice(1000, 2000, 5), slice(10, 20, 2))
    slice(1050, 1100, 10)
    """

    if isinstance(a, slice):
        a = normalize_slice(a)
    if isinstance(b, slice):
        b = normalize_slice(b)

    if isinstance(a, slice) and isinstance(b, int):
        return a.start + b*(a.step if a.step is not None else 1)
    if isinstance(a, slice) and isinstance(b, slice):
        start = a.start + a.step * b.start
        if b.stop is not None:
            stop = a.start + a.step * b.stop
        else:
            stop = None
        if a.stop is not None:
            if stop is not None:
                stop = min(a.stop, stop)
            else:
                stop = a.stop
            stop = stop - ((stop - start) % b.step)
        step = a.step * b.step
        if step == 1:
            step = None
        return slice(start, stop, step)
    if isinstance(a, tuple) and not isinstance(b, tuple):
        b = (b,)
    if isinstance(a, tuple) and isinstance(b, tuple):
        j = 0
        result = list()
        for i in range(len(a)):
            if isinstance(a[i], int):
                result.append(a[i])
                continue
            result.append(fuse_slice(a[i], b[j]))
            j += 1
        return tuple(result)
    raise NotImplementedError()

from dask.

mrocklin avatar mrocklin commented on May 11, 2024

Doesn't handle the stranger slicing conditions (e.g. negative steps) nor is it exhaustively tested.

from dask.

jcrist avatar jcrist commented on May 11, 2024

Addressed problem 2 with #106. Just make a function that accepts a dict of matches and returns a term, and use it as the rhs of the rule.

from dask.

shoyer avatar shoyer commented on May 11, 2024

@jcrist @mrocklin Very nice!

I have a pretty exhaustive test suite that I wrote for xray -- it basically just checks all permutations of a long set of slices against the output that would be generated from numpy array:
https://github.com/xray/xray/blob/0d164d848401209971ded33aea2880c1fdc892cb/xray/test/test_indexing.py#L85

Happy to help put things together but it looks like you already have a good start.

RE rewrite rules: what about optimizing cases like x.mean(axis=1)[:5]? Ideally, there would be some notion of operations that can commute along a given axis.

from dask.

mrocklin avatar mrocklin commented on May 11, 2024

I'm fooling around with this in #107 . Feedback and test cases heartily welcomed there. It's getting a bit crazy.

from dask.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.