Giter Club home page Giter Club logo

Comments (2)

jpivarski avatar jpivarski commented on June 19, 2024

I thought about that—there are cases where you'd want to bin a space in a non-rectangular way. For instance, in x < 0 the y bins are finely spaced, in 0 <= x < 1, the y bins are broadly spaced, and in 1 <= x, the y bins cover a different range, etc.

This can be expressed in the current framework as a combination of Histograms and Collections. Both of these have a list of Axis that divide the space in a Cartesian product, but the Collections also define a set of children that do not have to have the same Axis list. That way, you could have a Collection of three Histograms: one finely binned in y, filled with x < 0, another widely binned in y with 0 <= x < 1, and another with y bins in a different range for 1 <= x. The non-rectangularness is expressible, though the user-facing library might call these a single histogram while Aghast calls it three Histograms.

Ah, but in that case, you'd really prefer the children of the Collection to be "named" with elements of a PredicateBinning, rather than strings. Maybe I should add a sibling of Collection that does that: instead of keying the things it contains with strings, it should key them with a binning. That would carry more semantic information.

from aghast.

jpivarski avatar jpivarski commented on June 19, 2024

I had been thinking about this, and although a sister to Collection would as the functionality in a backwards-compatible way, it would be simpler (and a breaking change) to generalize the Collection members from a string → objects mapping to a binning → objects, where the binning is usually CategoryBinning. The case you want would be PredicateBinning.

Since it's still the really days I'm going to change that. A lot of tests will need to be touched, but it will be worth it in the end.

The Axis system would be like this:

  • Collection has a sequence of Axis that are the outermost Cartesian splits.
  • Collection lookup has a single Axis that is a binning for its individually defined objects—one Histogram definition for each bin.
  • Histograms have a sequence of Axis that are the innermost Cartesian splits.

That way, you can build arbitrary nestings of "ands" and "ors" for splitting, by nesting several layers of Collections.

from aghast.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.