Giter Club home page Giter Club logo

Comments (13)

ablaom avatar ablaom commented on July 28, 2024 2
julia> scitype([1.0, 2.3, missing, 3.5])
AbstractVector{Union{Missing, Continuous}} (alias for AbstractArray{Union{Missing, Continuous}, 1})

from tabletransforms.jl.

juliohm avatar juliohm commented on July 28, 2024 1

Yes, I guess we could generalize assert_continuous to work in the presence of missing values.

from tabletransforms.jl.

juliohm avatar juliohm commented on July 28, 2024 1

Let's postpone this discussion to after v1.0 is out. We are working on the missing transforms in #13 and will soon release this first major version. The API is quite stable and we can't predict any major change in the future. Missing values can potentially change API and will be considered in a future major release.

from tabletransforms.jl.

ceferisbarov avatar ceferisbarov commented on July 28, 2024

I generalized assert_continuous to work in the presence of missing values, but this results in errors during the transformation process. For example:

MethodError: Cannot `convert` an object of type Missing to an object of type Float64

I can think of several alternatives:

  • Automatically deal with missing values before the transformation to avoid these errors
  • Simply generalize assert_continuous and allow rest of the transform to throw these errors
  • Create a helpful warning message in assert_continuous function
  • Do nothing and let assert_continuous to assert that the table is not continuous.

Should I implement one of these? Or maybe something else?

from tabletransforms.jl.

juliohm avatar juliohm commented on July 28, 2024

That is a great suggestion @ablaom , if I understood correctly you are suggesting that we use scitype on the column and then check if the result is <: AbstractVector{Union{Missing,Continuous}} ?

from tabletransforms.jl.

juliohm avatar juliohm commented on July 28, 2024

@ceferisbarov try to use scitype as @ablaom suggested, happy to review a PR.

from tabletransforms.jl.

ceferisbarov avatar ceferisbarov commented on July 28, 2024

@juliohm Great, I am working on it. The problem is that AbstractVector{Continuous} <: AbstractVector{Union{Missing,Continuous}} returns false, but
Continuous <: Union{Missing, Continuous} returns true. Is there a reason why I shouldn't use the latter? The following line:

@assert all(T <: Continuous for T in types) "columns must hold continuous variables"

would be replaced with:

@assert all(T <: Union{Missing, Continuous} for T in types) "columns must hold continuous variables"

from tabletransforms.jl.

juliohm avatar juliohm commented on July 28, 2024

I think we need two distinct assertion functions. The one we have is more strict in the sense that it asserts that we don't have missing values. Maybe we should add a new function assert_continuous_or_missing if a transform supports missing values. Can you please remind me why we started discussing this generalization? Do we really need to allow missing values in our currently implemented transforms? As far as I remember none of our statistical transforms, which require continuous values, support missing values.

from tabletransforms.jl.

ceferisbarov avatar ceferisbarov commented on July 28, 2024

@juliohm The original request was to skip missing values. This can be done at least for Center transform. I believe we can create assert_continuous_or_missing as you said, and use it for transforms where we can skip missing values. And then we would have to add skip missing values option to the said transform.

from tabletransforms.jl.

ceferisbarov avatar ceferisbarov commented on July 28, 2024

An example:

x1 = [1.0, 2.0, 3.0, 4.0, 5.0]
x2 = [missing, 2.0, 3.0, 4.0, 5.0]
x3 = [5.0, 5.0, 5.0, 5.0, 5.0]
t = TypedTables.Table(;x1, x2, x3)

t |> Center(skipmissing=true)

Output:

Table with 3 columns and 5 rows:
     x1    x2       x3
   ┌───────────────────
 1 │ -2.0  missing  0.0
 2 │ -1.0  -1.5     0.0
 3 │ 0.0   -0.5     0.0
 4 │ 1.0   0.5      0.0
 5 │ 2.0   1.5      0.0

from tabletransforms.jl.

ablaom avatar ablaom commented on July 28, 2024

The problem is that AbstractVector{Continuous} <: AbstractVector{Union{Missing,Continuous}} returns false

Maybe this is what you're after:

julia> AbstractVector{Continuous} <: AbstractVector{<:Union{Missing,Continuous}}
true

from tabletransforms.jl.

ceferisbarov avatar ceferisbarov commented on July 28, 2024

@ablaom This works, thanks! Do you mind looking look at the PR?

from tabletransforms.jl.

juliohm avatar juliohm commented on July 28, 2024

This is no longer an issue with the migration to DataScienceTraits.jl.

from tabletransforms.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.