Giter Club home page Giter Club logo

Comments (11)

falexwolf avatar falexwolf commented on August 16, 2024 1

Sorry about the late response: yes, your .iloc/.loc analogy is correct. You'll find a similar behavior to anndata (without .iloc/.loc) also in numpy structured arrays. I'll add your analogy to the docs.

Thank you!

And, yes, we should also throw an error for the integer indexing. Leaving this open for now.

from anndata.

ivirshup avatar ivirshup commented on August 16, 2024

Just cause I feel like my current work-around is silly (converting the integer obs_names to strings), I was looking into why this is happening. Am I correct in thinking the assertion I'm triggering was added since indexing into an AnnData object with a numeric is supposed to be positional, like pandas .iloc method, while indexing with a string/ categorical is supposed to behave like .loc?

from anndata.

falexwolf avatar falexwolf commented on August 16, 2024

I added this warning when people try to work with non-string indices: 51be5d8.

from anndata.

ltosti avatar ltosti commented on August 16, 2024

Hi @falexwolf, @ivirshup ,
I was trying to use BBKNN and got this error when trying to concatenate different datasets (see here). Could you please clarify how you managed to fix that?

Thank you!

from anndata.

falexwolf avatar falexwolf commented on August 16, 2024

Sorry about this bug! It might be related to the objects having integer indices: we shouldn't have an error thrown, I guess. But previously, this was presumably inconsistent behavior. @flying-sheep, did you add the following line at some point to _normalize_index, which triggers the error:

assert names.dtype != float and names.dtype != int, \
    'Don’t call _normalize_index with non-categorical/string names'

from anndata.

flying-sheep avatar flying-sheep commented on August 16, 2024

am I missing something? AFAIK datasets still can’t have integer var_names or obs_names. else it’s impossible to distinguish if e.g. adata[0, :] means adata[adata.obs_names == 0, :] or adata[np.arange(adata.shape[0]) == 0, :].

from anndata.

ltosti avatar ltosti commented on August 16, 2024

Thank you for the clarification! I think this error was not thrown in scanpy 1.2.2 (for example here), but it does occur now (scanpy 1.3.6). I guess a solution could be to force the var_names to string.

from anndata.

falexwolf avatar falexwolf commented on August 16, 2024

am I missing something? AFAIK datasets still can’t have integer var_names or obs_names.

@flying-sheep we never threw an error when people set non-string indices; so there will be objects out there that have non-string indices; the assert statement that was probably inserted by you was the first to actually throw an error; before, integer indices were interpreted .iloc - style even if there was an integer index... that's the only way I can imagine that the notebook above broke

from anndata.

flying-sheep avatar flying-sheep commented on August 16, 2024

Well, I think it’s an useful error to throw. We could add a tip on how to resolve it by removing the index…

from anndata.

falexwolf avatar falexwolf commented on August 16, 2024

Yes, it's a useful error, but it shouldn't be thrown in getitem and it broke the notebook! Instead, an error should be thrown when setting a non-string index; but in that case I accounted for it via a warning, knowing that I would otherwise break things that previously worked.

We shouldn't allow people to have objects in an invalid state and then be surprised by throwing an error later on.

It would be better if here, we output a warning saying that one matches .iloc-style and not on the elements of the index. This is what happened before, it's OK in many cases and allowed the pancreas notebook referenced above to run through. There might be many such notebooks out there and it would be nice to renable them.

from anndata.

flying-sheep avatar flying-sheep commented on August 16, 2024

OK! so throwing an error here was a bugfix itself:

e83f9db...6a6f13e

I assume that _normalize_index is just not called correctly when there’s integer indices. The fix is therefore surely more involved than just removing that line or converting it into a warning:

I’m pretty sure that integer indices on anndata with integer names were already broken before that and I just added a nicer error message.

from anndata.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.