Giter Club home page Giter Club logo

Comments (8)

LuckyMD avatar LuckyMD commented on August 15, 2024

Hi @anikaliu,

Sorry for the delayed reply here. This looks like something we have encountered before and should work. Could I ask which version of scanorama you are running? It seems that you have more cells after integration than before.

Have you compared the /data/al862/scib_pipeline/data/test_data/integration/unscaled/full_feature/scanorama.h5ad output to your input data file? Are there any cells that seem to be duplicated? Or is it the case the we have 0 expression cells in the test object.

@mumichae this shouldn't have to do with the test object, no?

from scib-pipeline.

anikaliu avatar anikaliu commented on August 15, 2024

Hi @LuckyMD,
thanks for your reply! I hadn't checked any of the things you mentioned but have done it now:

  • The scanorama version is 1.7 (through bioconda).
  • The number of cells per batch is identical in the input and output h5ad (also for adata_output.obsm['X_scanorama'] )
  • There aren't any cells without expressed genes in the input object

Please let me know if there's anything else I can check or send over to sort this out!:)

from scib-pipeline.

LuckyMD avatar LuckyMD commented on August 15, 2024

Hmm... and the adata.obs['batch'] before and after integration are exactly the same?

from scib-pipeline.

mumichae avatar mumichae commented on August 15, 2024

There only time we encountered this error was when scanorama overwrote the batch column in adata.obs, which should have been fixed in version 1.7. Are you checking the version in the scIB-python environment?

Another thing you could try is switching to the update_scanpy branch and setting up the environments scib-pipeline and scib-R. The dependencies there should be more up-to-date.

P.S. Make sure you delete any intermediate files before rerunning the pipeline, in case the issue lies with wrong batch names in the scanorama output.

from scib-pipeline.

anikaliu avatar anikaliu commented on August 15, 2024

I've re-run after deleting the intermediate files now and noticed that actually the batches are mixed up! I hadn't noticed it because the frequencies were the same. I've now checked that the same samples are still in the same batch, so it's only the batch IDs are somehow different.. does this information help somehow?

from scib-pipeline.

mumichae avatar mumichae commented on August 15, 2024

Yes, it sounds very much like a bug that was in an older scanorama version. In the unintegrated adata object (adata_norm.h5ad) what data type does the batch column have? It should be categorical or String, but not integer.

Could you also confirm that you are checking the scanorama versions in the scIB-python environment (or scib-pipeline environment, if working on #6 ) and/or whether you're running the pipeline from this environment? The integration methods are explicitly run in this conda environment.

Assuming that you're working with the same data and environments as described in the repo, we should be getting the same behaviour (scanorama works for me).

from scib-pipeline.

mumichae avatar mumichae commented on August 15, 2024

@anikaliu Did you resolve your issue or are you still getting mixed-up batches with scanorama 1.7?

from scib-pipeline.

anikaliu avatar anikaliu commented on August 15, 2024

Really sorry for not getting back to you! I haven't been able to sort it but also haven't tried it with the update_scanpy branch yet. I will be trying what you suggested, but it might take me a while to get to it. (Totally understand if you prefer to close the issue for now)

from scib-pipeline.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.