Giter Club home page Giter Club logo

Comments (8)

pcarbo avatar pcarbo commented on August 23, 2024

@GreenGilad There is only one strict requirement: the count data should be non-negative numbers. Normally I would GetAssayData(object,"counts") from Seurat as the X input to fit_poisson_nmf or fit_topic_model. So hopefully you plan to do something similar? Also please know that we have a Seurat wrapper in development here.

from fasttopics.

GreenGilad avatar GreenGilad commented on August 23, 2024

@pcarbo Thanks for the quick reply!
Exactly, over a single Seurat object I do plan to do something that looks like this. The question is, what would be a good approach over an integrated dataset? In that case we do not have the counts data but only the data (normalized) data. By shifting the values in the matrix such that there are no negative values I will be able to run the topics over the normalized data but the question is:

  • Does it make sense to do so? The raw counts are natural numbers where as the shifted normalized data have any non-negative rational number. In the EM algorithm it tries to maximize the likelihood of the lambda of the Poisson (which is over natural numbers).

from fasttopics.

pcarbo avatar pcarbo commented on August 23, 2024

@GreenGilad I suggest following up by email. fastTopics may or may not be appropriate for your setting; we have not yet tested fastTopics for joint analysis of multiple data sets (this is something we are actively exploring). If the differences between the data sets are "small enough", then I think it would be reasonable to run fastTopics directly on the raw counts. A simple thing to do would be to run fastTopics separately on the individual data sets and on the combined data set and compare the results (there are however some subtleties in comparing the results effectively).

from fasttopics.

pcarbo avatar pcarbo commented on August 23, 2024

Look in the DESCRIPTION file.

from fasttopics.

inbarsh2 avatar inbarsh2 commented on August 23, 2024

Hi,
I encountered a similar problem in which I try to run fastTopic on integrated data. I would like to run fastTopic on each dataset separately, as you suggested, but I am not sure how to effectively compare the results.
thank you!

from fasttopics.

pcarbo avatar pcarbo commented on August 23, 2024

@inbarsh2 Could you explain in more detail what you mean by "compare the results"?

from fasttopics.

inbarsh2 avatar inbarsh2 commented on August 23, 2024

Sure.
I have data from patients with high variability between samples I need to overcome. I would like to use fastTopic on each sample separately and then find common expression programs or genes. My question is what is the best way to do so.
I also tried to integrate the data and then run fastTopic, but it isn't the correct input for the algorithm since the matrix is scaled (as described above).
Thank you.

from fasttopics.

pcarbo avatar pcarbo commented on August 23, 2024

@inbarsh2 I would start by running fastTopics on the raw count data for all the samples and see what the results look like; are some topics capturing sample-specific effects? So you have access to the raw count data?

from fasttopics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.