Comments (9)
If your userSets have regions that overlap multiple universe regions, this can happen. So, you haven't defined your userSets in terms of your universe. One giant region in the userSet could cover 15 universe regions, scoring 15 overlaps, but only registered as a single entity in the userSet, which can mess up the contingency table.
Try setting redefineUserSets=TRUE
in runLOLA. Does it solve it?
from lola.
No, unfortunately it doesn't. I get the same warning and output with:
res <- runLOLA(userSets, userUniverse, lolaCustom, cores=detectCores(), redefineUserSets=TRUE)
The regions in my userSets are all FIMO scans of motifs and therefore quite short in length (between 8 and 31 bp in fact).
I don't really understand what could possibly be wrong with this data. Any idea?
Thinking out loud:
- Would
disjoin()
,resize()
or some other operation on userSets help in any way? - What about removing all entries that have a negative c value from the scoreTable?
Thanks,
from lola.
Your userSets should already be disjoint, or else you would get an error.
Removing negative c entries isn't the best because the fact that you have negative c entries is warning you that something is wrong with the way the test is constructed; eliminating these doesn't make the other tests accurate.
Ok, another idea. Try running reduce
instead of disjoin
on your database. I think maybe you are getting lots of little overlapping regions in your database because of the way you're combining potentially overlapping stuff and then disjoining.
from lola.
OK, I made some tests and randomly stumbled upon a solution (that I don't understand).
The custom LOLA database I built initially contains three collections: JASPAR, Taipale and UniProbe (see here: http://www.uwencode.org/proj/CATO/).
Running runLOLA() with the whole database gives me the error above.
However, rather unexpectedly, if I create three databases with one collection each, runLOLA() works flawlessly!
Any idea why that would happen? Maybe it has to do with the length of the object? Each of the collection has several hundreds files.
from lola.
Side question:
is:
sheffield_dnase <- unlist(lolaCore$regionGRL[which(lolaCore$regionAnno$collection == "sheffield_dnase")])
userUniverse <- disjoin(c(sheffield_dnase, unlist(userSets)))
res <- runLOLA(userSets, userUniverse, regionDB, cores=detectCores(), redefineUserSets=FALSE)
equivalent to:
sheffield_dnase <- unlist(lolaCore$regionGRL[which(lolaCore$regionAnno$collection == "sheffield_dnase")])
userUniverse <- disjoin(sheffield_dnase)
res <- runLOLA(userSets, userUniverse, regionDB, cores=detectCores(), redefineUserSets=TRUE)
Why/Why not?
The first approach (i.e.: include your own regions in the universe) makes more sense to me but maybe I misunderstood what redefineUserSets=TRUE
does?
from lola.
do you mind putting the database somewhere where I can download it? Or email it to me (contact info)
from lola.
And for your side question: No, these are very different.
Option 1: just puts your user sets into the universe (and then disjoins them so there are no overlaps, separating regions out).
Option 2: your user sets will not be in the universe. Then you redefine the usersets; any that were not in the universe would no longer be in your user sets.
Try combining your options. Put your user sets in the universe, and then redefine the userSets.
you can check out ?redefineUserSets
This function will take the user sets, overlap with the universe, and redefine the user sets as the set of regions in the user universe that overlap at least one region in user sets. this makes for a more appropriate statistical enrichment comparison, as the user sets are actually exactly the same regions found in the universe. otherwise, you can get some weird artifacts from the many-to-many relationship between user set regions and universe regions.
from lola.
Sorry I can't as it also contain some unpublished data that I don't own.
So, would you recommend always using redefineUserSets=TRUE
then?
Also, can you please confirm that including userSets in userUniverse is a sensible thing to do?
from lola.
Sorry I can't as it also contain some unpublished data that I don't own.
Just a reproducible minimal example would be fine, if you can concoct one.
So, would you recommend always using redefineUserSets=TRUE then?
Ideally you don't need to redefine user sets because you have already defined them in the same way that you defined your universe. If you defined them in almost the same way, you probably don't have to worry about it. If they're very different, then I would at least see how things change when you redefine them. So, it's like a helper function in case you are constructing a universe on the fly. So the short answer is: it probably doesn't hurt, but it may not be necessary and it adds compute time. Look also at ?buildRestrictedUniverse
.
Also, can you please confirm that including userSets in userUniverse is a sensible thing to do?
Well, any userSets that are not in the universe must be ignored... because sort of by definition, a userSet must be in the universe (that's what the universe is: the regions that could have been included in userSets). Again, ideally, your universe is defined in a way that already includes all your userSets. If not, you can sort of hack it by just adding the userSets into the universe (as you do above). I think this is a reasonable thing to do becomes sometimes it's difficult to come up with the exact appropriate universe for your test. But if there are many userSets not in the universe you want to use, you're probably using the wrong universe, and just adding the userSets may not be what you want. So, in some cases it's sensible, but you should think a bit about the universe you are using and what it means for your contingency table.
from lola.
Related Issues (20)
- hg38 database HOT 1
- Add in missing cellType/antibody entries for encodeTFBSmm10 HOT 6
- Re-order result columns
- Different results between LOLAweb interactive server and local R package HOT 7
- Inclusion of mm10 JASAPR prediction track into LOLA JASPAR HOT 3
- cistrome_mm10 HOT 2
- runLOLA results table for PEP databases
- PEP file read caching HOT 2
- Can't load an example regionDB. HOT 3
- PEP tag system
- testing for enrichment of significant individual cpgs HOT 2
- Error if GRangesList has names HOT 2
- question: how to test two region sets for differential enrichment with buildRestrictedUniverse? HOT 2
- Each of the 2 combined objects has sequence levels not in the other: HOT 4
- R package error HOT 4
- LOLA runLOLA-regionDB to use for tissue-specific TFBS
- P4 tool not grouping according to the paper!!
- Error by using extractEnrichmentOverlap function
- Fisher test - two sided hypothesis
- Error when trying to load a regionDB
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lola.