Comments (4)
Yea this has been on my radar for awhile. I suppose I've hesitated because tomtom is so often unable to identify what proteins are most relevant to the motif. I'll post here if I code up a solution.
from basset.
Yeah, but I can definitely say that this is something that is absolutely crucial for this to catch real traction, need to connect to previous people's stuff. I started working on this and it should just be feeding the model the seqeunce of a specific location right?
from basset.
Just to make sure that we're on the same page- you want to annotate the motifs that pop up with loss scores in the saturated mutagenesis heat maps, right? For example, the sequence in Figure 5 in the paper would be annotated as CTCF. That's how I've been thinking about it.
Obviously, there are already tools for annotating sequences with motifs, like Tomtom. Basset can help suggest which motifs are actually relevant. So maybe the easiest approach would be to add an option to basset_sat.py to query the sequence for significant motifs with Tomtom, but filter the list for only those that overlap a nucleotide with a loss score above some threshold.
from basset.
Yeah that's exactly right (CTCF in figure 5).
And although the method you point out would show the identified ones, it would only show the identified ones? I think that there is definitely a TON of value in showing the Motifs that TomTom can't find. But that would mean starting w/ an all inclusive list of motifs which cause a significant loss score and just labeling those which TomTom is able to label, not discarding any (we know that current motifs suck).
from basset.
Related Issues (20)
- Training aways converges after 2 or 3 epochs HOT 4
- Training takes too long HOT 3
- Pre-trained Weights? HOT 1
- Torch-HDF5 Failure during Writing Output by basset_motifs_predict.lua HOT 3
- First Epoch Loss is Bigger than Expected HOT 2
- Nucleotide Order in Motif Heatmap Reversed HOT 1
- how are filters combined HOT 2
- Error in running basset_sat_vcf.py HOT 1
- Convolutional layers - padding and bias HOT 1
- input and target size mismatch when running test.lua HOT 3
- Citation on README HOT 1
- Prediction tasks specification HOT 1
- install_data.py requires more than 30 GiB of memory HOT 2
- Docker image Lua and Python setups wrong
- how to install 'convnet' module HOT 1
- ENUM error when running basset_train.lua HOT 7
- preprocess_features.py file is generating empty output bed files HOT 2
- bedtools getfasta skipping problem HOT 1
- Used in non human species HOT 1
- Original dataset Basset is trained on HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from basset.