Giter Club home page Giter Club logo

Comments (8)

haubold avatar haubold commented on August 21, 2024 2

Yes, that package is available. A common problem with fur is mixture between target and neighbor genomes. To make sure this doesn't happen, one should reconstruct a phylogeny from the genomes. If you'd like to see how this is done, take a look at the tutorial in the neighbors package.

from fur.

haubold avatar haubold commented on August 21, 2024 1

It is a good question, here is a sketch of an answer: Let's say the leaves in the tree are labelled with prefix 'n' for taxonomic neighbors and prefix 't' for taxonomic targets, as explained in the neighbors tutorial. We are looking for the node, v, that maximizes the number of targets in the subtree rooted on v and the number of neighbors elsewhwere.

For example, the tree file o157.txt contains 449 E. coli genomes, with the pathogen O157:H7 as the target. We iterate over the 448 internal nodes, which were labeled with lande, and for each node calculate the node score consisting of the sum of the targets found in that node and the neighbors found outside:

for a in $(seq 448); do
t=$(pickle $a o157.txt | grep '^t' | wc -l)
n=$(pickle -c $a o157.txt | grep '^n' | wc -l)
((s=$t+$n))
echo $a $s
done > score.txt

We sort the scores to find that the desired node is 300, which is also the one I'd pick from visual inspection.

sort -n -k 2 -r score.txt | head

Hope this helps.

from fur.

haubold avatar haubold commented on August 21, 2024 1

Just in case this is still of interest, the Neighbors package now also contains the program fintac for finding the target clade according to the procedure sketched above.

from fur.

haubold avatar haubold commented on August 21, 2024

We are developing the package neighbors for that. The Tutorial in its documentation might help you complete a given set of targets and to find the corresponding neighbors.

from fur.

wangzhichao1990 avatar wangzhichao1990 commented on August 21, 2024

We are developing the package neighbors for that. The Tutorial in its documentation might help you complete a given set of targets and to find the corresponding neighbors.

Thank you for your reply. Is this package currently available? I basically choose other genomes of the same genus as the neighbors, maybe I need to change the - q parameter.

from fur.

wangzhichao1990 avatar wangzhichao1990 commented on August 21, 2024

Yes, that package is available. A common problem with fur is mixture between target and neighbor genomes. To make sure this doesn't happen, one should reconstruct a phylogeny from the genomes. If you'd like to see how this is done, take a look at the tutorial in the neighbors package.

Is there a way to automatically correct the target and neighbor genomes?
manual inspection is a time-consuming step and I have many species to construct these genomes. Looking forward to your reply. Thank you.

from fur.

haubold avatar haubold commented on August 21, 2024

There's a semi-automatic way, which scales well to hundreds of bacterial genomes. It is described in the tutorial of the neighbors package. It consists of the following steps:

  1. Download target and neighbor genomes as listed by neighbors
  2. Construct tree from genomes
  3. Label tree nodes with and inspect tree to pick the target clade
  4. List the leaves in the target clade and automatically create symbolic links for them to the targets directory
  5. Repeat for the neighbors
  6. Run makeFurDb followed by fur

If this sounds promising to you, take a look at the neighbors tutorial for the details.

from fur.

wangzhichao1990 avatar wangzhichao1990 commented on August 21, 2024

Thank you for your reply.
I have read the documentation for this package. Actually, I want to automate step 3. At present, I don't have a good idea.

from fur.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.