Giter Club home page Giter Club logo

parett's Introduction

1355890 (1) image

Python Automated Retrieval of TimeTree data

Language Version Windows License DOI

Introduction

PAReTT is a menu-driven module used to interact with the Time Tree resource, specifically designed to automate batch retrieval of data for a list of species. Three main types of data can be retrieved using PAReTT: (1) Divergence times, between an individual pair or between all species in a list, (2) Evolutionary timelines, for individuals or a list species, and (3) Time trees of the divergence times, either for all available species within a specified taxon or between individual species supplied as a list. When working with a list of species, the best place to start is by using the first menu option to check the data availability of species in the list and removing any species for which data is not available.

Cite as: Le Clercq L.S., Kotze A., Grobler J.P., and Dalton D.L. PAReTT: a Python package for the Automated Retrieval and management of divergence time data from the TimeTree resource for downstream analyses (2023). Journal of Molecular Evolution. DOI: 10.1007/s00239-023-10106-3

Dependencies:

  • OS: Windows
  • Python >= 3.6
  • Numpy >= 1.20.1
  • Pandas >= 1.2.4
  • Math
  • Bio >= 1.3.9 (for handling newick trees)
  • Splinter >= 0.17.0 (for interacting with the server)
  • Selenium >= 4.1.5
  • Geckodriver >= 0.31.0 (Added to PATH)
  • Firefox browser

Installation:

After downloading and extracting the zip archive PAReTT can be implemented in by navigating to the directory and using one of two methods:

python parett.py

or

python setup.py install
python -m parett

-> This option will install the relevant dependencies automatically

A pre-compiled stand-alone Windows executable is also available. DOI

Main menu

The following options are available through the main menu:

MAIN MENU:
----------------------------------------
Choose one of the following options?
   *) Check data availability
   a) Get Divergence Times (pair)
   b) Get Divergence Times (batch)
   c) Get Evolutionary Timeline
   d) Build a Time Tree
   e) Print citation
   f) Validate datafile
   g) Calculate Diversification rate (r)
   q) Quit
----------------------------------------

input is given as lower case '*','a','b','c','d','e','f', 'g', or 'q' e.g.

Choice: a

*) Check data availability

Brings up the menu to first check the TimeTree.org website for availability of divergence time data of your study species.

AVAILABILITY MENU:
----------------------------------------
Choose one of the following options?
     i) Individual
     l) List
     m) Main menu
----------------------------------------

input is given as lower case 'i', 'l', or 'm' (return to main menu)

i) Individual

  • Takes an individual species as input to look up data availability e.g. Passer montanus

  • Prints availability on screen

l) List

  • Takes a list of species as input in from a .txt input file e.g. Species.txt

  • Prints availability on screen

  • Provides option to save results to a file in .csv format e.g. Availability.csv

e.g.

Species TimeTree.Data
Setophaga ruticilla Available
Hirundo rustica Available
Setophaga striata Available

a) Get Divergence Times (pair)

  • Takes a pair of species as input to look up divergence times e.g. Taxon a: Passer montanus, Taxon b: Halcyon senegalensis
  • Prints divergence time of pair on screen

b) Get Divergence Times (batch)

  • Takes a list of species as input to look up divergence times from a .txt input file e.g. Species.txt
  • Prints divergence time of pair on screen
  • Provides option to save results to a file in .csv format e.g. Output.csv

e.g.

Taxa1 Taxa2 Div.Time
Setophaga ruticilla Setophaga ruticilla 0
Setophaga ruticilla Hirundo rustica 35
Setophaga ruticilla Setophaga striata 3.52
Hirundo rustica Setophaga ruticilla 35
Hirundo rustica Hirundo rustica 0
Hirundo rustica Setophaga striata 35
Setophaga striata Setophaga ruticilla 3.52
Setophaga striata Hirundo rustica 35
Setophaga striata Setophaga striata 0
  • When retrieving data for longer lists (>5-10) server issues may result in missing values (NA) which can be checked and replaced using the data validation menu option after the run.

c) Get Evolutionary Timeline

Brings up the menu options to retrieve the evolutionary timeline:

TIMELINE MENU:
----------------------------------------
Choose one of the following options?
     i) Individual
     l) List
     m) Main menu
----------------------------------------

input is given as lower case 'i', 'l', or 'm' (return to main menu)

i) Individual

  • Takes an individual species as input to look up evolutionary timeline e.g. Passer montanus

  • Downloads .jpg result

l) List

  • Takes a list of species as input in from a .txt input file e.g. Species.txt

  • Downloads .jpg result for each specie in list

d) Build a Time Tree

Brings up the time tree menu options

TIME TREE MENU:
----------------------------------------
Choose one of the following options?
     t) Taxon
     s) Species list
     m) Main menu
----------------------------------------

input is given as lower case 't', 's', or 'm' (return to main menu)

t) Taxon

  • Takes the name for a taxon to get a time tree of all available species within the taxon e.g. Saxicola

s) Species list

  • Takes a list of species as input in from a .txt input file to generate a time tree e.g. Species.txt

  • Downloads the resulting time tree in the Newick format

  • Stores replaced or missing species to a .txt file e.g. replacements.txt

e) Print citation

Prints the citation for the TimeTree resource

S. Kumar, G. Stecher, M. Suleski, and S.B. Hedges, 2017. TimeTree: a resource for timelines, timetrees, and divergence times. Molecular Biology and Evolution 34: 1812-1819, DOI: 10.1093/molbev/msx116

f) Validate datafile

Brings up the datafile validation menu options

VALIDATE MENU:
----------------------------------------
      a) Check missing
      b) Replace missing
      c) View tree
      m) Main menu
----------------------------------------

input is given as lower case 'a', 'b', 'c', or 'm' (return to main menu)

a) Check missing

  • Used to check for missing values from running a long list of species (>10 Species)

  • Takes the output file (.csv) from the divergence time function and checks for any missing values

    e.g.

    Taxa1 Taxa2 Div.Time
    Setophaga ruticilla Setophaga ruticilla 0
    Setophaga ruticilla Hirundo rustica NA
    Setophaga ruticilla Setophaga striata 3.52
    Hirundo rustica Setophaga ruticilla 35
    Hirundo rustica Hirundo rustica 0
    Hirundo rustica Setophaga striata NA
    Setophaga striata Setophaga ruticilla 3.52
    Setophaga striata Hirundo rustica 35
    Setophaga striata Setophaga striata 0
  • If no missing values are detected, will print 'No missing values'

  • If missing values are detected they are printed to the screen and an attempt will be made to look up those values

  • Asks for file name to store the missing values as a .csv file e.g. missing.csv

    e.g.

    Taxa1 Taxa2 Div.Time
    Setophaga ruticilla Hirundo rustica 35
    Hirundo rustica Setophaga striata 35

b) Replace missing

  • Used to replace the missing values (divergence times) from a long list of species

  • Takes two input files, one with the divergence times and one with the missing values detected using 'Check missing'

  • Asks for file name to store the validated dataset of divergence times

    e.g.

    Taxa1 Taxa2 Div.Time
    Setophaga ruticilla Setophaga ruticilla 0
    Setophaga ruticilla Hirundo rustica 35
    Setophaga ruticilla Setophaga striata 3.52
    Hirundo rustica Setophaga ruticilla 35
    Hirundo rustica Hirundo rustica 0
    Hirundo rustica Setophaga striata 35
    Setophaga striata Setophaga ruticilla 3.52
    Setophaga striata Hirundo rustica 35
    Setophaga striata Setophaga striata 0

c) View tree

  • Takes a newick tree as input and renders a basic display of tree topology

g) Calulate Diversification rate (r)

  • Calculates the diversification rate using the Magallon-Sanderson equation (Magallón and Sanderson, 2001)
  • Takes three variables as input:

~ Species number (n)

~ Epsilon or Extinction rate fraction

~ Divergence time (t) as crown/node age

q) Quit

Exits program

Publications:

Le Clercq, L.-S., Bazzi, G., Cecere, J.G., Gianfranceschi, L., Grobler, J.P., Kotzé, A., Rubolini, D., Liedvogel, M. and Dalton, D.L. Time trees and clock genes: a systematic review and comparative analysis of contemporary avian migration genetics (2023). Biological Reviews, 98: 1051-1080. 10.1111/brv.12943

© 2022 Le Clercq

parett's People

Contributors

lsleclercq avatar

Stargazers

 avatar

Watchers

 avatar

parett's Issues

Divergence Function Troubleshooting

Hey there! Trying to run PAReTT for the first time, but it seems like the program is having trouble with the drivers. I have all of the contingencies installed, but still get this error when I attempt to use the divergence function. GeckoDriver is installed and listed under PATH. Thanks!

Traceback (most recent call last): File "C:\Users\ljouflas\Downloads\PAReTT-main\PAReTT-main\parett.py", line 616, in <module> main() File "C:\Users\ljouflas\Downloads\PAReTT-main\PAReTT-main\parett.py", line 599, in main div_times_sing() File "C:\Users\ljouflas\Downloads\PAReTT-main\PAReTT-main\parett.py", line 145, in div_times_sing with Browser('firefox', headless=True) as browser: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\ljouflas\AppData\Local\anaconda3\envs\tryingshitout\Lib\site-packages\splinter\browser.py", line 130, in Browser return get_driver(driver, retry_count=retry_count, config=config, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\ljouflas\AppData\Local\anaconda3\envs\tryingshitout\Lib\site-packages\splinter\browser.py", line 92, in get_driver return driver(config=config, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\ljouflas\AppData\Local\anaconda3\envs\tryingshitout\Lib\site-packages\splinter\driver\webdriver\firefox.py", line 83, in __init__ driver = _setup_firefox( ^^^^^^^^^^^^^^^ File "C:\Users\ljouflas\AppData\Local\anaconda3\envs\tryingshitout\Lib\site-packages\splinter\driver\webdriver\setup.py", line 86, in _setup_firefox rv = driver_class(options=options, service=service, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: WebDriver.__init__() got an unexpected keyword argument 'service'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.