This workflow calculates dN/dS ratios for RSV-A and RSV-B. It additionally calculates dN/dS ratios for different regions of the RSV G gene. The nonsynonymous and synonymous mutation counts are scaled respectively by the number of possible synonymous and nonsynonymous mutation sites in the given reference CDS regions.
Additionally, mutation rates per gene per codon are also calculated. This is calculated by dividing the number of mutations in each gene by tree length and gene length, and finally multiplying by three.
The inputs required for the workflow are the following:
-
amino acids mutations by tree branch (json)
-
nucleotide mutations by tree branch (json)
-
phylogenetic tree for RSV subtype of interest genome (nwk)
-
reference file (genbank)
The amino acid, nucleotide and tree files can be generated by the (RSV Nextstrain)[https://github.com/nextstrain/rsv] workflow
The outputs are generated in the results folder and include the following:
-
CSV file with dN/dS ratios for each gene
-
CSV file with dN/dS ratios for each G gene region
-
PNG graph of mutations per codon for each gene
The workflow can be run from the command line using Snakemake:
Snakemake --cores all