Giter Club home page Giter Club logo

wangdi2014's Projects

covid-19-data icon covid-19-data

An ongoing repository of data on coronavirus cases and deaths in the U.S.

covid-genome-ds icon covid-genome-ds

This is a project based on the complete genome analysis of the COVID-19 (Sars-cov2) virus, taken from the Wuhan-Hu-1 isolate sample. I cleaned the genome sample to obtain an RNA sequence and I verified the number of base-pairs in the virus. Using the concept of Kolmogorov complexity, I was able to find the lower bound size of a compressed version of the COVID-19 virus. I was able to compress it into an 8.412 kb file using the "LZMA" algorithm. Then I converted the RNA sequence into a DNA string for applying the concepts of "Codons". This helped me to find the essential 20 different types of proteins that can be used to express the genome into the Protein sequence. Further, I made a decoder to make the genome into the Reading-Frame sequence. With the help of this reading frame sequence, I was able to extract the polypeptides and long-chain polypeptides in the virus. Then, I analyzed the Open Reading Frame(ORF) for the Sars-Cov-2 virus which has 10 different proteins that are responsible for the synthesis and catalytic process of COVID-19 in a human body. At last, I was able to verify the length of all the 10 proteins(ORF1a, ORF1b, Spike Glycoprotein, Membrane, ORF6, ORF7a, ORF8, ORF10) thus this project has the proof of all the scientific foundlings using Data science concepts.

covid19 icon covid19

ACE2 expression and cigarette exposure

cpgstats icon cpgstats

cpgStats is a C application to parse CpG Dinucleotides file and get a summary of statistics and bed file annotation.

cpi_prediction icon cpi_prediction

This is a code for compound-protein interaction (CPI) prediction based on a graph neural network (GNN) for compounds and a convolutional neural network (CNN) for proteins.

crc_meta icon crc_meta

Code and analysis results for the CRC shotgun meta-analysis

create-pptc-pdx-oncoprints icon create-pptc-pdx-oncoprints

As part of an overall strategy for improving therapies for childhood cancers, the PPTC seeks to develop models for the types of tumors that will be encountered in early phase clinical testing by establishing patient derived xenografts (PDXs) from high-risk childhood cancers refractory to current standard of care treatments. Genomic profiling of these models is required to enable PPTC investigators to develop robust "responder hypotheses" when drug activity is observed. With funding provided by Alex's Lemonade Stand Foundation, we genomically characterize a major subset of 286 PDX models. We use whole exome sequencing, transcriptome sequencing, and SNPArray to characterize the tumor models. The focus on DNA and RNA sequencing data mirrors the current standard practice in most clinical diagnostics lab that use these technologies to detect the spectrum of targetable mutations, gene amplifications, and gene fusion events relevant to preclinical drug development.

cris.py icon cris.py

Analyze NGS data for CRISPR (or any engineered endonuclease) activity and screen for clones. Screen for NHEJ or multiple HDR events concurrently.

crisprcasfinder icon crisprcasfinder

A Perl script allowing to identify CRISPR arrays and associated Cas proteins from DNA sequences

cruzdb icon cruzdb

python access to UCSC genomes database

csa icon csa

Cyclic DNA Sequence Aligner

csama icon csama

Course material for CSAMA: Statistical Data Analysis for Genome Scale Biology

csar icon csar

A contig scaffolding tool using algebraic rearrangements.

csi icon csi

Complementary Sequence Index (CSI)

csvkit icon csvkit

A suite of utilities for converting to and working with CSV, the king of tabular file formats.

csvtk icon csvtk

A cross-platform, efficient and practical CSV/TSV toolkit in Golang

cutfree icon cutfree

Random nucleotide sequences without restriction sites

cwltool icon cwltool

Common Workflow Language reference implementation

cyntenator icon cyntenator

Cyntenator is a software for identification of conserved syntenic blocks between multiple genomes.

daligner icon daligner

Find all significant local alignments between reads

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.