wangdi2014 Goto Github PK
Type: User
Location: Potomac, Maryland
Type: User
Location: Potomac, Maryland
An ongoing repository of data on coronavirus cases and deaths in the U.S.
This is a project based on the complete genome analysis of the COVID-19 (Sars-cov2) virus, taken from the Wuhan-Hu-1 isolate sample. I cleaned the genome sample to obtain an RNA sequence and I verified the number of base-pairs in the virus. Using the concept of Kolmogorov complexity, I was able to find the lower bound size of a compressed version of the COVID-19 virus. I was able to compress it into an 8.412 kb file using the "LZMA" algorithm. Then I converted the RNA sequence into a DNA string for applying the concepts of "Codons". This helped me to find the essential 20 different types of proteins that can be used to express the genome into the Protein sequence. Further, I made a decoder to make the genome into the Reading-Frame sequence. With the help of this reading frame sequence, I was able to extract the polypeptides and long-chain polypeptides in the virus. Then, I analyzed the Open Reading Frame(ORF) for the Sars-Cov-2 virus which has 10 different proteins that are responsible for the synthesis and catalytic process of COVID-19 in a human body. At last, I was able to verify the length of all the 10 proteins(ORF1a, ORF1b, Spike Glycoprotein, Membrane, ORF6, ORF7a, ORF8, ORF10) thus this project has the proof of all the scientific foundlings using Data science concepts.
ACE2 expression and cigarette exposure
Covid-19 detection in chest x-ray images using Convolution Neural Network.
cpgStats is a C application to parse CpG Dinucleotides file and get a summary of statistics and bed file annotation.
This is a code for compound-protein interaction (CPI) prediction based on a graph neural network (GNN) for compounds and a convolutional neural network (CNN) for proteins.
C++ High Performance, published by Packt
Gene/transcript expression; Fusion; de novo assembly
CPTAC3 RNA-seq splicing pipeline
Methylation array analysis pipeline for CPTAC
Code and analysis results for the CRC shotgun meta-analysis
As part of an overall strategy for improving therapies for childhood cancers, the PPTC seeks to develop models for the types of tumors that will be encountered in early phase clinical testing by establishing patient derived xenografts (PDXs) from high-risk childhood cancers refractory to current standard of care treatments. Genomic profiling of these models is required to enable PPTC investigators to develop robust "responder hypotheses" when drug activity is observed. With funding provided by Alex's Lemonade Stand Foundation, we genomically characterize a major subset of 286 PDX models. We use whole exome sequencing, transcriptome sequencing, and SNPArray to characterize the tumor models. The focus on DNA and RNA sequencing data mirrors the current standard practice in most clinical diagnostics lab that use these technologies to detect the spectrum of targetable mutations, gene amplifications, and gene fusion events relevant to preclinical drug development.
Analyze NGS data for CRISPR (or any engineered endonuclease) activity and screen for clones. Screen for NHEJ or multiple HDR events concurrently.
CRISPR/Cas9 guide RNA Design
Scripts that make PDI IDI calculations
A Perl script allowing to identify CRISPR arrays and associated Cas proteins from DNA sequences
CRISPRDetect: A flexible algorithm to define CRISPR arrays
CRISPR discovery pipeline
python access to UCSC genomes database
Cyclic DNA Sequence Aligner
Course material for CSAMA: Statistical Data Analysis for Genome Scale Biology
A contig scaffolding tool using algebraic rearrangements.
Complementary Sequence Index (CSI)
A suite of utilities for converting to and working with CSV, the king of tabular file formats.
A cross-platform, efficient and practical CSV/TSV toolkit in Golang
Random nucleotide sequences without restriction sites
Common Workflow Language reference implementation
Cyntenator is a software for identification of conserved syntenic blocks between multiple genomes.
Find all significant local alignments between reads
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.