Giter Club home page Giter Club logo

scannotationt-paper's Introduction

Improved annotation with scRNA-seq and long reads

Recovering significant 10xGenomics single-cell RNA-seq signal through improved annotation with Oxford Nanopore (ONT) bulk long reads. Applied on chick embryo neuro-epithelial progenitors at 66h of development.

GitHub repository started on 2021-04-02.

Repository content and structure

Root folders

  • data - Data used for the analyses (e.g. BAM, RDS, tsv files, ...)
  • docs - Rendered analysis reports (as html files) and figures generated by the notebook
  • env - Code used to create the conda environment for this study (e.g. bash script) and the corresponding YAML file.
  • figs - Figures created before the notebook creation and used as input
  • rmd - R Markdown analysis files
  • src - Reusable code (e.g. functions)

Root files

Files found at the root of the repository are of general purpose:

  • _bookdown.yml - Configuration YAML file for the notebook
  • _build.log - Log file (stdout and stderr) generated when running the analysis
  • _build.sh - Bash script to run the analysis
  • _deploy.sh - Bash script to deploy the analysis on GitHub
  • index.Rmd - R Markdown file used for the setup of the analysis (load libraries, define variables, paths...)
  • README.md - This file
  • _workflowr.yml - Workflowr configuration file

Data

Current input data files include:

- notebook		# javascript code for bookdown output (do not touch)
- processed		# output generated when running the notebook
- raw			# raw data (input)
  - references
    - annotations
	  - ensembl
	  - ncbi
	  - ucsc
  - rna-seq
    - single-cell
    - long-read

R Markdown files

Current notebook files include:

  • 01-Impact-ref-annotation-scRNA.Rmd

Here we explore the discrepancies between the references annotations (Ensembl and RefSeq) and their impact on common scRNA-seq analyses.

  • 02-Incomplete-annotations-induce-signal-loss.Rmd

We study here the loss in scRNA-seq signal (e.g. genes) due to significant deficiencies in the reference annotation, specially in the 3' UTRs annotations.

  • 03-Approaches-to-improve-transcriptome-with-Long-Reads.Rmd

We compare various tools dedicated to transcriptome reconstruction in bulk RNA-seq (StringTie2, scallop), a dedicated signal detection approach and broad 3' UTR extension and apply them to scRNA-seq data and ONT bulk long reads.

  • 04-Impact-reannotation-scRNA.Rmd

We assess the impact of our various reannotations on common scRNA-seq analyses.

  • 05-Validation-of-novel-genes-with-scRNA.Rmd

We evaluate the ability of our approach to identify novel genes and use scRNA-seq analyses as a filter to highlight genes of biological interest in chick embryo neuro-epithelial progenitors.

  • 06-A-tool-and-pipeline-to-improve-annotation-for-scRNA.Rmd

Description and recommendations to use our pipeline on other scRNA-seq data.

  • 07-Session-info.Rmd

Session info output.

External scripts

Current code files include:

- analysis
- pipeline
- preprocessing
- utilities

Run the code

To run the notebook and create the corresponding html files, you have two options:

  • In RStudio, click the knit button (you may need to change the knit directory)
  • In a linux terminal, run the script _build.sh with the command:
bash _build.sh

The output will be stored in the docs folder.

scannotationt-paper's People

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.