Giter Club home page Giter Club logo

cfdnapro's Issues

Excluding chromosomes from analysis

I am trying to run cfNDAPro on data aligned to a "synthetic" genome containing puc and lambda sequences to check for methylation conversion.
I am trying to only look at reads aligned to 1:22 and X and Y however I struggle to read the bam file into cfDNAPro:

read_bam_insert_metrics(bamfile = file.bam, genome_label="hg38-NCBI",chromosome_to_keep =append(1:22,c("X","Y")))

bamfile was supplied.
Reading bam into galp...
Curating seqnames and strand information...
Removing outward facing fragments ...
Correcting start and end coordinates of fragments ...
Error in .normarg_seqlengths(value, seqnames(x)) :
the length of the supplied 'seqlengths' vector must be equal to the
number of sequences
Calls: read_bam_insert_metrics ... seqlengths<- -> seqlengths<- -> .normarg_seqlengths
In addition: Warning message:
In .merge_two_Seqinfo_objects(x, y) :
Each of the 2 combined objects has sequence levels not in the other:

  • in 'x': KI270728.1, KI270727.1, KI270442.1, KI270729.1, GL000225.1, KI270743.1, GL000008.2, GL000009.2, KI270747.1, KI270722.1, GL000194.1, KI270742.1, GL000205.2, GL000195.1, KI270736.1, KI270733.1, GL000224.1, GL000219.1, KI270719.1, GL000216.2, KI270712.1, KI270706.1, KI270725.1, KI270744.1, KI270734.1, GL000213.1, GL000220.1, KI270715.1, GL000218.1, KI270749.1, KI270741.1, GL000221.1, KI270716.1, KI270731.1, KI270751.1, KI270750.1, KI270519.1, GL000214.1, KI270708.1, KI270730.1, KI270438.1, KI270737.1, KI270721.1, KI270738.1, KI270748.1, KI270435.1, GL000208.1, KI270538.1, KI270756.1, KI270739.1, KI270757.1, KI270709.1, KI270746.1, KI270753.1, KI270589.1, KI270726.1, KI270735.1, KI270711.1, KI270745.1, KI270714.1, KI270732.1, KI270713.1, KI270754.1, KI270710.1, KI270717.1, KI270724.1, KI270720.1, KI270723.1, KI270718.1, KI270317.1, KI270740.1, KI270755.1, KI270707.1, KI270579.1, KI270752.1, KI270512.1, KI27032 [... truncated]
    Execution halted

I get the same error when I subset the bam file to only the relevant chromosomes using
samtools view
then re-index. I believe this is because the header retains the old chromosome names:


9 138394717 1374698 0
MT 16569 0 0
X 156040895 1739164 0
Y 57227415 7892 0
KI270728.1 1872759 0 0
KI270727.1 448248 0 0
KI270442.1 392061 0 0

Is there a way around this?
many thanks!

Vignette Build Fails

Building the vignettes on windows appears to fail:

── R CMD build ───────────────────────────────────────────────────────────────
✔ checking for file 'C:\Users\XXXXXX\AppData\Local\Temp\RtmpWCknBL\remotes43486a6b318f\hw538-cfDNAPro-e11dd61/DESCRIPTION' (537ms)
─ preparing 'cfDNAPro': (1.5s)
✔ checking DESCRIPTION meta-information ...
─ installing the package to build vignettes
E creating vignettes (55.5s)
--- re-building 'cfDNAPro.Rmd' using rmarkdown
Loading required package: magrittr

Registered S3 method overwritten by 'quantmod':
method from
as.zoo.data.frame zoo
Loading required package: ggplot2

Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

   filter, lag

The following objects are masked from 'package:base':

   intersect, setdiff, setequal, union

Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Quitting from lines 277-295 (cfDNAPro.Rmd)
Error: processing vignette 'cfDNAPro.Rmd' failed with diagnostics:
Problem while computing count = n().
ℹ The error occurred in group 1: group = "cohort_1", insert_size = 142.
Caused by error in n():
! This function should not be called directly
--- failed re-building 'cfDNAPro.Rmd'

SUMMARY: processing the following file failed:
'cfDNAPro.Rmd'

Error: Vignette re-building failed.
Execution halted
Error: Failed to install 'cfDNAPro' from GitHub:
! System command 'Rcmd.exe' failed

sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.utf8
[2] LC_CTYPE=English_United Kingdom.utf8
[3] LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.utf8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] devtools_2.4.5 usethis_2.1.6

loaded via a namespace (and not attached):
[1] tidyselect_1.2.0 remotes_2.4.2 purrr_0.3.5 carData_3.0-5
[5] colorspace_2.0-3 vctrs_0.5.1 generics_0.1.3 miniUI_0.1.1.1
[9] htmltools_0.5.3 utf8_1.2.2 rlang_1.0.6 pkgbuild_1.4.0
[13] urlchecker_1.0.1 ggpubr_0.5.0 pillar_1.8.1 later_1.3.0
[17] withr_2.5.0 glue_1.6.2 DBI_1.1.3 sessioninfo_1.2.2
[21] lifecycle_1.0.3 stringr_1.5.0 munsell_0.5.0 ggsignif_0.6.4
[25] gtable_0.3.1 htmlwidgets_1.5.4 memoise_2.0.1 callr_3.7.3
[29] fastmap_1.1.0 httpuv_1.6.6 ps_1.7.2 curl_4.3.3
[33] fansi_1.0.3 broom_1.0.1 Rcpp_1.0.9 xtable_1.8-4
[37] backports_1.4.1 scales_1.2.1 promises_1.2.0.1 cachem_1.0.6
[41] desc_1.4.2 pkgload_1.3.2 abind_1.4-5 mime_0.12
[45] fs_1.5.2 ggplot2_3.4.0 digest_0.6.30 stringi_1.7.8
[49] rstatix_0.7.1 processx_3.8.0 dplyr_1.0.10 shiny_1.7.3
[53] rprojroot_2.0.3 cowplot_1.1.1 grid_4.2.2 cli_3.4.1
[57] tools_4.2.2 magrittr_2.0.3 tibble_3.1.8 profvis_0.3.7
[61] crayon_1.5.2 car_3.1-1 tidyr_1.2.1 pkgconfig_2.0.3
[65] ellipsis_0.3.2 prettyunits_1.1.1 assertthat_0.2.1 rstudioapi_0.14
[69] R6_2.5.1 compiler_4.2.2

Problems plotting cohorts

Hi,

First of all, thank you for your contribution. I am following the demo described in https://bioconductor.org/packages/release/bioc/vignettes/cfDNAPro/inst/doc/cfDNAPro.html and is not working for me. Sorry but I'm very new to bioinformatics.

I have a folder called bam_metrics_cfDNA where I have two subfolders: cohort_1 and cohort_2. They both have different .txt files generated using picard CollectInsertSizeMetrics (the input for cfDNAPro).

First, when I run the code for plotting only one plot for cohort_1:

data_path <- "/Users/abelgd/Desktop/training_bioinfo/cfDNAs/bam_metrics_cfDNA"

cohort1_plot <- cfDNAPro::callSize(path = data_path) %>%
    dplyr::filter(group == as.character("cohort_1")) %>%
    cfDNAPro::plotSingleGroup()

I get this error:

cohort1_plot
$prop_plot
Error in draw_axis(break_positions = guide$key[[aesthetic]], break_labels = guide$key$.label, :
lazy-load database '/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/gtable/R/gtable.rdb' is corrupt
In addition: Warning messages:
1: In draw_axis(break_positions = guide$key[[aesthetic]], break_labels = guide$key$.label, :
restarting interrupted promise evaluation
2: In draw_axis(break_positions = guide$key[[aesthetic]], break_labels = guide$key$.label, :
internal error -3 in R_decompress1

I read that cohort1_plot is an S3 object with 3 ggplot objects so, how can I plot the data within it?

error multiuplexing plots

Following the explanation in https://bioconductor.org/packages/release/bioc/vignettes/cfDNAPro/inst/doc/cfDNAPro.html I got an error when plotting the multiplex. I have previously organized the .txt files generated with picard CollectInsertSizeMetrics into a folder containing two subfolders: cohort_1 and cohort_2. They both are in:
data_path <- '/Users/abelgd/Desktop/training_bioinfo/cfDNAs/bam_metrics_cfDNA'

When running this:

grp_list<-list("cohort_1"="cohort_1",
                 "cohort_2"="cohort_2")
  
  result<-sapply(grp_list, function(x){
    result <-callSize(path = data_path) %>% 
      dplyr::filter(group==as.character(x)) %>% 
      plotSingleGroup()
  }, simplify = FALSE)  

I got the following error:

Error in plotSingleGroup(.) : could not find function "plotSingleGroup"

Also it gives me the option to Show Traceback or Rerun with Debug.

Does anyone know what happening here?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.