Giter Club home page Giter Club logo

sequoia's People

Contributors

jiscah avatar olivroy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

sequoia's Issues

Error: Sibship number out of bounds

Hi Jisca,

I'm taking your advice and simulating pedigrees to test the use of sequoia for my intended use case. I'm running many many replicates across some parameters of interest (depth of simulated pedigree, probability of a female choosing to reproduce asexually, tolerance for sib-mating) and I've tested my pedigree simulation function quite well.

I have encountered the following error while trying to compile a markdown file for some results:

Quitting from lines 2031-2148 (analysis.Rmd) 
Error in SeqParSib(ParSib = "sib", FortPARAM, GenoM, LhIN = LifeHistData,  : 
    ERROR! *** setParTmp: Sibship number out of bounds *** Please report this bug.
Calls: <Anonymous> ... suppressMessages -> withCallingHandlers -> sequoia -> SeqParSib
Execution halted

This does not happen every time. In this case, it happened on the 41st replicate scenario. Unfortunately because it occurred while compiling a markdown file, I cannot retrieve the genotype matrix and pedigree than produced the error.

I am not at all sure what this means but it does instuct me to report the bug, so here I am!

Best,
Ollie

Option suggestion "Overlapping=T or F" & Analysis questions

Dear Jisca,

I know that you are working hard on a new version of sequoia. Maybe It is the time to suggest you an option :

1- Option suggestion :
Sometimes i works with population without overlapping generations (hatchery process)... Indeed, I am always confident that parents are contained N generation and individuals of N+1 are offsprings of N individuals. One potential problem of such design is that as the generation goes the inbreeding increase and decrease our power of reconstructing the pedigree.

2- My trick to analyse without overlapping generations :
Some context elements to understand the design. I am analyzing a 4 generation massale selection populations. Individuals at generation N are challenged against a virus, and the survivors are mated together (the exact pedigree is unknow) to produce N+1 individuals and so on.

Actually, I analyze the dataset by take the individuals of generation N (I set the "By" column in Lfh table to 1 and sex with 4) and for N+1 individuals (I assign set the "By" column in Lfh table to 2 and sex with 4). By iteration, I get the whole pedigree with less wholes compare to the strategy where i analyze the dataset with everybody with all the generation.

Do you have an smart/better idea ?

I hope my frenchy english is good enough to explain my problem and solution.

Error from R when trying to export dataset after a sequoia run.

Dear Jisca,

Again It is me.
Everything is ok with the new version of Sequoia, but I remark that is impossible to export a .Rdata after a sequoia run.
The error message is weird :

##-----------------------------------------------------------------------------

EstimateAssignmentPerformance <- function (geno=NA,ped=NA,lfh=NA,nb_repetition=NA,nb_loci=
[... more lines with Sequoia in it ...]
}

Run the function with small numbers

test <- EstimateAssignmentPerformance(geno=geno, ped=ped1,lfh=lfh1, nb_repetition=1 , nb_loci=c(5,10), ErroR=0.01 )
[... output from the function ...]

Trying to export test dataset as test.Rdata

name <- paste("test",version,".Rdata",sep="")
mypath4 <- file.path("G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug",name,fsep = .Platform$file.sep)
print(mypath4)

the output of this line is not OK... I should have only one line

[1] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/testx86_64-w64-mingw32.Rdata"
[2] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/testx86_64.Rdata"
[3] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/testmingw32.Rdata"
[4] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/testx86_64, mingw32.Rdata"
[5] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/test.Rdata"
[6] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/test3.Rdata"
[7] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/test4.1.Rdata"
[8] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/test2017.Rdata"
[9] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/test06.Rdata"
[10] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/test30.Rdata"
[11] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/test72865.Rdata"
[12] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/testR.Rdata"
[13] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/testR version 3.4.1 (2017-06-30).Rdata"
[14] "G:/IFREMER/02-PROJET/GENOYSTER/07_Sequoia_bug/ReproduceBug/testSingle Candle.Rdata"

save(test,file=mypath4)
Error in gzfile(file, "wb") : invalid 'description' argument
In addition: Warning message:
In if (!nzchar(file)) stop("'file' must be non-empty string") :
the condition has length > 1 and only the first element will be used
##-----------------------------------------------------------------------------

I attach folder to reproduce the bug.
Cheers,

JB

Fortran runtime error on Ubuntu 16.04

Hi,

I'm trying to run Sequoia on an Ubuntu server (16.04, running Microsoft R Open 3.3.2, with the current github version of Sequoia, 0.5).
This is my command:
ParOUT <- sequoia(GenoFile = GenoDataFile,LifeHistFile = "LifeHistFile_assigned_SA.txt", NumSibRounds=10, MaxSibshipSize=10000, MaxMismatch = 20, WriteFiles = TRUE)

And that's the output:

At line 7488 of file Sequoia.f90
Fortran runtime error: No such file or directory

I do get AgePriors.txt and SequoiaSpecs.txt in sequoiaOUT/ folder, so these functions work.

Same function with the same data works on windows.

Thanks, Ido

ErrM not found with DuplicateCheck

From the example in the vignette, the call to sequoia with MaxSubIter=1, you get an error that it can't find ErrM in the DuplicateCheck function. However, ErrM is an element of ParOUT, so you should make sure to pass ParOUT$ErrM to DuplicateCheck. Here is code that gets the error and shows a work-around.

`
library(sequoia")
data(Ped_HSg5, LH_HSg5)
set.seed(1000)
Geno <- SimGeno(Ped = Ped_HSg5, nSnp = 200)
Geno[1:5,1:5]
ParOUT <- sequoia(GenoM = Geno, LifeHistData = LH_HSg5, MaxSibIter = 0)
names(ParOUT)
table(is.na(ParOUT$PedigreePar[,2]))
SeqOUT <- sequoia(GenoM = Geno, SeqList = ParOUT, MaxSibIter = 1)

gets error

SeqOUT <- sequoia(GenoM = Geno, SeqList = ParOUT, MaxSibIter = 1, Err=1e-4)

gets error, but this allows ErrM to be found so it can run

ErrM <- ParOUT$ErrM
SeqOUT <- sequoia(GenoM = Geno, SeqList = ParOUT, MaxSibIter = 1)
`

Sequoia no longer on CRAN

The package has been removed as seen here: link

The page says this:

Package ‘sequoia’ was removed from the CRAN repository.

Formerly available versions can be obtained from the archive.

Archived on 2021-12-06 as check problems were not corrected in time.

R version for sequoia

Hi,
Could you please clarify what version of R I should install to run sequoia?
I have tried 4.2.0 and 4.0.0 and in both cases I get the message "package ‘sequoia’ is not available (for R version 4.0.0)".
Thank you so much!!!!!!!!
Leo

Sequoia running forever

Hi.

I am running sequoia for 4702 genotyped animals with 30K SNP, 500 SNP, and 200 SNP.
4702 animals are the same birth year and have no pedigree, they are from fish cages, produced by m x n parents but I did not know how many. I would like to test if there is a ped construct for the data.
It seemed Sequoied is running non-stop for almost a week without any output with 200SNPs even in my Mac (16 GBs) and HPC (5 cores, 124 Gbs). I am using Sequoia_notR.

Could you please help me with how to faster the run or/and parameter setting for my data?

Much appreciated!

Vu Nguyen

Sequoia crashes when embedded in "loop for" in R

Dear Jisca,

I am trying to make power analysis on various data quality (mendelian errors and segregation distortions).
My "R loops for" are ok, but as soon as I uncomment sequoia part, the program crashes sometimes.
You could download ".Rdata" (geno + lfh +ped) and the R code to reproduce this error.

For example, I able to run very large problem using sequoia (2000 snp across 200 individuals) alone... but as soon sequoia is embledded in a loop for, bug could appear with a rather unpredictable fashion. This make bug tracking quite tedious.

Initially I was suspecting a problem with Rstudio 0.9 and R 3.4.0. This morning, I reinstall R 3.4.1 and the lastest Rstudio 1.0.153.
With the previous install, R session is crashing when I embedded sequoia in a for loop, the bug is not stable sometimes it happen, sometimes not.

My ideas (I am not a specialist) :

  • some heavy objects are stored in memory (it will crashes the program when trying to re-test something).
  • conflicts between globals variables

I am searching stables solutions to make a power analysis (avoid to crashes a the end of 1000 iterations).
1- I will re-install sequoia using " install_github("JiscaH/sequoia")".
2- Use the same code on different computer (linux).
3- Use sequoia from Fortran version (on linux wrapped in bash script)... But the documentation is scarce, especially for a fortran newbies as me. Could you provide even rough scripts of fortran equivalent of :
-sequoia(GenoM = GM, LifeHistData = lfh, MaxSibIter = 0, Err=0.1, MaxMismatch = threshold,quiet=TRUE)
-sequoia(GenoM = GM, SeqList=ParOUT, MaxSibIter=10 , MaxSibshipSize=30,quiet=TRUE)
-PedCompare(Ped1=ped,Ped2=SeqOUT$Pedigree).

Best regards,

JB

ReproduceBug.zip

Silly question for a very usefull package

Hi Jisca,

I am sure that you have a R trick to access to dam object stored in compareOUT$Counts...

get assignation statistic for dam

compareOUT <- PedCompare(Ped1=ped1,Ped2=SeqOUT$Pedigree)
dam <- compareOUT$Counts[1] ## NOT WORKING
dam <- unlist(compareOUT$Counts)[1] ## NOT WORKING

best regards,

JB

segmentation fault error in Sequoia

Hi,
I'm running a large dataset in sequoia that is exiting with the following error:
Sibships - Initial Total LL :
[1] -232355
Round 01 end, Total LogLik:
[1] -228607.6
No. dams, sires (incl. dummies):
[1] 362 362

*** caught segfault ***
address (nil), cause 'unknown'

Traceback:
1: SeqParSib(ParSib = "sib", Specs = Specs, GenoM = GenoM, LhIN = LifeHistData[, 1:3], AgePriors = Ag$
2: sequoia(GenoM = GenoM, LifeHistData = LH, MaxSibIter = 10, MaxMismatch = 10, UseAge = "yes", FindM$
An irrecoverable exception occurred. R is aborting now ...
/var/spool/slurmd/job34109895/slurm_script: line 25: 164736 Segmentation fault (core dumped) Rscript $

Other, smaller, data files complete normally. I have reproduced this error across R console v3.5.1 and RStudio v1.1.456 on my MAC running Sierra, as well as on an HPC. In all cases, the package runs normally most of the way through the analysis before exiting with the seg fault error. Hoping to find a fix for this, thanks for any suggestions!

Incorrect "impossible" relationships in CalcPairLL

Dear Jisca,

First, thanks for writing such a useful package.

I have encountered an unusual behaviour which I don't know how to resolve. I have genotypes of parents and offspring, and for the vast majority of them (48/51) sequoia correctly assigns the offspring using the main sequoia() function. The scenario is simple: several controlled monogamous crosses where maternal and paternal ID is exactly known. There are three for whom the mother cannot be assigned even though the father is correctly inferred and all of the other offspring in that family are assignable.

When I use the CalcPairLL() function specifying "PO" as the focal relationship for all pairs (each pair being a unique combination of possible parents and the orphan offspring) the output shows that for most pairs all relationships are considered impossible (LL = 777.00). For instance here is the output table for one such offspring:

ID1    ID2 Sex1 Sex2 AgeDif focal patmat dropPar1 dropPar2      PO      FS      HS  GP      FA      HA       U TopRel  LLR
C9_310521_04 C01220    3    1      1    PO      1     none     none  777.00  999.00  999.00 999  999.00  999.00 -242.01      U   NA
C9_310521_04 C10223    3    2      1    PO      2     none     none  777.00  999.00  999.00 999  999.00  999.00 -222.22      U   NA
C9_310521_04 C01222    3    1      1    PO      1     none     none  777.00  999.00  999.00 999  999.00  999.00 -232.22      U   NA
C9_310521_04 C01232    3    2      1    PO      2     none     none  777.00  999.00  999.00 999  999.00  999.00 -229.85      U   NA
C9_310521_04 C01223    3    2      1    PO      2     none     none  777.00  999.00  999.00 999  999.00  999.00 -228.17      U   NA
C9_310521_04 C01213    3    1      1    PO      1     none     none  777.00  999.00  999.00 999  999.00  999.00 -236.77      U   NA
C9_310521_04 C01230    3    2      1    PO      2     none     none  777.00  999.00  999.00 999  999.00  999.00 -217.27      U   NA
C9_310521_04 C01224    3    1      1    PO      1     none     none  777.00  999.00  999.00 999  999.00  999.00 -227.89      U   NA
C9_310521_04 C01211    3    1      1    PO      1     none     none  777.00  999.00  999.00 999  999.00  999.00 -222.45      U   NA
C9_310521_04 C01231    3    1      1    PO      1     none     none  777.00  999.00  999.00 999  999.00  999.00 -224.64      U   NA
C9_310521_04 C10133    3    1      1    PO      1     none     none  777.00  999.00  999.00 999  999.00  999.00 -243.79      U   NA
C9_310521_04 C01221    3    2      1    PO      2     none     none  777.00  999.00  999.00 999  999.00  999.00 -231.01      U   NA
C9_310521_04 C01219    3    2      1    PO      2     none     none  777.00  999.00  999.00 999  999.00  999.00 -248.07      U   NA
C9_310521_04 C01227    3    2      1    PO      2     none     none -205.81 -211.12 -211.18 777 -211.32 -211.18 -224.29     PO 5.31
C9_310521_04 C01233    3    1      1    PO      1     none     none  777.00  999.00  999.00 999  999.00  999.00 -215.57      U   NA
C9_310521_04 C01228    3    1      1    PO      1     none     none  777.00  999.00  999.00 999  999.00  999.00 -219.18      U   NA
C9_310521_04 C01234    3    2      1    PO      2     none     none  777.00  999.00  999.00 999  999.00  999.00 -227.95      U   NA
C9_310521_04 C01210    3    2      1    PO      2     none     none  777.00  999.00  999.00 999  999.00  999.00 -251.84      U   NA

In this case, the father is correctly assigned (C01227 in column ID2) but for any other potential parent, the relationship is considered impossible and no LL is calculated. The true mother here is C01228. I confess I have no idea what this means for how the software is behaving. Can you please advise?

Best,
Oliver

Edit:

I should add that when using the primary function sequoia() these three offspring have no mother assigned to them, rather than an incorrect mother. Here is a sample output:

$PedigreePar
id    dam   sire LLRdam LLRsire LLRpair OHdam OHsire MEpair
C9_310521_04   <NA> C01227     NA    4.60      NA    NA      0     NA

Feature Request: Generation Time

I have a dataset of tortoises that I'm running through sequoia, then plotting with pedtools. I'm using the true birth years of the tortoises as a prior (birth years range from 1960-2020), but unfortunately the sibship analysis run by sequoia allows parent-offspring relations between tortoises that are only 1-3 years apart. In reality, we know it takes ~10 years for a tortoise to become reproductively active. When I change birth year to "generation" (e.g. 1, 2, 3, 4, etc.), I get many more spurious relationships (five tortoises become their own ancestors).

It seems like using the true birth years as a prior, then just manually changing the resulting pedigree from impossible parent-offspring relationships to sibling relationships resolves this issue. I wonder, however, how complicated it would be to include generation time or minimum age at reproduction, as another optional parameter.

Thanks for the great package. It's proving very useful in my work.

sequoia error can't find ErrM

From the example in the vignette, the call to sequoia with MaxSubIter=1, you get an error that it can't find ErrM in the DuplicateCheck function. However, ErrM is an element of ParOUT, so you should make sure to pass ParOUT$ErrM to DuplicateCheck. Here is code that gets the error and shows a work-around.

library(sequoia") data(Ped_HSg5, LH_HSg5) set.seed(1000) Geno <- SimGeno(Ped = Ped_HSg5, nSnp = 200) Geno[1:5,1:5] ParOUT <- sequoia(GenoM = Geno, LifeHistData = LH_HSg5, MaxSibIter = 0) MaxSibIter = 0) names(ParOUT) table(is.na(ParOUT$PedigreePar[,2])) args(sequoia) SeqOUT <- sequoia(GenoM = Geno, SeqList = ParOUT, MaxSibIter = 1) SeqOUT <- sequoia(GenoM = Geno, SeqList = ParOUT, MaxSibIter = 1, Err=1e-4) ErrM <- ParOUT$ErrM SeqOUT <- sequoia(GenoM = Geno, SeqList = ParOUT, MaxSibIter = 1)

Error in SeqParSib : 'AgePriors' matrix should have size nAgeClasses * 8

Dear Jisca,

I am trying to reconstruct pedigree (at least parents-offsprings relationship) in a lab population under experimental infection during 4 generation. I only know the age of individual... In some cases (limited) I know the sex of individual. The good thing is that the design is not with overlapping generation.

My intuition is telling me, that is a format error... but I am not more idea.

Error in SeqParSib("par", Specs, GenoM, LifeHistData, AgePriors, Parents = NULL, :
'AgePriors' matrix should have size nAgeClasses * 8
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf

I added to files to reproduce the error (I am on window 7 64 bit, using sequoia 0.9.3).

geno2_debug.txt
lfh2_debug.txt

Cheers,
JB

Known crosses as prior info

Dear, thank you for releasing this package which has been very helpful for my analysis.
I have searched for information on this and have not been able to find out if it is possible to tell sequoia in any way cross pairs as a priori information. I have pairs of males and females that we know were crossed, but we need to assign which offspring were generated by those crosses. In doing so without prior, and unfortunately due to a low amount of high quality SNPs we receive (app 50), the generated assignment does not correspond correctly to the crosses we know a priori, i.e. offspring are assigned to mothers and fathers that we know were not outcrossed.

Is there a way to incorporate this information into the analysis? Thank you in advance for your help.

installation of package ‘/var/folders/dc/5j3hs53s2cn0lhr1ps49slfw0000gn/T//RtmpNYHr2z/file1c0916e8a227/sequoia_2.3.3.tar.gz’ had non-zero exit status

Hi,
I am trying to get sequoia for R, and I am running into some errors. Any help would be appreciated.
Thank you!

Downloading GitHub repo JiscaH/sequoia@stable
✔ checking DESCRIPTION meta-information ... /5j3hs53s2cn0lhr1ps49slfw0000gn/T/RtmpNYHr2z/remotes1c093ea83438/JiscaH-sequoia-85ea741/DESCRIPTION’ ...
─ cleaning src
Warning: file 'sequoia/cleanup' did not have execute permissions: corrected
Warning: file 'sequoia/cleanup' did not have execute permissions: corrected

  • installing source package ‘sequoia’ ...
    ** using staged installation
    ** libs
    gfortran -mmacosx-version-min=10.13 -fno-optimize-sibling-calls -fPIC -Wall -g -O2 -c Sequoia.f90 -o Sequoia.o
    make: gfortran: No such file or directory
    make: *** [Sequoia.o] Error 1
    ERROR: compilation failed for package ‘sequoia’
  • removing ‘/Library/Frameworks/R.framework/Versions/4.2/Resources/library/sequoia’
    Warning message:
    In i.p(...) :
    installation of package ‘/var/folders/dc/5j3hs53s2cn0lhr1ps49slfw0000gn/T//RtmpNYHr2z/file1c0916e8a227/sequoia_2.3.3.tar.gz’ had non-zero exit status

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.