Giter Club home page Giter Club logo

alphaimpute2's Introduction

AlphaImpute2

AlphaImpute2 is a phasing and imputation algorithm for massive livestock populations. The method uses a approximate version of multi-locus iterative peeling for pedigree based imputation, and a novel imputation algorithm that uses the Positional Burrows Wheeler Transform for population imputation. AlphaImpute2 has been successfully used to perform imputation in populations of hundreds of thousands of individuals. AlphaImpute2 was developed by Andrew Whalen, and is currently being supported by Andrew Whalen and Steve Thorn.

Availability

The AlphaImpute2.zip file contains a python3 wheel file, a manual, and an example dataset.

Conditions of use

AlphaImpute2 is part of a suite of software that our group has developed. It is fully and freely available for all use under the MIT License.

Suggested Citation

Whalen, A, J.M. Hickey, (2020), AlphaImpute2: Fast and accurate pedigree and population based imputation for hundreds of thousands of individuals in livestock populations, bioRxiv.https://doi.org/10.1101/2020.09.16.299677

alphaimpute2's People

Contributors

alphagenes-admin avatar andrew-whalen-roslin avatar gregorgorjanc avatar sgavril avatar xingertang avatar yqiqichen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

alphaimpute2's Issues

Unclear input file formats

In the AlphaImpute2.pdf file distributed, in the Input File Format section, Genotype File subsection, it is written:

"The remaining values are the genotypes of the individual at each locus, either 0, 1, or 2 (or 9 if missing)."

It is unclear if "0" stands for homozigous genotypes for the first alelle, "1" stands for heterozigous genotypes and "2" stands for homoziguous genotypes for the other alelle. This is usually the case for other softwares, but can be confusing for newbies using your software.

Please clarify this in the documentation.

Successful installation but cannot run program

Hello,

I just read the pre-print for AlphaImpute2 and I am excited to test it out. After installing, I try to run AlphaImpute2 without any arguments and I am met with the following error below:

[stefan@stefan-81x2 AlphaImpute2]$ pip uninstall AlphaImpute2 -y
Found existing installation: AlphaImpute2 0.0.3
Uninstalling AlphaImpute2-0.0.3:
  Successfully uninstalled AlphaImpute2-0.0.3
[stefan@stefan-81x2 AlphaImpute2]$ pip install dist/AlphaImpute2-0.0.3-py3-none-any.whl 
Defaulting to user installation because normal site-packages is not writeable
Processing ./dist/AlphaImpute2-0.0.3-py3-none-any.whl
Requirement already satisfied: numba>=0.49.0 in /home/stefan/.local/lib/python3.8/site-packages (from AlphaImpute2==0.0.3) (0.52.0)
Requirement already satisfied: numpy>=1.19 in /home/stefan/.local/lib/python3.8/site-packages (from AlphaImpute2==0.0.3) (1.19.4)
Requirement already satisfied: llvmlite<0.36,>=0.35.0 in /home/stefan/.local/lib/python3.8/site-packages (from numba>=0.49.0->AlphaImpute2==0.0.3) (0.35.0)
Requirement already satisfied: setuptools in /usr/lib/python3.8/site-packages (from numba>=0.49.0->AlphaImpute2==0.0.3) (50.3.2)
Installing collected packages: AlphaImpute2
Successfully installed AlphaImpute2-0.0.3
[stefan@stefan-81x2 AlphaImpute2]$ AlphaImpute2
Traceback (most recent call last):
  File "/usr/bin/AlphaImpute2", line 5, in <module>
    from alphaimpute2.alphaimpute2 import main
  File "/home/stefan/.local/lib/python3.8/site-packages/alphaimpute2/alphaimpute2.py", line 4, in <module>
    from .Imputation import ParticlePhasing
  File "/home/stefan/.local/lib/python3.8/site-packages/alphaimpute2/Imputation/ParticlePhasing.py", line 19, in <module>
    from . import PhasingObjects
  File "/home/stefan/.local/lib/python3.8/site-packages/alphaimpute2/Imputation/PhasingObjects.py", line 17, in <module>
    from . import Imputation
  File "/home/stefan/.local/lib/python3.8/site-packages/alphaimpute2/Imputation/Imputation.py", line 2, in <module>
    from numba import jit, int8, int32, boolean, jitclass, float32, int64
ImportError: cannot import name 'jitclass' from 'numba' (/home/stefan/.local/lib/python3.8/site-packages/numba/__init__.py)

I'm not sure what is going wrong with the installation, and any suggestions would be greatly appreciated. If it is relevant, I am running Manjaro 20.2. Many thanks.

How does AlphaImpute2 handle sex chromosomes?

Hi all,

I've been working with AlphaImpute2 more closely now and a question that has come up is how this software handles x chromosomes for males? I couldn't find anything in the bioRxiv paper or in the PDF user manual referring to this. Would the male x chromosome just be treated as diploid and homozygous at every SNP? Apologies if this has been answered elsewhere.

impute non-genotyped individuals

Dear author:
I am very interested in this software. I want to impute individuals non-genotyped but have phenotype and pedigree. Does this software support this function? Because this aspect is not covered in your instruction manual, if it is supported, how should it work? Thank you very much.

Start and End chromosome position

Hi,

I am tying using AlphaPeel to impute some animals from 0 (ungenotyped) to 50K markers. To do this, I've included in my card both startsnp and endsnp options.
Here is an example:

nSnp         ,       40417
InputFilePath, genoBullsImputation
pedigree, pedigreeGeno.txt
OutputFilePath, ImpResult14
nCycles, 20
runType, multi
startsnp, 23742
endsnp, 25166

Although it runs without any error message, the results seem very off, with imputation accuracy (correlation) being around 0.10.
However, when running chromosome one (startsnp, 1 , endsnp, 2628 ), the result looks much better, with accuracy around 0,95.

My input files look like this:
genoBullsImputation

A00471368 1 0 0 0 1 1 1 1 2 0 0 2 0 0
A00471389 1 0 0 0 0 0 1 1 2 0 0 2 0 0 
A00471406 0 2 0 1 1 1 1 1 1 2 1 2 0 0 
A00471475 2 0 1 1 1 1 0 1 2 0 0 1 1 1

pedigreeGeno.txt

A11548968 A09601098 A11395392
A12225412 A09687794 A11381161
A11608547 A08687697 A11608143
A11608556 A09460260 A11341519
A11383528 A09939244 A10786854
A11383532 A08275585 A10481355

I also used files in the example folder, but it also seems not to work properly.

I'd appreciate with you have any suggestion about what can be wrong on my files?

Thank you.

Output posterior probabilities for alleles and genotypes

Discussed in #24

Originally posted by Leo4Luffy July 19, 2023
Good morning, I hope you are very well. I just wanted to ask a little question. I have performed the imputation of genomic data using AlphaImpute2. I have used the three proposals of this program: imputation based on the pedigree, imputation based on the population and imputation using both sources of information. From this, I have obtained the genotypes in phase and the imputed genotypes. I wanted to ask you, and excuse me, if AlphaImpute2 also allows you to obtain the probability of genotypes as AlphaImpute does? I would appreciate any information you can give me. A happy day.

Issue with Phasing in AlphaImpute2

I have been running AlphaImpute2 today. Regardless of what datasets I use (my own or the example provided in the GitHub package), I am receiving the same error message. I took a screenshot below. Basically, the imputation seems to run ok and writes a .genotype file, but the phasing always errors out at the very end and will not write the .phase file. My inputs are just a genotype and pedigree file and with -phase_output keyed in.

Below is the example:

AlphaImpute2 -genotypes ./AlphaImpute2-main/example/data/genotypes.txt -pedigree ./AlphaImpute2-main/example/data/pedigree.txt -out test -phase_output

I get the error AttributeError: 'tuple' has no attribute 'ndim'. The files are simply the example files in the package. I can provide others if needed, but figured this the example files are a good start.

Any help would be much appreciated!

Josh

Are writekey and onlykeyed output options implemented?

Hi,

I tried running AlphaImpute2 with the following code:
AlphaImpute2 -genotypes data/Genotypes_all.txt
-pedigree data/ped_certain.txt
-out outputs/ai2_SW_geno -writekey genotypes -onlykeyed
-cycles 10 -maxthreads 12

I was hoping to get back only the imputation for those individuals that we have sequenced and we know their positions in the PLINK file we're using, but the -writekey and -onlykeyed flags are ignored and the run outputs all the individuals it can find in the pedigree and dummy individuals in alphanumeric order, instead of the order in the genotype file. I was looking at the AlphaImpute2 script and I noticed that these flags are never invoked. Is this feature not implemented as it says in the manual? I appreciate any information.

Kind regards,

Sergio

Alphaimpute2 limitations

Hello,
When I'm trying to impute big files (~500K lines) it's not working.
Error message:
raise ValueError(f"Incorrect number of values from {fileName}. Expected {self.nLoci} got {nLoci}.")
NameError: name 'fileName' is not defined

When I chunked these files into smaller files (50K each), it's working fine. Is it really some limitations in your program based on files size? Because I didn't find this info in the manual and Github. I thought maybe turning the 'length' parameter will help, but it's not.

Data convertation?

Hello!
Is it some guidelines on how to convert the output of the Alphaimpute back to the VCF file?

Thanks!

is there a way to Alphaimpute2 only impute samples in input file?

AlphaImpute2 is taking a long time ro run and a huge amount of memory due the to imputation genotypes of samples present in the whole pedigree file. Is there a way to impute genotypes only from the samples present in the genotype input file? I tried the -onlykeyed flag but it is not working.

Binary plink file input error

Hello,

I have binary input files generated by plink (.bed and .bim). I read in the documentation that AlphaImpute2 supports binary plink file input with alphaplinkpython, so I downloaded the most recent version (0.08). When I run AlphaImpute2, I get a file not found error.

First as a sanity check, the files are present:

(python3.6) [stefan@stefan-81x2 05_AlphaImpute2]$ ls alphaImpute2.sh alphaplinkpython-0.0.8-cp36-cp36m-manylinux1_x86_64.whl testFile.bed testFile.bim

The command I executed:

AlphaImpute2 -bfile testFile -out testOut -maxthreads 4

And the output:
alphaImpute2output.txt

I am not sure how to go about fixing this. I first tried this with the file extension name in the input and that gave me similar results. Does anyone have any thoughts?

Still getting error: `cannot import name 'jitclass' from 'numba'`

Hi there,

Noting that you've already tried to address this issue, after installing the pip wheel inside a python 3.8 Singularity container on Linux, or a conda environment on MacOS, I still get the following error when trying to initiate the software:

username@hostname:~$ AlphaImpute2

Traceback (most recent call last):
  File "/usr/local/bin/AlphaImpute2", line 5, in <module>
    from alphaimpute2.alphaimpute2 import main
  File "/usr/local/lib/python3.8/site-packages/alphaimpute2/alphaimpute2.py", line 4, in <module>
    from .Imputation import ParticlePhasing
  File "/usr/local/lib/python3.8/site-packages/alphaimpute2/Imputation/ParticlePhasing.py", line 19, in <module>
    from . import PhasingObjects
  File "/usr/local/lib/python3.8/site-packages/alphaimpute2/Imputation/PhasingObjects.py", line 17, in <module>
    from . import Imputation
  File "/usr/local/lib/python3.8/site-packages/alphaimpute2/Imputation/Imputation.py", line 2, in <module>
    from numba import jit, int8, int32, boolean, jitclass, float32, int64
ImportError: cannot import name 'jitclass' from 'numba' (/usr/local/lib/python3.8/site-packages/numba/__init__.py)

Your assistance would be appreciated.

Best,

Ian

IndexError: tuple index out of range

Hi
I'm trying to do some imputation checks by masking the genotype data and it worked well with the original file but always get errors when I use the file that I added random 9(missing value) into the genotype file:
File "/home1/p314480/.local/lib/python3.10/site-packages/alphaimpute2/Imputation/BurrowsWheelerLibrary.py", line 65, in setup_library
bw_loci = np.array(range(self.haplotypes.shape[1]), dtype = np.int64)
IndexError: tuple index out of range
How could I solve this problem? Thank you!

Issue with OutputOnlyGenotypedAnimals option

Dear AlphaGenes team
Im using Alphaimpute2, to impute SNP array data ( ~ 60K SNPs) from offspring to WGS (~ 7M SNPs) to the parents . Also I have a pedigree file from 15 generations back (~ 85 K animals).
My issue is that the generated output has imputed data from all the fish, included those on the pedigree. However I only require the imputation from the genotyped animals.
On the Alphaimpute v.1.9 manual (I cant find the manual for the v2.0) there is an option called Outputonlygenotypedanimals. I’ve tried this option but it looks like it does not work for the last version of alphaimpute (at least for me). I’ve tried not using the pedigree for the imputation, and it works in terms that I only have the output for the offspring. However, my idea is to use the pedigree information as well, to improve the accuracy of the analysis.
It is possible to use the "Outputonlygenotypedanimals" on the v2? I wrote this option on the last box (BOX 8; Output), as shown below:
= BOX 8: Output =============================================================
WellPhasedThreshold ,99.0
ResultFolderPath ,Results
OutputOnlyGenotypedAnimals ,Yes

I appreciate your help

Cheers

Test example error

While running the test example, it raises the following error

Traceback (most recent call last):

File "/Users/evie/mambaforge/bin/AlphaImpute2", line 8, in <module>

sys.exit(main())

File "/Users/evie/mambaforge/lib/python3.10/site-packages/alphaimpute2/tinyhouse/Utils.py", line 8, in timer

values = func(*args, **kwargs)

File "/Users/evie/mambaforge/lib/python3.10/site-packages/alphaimpute2/alphaimpute2.py", line 304, in main

write_out_data(pedigree, args)

File "/Users/evie/mambaforge/lib/python3.10/site-packages/alphaimpute2/tinyhouse/Utils.py", line 8, in timer

values = func(*args, **kwargs)

File "/Users/evie/mambaforge/lib/python3.10/site-packages/alphaimpute2/alphaimpute2.py", line 255, in write_out_data

pedigree.writePhase(args.out + ".haplotypes")

File "/Users/evie/mambaforge/lib/python3.10/site-packages/alphaimpute2/tinyhouse/Pedigree.py", line 943, in writePhase

if ind.haplotypes.ndim == 2:  # diploid

AttributeError: 'tuple' object has no attribute 'ndim'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.