Giter Club home page Giter Club logo

Comments (14)

chrchang avatar chrchang commented on June 17, 2024

Ok. Can you post the full .log file for the failed run?

from plink-ng.

schmarki avatar schmarki commented on June 17, 2024

PLINK v2.00a2LM 64-bit Intel (15 Feb 2018) www.cog-genomics.org/plink/2.0/
(C) 2005-2018 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to testchr22.log.
Options in effect:
--bgen ukb_imp_chr22_v2.bgen
--make-pgen
--out testchr22
--sample ukb672_imp_chr22_v2_s487406.sample

Start time: Fri Feb 16 12:08:33 2018
257847 MB RAM detected; reserving 128923 MB for main workspace.
Allocated 7259 MB successfully, after larger attempt(s) failed.
Using up to 32 threads (change this with --threads).
--bgen: 1255680 variants detected, format v1.2.

Error: File read failure.
End time: Fri Feb 16 12:08:34 2018

from plink-ng.

chrchang avatar chrchang commented on June 17, 2024

Okay, pretty sure I know what the problem is, posting what I think is a fix to GitHub in a few minutes.

from plink-ng.

schmarki avatar schmarki commented on June 17, 2024

Thanks a lot & again, thanks for developing this great tool!

from plink-ng.

ttasa avatar ttasa commented on June 17, 2024

Similar issue with 30 Jul Version. This fails at seemingly random positions.
Below are three different runs with slightly different parameters and three different points of failure.
Is this a bug / is it something that has been addressed in newer alpha releases?
1)
PLINK v2.00a2LM 64-bit Intel (30 Jul 2018) www.cog-genomics.org/plink/2.0/
(C) 2005-2018 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to ukb_hDMg_QC_14.log.
Options in effect:
--bgen b_imp_chr14_v3.bgen
--hardy
--memory 30000
--missing
--out b_hDMg_QC_14
--sample Bgen.sample
--threads 1

Start time: Tue Sep 11 14:18:15 2018
1031229 MiB RAM detected; reserving 30000 MiB for main workspace.
Using 1 compute thread.
--bgen: 3037521 variants detected, format v1.2.
487409 samples imported from .sample file to ukb_hDMg_QC_14-temporary.psam .
--bgen: 312k variants scanned.
Error: File read failure.
End time: Tue Sep 11 14:31:03 2018

PLINK v2.00a2LM 64-bit Intel (30 Jul 2018) www.cog-genomics.org/plink/2.0/
(C) 2005-2018 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to ukb_hDMg_QC_14.log.
Options in effect:
--bgen b_imp_chr14_v3.bgen
--hardy
--missing
--out b_hDMg_QC_14
--sample Bgen.sample
--threads 4

Start time: Tue Sep 11 14:48:48 2018
257272 MiB RAM detected; reserving 128636 MiB for main workspace.
Using up to 4 compute threads.
--bgen: 3037521 variants detected, format v1.2.
487409 samples imported from .sample file to ukb_hDMg_QC_14-temporary.psam .
--bgen: 26k variants scanned.
Error: File read failure.
End time: Tue Sep 11 14:57:56 2018

PLINK v2.00a2LM 64-bit Intel (30 Jul 2018) www.cog-genomics.org/plink/2.0/
(C) 2005-2018 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to ukb_hDMg_QC_14.log.
Options in effect:
--bgen b_imp_chr14_v3.bgen
--hardy
--missing
--out b_hDMg_QC_14
--sample Bgen.sample
--threads 1

Start time: Tue Sep 11 15:00:36 2018
257272 MiB RAM detected; reserving 128636 MiB for main workspace.
Using 1 compute thread.
--bgen: 3037521 variants detected, format v1.2.
487409 samples imported from .sample file to ukb_hDMg_QC_14-temporary.psam .
--bgen: 24k variants scanned.
Error: File read failure.
End time: Tue Sep 11 15:11:56 2018

from plink-ng.

chrchang avatar chrchang commented on June 17, 2024

This looks like an unfixed bug; will try to replicate it today. Are there any differences between your b_imp_chr14_v3.bgen and the raw ukb_imp_chr14_v3.bgen file that I should be aware of?

from plink-ng.

ttasa avatar ttasa commented on June 17, 2024

None. They are exactly the same.
The issue comes up both with -make-pgen and -make-bed.
However, if I subset just the first 700k variants in that chromosome then it seems to be doing much better and gets way past the number of variants reported above.

Job log as of right now and still processing:

PLINK v2.00a2LM 64-bit Intel (30 Jul 2018) www.cog-genomics.org/plink/2.0/
(C) 2005-2018 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to ukb_hDMg_QC_14.log.
Options in effect:
--bgen SubsetChr14_700k.bgen
--hardy
--missing
--out b_hDMg_QC_14
--sample Bgen.sample
--threads 10

Start time: Tue Sep 11 17:32:30 2018
64225 MiB RAM detected; reserving 32112 MiB for main workspace.
Using up to 10 threads (change this with --threads).
--bgen: 697895 variants detected, format v1.2.
487409 samples imported from .sample file to ukb_hDMg_QC_14-temporary.psam .
--bgen: 537k variants converted.

from plink-ng.

ttasa avatar ttasa commented on June 17, 2024

Succesful .pgen conversion with a subset. A memory handling/dataset size issue?

from plink-ng.

chrchang avatar chrchang commented on June 17, 2024

I'm primarily interested in sets of parameters that maximize the chance of a crash right now so I can investigate this properly. How quickly does --memory 30000 --threads 4 on the original (non-subsetted) dataset crash?

from plink-ng.

ttasa avatar ttasa commented on June 17, 2024

So, I am running these jobs in Slurm queue management system in our HPC cluster where the tasks are directed to different node machines. It seems to be the case that the failing jobs are all directed to a particular set of hosts. Let me investigate what the specifics of the failing machines are and I'll get back to you sometime soon.

from plink-ng.

chrchang avatar chrchang commented on June 17, 2024

Hi,

Do you have any more information on this? I haven't attempted to replicate the crash yet, since if you only observe it on one type of machine it's important for me to match that.

from plink-ng.

ttasa avatar ttasa commented on June 17, 2024

Sorry about the delay. I investigated it further and I think the "File read failure" and "Sample file not found" errors only occurred if I performed the analysis (read/write operations) on a server disk that was around, unbeknown to me, 95% full. So, I would presume that if storage capacity limits interfere with read/write operations then Plink outputs these cryptic messages. Though, I wouldn't really know what to do or how to update code based on this error. After moving analyses to a different disk, the problems disappeared.

from plink-ng.

chrchang avatar chrchang commented on June 17, 2024

Okay. Plink doesn't write anything to disk at all during the .bgen scanning phase, so the "read error" message is probably accurate as far as it goes; the question is what's happening on the system that's causing the error to occur during reading, instead of writing as one would expect. It's probably virtual memory/swapping-related.

I will look into modifying plink2's read- (and write-)error messages so that if any more information is available about the error, that is also logged.

from plink-ng.

chrchang avatar chrchang commented on June 17, 2024

Read- and write-error messages now surface the original error message reported by the OS, as of the 9 Oct 2019 build.

from plink-ng.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.