Comments (6)
Hello - it seems like you already have files in the output location, or alternatively files named the same thing in your input?
Can you show me the tree of files you are trying to convert in the input/output, and the exact command you are running?
Thanks,
- George
from pod5-file-format.
Hi, yes files in the input are named the same, that was the point I was making. It is (was?) done so by default when minknow splits reads into fast5_fail and fast5_pass.
I couldn't find any documentation on if it is better to keep 4000 reads/file or if it is better to store more reads per pod5 file?
from pod5-file-format.
In this case input files are named like this:
20191118_1537_MN19470_FAK64715_2f44e93d/fast5_fail/FAK64715_3b9c09954054f5908e63cf500553a769d9b98cea_0.fast5
20191118_1537_MN19470_FAK64715_2f44e93d/fast5_pass/FAK64715_3b9c09954054f5908e63cf500553a769d9b98cea_0.fast5
So I assumed this is what caused the error but it could also be from duplicated folders.
from pod5-file-format.
OK - I'll have a look at respecting the input hierarchy shortly.
I couldn't find any documentation on if it is better to keep 4000 reads/file or if it is better to store more reads per pod5 file?
Its really up to you and how you want to use the data, there is no performance downside to putting all the data into one file.
If you are doing the conversion to archive the data, it may be more convenient to have one large file.
- George
from pod5-file-format.
Ok, thanks. Having too many reads per file was a problem for some tools with fast5 but perhaps this is one of the reasons for the format being replaced. For large runs it may still be better to split reads into multiple files, so would be good if there was an option to define how many reads to store per pod5 file.
from pod5-file-format.
Hi @olawa,
Please try the latest release, where the conversion scripts now have --file-read-count
as an option, and to maintain the input folder hierarchy.
- George
from pod5-file-format.
Related Issues (20)
- Segmentation fault (core dumped) when trying to run pod5 convert fast5 HOT 9
- pod5 view fails with polars.exceptions.ColumnNotFoundError: not_set HOT 2
- pod5 filter/ subset failed with TypeError: enable_string_cache() missing 1 required positional argument: 'enable' HOT 3
- V0.3.6 pod5 view "Error while processing" HOT 3
- pod5 view does not work for some data since version 0.3.0 HOT 6
- pod5 webserver memory error HOT 2
- option to split pod5 by size/read number HOT 3
- Scratch/tmp pod5 problem HOT 21
- Semaphore hissy fit at the end of subset run HOT 1
- pod5 subset/filter in preparation for dorado duplex is slow HOT 5
- error with pod5 convert to_fast5 HOT 1
- Cannot install pod5 through pip on ARM due to dependency issues HOT 11
- Reader class attributes immutable (Cannot edit "sample_id" field of mutable read object) HOT 1
- getrandom error with pod5 convert fast5 HOT 14
- MantaControl': Unable to read fast5 file at /path/: HDF5 exception", HOT 2
- Getting the signal chunk size of a pod5 file HOT 1
- Missing conda pod5 package HOT 1
- No documentation regarding multi-file pod5 dependency HOT 2
- pod5 convert fast5 warning: Failed to read key read_XXX HOT 2
- Troubleshooting Conversion of Fast5 Files to Pod5 Format HOT 12
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pod5-file-format.