fred-2 / optitype Goto Github PK
View Code? Open in Web Editor NEWPrecision HLA typing from next-generation sequencing data
License: BSD 3-Clause "New" or "Revised" License
Precision HLA typing from next-generation sequencing data
License: BSD 3-Clause "New" or "Revised" License
Looking at the commits, this might already be fixed. However, can we get a release so I can update the Bioconda package?
+ OptiTypePipeline.py --config /tmp/tmp.7tWxKjFp3A/tmp.giJXfvqJlQ/tmp.d/optitype.ini --verbose --input /tmp/tmp.7tWxKjFp3A/tmp.giJXfvqJlQ/tmp.d/reads_left.fastq /tmp/tmp.7tWxKjFp3A/tmp.giJXfvqJlQ/tmp.d/reads_right.fastq --dna --outdir /tmp/tmp.7tWxKjFp3A/tmp.giJXfvqJlQ/out.tmp
Traceback (most recent call last):
File "/tmp/tmp.7tWxKjFp3A/tmpl00sjz26/.snakemake/conda/7a1aa8ba/bin/OptiTypePipeline.py", line 427, in <module>
coverage_mat = ht.calculate_coverage(plot_variables, features, hlatype, features_used)
File "/tmp/tmp.7tWxKjFp3A/tmpl00sjz26/.snakemake/conda/7a1aa8ba/share/optitype-1.2-0/hlatyper.py", line 626, in calculate_coverage
coverage[bool(i_mismatches)][i_pairing-1][i_hitcount-1][i_pos-1:i_pos-1+i_read_length] += 1
IndexError: in the future, 0-d boolean arrays will be interpreted as a valid boolean index
+ rm -rf /tmp/tmp.7tWxKjFp3A/tmp.giJXfvqJlQ
Traceback (most recent call last):
File "/home/mholtgre/Development/open_pipeline/tools/cubi_wrappers/cubi_wrappers/wrappers/optitype/.snakemake.nq3mkf2a.wrapper.py", line 119, in <module>
""")
File "/bioconda/2017-03/miniconda3/lib/python3.4/site-packages/snakemake/shell.py", line 80, in __new__
raise sp.CalledProcessError(retcode, cmd)
Hello,
I received the above error running on example data. Do you have any advice about how I can proceed? Thanks!
Command:
python OptiTypePipeline.py -i test/exome/NA11995_SRR766010_1_fished.fastq --verbose --dna --outdir /home/ubuntu/OptiType-master
Config:
[MAPPING]
RAZERS3=/usr/bin/razers3
THREADS=4
[LIBRARIES]
RNA_REF=/home/ubuntu/OptiType-master/data/hla_reference_rna.fasta
DNA_REF=/home/ubuntu/OptiType-master/data/hla_reference_dna.fasta
ALLELES=/home/ubuntu/OptiType-master/data/alleles.h5
[OPTIMIZATION]
SOLVER=cbc
THREADS=1
Log:
0:00:00.33 Mapping NA11995_SRR766010_1_fished.fastq to GEN reference...
0:00:15.47 Generating binary hit matrix.
0:00:15.47 Loading alleles and read IDs from /home/ubuntu/OptiType-master/2015_09_27_22_16_19/2015_09_27_22_16_19_0.sam...
0:00:16.37 11179 alleles and 1909 reads found.
0:00:16.37 Initializing mapping matrix...
0:00:16.38 1909x11179 mapping matrix initialized. Populating 1344422 hits from SAM file...
10% completed
20% completed
30% completed
40% completed
50% completed
60% completed
70% completed
80% completed
90% completed
100% completed
0:03:20.32 1344422 elements filled. Matrix sparsity: 1 in 15.87
0:03:22.42 temporary pruning of identical rows and columns
0:03:22.51 Size of mtx with unique rows and columns: (434, 1021)
0:03:22.51 determining minimal set of non-overshadowed alleles
/home/ubuntu/.local/lib/python2.7/site-packages/pandas/util/decorators.py:13: FutureWarning: diff is deprecated. Use difference instead
FutureWarning)
0:03:23.70 Keeping only the minimal number of required alleles (125,)
0:03:23.70 Creating compact model...
0:03:23.83 Initializing OptiType model...
Traceback (most recent call last):
File "OptiTypePipeline.py", line 322, in
result = op.solve(args.enumerate)
File "/home/ubuntu/OptiType-master/model.py", line 142, in solve
self.__instance.x.reset()
AttributeError: 'IndexedVarWithDomain' object has no attribute 'reset'`
One of the samples not correctly predicted by OptiType is ERR031857, where A02:06 is misclassified as A02:01.
Even when expanding the results (-e 5
), the correct solution is not found:
A1 A2 B1 B2 C1 C2 Reads Objective
0 A*02:01 A*11:01 B*07:05 B*54:01 C*07:02 C*01:02 413 394.4049999999999
1 A*02:01 A*11:01 B*07:05 B*55:02 C*07:02 C*01:02 407 388.6749999999999
2 A*02:01 A*11:01 B*54:01 B*07:02 C*07:02 C*01:02 400 381.97999999999985
3 A*02:01 A*11:01 B*55:02 B*07:02 C*07:02 C*01:02 394 376.24999999999983
4 A*02:01 A*11:01 B*07:05 B*55:04 C*07:02 C*01:02 392 374.34000000000003
Curiously, e.g. Major et al. (2013) were able to predict the correct HLA types.
Does anybody have an idea why this happens?
(I would be interested if Optitype2 can handle this case.)
Would you consider a patch adding Python 3 support to OptiType? This would greatly simplify our lives at the place I work as this would make OptiType integrate better with the rest of our stack.
Hello,
I received the following error. I am not familiar with python so am unable to investigate further. I have run OptiType successfully on many other similar files and encountered this error only once. Thank you in advance for your consideration.
Traceback (most recent call last):
File "/home/ubuntu/OptiType-master/OptiTypePipeline.py", line 342, in
coverage_mat = ht.calculate_coverage(plot_variables, features, hlatype, features_used)
File "/home/ubuntu/OptiType-master/hlatyper.py", line 505, in calculate_coverage
hit_counts[reads]):
File "/home/ubuntu/.local/lib/python2.7/site-packages/pandas/core/series.py", line 561, in getitem
return self._get_with(key)
File "/home/ubuntu/.local/lib/python2.7/site-packages/pandas/core/series.py", line 604, in _get_with
return self.reindex(key)
File "/home/ubuntu/.local/lib/python2.7/site-packages/pandas/core/series.py", line 2151, in reindex
return super(Series, self).reindex(index=index, **kwargs)
File "/home/ubuntu/.local/lib/python2.7/site-packages/pandas/core/generic.py", line 1773, in reindex
method, fill_value, copy).finalize(self)
File "/home/ubuntu/.local/lib/python2.7/site-packages/pandas/core/generic.py", line 1790, in _reindex_axes
fill_value=fill_value, copy=copy, allow_dups=False)
File "/home/ubuntu/.local/lib/python2.7/site-packages/pandas/core/generic.py", line 1876, in _reindex_with_indexers
copy=copy)
File "/home/ubuntu/.local/lib/python2.7/site-packages/pandas/core/internals.py", line 3150, in reindex_indexer
self.axes[axis]._can_reindex(indexer)
File "/home/ubuntu/.local/lib/python2.7/site-packages/pandas/core/index.py", line 1860, in _can_reindex
raise ValueError("cannot reindex from a duplicate axis")
ValueError: cannot reindex from a duplicate axis
System information:
Linux version 3.19.0-31-generic (buildd@lcy01-07) (gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13) ) #36-Ubuntu SMP Wed Oct 7 15:04:02 UTC 2015
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 15.04
Release: 15.04
Codename: vivid
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz
Stepping: 2
CPU MHz: 1220.812
CPU max MHz: 3000.0000
CPU min MHz: 1200.0000
BogoMIPS: 4862.53
Hypervisor vendor: Xen
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 30720K
NUMA node0 CPU(s): 0-9,20-29
NUMA node1 CPU(s): 10-19,30-39
MemTotal: 165051848 kB
MemFree: 7424336 kB
MemAvailable: 111855276 kB
Buffers: 111620 kB
Cached: 102348516 kB
SwapCached: 9496 kB
Active: 91282240 kB
Inactive: 63390396 kB
Active(anon): 51573908 kB
Inactive(anon): 640368 kB
Active(file): 39708332 kB
Inactive(file): 62750028 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 1048572 kB
SwapFree: 766880 kB
Dirty: 120940 kB
Writeback: 0 kB
AnonPages: 52205360 kB
Mapped: 84072 kB
Shmem: 428 kB
Slab: 2388540 kB
SReclaimable: 2310464 kB
SUnreclaim: 78076 kB
KernelStack: 11584 kB
PageTables: 111768 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 83574496 kB
Committed_AS: 77935112 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 439564 kB
VmallocChunk: 34274108676 kB
HardwareCorrupted: 0 kB
AnonHugePages: 17944576 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 133116 kB
DirectMap2M: 167770112 kB
siblings : 20
core id : 8
cpu cores : 10
apicid : 49
initial apicid : 49
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq monitor est ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm ida fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt
bugs :
bogomips : 4862.53
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
We'd like to experiment with OptiType and I'd like to fix on a particular code set. Instead of relying on a commit, would it be possible to have a tagged version?
I'm getting the following error message. Any idea what is causing it?
mapping with 8 threads...
0:01:18.24 Mapping filtered_fished.fastq to GEN reference...
0:01:45.95 Generating binary hit matrix.
0:01:45.97 Loading optitype_outdir/2017_04_25_02_50_30/2017_04_25_02_50_30_1.bam started. Number of HLA reads loaded (updated every thousand):
0:01:46.04 672 reads loaded. Creating dataframe...
0:01:46.12 Dataframes created. Shape: 672 x 11179, hits: 6952 (11622), sparsity: 1 in 646.39
0:01:51.76 temporary pruning of identical rows and columns
0:01:52.06 Size of mtx with unique rows and columns: (43, 54)
0:01:52.06 determining minimal set of non-overshadowed alleles
0:01:52.19 Keeping only the minimal number of required alleles (4,)
0:01:52.19 Creating compact model...
starting ilp solver with 1 threads...
0:01:52.20 Initializing OptiType model...
Welcome to the CBC MILP Solver
Version: 2.9
Build Date: Mar 7 2017
command line - /n/sw/fasrcsw/apps/Core/Cbc/2.9-fasrc01/bin/cbc -printingOptions all -import /tmp/tmpm4vfz6ss.pyomo.lp -import -stat=1 -solve -solu /tmp/tmpm4vfz6ss.pyomo.soln (default strategy 1)
Option for printingOptions changed from normal to all
CoinLpIO::readLp(): Maximization problem reformulated as minimization
Current default (if $ as parameter) for import is /tmp/tmpm4vfz6ss.pyomo.lp
Presolve 16 (-8) rows, 11 (-3) columns and 37 (-13) elements
Statistics for presolved model
Original problem has 8 integers (8 of which binary)
Presolved problem has 6 integers (6 of which binary)
==== 3 zero objective 7 different
1 variables have objective of -116
1 variables have objective of -101
2 variables have objective of -5
3 variables have objective of 0
2 variables have objective of 0.045
1 variables have objective of 0.909
1 variables have objective of 1.044
==== absolute objective values 7 different
3 variables have objective of 0
2 variables have objective of 0.045
1 variables have objective of 0.909
1 variables have objective of 1.044
2 variables have objective of 5
1 variables have objective of 101
1 variables have objective of 116
==== for integers 2 zero objective 4 different
1 variables have objective of -116
1 variables have objective of -101
2 variables have objective of -5
2 variables have objective of 0
==== for integers absolute objective values 4 different
2 variables have objective of 0
2 variables have objective of 5
1 variables have objective of 101
1 variables have objective of 116
===== end objective counts
Problem has 16 rows, 11 columns (8 with objective) and 37 elements
Column breakdown:
4 of type 0.0->inf, 1 of type 0.0->up, 0 of type lo->inf,
0 of type lo->up, 0 of type free, 0 of type fixed,
0 of type -inf->0.0, 0 of type -inf->up, 6 of type 0.0->1.0
Row breakdown:
0 of type E 0.0, 0 of type E 1.0, 0 of type E -1.0,
0 of type E other, 0 of type G 0.0, 1 of type G 1.0,
0 of type G other, 10 of type L 0.0, 1 of type L 1.0,
4 of type L other, 0 of type Range 0.0->1.0, 0 of type Range other,
0 of type Free
Continuous objective value is -224.957 - 0.00 seconds
Cgl0004I processed model has 16 rows, 11 columns (6 integer (6 of which binary)) and 37 elements
Cbc0038I Initial state - 0 integers unsatisfied sum - 0
Cbc0038I Solution found of -224.957
Cbc0038I Relaxing continuous gives -224.957
Cbc0038I Before mini branch and bound, 6 integers at bound fixed and 0 continuous
Cbc0038I Mini branch and bound did not improve solution (0.00 seconds)
Cbc0038I After 0.00 seconds - Feasibility pump exiting with objective of -224.957 - took 0.00 seconds
Cbc0012I Integer solution of -224.957 found by feasibility pump after 0 iterations and 0 nodes (0.00 seconds)
Cbc0001I Search completed - best objective -224.957, took 0 iterations and 0 nodes (0.00 seconds)
Cbc0035I Maximum depth 0, 0 variables fixed on reduced cost
Cuts at root node changed objective from -224.957 to -224.957
Probing was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Gomory was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Knapsack was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Clique was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
MixedIntegerRounding2 was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
FlowCover was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
TwoMirCuts was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Result - Optimal solution found
Objective value: -224.95700000
Enumerated nodes: 0
Total iterations: 0
Time (CPU seconds): 0.00
Time (Wallclock seconds): 0.01
Total time (CPU seconds): 0.00 (Wallclock seconds): 0.01
Traceback (most recent call last):
File "/n/regal/nowak_lab/immunotherapy/OptiType/OptiTypePipeline.py", line 405, in <module>
result = op.solve(args.enumerate)
File "/n/regal/nowak_lab/immunotherapy/OptiType/model.py", line 188, in solve
self.__instance.c.add(expr >= 1)
File "/n/scrb152/Software/Python/py35/lib/python3.5/site-packages/pyomo/core/base/constraint.py", line 1188, in add
cdata = self._check_skip_add(self._nconstraints + 1, expr)
File "/n/scrb152/Software/Python/py35/lib/python3.5/site-packages/pyomo/core/base/constraint.py", line 895, in _check_skip_add
self._data[index].name))
ValueError: Invalid constraint expression. The constraint expression resolved to a trivial Boolean (False) instead of a Pyomo object. Please modify your rule to return Constraint.Infeasible instead of False.
This is my first time running optitype, so forgive me if I missed something obvious. I am getting an odd parsing error. This also occurs with running the test dataset.
Command:
python OptiTypePipeline.py -i ./sample_opti.1.fq ./sample_opti.2.fq --dna -c ./config.ini -v -o ./optitype_work/
Output:
...
Problem data seem to be well scaled
Constructing initial basis...
Size of triangular part is 24641
Solving LP relaxation...
GLPK Simplex Optimizer, v4.59
24641 rows, 13057 columns, 387578 non-zeros
0: obj = -0.000000000e+00 inf = 6.000e+00 (6)
6: obj = -5.000000000e-02 inf = 0.000e+00 (0)
* 500: obj = 3.181086340e+04 inf = 2.501e-14 (5968)
* 1000: obj = 3.630040867e+04 inf = 2.998e-15 (5832) 1
* 1500: obj = 3.858366231e+04 inf = 1.865e-14 (5697) 1
* 2000: obj = 4.039086154e+04 inf = 8.253e-15 (5604)
* 2500: obj = 4.164897250e+04 inf = 8.357e-15 (5524) 1
* 3000: obj = 4.272261500e+04 inf = 5.965e-15 (5427) 1
* 3500: obj = 4.343465417e+04 inf = 8.882e-15 (5356) 2
* 4000: obj = 4.420333917e+04 inf = 6.217e-15 (5259) 1
* 4500: obj = 4.513628286e+04 inf = 1.753e-14 (5178) 4
* 5000: obj = 4.557089286e+04 inf = 0.000e+00 (5148) 1
* 5500: obj = 4.586324125e+04 inf = 0.000e+00 (5136) 1
* 6000: obj = 4.677771714e+04 inf = 4.199e-14 (5013) 1
* 6500: obj = 4.804854833e+04 inf = 1.110e-16 (4827) 3
* 7000: obj = 4.926832833e+04 inf = 0.000e+00 (4595) 2
* 7500: obj = 4.933832833e+04 inf = 0.000e+00 (4564)
* 8000: obj = 4.940799500e+04 inf = 3.331e-15 (4515)
* 8500: obj = 4.947966167e+04 inf = 0.000e+00 (4492) 1
* 9000: obj = 4.955066167e+04 inf = 0.000e+00 (4462)
* 9500: obj = 5.095554333e+04 inf = 0.000e+00 (4141) 2
* 10000: obj = 5.227048333e+04 inf = 0.000e+00 (3768) 1
* 10500: obj = 5.347663333e+04 inf = 0.000e+00 (3333) 1
* 11000: obj = 5.448885333e+04 inf = 0.000e+00 (2891)
* 11500: obj = 5.521412000e+04 inf = 0.000e+00 (2429) 1
* 12000: obj = 5.594569333e+04 inf = 0.000e+00 (1965)
* 12500: obj = 5.638794833e+04 inf = 0.000e+00 (1485)
* 13000: obj = 5.676556000e+04 inf = 0.000e+00 (1004)
* 13500: obj = 5.715105500e+04 inf = 0.000e+00 (516)
* 14000: obj = 5.752945500e+04 inf = 0.000e+00 (34) 1
* 14037: obj = 5.755625833e+04 inf = 0.000e+00 (0)
OPTIMAL LP SOLUTION FOUND
Integer optimization begins...
+ 14037: mip = not found yet <= +inf (1; 0)
+ 14038: >>>>> 5.755594000e+04 <= 5.755594000e+04 0.0% (2; 0)
+ 14038: mip = 5.755594000e+04 <= tree is empty 0.0% (0; 3)
INTEGER OPTIMAL SOLUTION FOUND
Time used: 12.8 secs
Memory used: 75.9 Mb (79626336 bytes)
Writing MIP solution to '/var/folders/76/zt8rzbc5077bnxl8pvhpw6d4wvz933/T/tmpH4rygQ.glpk.raw'...
37709 lines were written
invalid literal for int() with base 10: 'c'
Traceback (most recent call last):
File "~/GitHub/OptiType/OptiTypePipeline.py", line 373, in <module>
result = op.solve(args.enumerate)
File "~/GitHub/OptiType/model.py", line 149, in solve
res = self.__solver.solve(self.__instance, options={}, tee=self.__verbosity)
File "~/anaconda2/lib/python2.7/site-packages/pyomo/opt/base/solvers.py", line 578, in solve
result = self._postsolve()
File "~/anaconda2/lib/python2.7/site-packages/pyomo/opt/solver/shellcmd.py", line 161, in _postsolve
results = self.process_output(self._rc)
File "~/anaconda2/lib/python2.7/site-packages/pyomo/opt/solver/shellcmd.py", line 220, in process_output
self.process_soln_file(results)
File "~/anaconda2/lib/python2.7/site-packages/pyomo/solvers/plugins/solvers/GLPK.py", line 445, in process_soln_file
raise ValueError(msg)
ValueError: Error parsing solution data file, line 1
When I open up the MIP solution tmp file it has the format of:
c Problem:
c Rows: 24642
c Columns: 13058
c Non-zeros: 387579
c Status: INTEGER OPTIMAL
c Objective: x13058 = 57555.94 (MAXimum)
c
s mip 24642 13058 o 57555.9400000032
i 1 2
i 2 2
i 3 2
i 4 2
...
I guess it has to do with the 'c' in the file being treated as integers but I am not sure why.
Any ideas?
Hi,
When running this, I'm trying to figure out why tmp files that are supposed to be created for input into the solver (I'm using Cbc) do not get written. Before I go hunting around code, does anyone know what is creating these tmp files and perhaps venture a guess as to why they are not getting written. I'll post code if people think it's needed.
Thanks,
-todd
Hi,
Is there anyway for OptiType to take bam files directly without having to run bam2fq?
Thanks,
-todd
The docker image can't be rebuilt locally because the source image biodckr/biodocker does not exist anymore. Also, making the USER biodocker means that a biodocker user needs to exist on whatever machine this container is being run on or the data and folders have to be read and writable by everyone.
Hello,
I was looking through the given fasta files under data directory and noticed that some of the sequences provided in there are combinations of coding sequence and non-coding sequence.
What puzzles me is that some of them have different first 2 digits. (ex. HLA07296_HLA00097 HLA-A_33:53 (introns from HLA-A_31:01:02)). And it seems like the list of combinations of exon and intron that share first two digits are not exhaustive.
Is there a logic to how these alleles are combined? (And if I were to update this fasta to more recent alleles from the HLA db, what logic should I use to combine the alleles?)
Thanks.
I'm using optitype/1.3.1, specifying solver=cbc in config.ini, using hla_reference_dna.fasta. i prefilter my reads using razers3. This is the output i'm getting with my own data.
A1 A2 B1 B2 C1 C2 Reads Objective
0 HLA00001 HLA00037 HLA00344 HLA00180 HLA00
433 HLA00401 4305 4072.53
EDIT:
This is the output i get using the test data from the code.
A1 A2 B1 B2 C1 C2 Reads Objective
0 HLA00001 HLA00001 HLA00146 HLA00381 HLA00
433 HLA00430 1156 1135.192
The stdout
filtering for hla region reads for R1
convert filtered bam1 to fastq1
filtering for hla region reads for R2
convert filtered bam1 to fastq1
run hla typing
mapping with 16 threads...
0:00:00.51 Mapping EVL35_1.fastq to GEN reference...
0:00:25.57 Mapping EVL35_2.fastq to GEN reference...
0:00:58.46 Generating binary hit matrix.
Warning: PySam not available on the system. Falling back to primitive SAM par
sing.
0:00:58.46 Loading alleles and read IDs from 01-filter-hla-read/output/EVL35/
2018_01_18_18_39_13/2018_01_18_18_39_13_1.sam...
0:01:03.47 11179 alleles and 6968 reads found.
0:01:03.47 Initializing mapping matrix...
0:01:03.47 6968x11179 mapping matrix initialized. Populating 1549982 hits fro
m SAM file...
10% completed
20% completed
0:04:55.05 1549982 elements filled. Matrix sparsity: 1 in 50.26
Warning: PySam not available on the system. Falling back to primitive SAM par
sing.
0:04:55.42 Loading alleles and read IDs from 01-filter-hla-read/output/EVL35/
2018_01_18_18_39_13/2018_01_18_18_39_13_2.sam...
0:05:00.29 11179 alleles and 6989 reads found.
0:05:00.29 Initializing mapping matrix...
0:05:00.30 6989x11179 mapping matrix initialized. Populating 1519093 hits fro
m SAM file...
10% completed
0:08:47.45 1519093 elements filled. Matrix sparsity: 1 in 51.43
0:08:48.94 Alignment pairing completed. 6164 paired, 1561 unpaired, 34 discor
dant
0:08:52.71 temporary pruning of identical rows and columns
0:08:52.96 Size of mtx with unique rows and columns: (983, 890)
0:08:52.96 determining minimal set of non-overshadowed alleles
0:08:55.61 Keeping only the minimal number of required alleles (77,)
0:08:55.61 Creating compact model...
starting ilp solver with 1 threads...
0:08:55.93 Initializing OptiType model...
Welcome to the CBC MILP Solver
Version: 2.8
Build Date: Aug 5 2015
Revision Number: 2210
command line - /risapps/rhel6/cbc/2.8/bin/cbc -printingOptions all -import /t
mp/tmp2d8uhj.pyomo.lp -import -stat=1 -solve -solu /tmp/tmp2d8uhj.pyomo.soln
(default strategy 1)
Option for printingOptions changed from normal to all
Coin0009I CoinLpIO::readLp(): Maximization problem reformulated as minimizat
ion
Current default (if $ as parameter) for import is /tmp/tmp2d8uhj.pyomo.lp
Presolve 845 (-1) rows, 494 (-1) columns and 3059 (-1) elements
Statistics for presolved model
Problem has 845 rows, 494 columns (458 with objective) and 3059 elements
Column breakdown:
208 of type 0.0->inf, 1 of type 0.0->up, 0 of type lo->inf,
0 of type lo->up, 0 of type free, 0 of type fixed,
0 of type -inf->0.0, 0 of type -inf->up, 285 of type 0.0->1.0
Row breakdown:
0 of type E 0.0, 0 of type E 1.0, 0 of type E -1.0,
0 of type E other, 0 of type G 0.0, 6 of type G 1.0,
0 of type G other, 624 of type L 0.0, 0 of type L 1.0,
215 of type L other, 0 of type Range 0.0->1.0, 0 of type Range other,
0 of type Free
Continuous objective value is -4072.53 - 0.01 seconds
Cgl0004I processed model has 839 rows, 494 columns (285 integer) and 2982 ele
ments
Cbc0038I Solution found of -4072.53
Cbc0038I Before mini branch and bound, 285 integers at bound fixed and 25 con
tinuous
Cbc0038I Mini branch and bound did not improve solution (0.02 seconds)
Cbc0038I After 0.02 seconds - Feasibility pump exiting with objective of -407
2.53 - took 0.00 seconds
Cbc0012I Integer solution of -4072.53 found by feasibility pump after 0 itera
tions and 0 nodes (0.02 seconds)
Cbc0001I Search completed - best objective -4072.530000000001, took 0 iterati
ons and 0 nodes (0.02 seconds)
Cbc0035I Maximum depth 0, 0 variables fixed on reduced cost
Cuts at root node changed objective from -4072.53 to -4072.53
Probing was tried 0 times and created 0 cuts of which 0 were active after add
ing rounds of cuts (0.000 seconds)
Gomory was tried 0 times and created 0 cuts of which 0 were active after addi
ng rounds of cuts (0.000 seconds)
Knapsack was tried 0 times and created 0 cuts of which 0 were active after ad
ding rounds of cuts (0.000 seconds)
Clique was tried 0 times and created 0 cuts of which 0 were active after addi
ng rounds of cuts (0.000 seconds)
MixedIntegerRounding2 was tried 0 times and created 0 cuts of which 0 were ac
tive after adding rounds of cuts (0.000 seconds)
FlowCover was tried 0 times and created 0 cuts of which 0 were active after a
dding rounds of cuts (0.000 seconds)
TwoMirCuts was tried 0 times and created 0 cuts of which 0 were active after
adding rounds of cuts (0.000 seconds)
Result - Optimal solution found
Objective value: -4072.53000000
Enumerated nodes: 0
Total iterations: 0
Time (CPU seconds): 0.03
Time (Wallclock seconds): 0.03
Total time (CPU seconds): 0.03 (Wallclock seconds): 0.04
0:08:56.29 Result dataframe has been constructed...
Hi,
I've been running OptiType mostly successfully; however, sometimes there is a KeyError that is raised, which causes the program to stop. It only affects some samples but I haven't found a common link between them. I've included the error message and was wondering if you had encountered this before.
Traceback (most recent call last):
File "[base]/OptiType/OptiTypePipeline.py", line 315, in
r = result_4digit[["A1", "A2", "B1", "B2", "C1", "C2", "nof_reads", "obj"]]
File "[base]/python/lib/python2.7/site-packages/pandas/core/frame.py", line 1672, in getitem
return self._getitem_array(key)
File "[base]/python/lib/python2.7/site-packages/pandas/core/frame.py", line 1716, in _getitem_array
indexer = self.ix._convert_to_indexer(key, axis=1)
File "[base]/python/lib/python2.7/site-packages/pandas/core/indexing.py", line 1085, in _convert_to_indexer
raise KeyError('%s not in index' % objarr[mask])
KeyError: "['A1' 'A2'] not in index"
Thanks
The files in the data folder are more than 3 years old. Packaging an up-to-date version of IMGT/HLA would be appreciated. Mis-annotated alleles have been fixed and new alleles added since 2014.
Hi,
I have used both razers3 and yara mapper for the aligner for Optitype and I am wondering if it would work with other aligners. Is there a specific setting / sweet spot for mismatches or clipping?
It seems like Optitype is sensitive to spurious alignments, ie reads aligning badly and hence causes a false HLA type after Optitype counts the reads. But on the other hand it can benefit with aligners that can include more reads and more information to discern the HLA types.
Would there be any issues if BWA mem is used instead to align the reads and would the scripts have an issue since it is designed to work with razers3 and yara?
I have seen some differences in the HLA types predicted with different aligners and I think it is a good way to show the robustness of optitype.
I ran optitype, and everything looked fine, but the output file did not give any predictions for the HLA-C alleles. Here is the output log:
mapping with 8 threads...
0:01:14.27 Mapping filtered_fished.fastq to GEN reference...
0:03:26.66 Generating binary hit matrix.
0:03:26.70 Loading optitype_outdir/2017_04_25_02_02_55/2017_04_25_02_02_55_1.bam started. Number of HLA reads loaded (updated every thousand):
1K...
0:03:26.97 1255 reads loaded. Creating dataframe...
0:03:27.18 Dataframes created. Shape: 1255 x 11179, hits: 22073 (22250), sparsity: 1 in 630.55
0:03:33.52 temporary pruning of identical rows and columns
0:03:33.63 Size of mtx with unique rows and columns: (50, 59)
0:03:33.63 determining minimal set of non-overshadowed alleles
0:03:33.78 Keeping only the minimal number of required alleles (12,)
0:03:33.78 Creating compact model...
starting ilp solver with 1 threads...
0:03:33.83 Initializing OptiType model...
Welcome to the CBC MILP Solver
Version: 2.9
Build Date: Mar 7 2017
command line - /n/sw/fasrcsw/apps/Core/Cbc/2.9-fasrc01/bin/cbc -printingOptions all -import /tmp/tmpext6iepo.pyomo.lp -import -stat=1 -solve -solu /tmp/tmpext6iepo.pyomo.soln (default strategy 1)
Option for printingOptions changed from normal to all
CoinLpIO::readLp(): Maximization problem reformulated as minimization
Current default (if $ as parameter) for import is /tmp/tmpext6iepo.pyomo.lp
Presolve 103 (-9) rows, 61 (-3) columns and 285 (-15) elements
Statistics for presolved model
Original problem has 37 integers (37 of which binary)
Presolved problem has 35 integers (35 of which binary)
==== 8 zero objective 32 different
==== absolute objective values 32 different
==== for integers 7 zero objective 17 different
==== for integers absolute objective values 17 different
===== end objective counts
Problem has 103 rows, 61 columns (53 with objective) and 285 elements
Column breakdown:
25 of type 0.0->inf, 1 of type 0.0->up, 0 of type lo->inf,
0 of type lo->up, 0 of type free, 0 of type fixed,
0 of type -inf->0.0, 0 of type -inf->up, 35 of type 0.0->1.0
Row breakdown:
0 of type E 0.0, 0 of type E 1.0, 0 of type E -1.0,
0 of type E other, 0 of type G 0.0, 3 of type G 1.0,
0 of type G other, 73 of type L 0.0, 0 of type L 1.0,
27 of type L other, 0 of type Range 0.0->1.0, 0 of type Range other,
0 of type Free
Continuous objective value is -429.083 - 0.00 seconds
Cgl0004I processed model has 102 rows, 61 columns (35 integer (35 of which binary)) and 279 elements
Cbc0038I Initial state - 0 integers unsatisfied sum - 1.44329e-15
Cbc0038I Solution found of -429.083
Cbc0038I Relaxing continuous gives -429.083
Cbc0038I Before mini branch and bound, 35 integers at bound fixed and 5 continuous
Cbc0038I Mini branch and bound did not improve solution (0.01 seconds)
Cbc0038I After 0.01 seconds - Feasibility pump exiting with objective of -429.083 - took 0.00 seconds
Cbc0012I Integer solution of -429.083 found by feasibility pump after 0 iterations and 0 nodes (0.01 seconds)
Cbc0001I Search completed - best objective -429.083, took 0 iterations and 0 nodes (0.01 seconds)
Cbc0035I Maximum depth 0, 0 variables fixed on reduced cost
Cuts at root node changed objective from -429.083 to -429.083
Probing was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Gomory was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Knapsack was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Clique was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
MixedIntegerRounding2 was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
FlowCover was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
TwoMirCuts was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Result - Optimal solution found
Objective value: -429.08300000
Enumerated nodes: 0
Total iterations: 0
Time (CPU seconds): 0.01
Time (Wallclock seconds): 0.02
Total time (CPU seconds): 0.02 (Wallclock seconds): 0.03
The output file looks like this:
A1 A2 B1 B2 C1 C2 Reads Objective
0 A*11:01 A*11:01 B*07:02 B*07:02 433 429.083
Any idea why this is?
Hello,
My colleague is getting this error:
File "/apps/RH7U2/gnu/OptiType/1.3.1/OptiTypePipeline.py", line 415, in <module>
r = result_4digit[["A1", "A2", "B1", "B2", "C1", "C2", "nof_reads", "obj"]]
File "/apps/RH7U2/gnu/python/2.7.13/lib/python2.7/site-packages/pandas/core/frame.py", line 1958, in __getitem__
return self._getitem_array(key)
File "/apps/RH7U2/gnu/python/2.7.13/lib/python2.7/site-packages/pandas/core/frame.py", line 2002, in _getitem_array
indexer = self.loc._convert_to_indexer(key, axis=1)
File "/apps/RH7U2/gnu/python/2.7.13/lib/python2.7/site-packages/pandas/core/indexing.py", line 1231, in _convert_to_
indexer
raise KeyError('%s not in index' % objarr[mask])
KeyError: "['nof_reads' 'obj'] not in index"
pandas 0.20.3
Please advise.
Thank you.
prateek@cpu:~/dhwani$ sudo docker run -v test:/test -t fred2/optitype -i NA11995_SRR766010_1_fished.fastq NA11995_SRR766010_1_fished.fastq -d -o out
sudo: unable to resolve host cpu: Connection timed out
[E::hts_open_format] Failed to open file out/2017_11_18_07_05_37/2017_11_18_07_05_37_1.bam
Traceback (most recent call last):
File "/usr/local/bin/OptiType/OptiTypePipeline.py", line 299, in
pos, read_details = ht.pysam_to_hdf(bam_paths[0])
File "/usr/local/bin/OptiType/hlatyper.py", line 186, in pysam_to_hdf
sam = pysam.AlignmentFile(samfile, sam_or_bam)
File "pysam/libcalignmentfile.pyx", line 444, in pysam.libcalignmentfile.AlignmentFile.cinit
File "pysam/libcalignmentfile.pyx", line 621, in pysam.libcalignmentfile.AlignmentFile._open
IOError: [Errno 2] could not open alignment file out/2017_11_18_07_05_37/2017_11_18_07_05_37_1.bam
: No such file or directory
Hi I was wondering if you have a fix for the following issue. I get this error:
0:00:14.95 Mapping 4.R1.fished.fastq to NUC reference...
0:00:22.52 Mapping 4.R2.fished.fastq to NUC reference...
0:00:31.25 Generating binary hit matrix.
0:00:31.26 Loading OptiType_RNA/2017_03_16_11_54_33/2017_03_16_11_54_33_1.bam started. Number of HLA reads loaded (updated every thousand):
0:00:31.26 0 reads loaded. Creating dataframe...
Traceback (most recent call last):
File "optitype-1.0/OptiType/OptiTypePipeline.py", line 267, in <module>
pos, read_details = ht.pysam_to_hdf(bam_paths[0])
File "optitype-1.0/OptiType/hlatyper.py", line 230, in pysam_to_hdf
pos_df = pd.DataFrame.from_items(hits.iteritems()).T
File "python-2.7.10/lib/python2.7/site-packages/pandas/core/frame.py", line 1046, in from_items
keys, values = lzip(*items)
ValueError: need more than 0 values to unpack
With the following command for two of five datasets.
python2.7 OptiTypePipeline.py \
--config optitype_config.txt \
-i 4.R1.fished.fastq \
4.R2.fished.fastq \
--rna \
-v \
-o ~/OptiType_RNA
Hi, I have been getting
Traceback (most recent call last):
File "/data/rozencompute2/a0073895/optitest/OptiType/OptiTypePipeline.py", line 304, in <module>
"in your config file (currently %.3f), because you may need to resort to using unpaired reads.") % unpaired_weight
TypeError: not enough arguments for format string
even when I have changed the unpaired weights in the config.ini file to 1. I am wondering if this is a error message when there are few reads in the input fastq or if something else is happening.
Hi,
I'm running optitype in rna mode for fastq.gz files. I get a large cryptic error that I'm not really able to figure out. Can someone help me translate this? See the command and error below:
Thanks,
-todd
OptiType-master/OptiTypePipeline.py --rna --verbose --config /Biomarker/ngs/software/OptiType/OptiType-master/config.ini -i /ts19/ngs/studies/ngs_000230/fastq/TB2-EM11595_R1.fastq.gz /ts19/ngs/studies/ngs_000230/fastq/TB2-EM11595_R2.fastq.gz -o /ts19/ngs/studies/ngs_000230/neoantigen_analysis/optitype
0:00:03.82 Mapping TB2-EM11595_R1.fastq.gz to NUC reference...
/home/mi/esiragusa/seqan/include/seqan/basic/basic_exception.h:368 FAILED! (Uncaught exception of type seqan::UnexpectedEnd: Unexpected end of input.)
stack trace:
0 [0xa9d3f7] /Biomarker/ngs/software/razers3/razers3-3.4.0-Linux-x86_64/bin/razers3()
1 [0xab37a6] __cxxabiv1::terminate(void (*)()) + 0x6
2 [0xab37d3] /Biomarker/ngs/software/razers3/razers3-3.4.0-Linux-x86_64/bin/razers3()
3 [0xab491e] /Biomarker/ngs/software/razers3/razers3-3.4.0-Linux-x86_64/bin/razers3()
4 [0x76b253] /Biomarker/ngs/software/razers3/razers3-3.4.0-Linux-x86_64/bin/razers3()
5 [0x7ce297] void seqan::readRecord<seqan::String<char, seqan::Alloc >, seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5Q>, seqan::Alloc >, seqan::String<char, seqan::Alloc >, seqan::Iter<seqan::VirtualStream<char, seqan::Tagseqan::Input_, std::char_traits >, seqan::StreamIterator<seqan::Tagseqan::Input_ > > >(seqan::String<char, seqan::Alloc >&, seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5Q>, seqan::Alloc >&, seqan::String<char, seqan::Alloc >&, seqan::Iter<seqan::VirtualStream<char, seqan::Tagseqan::Input_, std::char_traits >, seqan::StreamIterator<seqan::Tagseqan::Input_ > >&, seqan::Tagseqan::TagFastq_) + 0x77
6 [0x81924c] bool seqan::loadReads<MyFragStoreConfig, seqan::FragmentStoreConfig, seqan::RazerSOptions<seqan::RazerSSpec<false, false> > >(seqan::FragmentStore<MyFragStoreConfig, seqan::FragmentStoreConfig >&, seqan::FormattedFile<seqan::Tagseqan::TagFastq_, seqan::Tagseqan::Input_, void>&, seqan::RazerSOptions<seqan::RazerSSpec<false, false> >&) + 0x4ec
7 [0xa9c205] int mapReads<seqan::RazerSSpec<false, false> >(seqan::StringSet<seqan::String<char, seqan::Alloc >, seqan::Owner<seqan::Tagseqan::Default_ > >&, seqan::StringSet<seqan::String<char, seqan::Alloc >, seqan::Owner<seqan::Tagseqan::Default_ > >&, seqan::RazerSOptions<seqan::RazerSSpec<false, false> >&) + 0x4c5
8 [0x76287f] main + 0x22f
9 [0x31e101ed1d] __libc_start_main + 0xfd
10 [0x763039] /Biomarker/ngs/software/razers3/razers3-3.4.0-Linux-x86_64/bin/razers3()
0:00:03.99 Mapping TB2-EM11595_R2.fastq.gz to NUC reference...
/home/mi/esiragusa/seqan/include/seqan/basic/basic_exception.h:368 FAILED! (Uncaught exception of type seqan::UnexpectedEnd: Unexpected end of input.)
stack trace:
0 [0xa9d3f7] /Biomarker/ngs/software/razers3/razers3-3.4.0-Linux-x86_64/bin/razers3()
1 [0xab37a6] __cxxabiv1::terminate(void (*)()) + 0x6
2 [0xab37d3] /Biomarker/ngs/software/razers3/razers3-3.4.0-Linux-x86_64/bin/razers3()
3 [0xab491e] /Biomarker/ngs/software/razers3/razers3-3.4.0-Linux-x86_64/bin/razers3()
4 [0x76b253] /Biomarker/ngs/software/razers3/razers3-3.4.0-Linux-x86_64/bin/razers3()
5 [0x7ce297] void seqan::readRecord<seqan::String<char, seqan::Alloc >, seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5Q>, seqan::Alloc >, seqan::String<char, seqan::Alloc >, seqan::Iter<seqan::VirtualStream<char, seqan::Tagseqan::Input_, std::char_traits >, seqan::StreamIterator<seqan::Tagseqan::Input_ > > >(seqan::String<char, seqan::Alloc >&, seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5Q>, seqan::Alloc >&, seqan::String<char, seqan::Alloc >&, seqan::Iter<seqan::VirtualStream<char, seqan::Tagseqan::Input_, std::char_traits >, seqan::StreamIterator<seqan::Tagseqan::Input_ > >&, seqan::Tagseqan::TagFastq_) + 0x77
6 [0x81924c] bool seqan::loadReads<MyFragStoreConfig, seqan::FragmentStoreConfig, seqan::RazerSOptions<seqan::RazerSSpec<false, false> > >(seqan::FragmentStore<MyFragStoreConfig, seqan::FragmentStoreConfig >&, seqan::FormattedFile<seqan::Tagseqan::TagFastq_, seqan::Tagseqan::Input_, void>&, seqan::RazerSOptions<seqan::RazerSSpec<false, false> >&) + 0x4ec
7 [0xa9c205] int mapReads<seqan::RazerSSpec<false, false> >(seqan::StringSet<seqan::String<char, seqan::Alloc >, seqan::Owner<seqan::Tagseqan::Default_ > >&, seqan::StringSet<seqan::String<char, seqan::Alloc >, seqan::Owner<seqan::Tagseqan::Default_ > >&, seqan::RazerSOptions<seqan::RazerSSpec<false, false> >&) + 0x4c5
8 [0x76287f] main + 0x22f
9 [0x31e101ed1d] __libc_start_main + 0xfd
10 [0x763039] /Biomarker/ngs/software/razers3/razers3-3.4.0-Linux-x86_64/bin/razers3()
0:00:04.85 Generating binary hit matrix.
Traceback (most recent call last):
File "/Biomarker/ngs/software/OptiType/OptiType-master/OptiTypePipeline.py", line 275, in
pos, read_details = ht.pysam_to_hdf(bam_paths[0])
File "/Biomarker/ngs/software/OptiType/OptiType-master/hlatyper.py", line 177, in pysam_to_hdf
sam = pysam.AlignmentFile(samfile, sam_or_bam)
File "pysam/calignmentfile.pyx", line 311, in pysam.calignmentfile.AlignmentFile.cinit (pysam/calignmentfile.c:4929)
File "pysam/calignmentfile.pyx", line 480, in pysam.calignmentfile.AlignmentFile._open (pysam/calignmentfile.c:6905)
IOError: file /ts19/ngs/studies/ngs_000230/neoantigen_analysis/optitype/TB2-EM11595/2018_03_02_10_16_01_1.bam
not found
I have a 28GB bam file, and I'm running into memory issues with Razers3 even when I allocate 32GB of RAM for this job. Do you have any recommendations on what to do? Thanks.
Dear all,
To my knowledge, I have installed OptiType and the required softwares and libraries, however I got the following error:
C02NQ30CG3QT:OptiType-master jimene01$ python OptiTypePipeline.py --help
Error loading 'pyutilib.component' entry points: 'type object 'PluginGlobals' has no attribute 'add_env''
Traceback (most recent call last):
File "OptiTypePipeline.py", line 108, in <module>
from model import OptiType
File "/Users/jimene01/Documents/ResearchPlacement/Project/OptiType/OptiType-master/model.py", line 15, in <module>
import coopr.environ
File "/Users/jimene01/anaconda/lib/python2.7/site-packages/coopr/environ/__init__.py", line 48, in <module>
import_packages()
File "/Users/jimene01/anaconda/lib/python2.7/site-packages/coopr/environ/__init__.py", line 38, in import_packages
do_import(pname)
File "/Users/jimene01/anaconda/lib/python2.7/site-packages/coopr/environ/__init__.py", line 20, in do_import
__import__(pname, globals(), locals(), [], -1)
File "/Users/jimene01/anaconda/lib/python2.7/site-packages/coopr/pyomo/__init__.py", line 16, in <module>
from pyomo.environ import *
File "/Users/jimene01/anaconda/lib/python2.7/site-packages/pyomo/environ/__init__.py", line 13, in <module>
import pyomo.core
File "/Users/jimene01/anaconda/lib/python2.7/site-packages/pyomo/core/__init__.py", line 10, in <module>
from pyomo.util.plugin import PluginGlobals
File "/Users/jimene01/anaconda/lib/python2.7/site-packages/pyomo/util/__init__.py", line 10, in <module>
from pyomo.util._task import pyomo_api, PyomoAPIData, PyomoAPIFactory
File "/Users/jimene01/anaconda/lib/python2.7/site-packages/pyomo/util/_task.py", line 26, in <module>
plugin.PluginGlobals.add_env("pyomo")
AttributeError: type object 'PluginGlobals' has no attribute 'add_env'
I would highly appreciate if you would not mind helping me troubleshooting my installation, since I noticed that I have the softwares but for many of them the version is different from the one you have put in the requirements, and I do not know if that affects OptiType.
Moreover, I am not quite sure if Razers3 and Cbc works correctly... To be honest, this is the first time I install something that needs other many programs and I am not quite sure everything works correctly together.
I would highly appreciate any help that may be provided,
Alejandro
when using docker file, It give me an error info, it seems to be the permission error of mkdir function. I googled and used the solution of add '--previliged=true', but it still doesn't work. My system is Ubuntu 16.04.3 and docker ce. Is there anything I can do to solve it?
Traceback (most recent call last):
File "/usr/local/bin/OptiType/OptiTypePipeline.py", line 264, in
os.makedirs(out_dir)
File "/usr/lib/python2.7/os.py", line 157, in makedirs
mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/data/2017_12_31_08_56_44'
another error info I encounter is the software installation by bioconda, the error info is ๏ผ
0:12:16.95 Generating binary hit matrix.
Traceback (most recent call last):
File "/home/zjd/miniconda3/envs/python2.7/bin/OptiTypePipeline.py", line 303, in
pos, read_details = ht.pysam_to_hdf(bam_paths[0])
File "/home/zjd/miniconda3/envs/python2.7/share/optitype-1.2.1-0/hlatyper.py", line 186, in pysam_to_hdf
sam = pysam.AlignmentFile(samfile, sam_or_bam)
AttributeError: 'module' object has no attribute 'AlignmentFile'
OptiType is crashing with the following error message from the log:
mapping with 8 threads...
0:00:59.80 Mapping PGDX3144N_HLA_filtered_fished.fastq to GEN reference...
0:03:48.28 Generating binary hit matrix.
0:03:48.31 Loading ../OptiType_output/2017_03_07_16_43_12/2017_03_07_16_43_12_1.bam started. Number of HLA reads loaded (updated every thousand):
1K...2K...3K...4K...5K...6K...7K...8K...9K...10K...11K...12K...13K...14K...
0:04:18.66 14383 reads loaded. Creating dataframe...
0:04:21.58 Dataframes created. Shape: 14383 x 11179, hits: 4063907 (4063907), sparsity: 1 in 39.56
0:04:37.63 temporary pruning of identical rows and columns
0:04:39.04 Size of mtx with unique rows and columns: (767, 1332)
0:04:39.04 determining minimal set of non-overshadowed alleles
0:04:55.87 Keeping only the minimal number of required alleles (168,)
0:04:55.91 Creating compact model...
starting ilp solver with 1 threads...
0:04:58.69 Initializing OptiType model...
ERROR: Rule failed when generating expression for objective read_cov:
RuntimeError: Expression entered generate_expression() with too few references (-1<0); this is indicative of a SERIOUS ERROR in the expression reuse detection scheme.
ERROR: Constructing component 'read_cov' from data=None failed:
RuntimeError: Expression entered generate_expression() with too few references (-1<0); this is indicative of a SERIOUS ERROR in the expression reuse detection scheme.
The explicit error message is:
Traceback (most recent call last):
File "OptiTypePipeline.py", line 404, in <module>
config.get("ilp", "solver"), threads, verbosity=VERBOSE)
File "/n/regal/nowak_lab/immunotherapy/OptiType/model.py", line 97, in __init__
model.reconst[a] * model.x[a] for a in model.L), sense=maximize)
File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/pyomo/core/base/block.py", line 484, in __setattr__
self.add_component(name, val)
File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/pyomo/core/base/block.py", line 890, in add_component
val.construct(data)
File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/pyomo/core/base/objective.py", line 307, in construct
tmp = _init_rule(_self_parent)
File "/n/regal/nowak_lab/immunotherapy/OptiType/model.py", line 96, in <lambda>
rule=lambda model: sum(model.occ[r] * (model.y[r] - model.beta * (model.re[r])) for r in model.R) - sum(
File "/n/regal/nowak_lab/immunotherapy/OptiType/model.py", line 96, in <genexpr>
rule=lambda model: sum(model.occ[r] * (model.y[r] - model.beta * (model.re[r])) for r in model.R) - sum(
File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/pyomo/core/base/numvalue.py", line 460, in __sub__
return generate_expression(_sub,self,other)
File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/pyomo/core/base/expr_coopr3.py", line 1028, in generate_expression
other = _generate_expression__clone_if_needed(other, 0)
File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/pyomo/core/base/expr_coopr3.py", line 918, in _generate_expression__clone_if_needed
% ( getrefcount(obj) - UNREFERENCED_EXPR_COUNT, ))
RuntimeError: Expression entered generate_expression() with too few references (-1<0); this is indicative of a SERIOUS ERROR in the expression reuse detection scheme.
Do you know what the problem could be?
Dear main contributors @andras86 @b-schubert
Thank you to release this open-source efficient code.
From my few experiments, it works well for Class I. Nice! :-)
Obviously, more is better :-) and Class II should be nicer.
From what I have quickly read in your paper and in your code, your proposed strategy should work (more or less) with any Class. Right ?
Less in the sense that the Integer Linear Program will be harder and harder to solve when increasing the number of alleles, since the matrix of constraints will be large (or extremely large).
Therefore, I would like asking:
Is it still planned to release an updated version ?
All the best
Hello,
OptiType returned the following message while calling the module model.py
:
Invalid option '-s'; try /group/bioinformatics/software/GLPK/4.61/bin/glpsol --help
ERROR: "[base]/site-packages/pyomo/opt/base/solvers.py", 599, solve
Solver (asl) returned non-zero return code (1)
ERROR: "[base]/site-packages/pyomo/opt/base/solvers.py", 602, solve
See the solver log above for diagnostic information.
This is my command:
module load anaconda3
source activate python2.7
module load GLPK/4.61 HDF5/1.10.0-patch1 samtools/1.2 bwa/0.7.15 sambamba/0.5.6
python ./OptiTypePipeline.py -i ./test/exome/NA11995_SRR766010_1_fished.fastq ./test/exome/NA11995_SRR766010_2_fished.fastq -o test -d -v
source deactivate
I was not able to install RazerS
and CPLEX
on our server, hence went for bwa
and GLPK
instead. I modified OptiTypePipeline.py
a little bit to take in bwa
alignment commands instead of the default RazerS
.
The command-line solver for GLPK
is glpsol
, which I also changed in config.ini
. Could that be the reason why the error was reported? The modified OptiTypePipeline.py
, config, commands and log files are attached here if helpful myfiles.zip
Any help would be highly appreciated. Thanks much in advance :)
Best,
Riyue (Sunny)
The University of Chicago
Hello!
In the readme the version is 1.3.1 (2014).
But in releases we can see only 1.2.1. I think it would be better to fix this.
And inside the OptiTypePipeline.py
we can see:
Date: April 2014
Version: 1.0
Hi,
When I installed the OptiType, I found that the website about Razer3 could not be accessed and it seems that there is no other way to download the Razer3. Could you mind to help?
Hi
I know somebody down here had the same problem.
I'm trying to use optotype in docker:
docker run -v /local/pVACtools/:/data/ -t fred2/optitype --input SRR2672972_1.fastq SRR2672972_2.fastq --rna -o /local/pVACtools/Optitype/RNA_control
And I get this:
Traceback (most recent call last):
File "/usr/local/bin/OptiType/OptiTypePipeline.py", line 235, in <module>
os.makedirs(args.outdir)
File "/usr/lib/python2.7/os.py", line 150, in makedirs
makedirs(head, mode)
File "/usr/lib/python2.7/os.py", line 150, in makedirs
makedirs(head, mode)
File "/usr/lib/python2.7/os.py", line 150, in makedirs
makedirs(head, mode)
File "/usr/lib/python2.7/os.py", line 150, in makedirs
makedirs(head, mode)
File "/usr/lib/python2.7/os.py", line 157, in makedirs
mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/local'
I assume that python 2.7is also in the docker. I've tried to run this as sudo and ik keeps showing the same errors.
Thanks
Steps I've taken:
Here is the error I'm getting:
docker run -v /home/biodocker:/data/ -t fred2/optitype -i /home/biodocker/subject1.mhc-raz1.fastq /home/biodocker/subject1.mhc-raz2.fastq -d -o /home/biodocker
Traceback (most recent call last):
File "/usr/local/bin/OptiType/OptiTypePipeline.py", line 299, in <module>
pos, read_details = ht.pysam_to_hdf(bam_paths[0])
File "/usr/local/bin/OptiType/hlatyper.py", line 186, in pysam_to_hdf
sam = pysam.AlignmentFile(samfile, sam_or_bam)
File "pysam/libcalignmentfile.pyx", line 397, in pysam.libcalignmentfile.AlignmentFile.__cinit__ (pysam/libcalignmentfile.c:5831)
File "pysam/libcalignmentfile.pyx", line 558, in pysam.libcalignmentfile.AlignmentFile._open (pysam/libcalignmentfile.c:7556)
IOError: file `/home/biodocker/2017_05_10_23_04_02/2017_05_10_23_04_02_1.bam` not found
Any thoughts on what I may be doing wrong? Any advice or suggestions would be greatly appreciated.
Hey guys.
I'm developing geneotyper with a similar purpose as OptiType. When I was comparing it to OptiType I noticed there are some inconsistencies with the verification data I found on the 1000 Genomes FTP site ( ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/working/20140725_hla_genotypes/20140702_hla_diversity.txt ) and the data you use in your supplementary table S2.
For example: In your table you have...
What's the reason behind this inconsistency? I'm sorry if you mention it somewhere in your article, but I couldn't find it.
Cheers,
Hannes
Hello,
I ran the script with the provided exome test files and got these error messages. I would greatly appreciate it if you can have a look and help me troubleshoot the issues.
[jc2545@login-0-1 exome]$ python ~/programs/OptiType-master/OptiTypePipeline.py -i NA11995_SRR766010_1_fished.fastq NA11995_SRR766010_2_fished.fastq -d -v -o .
Traceback (most recent call last):
File "/home/jc2545/programs/OptiType-master/OptiTypePipeline.py", line 195, in <module>
ALLELE_HDF = config.get("LIBRARIES", "ALLELES")
File "/home/jc2545/python/lib/python2.7/ConfigParser.py", line 607, in get
raise NoSectionError(section)
ConfigParser.NoSectionError: No section: 'LIBRARIES'
Here is my config.ini file
[jc2545@login-0-0 OptiType-master]$ cat config.ini
[MAPPING]
#please specify the razerS3 binary path
RAZERS3=/home/jc2545/programs/razers3-3.4.0-Linux-x86_64/bin/razers3
THREADS=8
[LIBRARIES]
RNA_REF=./data/hla_reference_rna.fasta
DNA_REF=./data/hla_reference_dna.fasta
ALLELES=./data/alleles.h5
[OPTIMIZATION]
#the solver has to be supported by Coopr
SOLVER=cbc
THREADS=1
I believe all the required softwares and libraries are installed.
[jc2545@login-0-1 exome]$ python ~/programs/OptiType-master/OptiTypePipeline.py --help
usage: OptiType [-h] --input INPUT [INPUT ...] (--rna | --dna) [--beta BETA]
[--enumerate ENUMERATE] --outdir OUTDIR [--verbose]
OptiType: 4-digit HLA typer
optional arguments:
-h, --help show this help message and exit
--input INPUT [INPUT ...], -i INPUT [INPUT ...]
Fastq files with fished HLA reads. Max two files (for
paired-end)
--rna, -r Specifiying the mapped data as RNA.
--dna, -d Specifiying the mapped data as DNA.
--beta BETA, -b BETA The beta value for for homozygosity detection.
--enumerate ENUMERATE, -e ENUMERATE
The number of enumerations.
--outdir OUTDIR, -o OUTDIR
Specifies the out directory to which all files should
be written
--verbose, -v Set verbose mode on.
Hello,
I am using OptiType with Python 2.7.10. After installing some modules, I could run the analysis:
python /OptiType/OptiTypePipeline.py -d -v -i
until determining minimal set of non-overshadowed alleles step:
` 0:00:00.38 Mapping Sample_214310406_T-AL-O_1.fastq to GEN reference...
0:00:19.11 Mapping Sample_214310406_T-AL-O_2.fastq to GEN reference...
0:00:38.76 Generating binary hit matrix.
0:00:38.76 Loading alleles and read IDs from /data/Analysis/NeoEpitopePrediction/HLA/2017_04_25_11_59_59/2017_04_25_11_59_59_0.sam...
0:00:40.06 11179 alleles and 2016 reads found.
0:00:40.06 Initializing mapping matrix...
0:00:40.07 2016x11179 mapping matrix initialized. Populating 1077618 hits from SAM file...
10% completed
20% completed
30% completed
40% completed
50% completed
60% completed
70% completed
80% completed
90% completed
100% completed
0:03:35.02 1077618 elements filled. Matrix sparsity: 1 in 20.91
0:03:44.25 Loading alleles and read IDs from /data/Analysis/NeoEpitopePrediction/HLA/2017_04_25_11_59_59/2017_04_25_11_59_59_1.sam...
0:03:45.11 11179 alleles and 2177 reads found.
0:03:45.11 Initializing mapping matrix...
0:03:45.12 2177x11179 mapping matrix initialized. Populating 992781 hits from SAM file...
10% completed
20% completed
30% completed
40% completed
50% completed
60% completed
70% completed
80% completed
90% completed
100% completed
0:06:28.36 992781 elements filled. Matrix sparsity: 1 in 24.51
0:06:40.68 temporary pruning of identical rows and columns
0:06:40.71 Size of mtx with unique rows and columns: (312, 446)
0:06:40.71 determining minimal set of non-overshadowed alleles `
Could this problem be related with the solver? I tried both solvers cbc and glpk that I added to my $PATH.
Thank you in advance for your help
Hello,
I get the following error when I use BAM files as an input for OptiType. Is there a restriction on the chromosome notation or does this error have a different cause?
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1434, in _has_valid_type
error()
File "/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1429, in error
(key, self.obj._get_axis_name(axis)))
KeyError: 'the label [chr1] is not in the [index]'During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/software/install/OptiType/OptiTypePipeline.py", line 355, in
alleles_to_keep = list(filter(is_frequent, binary.columns))
File "/home/software/install/OptiType/OptiTypePipeline.py", line 142, in is_frequent
return table.loc[allele_id]['4digit'] in freq_alleles and table.loc[allele_id]['flags'] == 0 or (table.loc[allele_id]['locus'] in 'HGJ')
File "/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1328, in getitem
return self._getitem_axis(key, axis=0)
File "/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1551, in _getitem_axis
self._has_valid_type(key, axis)
File "/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1442, in _has_valid_type
error()
File "/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1429, in error
(key, self.obj._get_axis_name(axis)))
KeyError: 'the label [chr1] is not in the [index]'
Hello all,
I was wondering how OptiType handles ambiguous results. I would expect that the solver returns something along the lines of "no single best solution found" or similar. Or will all best results be reported?
Further, is this a problem at all? I.e. you have benchmarked with quite a lot of datasets, have you ever seen such a case or is too rare to worry about? I just saw the -e
switch, so I can't say yet for our benchmarks.
From Ambiguous allele combinations in HLA Class I and Class II sequence-based typing: when precise nucleotide sequencing leads to imprecise allele identification http://dx.doi.org/10.1186%2F1479-5876-2-30
However, one of the inherent problems with this typing method is the interpretation of ambiguous allele combinations which occur when two or more different allele combinations produce identical sequences.
Example: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC517951/figure/F1/
The complete list can be found here: http://www.ebi.ac.uk/ipd/imgt/hla/ambig.html
With Pyomo 5.x the following error occurs for GLPK <= 4.5x (see Pyomo/pyomo#146)
Writing MIP solution to `/tmp/tmpDUGZvJ.glpk.raw'... 474 lines were written ERROR: Expecting 's' row after 'c' rows Traceback (most recent call last): File "OptiTypePipeline.py", line 405, in <module> result = op.solve(args.enumerate) File "/home/travis/build/FRED-2/OptiType/model.py", line 153, in solve res = self.__solver.solve(self.__instance, options={}, tee=self.__verbosity) File "/home/travis/virtualenv/python2.7.9/lib/python2.7/site-packages/pyomo/opt/base/solvers.py", line 610, in solve result = self._postsolve() File "/home/travis/virtualenv/python2.7.9/lib/python2.7/site-packages/pyomo/opt/solver/shellcmd.py", line 268, in _postsolve results = self.process_output(self._rc) File "/home/travis/virtualenv/python2.7.9/lib/python2.7/site-packages/pyomo/opt/solver/shellcmd.py", line 330, in process_output self.process_soln_file(results) File "/home/travis/virtualenv/python2.7.9/lib/python2.7/site-packages/pyomo/solvers/plugins/solvers/GLPK.py", line 362, in process_soln_file raise ValueError(msg) ValueError: Error parsing solution data file, line 2
biopython==1.64
Coopr 3.5.8787 (CPython 2.7.6 on Linux 3.13.0-37-generic)
matplotlib==1.3.1
pandas==0.13.1
solver: glpk 4.35
python OptiTypePipeline.py -i ./test/exome/NA11995_SRR766010_1_fished.fastq ./test/exome/NA11995_SRR766010_2_fished.fastq -d -v -o ./test/exome/
WARNING: No construction rule or expression specified for constraint 'c'
Invalid option `--threads'; try /usr/local/bin/glpsol --help
ERROR: "[base]/dist-packages/coopr/opt/base/solvers.py", 448, solve
Solver (glpk) returned non-zero return code (1)
ERROR: "[base]/dist-packages/coopr/opt/base/solvers.py", 451, solve
See the solver log above for diagnostic information.
glp_read_lp: reading problem data from `/tmp/tmpA9lW95.pyomo.lp'...
/tmp/tmpA9lW95.pyomo.lp:3620: warning: lower bound of variable `x481' redefined
/tmp/tmpA9lW95.pyomo.lp:3620: warning: upper bound of variable `x481' redefined
... (multiple lines similar warnings)
A1 A2 B1 B2 C1 C2 Reads Objective
0 A*01:01 A*01:01 B*08:01 B*57:01 C*07:01 C*06:02 1156 1135.192
Hi,
When running against the test data, I'm getting the below error that I think traces back to GLPK but I'm not sure. Can someone help me possibly debug this issue?
Here's the shelll script I used to run it:
`
export SAMTOOLS=/Biomarker/ngs/software/samtools/samtools-1.2/bin
export GLPK=/Biomarker/ngs/software/glpk/glpk-4.59/bin
export PATH=$SAMTOOLS:$GLPK:$PATH
export HDF5_DIR=/Biomarker/ngs/software/HD5/hdf5-1.8.16-linux-centos7-x86_64-gcc483-shared
export LD_LIBRARY_PATH=/Biomarker/ngs/software/HD5/hdf5-1.8.16-linux-centos7-x86_64-gcc483-shared/lib
/Biomarker/ngs/software/bin/python OptiType-master/OptiTypePipeline.py -i OptiType-master/test/exome/NA11995_SRR766010_1_fished.fastq OptiType-master/test/exome/NA11995_SRR766010_2_fished.fastq --dna --verbose --config OptiType-master/config.ini -o OptiType-master/test/exome/
`
The head of the .raw file looks like this:
c Problem:
c Rows: 450
c Columns: 282
c Non-zeros: 1715
c Status: INTEGER OPTIMAL
c Objective: x282 = 1135.192 (MAXimum)
c
s mip 450 282 o 1135.192
i 1 1
i 2 2
i 3 2
i 4 1
i 5 1
i 6 1
i 7 1
i 8 2
i 9 2
i 10 1
ERROR (at the bottom):
0:00:01.08 Mapping NA11995_SRR766010_1_fished.fastq to GEN reference...
0:00:31.21 Mapping NA11995_SRR766010_2_fished.fastq to GEN reference...
0:00:57.64 Generating binary hit matrix.
0:00:57.66 Loading OptiType-master/test/exome/2016_03_23_16_57_45/2016_03_23_16_57_45_1.bam started. Number of HLA reads loaded (updated every thousand):
1K...
0:01:00.97 1909 reads loaded. Creating dataframe...
0:01:01.22 Dataframes created. Shape: 1909 x 11179, hits: 688669 (1249465), sparsity: 1 in 17.08
0:01:01.60 Loading OptiType-master/test/exome/2016_03_23_16_57_45/2016_03_23_16_57_45_2.bam started. Number of HLA reads loaded (updated every thousand):
1K...
0:01:04.73 1876 reads loaded. Creating dataframe...
0:01:04.92 Dataframes created. Shape: 1876 x 11179, hits: 657359 (1192811), sparsity: 1 in 17.58
0:01:05.67 Alignment pairing completed. 1681 paired, 359 unpaired, 32 discordant
0:01:11.14 temporary pruning of identical rows and columns
0:01:11.32 Size of mtx with unique rows and columns: (496, 776)
0:01:11.32 determining minimal set of non-overshadowed alleles
0:01:13.67 Keeping only the minimal number of required alleles (62,)
0:01:13.67 Creating compact model...
0:01:13.82 Initializing OptiType model...
GLPSOL: GLPK LP/MIP Solver, v4.59
Parameter(s) specified in the command line:
--write /tmp/tmpGZXIuT.glpk.raw --wglp /tmp/tmpmXCoNz.glpk.glp --cpxlp /tmp/tmpWPTOBn.pyomo.lp
Reading problem data from '/tmp/tmpWPTOBn.pyomo.lp'...
/tmp/tmpWPTOBn.pyomo.lp:3620: warning: lower bound of variable 'x1' redefined
/tmp/tmpWPTOBn.pyomo.lp:3620: warning: upper bound of variable 'x1' redefined
450 rows, 282 columns, 1715 non-zeros
171 integer variables, all of which are binary
3791 lines were read
Writing problem data to '/tmp/tmpmXCoNz.glpk.glp'...
3276 lines were written
GLPK Integer Optimizer, v4.59
450 rows, 282 columns, 1715 non-zeros
171 integer variables, all of which are binary
Preprocessing...
2 hidden packing inequaliti(es) were detected
95 hidden covering inequaliti(es) were detected
444 rows, 280 columns, 1705 non-zeros
170 integer variables, all of which are binary
Scaling...
A: min|aij| = 1.000e+00 max|aij| = 6.000e+00 ratio = 6.000e+00
Problem data seem to be well scaled
Constructing initial basis...
Size of triangular part is 444
Solving LP relaxation...
GLPK Simplex Optimizer, v4.59
444 rows, 280 columns, 1705 non-zeros
0: obj = -0.000000000e+00 inf = 5.000e+00 (5)
5: obj = -3.000000000e-02 inf = 0.000e+00 (0)
Hello!
I'm running into an error at the "Result dataframe has been constructed..." stage.
Here is he command I used:
python ~/src/OptiType/OptiTypePipeline.py -v -i nebula_finished.fastq --dna -o . &> run_log.txt
My config.ini:
[MAPPING]
#please specify the razerS3 binary path
RAZERS3=/home/ubuntu/src/razers3-3.4.0-Linux-x86_64/bin/razers3
THREADS=8
[LIBRARIES]
RNA_REF=./data/hla_reference_rna.fasta
DNA_REF=./data/hla_reference_dna.fasta
ALLELES=./data/alleles.h5
[OPTIMIZATION]
#the solver has to be supported by Coopr
SOLVER=cbc
THREADS=1
And the run log:
0:00:02.66 Mapping nebula_finished.fastq to GEN reference...
0:05:23.24 Generating binary hit matrix.
0:05:23.24 Loading alleles and read IDs from ./2015_02_23_21_31_27/2015_02_23_21_31_27_0.sam...
0:05:27.01 11179 alleles and 13842 reads found.
0:05:27.01 Initializing mapping matrix...
0:05:27.02 13842x11179 mapping matrix initialized. Populating 4135583 hits from SAM file...
10% completed
20% completed
30% completed
40% completed
50% completed
60% completed
70% completed
80% completed
90% completed
100% completed
0:49:01.96 4135583 elements filled. Matrix sparsity: 1 in 37.42
0:50:19.10 temporary pruning of identical rows and columns
0:50:19.82 Size of mtx with unique rows and columns: (2163, 1384)
0:50:19.82 determining minimal set of non-overshadowed alleles
0:50:24.99 Keeping only the minimal number of required alleles (184,)
0:50:24.99 Creating compact model...
0:50:25.35 Initializing OptiType model...
WARNING: No construction rule or expression specified for constraint 'c'
Welcome to the CBC MILP Solver
Version: 2.8.7
Build Date: Dec 28 2013
command line - /usr/bin/cbc -printingOptions all -import /tmp/tmp8qUjdt.pyomo.lp -import -stat=1 -solve -solu /tmp/tmp8qUjdt.pyomo.soln (default strategy 1)
Option for printingOptions changed from normal to all
Coin0009I CoinLpIO::readLp(): Maximization problem reformulated as minimization
Current default (if $ as parameter) for import is /tmp/tmp8qUjdt.pyomo.lp
Presolve 2401 (-1) rows, 1379 (-1) columns and 19535 (-1) elements
Statistics for presolved model
Problem has 2401 rows, 1379 columns (1323 with objective) and 19535 elements
Column breakdown:
597 of type 0.0->inf, 1 of type 0.0->up, 0 of type lo->inf,
0 of type lo->up, 0 of type free, 0 of type fixed,
0 of type -inf->0.0, 0 of type -inf->up, 781 of type 0.0->1.0
Row breakdown:
0 of type E 0.0, 0 of type E 1.0, 0 of type E -1.0,
0 of type E other, 0 of type G 0.0, 6 of type G 1.0,
0 of type G other, 1791 of type L 0.0, 0 of type L 1.0,
604 of type L other, 0 of type Range 0.0->1.0, 0 of type Range other,
0 of type Free
Continuous objective value is -6923.74 - 0.04 seconds
Cgl0004I processed model has 2395 rows, 1379 columns (781 integer) and 19351 elements
Cbc0038I Solution found of -6923.74
Cbc0038I Before mini branch and bound, 781 integers at bound fixed and 26 continuous
Cbc0038I Mini branch and bound did not improve solution (0.08 seconds)
Cbc0038I After 0.08 seconds - Feasibility pump exiting with objective of -6923.74 - took 0.01 seconds
Cbc0012I Integer solution of -6923.74 found by feasibility pump after 0 iterations and 0 nodes (0.08 seconds)
Cbc0001I Search completed - best objective -6923.739999999943, took 0 iterations and 0 nodes (0.09 seconds)
Cbc0035I Maximum depth 0, 0 variables fixed on reduced cost
Cuts at root node changed objective from -6923.74 to -6923.74
Probing was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Gomory was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Knapsack was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Clique was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
MixedIntegerRounding2 was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
FlowCover was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
TwoMirCuts was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Result - Optimal solution found
Objective value: -6923.74000000
Enumerated nodes: 0
Total iterations: 0
Time (CPU seconds): 0.11
Time (Wallclock seconds): 0.11
Total time (CPU seconds): 0.12 (Wallclock seconds): 0.12
0:50:26.55 Result dataframe has been constructed...
Traceback (most recent call last):
File "/home/ubuntu/src/OptiType/OptiTypePipeline.py", line 325, in <module>
coverage_mat = ht.calculate_coverage(plot_variables, features, hlatype, features_used)
File "/home/ubuntu/src/OptiType/hlatyper.py", line 505, in calculate_coverage
hit_counts[reads]):
File "/home/ubuntu/env/optitype/local/lib/python2.7/site-packages/pandas/core/series.py", line 641, in __getitem__
return self._get_with(key)
File "/home/ubuntu/env/optitype/local/lib/python2.7/site-packages/pandas/core/series.py", line 688, in _get_with
return self.reindex(key)
File "/home/ubuntu/env/optitype/local/lib/python2.7/site-packages/pandas/core/series.py", line 2646, in reindex
return self._reindex_with_indexers(new_index, indexer, copy=copy, fill_value=fill_value)
File "/home/ubuntu/env/optitype/local/lib/python2.7/site-packages/pandas/core/series.py", line 2650, in _reindex_with_indexers
return Series(new_values, index=index, name=self.name)
File "/home/ubuntu/env/optitype/local/lib/python2.7/site-packages/pandas/core/series.py", line 492, in __new__
subarr.index = index
File "properties.pyx", line 74, in pandas.lib.SeriesIndex.__set__ (pandas/lib.c:29541)
AssertionError: Index length did not match values
Hi
I am trying to use OptiType and get the error:
Error loading 'pyutilib.component' entry points: 'type object 'PluginGlobals' has no attribute 'push_env''
Traceback (most recent call last):
File "OptiTypePipeline.py", line 124, in
from model import OptiType
File "/home/usr/OptiType/model.py", line 19, in
from pyomo.environ import ConcreteModel, Set, Param, Var, Binary, Objective, Constraint, ConstraintList, maximize
File "/home/python2.7/site-packages/pyomo/environ/init.py", line 14, in
from pyomo.core import *
File "/home/python2.7/site-packages/pyomo/core/init.py", line 11, in
from pyomo.util.plugin import PluginGlobals
File "/home/python2.7/site-packages/pyomo/util/init.py", line 11, in
from pyomo.util._task import pyomo_api, PyomoAPIData, PyomoAPIFactory
File "/home/python2.7/site-packages/pyomo/util/_task.py", line 21, in
import pyutilib.workflow
File "/home/python2.7/site-packages/pyutilib/workflow/init.py", line 11, in
pyutilib.component.core.PluginGlobals.push_env("pyutilib.workflow")
AttributeError: type object 'PluginGlobals' has no attribute 'push_env'
All dependencies are already installed
Thanks for the help
Eli
OptiType was installed with conda install optitype
, and it can print help message, but cannot deal with small data( HLA_1.fastq = 31k and HLA1_2.fastq = 31k).
The command is python OptiTypePipeline.py --input HLA1_1.fastq HLA1_2.fastq --rna -v -o ./test/rna/
and error information is as follows:
mapping with 4 threads...
0:00:04.45 Mapping HLA1_1.fastq to NUC reference...
0:00:10.85 Mapping HLA1_2.fastq to NUC reference...
Segmentation fault (core dumped)
how can I deal with this? thanks
Hi.
I'm trying to use OptiType, but I get some warnings and errors. I assume that is related to pyomo, but not sure how can I solve it. Downgrade the version? I'm using version 5.2
WARNING: Constant objective detected, replacing with a placeholder to prevent solver failure.
ERROR: Expecting 's' row after 'c' rows
WARNING: Solver does not support multi-threading. Please change the config file accordingly. Falling back to single-threading.
WARNING: Constant objective detected, replacing with a placeholder to prevent solver failure.
ERROR: Expecting 's' row after 'c' rows
Traceback (most recent call last):
File "/home/.local/lib/python3.5/site-packages/pyomo/solvers/plugins/solvers/GLPK.py", line 351, in process_soln_file
raise ValueError("Expecting 's' row after 'c' rows")
ValueError: Expecting 's' row after 'c' rows
Thanks for your time.
Hello,
Optitype keeps giving me the same error for a group of samples. This is single end, 100bp data. Normally the software works great for me so I don't think it's my version/setup. I can send you ~1Mb fastq file to recreate if you like.
Command
python OptiTypePipeline.py -i ${path}/temp2_fished1.fastq --rna --o ${path} -v
Error:
Problem data seem to be well scaled
Constructing initial basis...
Size of triangular part = 408
Solving LP relaxation...
GLPK Simplex Optimizer, v4.49
408 rows, 248 columns, 2006 non-zeros
0: obj = 0.000000000e+00 infeas = 1.000e+00 (0)
* 1: obj = 0.000000000e+00 infeas = 0.000e+00 (0)
* 192: obj = 3.090929000e+03 infeas = 0.000e+00 (0)
OPTIMAL SOLUTION FOUND
Integer optimization begins...
+ 192: mip = not found yet <= +inf (1; 0)
+ 192: >>>>> 3.090929000e+03 <= 3.090929000e+03 0.0% (1; 0)
+ 192: mip = 3.090929000e+03 <= tree is empty 0.0% (0; 1)
INTEGER OPTIMAL SOLUTION FOUND
Time used: 0.0 secs
Memory used: 0.7 Mb (727283 bytes)
Writing MIP solution to `/tmp/tmpGkiOZD.glpk.raw'...
672 lines were written
0:14:51.33 Result dataframe has been constructed...
Traceback (most recent call last):
File "OptiTypePipeline.py", line 315, in <module>
r = result_4digit[["A1", "A2", "B1", "B2", "C1", "C2", "nof_reads", "obj"]]
File "~/common/python/2.7.6/lib/python2.7/site-packages/pandas/core/frame.py", line 1781, in __getitem__
return self._getitem_array(key)
File "~/common/python/2.7.6/lib/python2.7/site-packages/pandas/core/frame.py", line 1825, in _getitem_array
indexer = self.ix._convert_to_indexer(key, axis=1)
File "~/common/python/2.7.6/lib/python2.7/site-packages/pandas/core/indexing.py", line 1140, in _convert_to_indexer
raise KeyError('%s not in index' % objarr[mask])
KeyError: "['B1' 'B2'] not in index"
This is Yijia Li from Yunnan Province Stem Cell Bank, China.
We are very interested using your OptiType to do HLA typing.
However may I ask if I can use already mapped and SNP calling VCF to do that without using the original fastq file?
Thanks very much.
Hello again,
The OptiType paper says that "[OptiType] can be easily adapted to predict genotypes for loci other than HLA-I such as HLA-II". I was wondering if it's just a matter of changing some parameters in OptiTypePipeline.py or is something different. In any case, I am very interested in predicting HLA class II, so I was wondering how achievable this would be.
Best,
Alejandro
Hi,
I was wondering if you could please help. Have attempted to install and run Optitype as per the guidance. Unfortunately this error is generated:
python /Users/markg14/software/OptiType-master/OptiTypePipeline.py -i ./test/rna/CRC_81_N_1_fished.fastq ./test/rna/CRC_81_N_2_fished.fastq --rna -v -o ./test/rna/
mapping with 4 threads...
0:00:02.83 Mapping CRC_81_N_1_fished.fastq to NUC reference...
0:00:04.49 Mapping CRC_81_N_2_fished.fastq to NUC reference...
0:00:05.09 Generating binary hit matrix.
Traceback (most recent call last):
File "/Users/markg14/software/OptiType-master/OptiTypePipeline.py", line 298, in
pos, read_details = ht.pysam_to_hdf(bam_paths[0])
File "/Users/markg14/software/OptiType-master/hlatyper.py", line 186, in pysam_to_hdf
sam = pysam.AlignmentFile(samfile, sam_or_bam)
File "pysam/libcalignmentfile.pyx", line 351, in pysam.libcalignmentfile.AlignmentFile.cinit (pysam/libcalignmentfile.c:5200)
File "pysam/libcalignmentfile.pyx", line 544, in pysam.libcalignmentfile.AlignmentFile._open (pysam/libcalignmentfile.c:7366)
IOError: file ./test/rna/2017_03_02_15_59_39/2017_03_02_15_59_39_1.bam
not found
Any help would be greatly appreciated.
Thank you
Mark
hdf5 resources are inavailable
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.