gagneurlab / concise Goto Github PK
View Code? Open in Web Editor NEWConcise: Keras extension for regulatory genomics
Home Page: https://i12g-gagneurweb.in.tum.de/public/docs/concise/
License: MIT License
Concise: Keras extension for regulatory genomics
Home Page: https://i12g-gagneurweb.in.tum.de/public/docs/concise/
License: MIT License
Hi,
I using the metrics (fpr, fnr, tpr, tnr) for my model in Keras (Tensorflow.python keras v2.1.2) as you can see:
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['binary_accuracy', tpr, tnr, fpr, fnr])
But running this with history = model.fit_generator(..)
am getting the accuracy on tpr and fpr; and 1-acc for tnr and fnr. For debugging I returned also the outputs of the contingency_table
and tp=tn = all true predicitons and fp=fn = all false predicitons.
Do you have any suggestions how to fix this issue and get the right values? Thanks in advance!
Hello, since trying to reproduce the other paper I had encountered exceptions from within this package I thought to test if everything was working correctly and I tried to run pytest. After installing the required packages, I have encountered the following errors:
MBP-di-Luca:concise-master lucacappelletti$ pytest
========================================================= test session starts =========================================================
platform darwin -- Python 3.6.6, pytest-3.3.2, py-1.5.2, pluggy-0.6.0
rootdir: /Users/lucacappelletti/Downloads/concise-master, inifile:
plugins: pep8-1.0.6, cov-2.5.1
collected 81 items
tests/test_GAMSmooth.py .. [ 2%]
tests/test_concise_keras.py ....... [ 11%]
tests/test_conv1d.py .. [ 13%]
tests/test_effects.py ....... [ 22%]
tests/test_eval.py . [ 23%]
tests/test_initializers.py ...... [ 30%]
tests/test_layers.py .....F... [ 41%]
tests/test_losses.py .. [ 44%]
tests/test_metrics.py ... [ 48%]
tests/data/test_attract.py . [ 49%]
tests/data/test_encode.py . [ 50%]
tests/data/test_hocomoco.py . [ 51%]
tests/layers/test_SplineT.py ............. [ 67%]
tests/preprocessing/test_sequence.py ..... [ 74%]
tests/preprocessing/test_splines.py ....... [ 82%]
tests/preprocessing/test_structure.py FFF [ 86%]
tests/utils/test_model_data.py ... [ 90%]
tests/utils/test_position.py .. [ 92%]
tests/utils/test_utils_pwm.py ...... [100%]
============================================================== FAILURES ===============================================================
_____________________________________ test_all_layers[seq3-encodeSEQ3-InputSEQ3-ConvRNAStructure] _____________________________________
seq = ['ACTTGAATA'], encodeSEQ = <function encodeRNAStructure at 0x1a1bb1d268>, InputSEQ = <function InputRNAStructure at 0x1a1c665378>
ConvSEQ = <class 'concise.layers.ConvRNAStructure'>
tmpdir = local('/private/var/folders/97/ljz_bpdd5qq6m3kxf6vdn0640000gn/T/pytest-of-lucacappelletti/pytest-0/test_all_layers_seq3_encodeSEQ0')
@pytest.mark.parametrize("seq, encodeSEQ, InputSEQ, ConvSEQ", [
(["ACTTGAATA"], encodeDNA, cl.InputDNA, cl.ConvDNA),
(["ACUUGAAUA"], encodeRNA, cl.InputRNA, cl.ConvRNA),
(["ACTTGAATA"], encodeCodon, cl.InputCodon, cl.ConvCodon),
(["ACTTGAATA"], encodeRNAStructure, cl.InputRNAStructure, cl.ConvRNAStructure),
(["ARNBCEQ"], encodeAA, cl.InputAA, cl.ConvAA),
(np.array([[1, 2, 3, 4, 5]]), encodeSplines, cl.InputSplines, cl.ConvSplines),
])
def test_all_layers(seq, encodeSEQ, InputSEQ, ConvSEQ, tmpdir):
seq_length = len(seq[0])
# pre-process
> train_x = encodeSEQ(seq)
tests/test_layers.py:70:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
concise/preprocessing/structure.py:134: in encodeRNAStructure
run_RNAplfold(fasta_path, tmpdir, W=W, L=L, U=U)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
input_fasta = '/tmp/RNAplfold//5158acaf-cb4f-49b2-a590-8771eae37917//input.fasta'
tmpdir = '/tmp/RNAplfold//5158acaf-cb4f-49b2-a590-8771eae37917/', W = 240, L = 160, U = 1
def run_RNAplfold(input_fasta, tmpdir, W=240, L=160, U=1):
"""
Arguments:
W, Int: span - window length
L, Int, maxiumm span
U, Int, size of unpaired region
"""
profiles = RNAplfold_PROFILES_EXECUTE
for i, P in enumerate(profiles):
print("running {P}_RNAplfold... ({i}/{N})".format(P=P, i=i + 1, N=len(profiles)))
command = "{bin}/{P}_RNAplfold".format(bin=RNAplfold_BIN_DIR, P=P)
file_out = "{tmp}/{P}_profile.fa".format(tmp=tmpdir, P=P)
args = " -W {W} -L {L} -u {U} < {fa} > {file_out}".format(W=W, L=L, U=U, fa=input_fasta, file_out=file_out)
os.system(command + args)
# check if the file is empty
if os.path.getsize(file_out) == 0:
> raise Exception("command wrote an empty file: {0}".format(file_out))
E Exception: command wrote an empty file: /tmp/RNAplfold//5158acaf-cb4f-49b2-a590-8771eae37917//H_profile.fa
concise/preprocessing/structure.py:38: Exception
-------------------------------------------------------- Captured stdout call ---------------------------------------------------------
running H_RNAplfold... (1/4)
-------------------------------------------------------- Captured stderr call ---------------------------------------------------------
sh: /Users/lucacappelletti/Downloads/concise-master/concise/resources/RNAplfold/H_RNAplfold: cannot execute binary file
_______________________________________________________ test_encodeRNAstructure _______________________________________________________
def test_encodeRNAstructure():
with cd("/tmp/"):
# what we want: seqs, values, chanells?
seq = ["TATTATGTATATGTATA", "TATGTATAT"]
> arr = encodeRNAStructure(seq)
tests/preprocessing/test_structure.py:38:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
concise/preprocessing/structure.py:134: in encodeRNAStructure
run_RNAplfold(fasta_path, tmpdir, W=W, L=L, U=U)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
input_fasta = '/tmp/RNAplfold//6e620cf5-fe37-46f7-9959-11407f171fdf//input.fasta'
tmpdir = '/tmp/RNAplfold//6e620cf5-fe37-46f7-9959-11407f171fdf/', W = 240, L = 160, U = 1
def run_RNAplfold(input_fasta, tmpdir, W=240, L=160, U=1):
"""
Arguments:
W, Int: span - window length
L, Int, maxiumm span
U, Int, size of unpaired region
"""
profiles = RNAplfold_PROFILES_EXECUTE
for i, P in enumerate(profiles):
print("running {P}_RNAplfold... ({i}/{N})".format(P=P, i=i + 1, N=len(profiles)))
command = "{bin}/{P}_RNAplfold".format(bin=RNAplfold_BIN_DIR, P=P)
file_out = "{tmp}/{P}_profile.fa".format(tmp=tmpdir, P=P)
args = " -W {W} -L {L} -u {U} < {fa} > {file_out}".format(W=W, L=L, U=U, fa=input_fasta, file_out=file_out)
os.system(command + args)
# check if the file is empty
if os.path.getsize(file_out) == 0:
> raise Exception("command wrote an empty file: {0}".format(file_out))
E Exception: command wrote an empty file: /tmp/RNAplfold//6e620cf5-fe37-46f7-9959-11407f171fdf//H_profile.fa
concise/preprocessing/structure.py:38: Exception
-------------------------------------------------------- Captured stdout call ---------------------------------------------------------
running H_RNAplfold... (1/4)
-------------------------------------------------------- Captured stderr call ---------------------------------------------------------
sh: /Users/lucacappelletti/Downloads/concise-master/concise/resources/RNAplfold/H_RNAplfold: cannot execute binary file
_________________________________________________________ test_other_objects __________________________________________________________
def test_other_objects():
seq = np.array(["TATTATGTATATGTATA", "TATGTATAT"])
> arr = encodeRNAStructure(seq)
tests/preprocessing/test_structure.py:46:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
concise/preprocessing/structure.py:134: in encodeRNAStructure
run_RNAplfold(fasta_path, tmpdir, W=W, L=L, U=U)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
input_fasta = '/tmp/RNAplfold//2035953c-fd3f-4c12-b76a-0fe72b0658a5//input.fasta'
tmpdir = '/tmp/RNAplfold//2035953c-fd3f-4c12-b76a-0fe72b0658a5/', W = 240, L = 160, U = 1
def run_RNAplfold(input_fasta, tmpdir, W=240, L=160, U=1):
"""
Arguments:
W, Int: span - window length
L, Int, maxiumm span
U, Int, size of unpaired region
"""
profiles = RNAplfold_PROFILES_EXECUTE
for i, P in enumerate(profiles):
print("running {P}_RNAplfold... ({i}/{N})".format(P=P, i=i + 1, N=len(profiles)))
command = "{bin}/{P}_RNAplfold".format(bin=RNAplfold_BIN_DIR, P=P)
file_out = "{tmp}/{P}_profile.fa".format(tmp=tmpdir, P=P)
args = " -W {W} -L {L} -u {U} < {fa} > {file_out}".format(W=W, L=L, U=U, fa=input_fasta, file_out=file_out)
os.system(command + args)
# check if the file is empty
if os.path.getsize(file_out) == 0:
> raise Exception("command wrote an empty file: {0}".format(file_out))
E Exception: command wrote an empty file: /tmp/RNAplfold//2035953c-fd3f-4c12-b76a-0fe72b0658a5//H_profile.fa
concise/preprocessing/structure.py:38: Exception
-------------------------------------------------------- Captured stdout call ---------------------------------------------------------
running H_RNAplfold... (1/4)
-------------------------------------------------------- Captured stderr call ---------------------------------------------------------
sh: /Users/lucacappelletti/Downloads/concise-master/concise/resources/RNAplfold/H_RNAplfold: cannot execute binary file
___________________________________________________________ test_real_data ____________________________________________________________
def test_real_data():
csv_file_path = "data/pombe_half-life_UTR3.csv"
dt = pd.read_csv(csv_file_path)
seq_vec = dt["seq"][:6]
> a = encodeRNAStructure(seq_vec)
tests/preprocessing/test_structure.py:55:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
concise/preprocessing/structure.py:134: in encodeRNAStructure
run_RNAplfold(fasta_path, tmpdir, W=W, L=L, U=U)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
input_fasta = '/tmp/RNAplfold//4c0c20c2-ca9f-45a0-b2e7-a5933a2528b1//input.fasta'
tmpdir = '/tmp/RNAplfold//4c0c20c2-ca9f-45a0-b2e7-a5933a2528b1/', W = 240, L = 160, U = 1
def run_RNAplfold(input_fasta, tmpdir, W=240, L=160, U=1):
"""
Arguments:
W, Int: span - window length
L, Int, maxiumm span
U, Int, size of unpaired region
"""
profiles = RNAplfold_PROFILES_EXECUTE
for i, P in enumerate(profiles):
print("running {P}_RNAplfold... ({i}/{N})".format(P=P, i=i + 1, N=len(profiles)))
command = "{bin}/{P}_RNAplfold".format(bin=RNAplfold_BIN_DIR, P=P)
file_out = "{tmp}/{P}_profile.fa".format(tmp=tmpdir, P=P)
args = " -W {W} -L {L} -u {U} < {fa} > {file_out}".format(W=W, L=L, U=U, fa=input_fasta, file_out=file_out)
os.system(command + args)
# check if the file is empty
if os.path.getsize(file_out) == 0:
> raise Exception("command wrote an empty file: {0}".format(file_out))
E Exception: command wrote an empty file: /tmp/RNAplfold//4c0c20c2-ca9f-45a0-b2e7-a5933a2528b1//H_profile.fa
concise/preprocessing/structure.py:38: Exception
-------------------------------------------------------- Captured stdout call ---------------------------------------------------------
running H_RNAplfold... (1/4)
-------------------------------------------------------- Captured stderr call ---------------------------------------------------------
sh: /Users/lucacappelletti/Downloads/concise-master/concise/resources/RNAplfold/H_RNAplfold: cannot execute binary file
========================================================== warnings summary ===========================================================
tests/test_effects.py::test_dropout
/anaconda3/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:879: RuntimeWarning: invalid value encountered in greater
return (self.a < x) & (x < self.b)
/anaconda3/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:879: RuntimeWarning: invalid value encountered in less
return (self.a < x) & (x < self.b)
/anaconda3/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:1821: RuntimeWarning: invalid value encountered in less_equal
cond2 = cond0 & (x <= self.a)
/Users/lucacappelletti/Downloads/concise-master/concise/effects/dropout.py:250: RuntimeWarning: invalid value encountered in greater
sel = (np.abs(prob) > np.abs(prob_rc)).astype(np.int) # Select the LOWER p-value among fwd and rc
/anaconda3/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:879: RuntimeWarning: invalid value encountered in greater
return (self.a < x) & (x < self.b)
/anaconda3/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:879: RuntimeWarning: invalid value encountered in less
return (self.a < x) & (x < self.b)
/anaconda3/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:1821: RuntimeWarning: invalid value encountered in less_equal
cond2 = cond0 & (x <= self.a)
/Users/lucacappelletti/Downloads/concise-master/concise/effects/dropout.py:250: RuntimeWarning: invalid value encountered in greater
sel = (np.abs(prob) > np.abs(prob_rc)).astype(np.int) # Select the LOWER p-value among fwd and rc
tests/test_effects.py::test_ism
/Users/lucacappelletti/Downloads/concise-master/concise/effects/ism.py:69: UserWarning: Using log_odds on model outputs that are not bound [0,1]
warnings.warn("Using log_odds on model outputs that are not bound [0,1]")
/Users/lucacappelletti/Downloads/concise-master/concise/effects/ism.py:70: RuntimeWarning: invalid value encountered in log
diffs = np.log(preds["alt"] / (1 - preds["alt"])) - np.log(preds["ref"] / (1 - preds["ref"]))
/Users/lucacappelletti/Downloads/concise-master/concise/effects/ism.py:71: RuntimeWarning: invalid value encountered in log
diffs_rc = np.log(preds["alt_rc"] / (1 - preds["alt_rc"])) - np.log(preds["ref_rc"] / (1 - preds["ref_rc"]))
/Users/lucacappelletti/Downloads/concise-master/concise/effects/ism.py:79: RuntimeWarning: invalid value encountered in less
replace_filt = np.abs(diffs) < np.abs(diffs_rc)
tests/test_effects.py::test_effect_from_model
/anaconda3/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:879: RuntimeWarning: invalid value encountered in greater
return (self.a < x) & (x < self.b)
/anaconda3/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:879: RuntimeWarning: invalid value encountered in less
return (self.a < x) & (x < self.b)
/anaconda3/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:1821: RuntimeWarning: invalid value encountered in less_equal
cond2 = cond0 & (x <= self.a)
/Users/lucacappelletti/Downloads/concise-master/concise/effects/dropout.py:250: RuntimeWarning: invalid value encountered in greater
sel = (np.abs(prob) > np.abs(prob_rc)).astype(np.int) # Select the LOWER p-value among fwd and rc
/Users/lucacappelletti/Downloads/concise-master/concise/effects/ism.py:69: UserWarning: Using log_odds on model outputs that are not bound [0,1]
warnings.warn("Using log_odds on model outputs that are not bound [0,1]")
/Users/lucacappelletti/Downloads/concise-master/concise/effects/ism.py:70: RuntimeWarning: invalid value encountered in log
diffs = np.log(preds["alt"] / (1 - preds["alt"])) - np.log(preds["ref"] / (1 - preds["ref"]))
/Users/lucacappelletti/Downloads/concise-master/concise/effects/ism.py:71: RuntimeWarning: invalid value encountered in log
diffs_rc = np.log(preds["alt_rc"] / (1 - preds["alt_rc"])) - np.log(preds["ref_rc"] / (1 - preds["ref_rc"]))
/Users/lucacappelletti/Downloads/concise-master/concise/effects/ism.py:79: RuntimeWarning: invalid value encountered in less
replace_filt = np.abs(diffs) < np.abs(diffs_rc)
-- Docs: http://doc.pytest.org/en/latest/warnings.html
========================================= 4 failed, 77 passed, 20 warnings in 536.29 seconds ==========================================
Should be done soon.
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
20 train_exclude_features
21
[Sat Dec 1 13:52:44 2018]
rule train_exclude_features:
input: data/eclip/processed/design_matrix/train/TBRG4_extended.csv, data/eclip/processed/design_matrix/valid/TBRG4_extended.csv, data/eclip/processed/design_matrix/test/TBRG4_extended.csv, Scripts/RBP/Eclip/predictive_models/train_exclude_features.py
output: data/eclip/processed/feature_exclusion_exp/results/TBRG4/DeepNN_scalar_position_ext_gam-excl-polya,gene_end.json
jobid: 26
wildcards: rbp_name=TBRG4, exp=DeepNN_scalar_position_ext_gam, fset=polya,gene_end
Using TensorFlow backend.
INFO:2018-12-01 13:52:57,891:excl_f] used_features: ['tss', 'exon_intron', 'intron_exon', 'start_codon', 'stop_codon', 'gene_start']
2018-12-01 13:52:57,891 [INFO] used_features: ['tss', 'exon_intron', 'intron_exon', 'start_codon', 'stop_codon', 'gene_start']
INFO:2018-12-01 13:52:57,891:excl_f] get the best hyper-parameters for a model
2018-12-01 13:52:57,891 [INFO] get the best hyper-parameters for a model
INFO:2018-12-01 13:52:57,892:excl_f] c_exp_name: DeepNN_scalar_position_ext_gam_TBRG4
2018-12-01 13:52:57,892 [INFO] c_exp_name: DeepNN_scalar_position_ext_gam_TBRG4
2018-12-01 13:52:57,892 [INFO] PROTOCOL mongo
2018-12-01 13:52:57,892 [INFO] USERNAME None
2018-12-01 13:52:57,892 [INFO] HOSTNAME localhost
2018-12-01 13:52:57,892 [INFO] PORT 27017
2018-12-01 13:52:57,892 [INFO] PATH /RBP__Eclip/jobs
2018-12-01 13:52:57,892 [INFO] AUTH DB None
2018-12-01 13:52:57,892 [INFO] DB RBP__Eclip
2018-12-01 13:52:57,892 [INFO] COLLECTION jobs
Traceback (most recent call last):
File "Scripts/RBP/Eclip/predictive_models/train_exclude_features.py", line 75, in <module>
tid = tr.best_trial_tid()
File "/home/cappelletti/code/.virtualenvs/virtual-py36gpu/lib/python3.6/site-packages/concise/hyopt.py", line 137, in best_trial_tid
lid = np.where(np.argsort(losses).argsort() == rank)[0][0]
IndexError: index 0 is out of bounds for axis 0 with size 0
[Sat Dec 1 13:52:58 2018]
Error in rule train_exclude_features:
jobid: 26
output: data/eclip/processed/feature_exclusion_exp/results/TBRG4/DeepNN_scalar_position_ext_gam-excl-polya,gene_end.json
RuleException:
CalledProcessError in line 123 of /data/Avsec/automated/Manuscript_Avsec_Bioinformatics_2017/Scripts/RBP/Eclip/Snakefile:
Command ' set -euo pipefail; python Scripts/RBP/Eclip/predictive_models/train_exclude_features.py --rbp=TBRG4 --feature_set=polya,gene_end --exp=DeepNN_scalar_position_ext_gam ' returned non-zero exit status 1.
File "/data/Avsec/automated/Manuscript_Avsec_Bioinformatics_2017/Scripts/RBP/Eclip/Snakefile", line 123, in __rule_train_exclude_features
File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /data/Avsec/automated/Manuscript_Avsec_Bioinformatics_2017/Scripts/RBP/Eclip/.snakemake/log/2018-12-01T135243.477113.snakemake.log
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.