princeton-vl / coqgym Goto Github PK

A Learning Environment for Theorem Proving with the Coq proof assistant

Home Page: https://arxiv.org/abs/1905.09381

License: GNU Lesser General Public License v2.1

Python 0.31% Makefile 0.22% Shell 0.15% OCaml 8.35% C 2.95% Coq 52.18% Standard ML 0.03% C++ 0.08% JavaScript 0.85% Emacs Lisp 0.01% Nix 0.01% Batchfile 0.02% NSIS 0.02% Dockerfile 0.01% TeX 1.06% sed 0.01% CSS 0.07% HTML 33.67% GAP 0.01% Scheme 0.01%

icml-2019 machine-learning theorem-proving

coqgym's People

Contributors

Stargazers

Watchers

coqgym's Issues

Error when installing SerAPI

Hi, I follow your instructions to install CoqGym from scratch, but I encounter a problem when installing SerAPI. When I run sudo ./install.sh in CoqGym, my terminal always tells me an error when installing:

File "serapi/serapi_protocol.mli", line 135, characters 23-32:
Error (warning 3): deprecated: Sexplib.Conv.sexp_list
[since 2019-03] use [@sexp.list] instead
File "serapi/serapi_protocol.mli", line 136, characters 16-27:
Error (warning 3): deprecated: Sexplib.Conv.sexp_option
[since 2019-03] use [@sexp.option] instead
File "serapi/serapi_protocol.mli", line 180, characters 23-34:
Error (warning 3): deprecated: Sexplib.Conv.sexp_option
[since 2019-03] use [@sexp.option] instead
File "serapi/serapi_protocol.mli", line 183, characters 21-32:
Error (warning 3): deprecated: Sexplib.Conv.sexp_option
[since 2019-03] use [@sexp.option] instead
File "serapi/serapi_protocol.mli", line 184, characters 21-32:
Error (warning 3): deprecated: Sexplib.Conv.sexp_option
[since 2019-03] use [@sexp.option] instead
File "serapi/serapi_protocol.mli", line 185, characters 21-32:
Error (warning 3): deprecated: Sexplib.Conv.sexp_option
[since 2019-03] use [@sexp.option] instead
File "serapi/serapi_protocol.mli", line 199, characters 37-48:
Error (warning 3): deprecated: Sexplib.Conv.sexp_option
[since 2019-03] use [@sexp.option] instead
File "serapi/serapi_protocol.mli", line 202, characters 61-72:
Error (warning 3): deprecated: Sexplib.Conv.sexp_option
[since 2019-03] use [@sexp.option] instead
Makefile:19: recipe for target 'build' failed
make: *** [build] Error 1
Error: The following .install are missing:

_build/default/coq-serapi.install

From https://gitter.im/coq-serapi/Lobby?at=5e0cbdddeaa2cd096fd8bcfb, it seems like that the latest version sexplib and ppx_import don't support the version of SerAPI in your project. Do you have any idea about this problem and how to fix it if possible? Thank you for your help.

What is the train, val, test splits?

Do you have a description of which coq projects are used for train, val, test without having to create the data sets? Or only via the extract_proof_steps.py can we see this?

ASTactic/extract_proof_steps.py process killed after 17%

I have followed the readme instructions for setting up coq and coqgym (with some difficulty!) and now trying to reproduce the decoder model results. As a first step I try to run python ASTactic/extract_proof_steps.py and it seems to progress to 17% without issue but then the process is killed with no logs or error messaging.

Running linux ubuntu 20.04 and built coqgym from source (not appcontainer).

Do you know what is happening here?

Error when installing

I tried to install this project using the instructions in the readme. This is the terminal output I got:

21:56:27 Project #=# mkdir coqgym
21:56:45 Project #=# cd coqgym
21:56:50 coqgym #=# opam switch create 4.07.1+flambda && eval $(opam env)

<><> Gathering sources ><><><><><><><><><><><><><><><><><><><><><><><><><><>  🐫 
[ocaml-variants.4.07.1+flambda] downloaded from cache at https://opam.ocaml.org/cache

<><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><>  🐫 
∗ installed base-bigarray.base
∗ installed base-threads.base
∗ installed base-unix.base
∗ installed ocaml-variants.4.07.1+flambda
∗ installed ocaml-config.1
∗ installed ocaml.4.07.1
Done.
# Run eval $(opam env) to update the current shell environment
21:59:48 coqgym #=# opam upgrade && eval $(opam env)
Everything as up-to-date as possible (run with --verbose to show unavailable upgrades).
However, you may "opam upgrade" these packages explicitly, which will ask permission to downgrade or uninstall the conflicting packages.
Nothing to do.
22:00:06 coqgym #=# git clone https://github.com/princeton-vl/CoqGym
Cloning into 'CoqGym'...
remote: Enumerating objects: 4, done.
remote: Counting objects: 100% (4/4), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 11740 (delta 0), reused 0 (delta 0), pack-reused 11736
Receiving objects: 100% (11740/11740), 32.70 MiB | 11.09 MiB/s, done.
Resolving deltas: 100% (1683/1683), done.
Updating files: 100% (11496/11496), done.
22:00:28 coqgym #=# cd CoqGym && source install.sh
Installing Dependencies..
opam: PACKAGES... arguments: Invalid character in package name "dune=1.10.0
      cmdliner=1.0.4 ppx_sexp_conv=v0.12.0 ppx_deriving=4.3 sexplib=v0.12.0
      ppx_import=1.6.2 camlp5=7.08 coq=8.9.1"
Usage: opam install [OPTION]... [PACKAGES]...
Try `opam install --help' or `opam --help' for more information.
Dependencies installed
Installing Coq..
Error: cannot find 'ocamlfind' in your path!
Please adjust your path or use the -ocamlfind option of ./configure
Configuration script failed!
Coq installed
Installing SerAPI..
dune build
make: dune: No such file or directory
make: *** [build] Error 1
install.sh:22: command not found: dune
SerAPI installed
Installing CoqHammer..
coq_makefile -f _CoqProject -o Makefile.coq
/Library/Developer/CommandLineTools/usr/bin/make -f Makefile.coq
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqtop: No such file or directory
CAMLDEP src/provers.mli
CAMLDEP src/features.mli
CAMLDEP src/coq_transl.mli
CAMLDEP src/tptp_out.mli
CAMLDEP src/coq_convert.mli
CAMLDEP src/hashing.mli
CAMLDEP src/coq_typing.mli
CAMLDEP src/defhash.mli
COQDEP src/hammer_plugin.mllib
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqdep: No such file or directory
CAMLDEP src/partac.ml
CAMLDEP src/provers.ml
File "src/provers.ml", line 193, characters 143-145:
Warning 14: illegal backslash escape in string.
CAMLDEP src/features.ml
CAMLDEP src/parallel.ml
CAMLDEP src/opt.ml
CAMLDEP src/coq_transl.ml
CAMLDEP src/tptp_out.ml
CAMLDEP src/coq_convert.ml
CAMLDEP src/hashing.ml
CAMLDEP src/coq_typing.ml
CAMLDEP src/defhash.ml
CAMLDEP src/coqterms.ml
CAMLDEP src/coq_transl_opts.ml
CAMLDEP src/timeout.ml
CAMLDEP src/msg.ml
CAMLDEP src/hh_term.ml
CAMLDEP src/hhlib.ml
CAMLDEP src/hammer_errors.ml
make[1]: printconf: Command not found
CAMLDEP -pp src/hammer.ml4
sh: -I: command not found
File "src/hammer.ml4", line 1:
Error: Error while running external preprocessor
Command line:  -I  -I /grammar compat5.cmo  grammar.cma  -impl 'src/hammer.ml4' > /var/folders/bk/jzgxbvbj67x8kjdptbj814y80000gn/T/ocamlppbeacc8

COQDEP theories/Hammer.v
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqdep: No such file or directory
COQDEP theories/Reconstr.v
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqdep: No such file or directory
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqtop: No such file or directory
COQDEP src/hammer_plugin.mllib
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqdep: No such file or directory
make[1]: printconf: Command not found
CAMLDEP -pp src/hammer.ml4
sh: -I: command not found
File "src/hammer.ml4", line 1:
Error: Error while running external preprocessor
Command line:  -I  -I /grammar compat5.cmo  grammar.cma  -impl 'src/hammer.ml4' > /var/folders/bk/jzgxbvbj67x8kjdptbj814y80000gn/T/ocamlppccf733

COQDEP theories/Hammer.v
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqdep: No such file or directory
COQDEP theories/Reconstr.v
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqdep: No such file or directory
COQC theories/Reconstr.v
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqc: No such file or directory
make[1]: *** [theories/Reconstr.vo] Error 127
make: *** [all] Error 2
/Library/Developer/CommandLineTools/usr/bin/make -f Makefile.coq install
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqtop: No such file or directory
COQDEP src/hammer_plugin.mllib
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqdep: No such file or directory
make[1]: printconf: Command not found
CAMLDEP -pp src/hammer.ml4
sh: -I: command not found
File "src/hammer.ml4", line 1:
Error: Error while running external preprocessor
Command line:  -I  -I /grammar compat5.cmo  grammar.cma  -impl 'src/hammer.ml4' > /var/folders/bk/jzgxbvbj67x8kjdptbj814y80000gn/T/ocamlpp430043

COQDEP theories/Hammer.v
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqdep: No such file or directory
COQDEP theories/Reconstr.v
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqdep: No such file or directory
make[1]: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqtop: Command not found
make[1]: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqtop: Command not found
make[1]: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqtop: Command not found
make[1]: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqtop: Command not found
cd "src" && for i in hammer_plugin.cma coq_convert.cmi coq_transl.cmi coq_transl_opts.cmi coq_typing.cmi coqterms.cmi defhash.cmi features.cmi hammer.cmi hammer_errors.cmi hashing.cmi hh_term.cmi hhlib.cmi msg.cmi opt.cmi parallel.cmi partac.cmi provers.cmi timeout.cmi tptp_out.cmi hammer.cmo hammer_errors.cmo hhlib.cmo hh_term.cmo msg.cmo timeout.cmo coq_transl_opts.cmo coqterms.cmo defhash.cmo coq_typing.cmo hashing.cmo coq_convert.cmo tptp_out.cmo coq_transl.cmo opt.cmo parallel.cmo features.cmo provers.cmo partac.cmo; do \
	 install -d "`dirname """user-contrib"/Hammer/$i`"; \
	 install -m 0644 $i """user-contrib"/Hammer/$i; \
	done
install: hammer_plugin.cma: No such file or directory
install: coq_convert.cmi: No such file or directory
install: coq_transl.cmi: No such file or directory
install: coq_transl_opts.cmi: No such file or directory
install: coq_typing.cmi: No such file or directory
install: coqterms.cmi: No such file or directory
install: defhash.cmi: No such file or directory
install: features.cmi: No such file or directory
install: hammer.cmi: No such file or directory
install: hammer_errors.cmi: No such file or directory
install: hashing.cmi: No such file or directory
install: hh_term.cmi: No such file or directory
install: hhlib.cmi: No such file or directory
install: msg.cmi: No such file or directory
install: opt.cmi: No such file or directory
install: parallel.cmi: No such file or directory
install: partac.cmi: No such file or directory
install: provers.cmi: No such file or directory
install: timeout.cmi: No such file or directory
install: tptp_out.cmi: No such file or directory
install: hammer.cmo: No such file or directory
install: hammer_errors.cmo: No such file or directory
install: hhlib.cmo: No such file or directory
install: hh_term.cmo: No such file or directory
install: msg.cmo: No such file or directory
install: timeout.cmo: No such file or directory
install: coq_transl_opts.cmo: No such file or directory
install: coqterms.cmo: No such file or directory
install: defhash.cmo: No such file or directory
install: coq_typing.cmo: No such file or directory
install: hashing.cmo: No such file or directory
install: coq_convert.cmo: No such file or directory
install: tptp_out.cmo: No such file or directory
install: coq_transl.cmo: No such file or directory
install: opt.cmo: No such file or directory
install: parallel.cmo: No such file or directory
install: features.cmo: No such file or directory
install: provers.cmo: No such file or directory
install: partac.cmo: No such file or directory
make[1]: *** [install] Error 71
make: *** [install] Error 2
CoqHammer installed
22:00:43 CoqGym #=# cd coq_projects && make && cd ..
/Library/Developer/CommandLineTools/usr/bin/make core-dependencies
/Library/Developer/CommandLineTools/usr/bin/make math-comp
cd math-comp/mathcomp/ && make && make install
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqtop: No such file or directory
/bin/sh: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coqtop: No such file or directory
/bin/sh: line 0: [: =: unary operator expected
/Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coq_makefile  -f Make -o Makefile.coq
make[3]: /Users/boltonbailey/Documents/MyDocuments/UIUC/Classes/FourthYear/Fall/CS598YW/Project/coqgym/CoqGym/coq/bin/coq_makefile: No such file or directory
make[3]: *** [Makefile.coq] Error 1
make[2]: *** [math-comp] Error 2
make[1]: *** [core-dependencies] Error 2
make: *** [all] Error 2
22:01:02 coq_projects #=#

I am on macOS 10.15.6

Usage of pre-trained model

Hi,
I am looking for the most direct way to simply apply the pre-trained model to a new dataset that I am preparing. I believe the pre-trained model has to be loaded using torch.load_state_dict(), which means that I need to include the class definition. This has produced numerous issues getting the dependencies right so that the code runs without errors. (I am on a Windows machine, and the git clone failed because of some issue with a file named Aux.V).

All of that is to say, would it be possible to get any additional guidance on the simplest way to apply the pre-trained model to new data? Thank you for any advice!

why is 'SerTop' special when extracting environment and is 'vernac' 'vernacular file'?

CoqGym/ASTactic/agent.py

Line 36 in 85d3c85

 for const in [const for const in env['constants'] if const['qualid'].startswith('SerTop')][-10:]: 

Hi, I am very interested in this project and I try to find the definition of 'SerTop' in glossary but I can't, and I find 'vernacular file' when I search for 'vernac'.
could you shed some lights on these, thanks very much!

A bug in the dataset of CoqGym

Hi,

I find there is a bug in the dataset that will affect training. In agent.py, the qualid attribute for const is the full name of the const, (e.g., Coq.Arith.PeanoNat.Nat.mul_wd), however, it should be the short_ident (PeanoNat.Nat.mul_wd), since the extracted tactic arguments from the Coq files use the short name, not the full name. This problem will cause that there is no ground-truth label for tactic arguments when training.

And may I ask if the provided checkpoint of ASTacitc trained on only the training set or both training set & testing set, and is the epoch 4 or 5 (since the default epoch is 4 but the paper states that the epoch is 5)?

discussion feature from github for CoqGym

Hi!

I recently discovered this the discussion feature from github. Since it seems beyond myself there have been gitissued raised that seem more like questions or discussions I thought this might be a feature you'd like. I am trying it out in my own fork of your repo:

https://github.com/brando90/CoqGym/discussions

let me know if you open one!

Happy coding proving!

How to Use Singularity Containers for Parallel Computing?

It seems that the majority of time is spent in the authors’ modified version of Coq, and that the Singularity containers are meant to spread some of that work out across multiple instances to speed up training. I'm currently able to train ASTTactic on AWS, however, it takes a few days.

Do you guys have any advice on how we can speed this up?

CC: @daniel-zeng, @zyx0wu

Version of coq used in the pre-extracted data

Would you specify the version of coq you used to extract the benchmark data? The downloaded data seems to be incompatible with the newest version of coq.

CoqExn Error when starting next proof

Hi, kaiyu, I reinstall CoqGym on a new machine but when I run CoqGym, there is a new problem when starting some new proofs:

Traceback (most recent call last):
File "evaluate.py", line 89, in
results.extend(agent.evaluate(f, opts.proof))
File "/home/lizhaoyu/CoqGym/ASTactic/agent.py", line 173, in evaluate
for proof_env in file_env: # start a proof
File "/home/lizhaoyu/CoqGym/eval_env.py", line 187, in next
self.serapi.execute(cmd)
File "/home/lizhaoyu/CoqGym/serapi.py", line 343, in execute
state_id, ast = self.send_add(cmd, return_ast)
File "/home/lizhaoyu/CoqGym/serapi.py", line 151, in send_add
responses, raw_responses = self.send('(Add () "%s")' % escape(cmd))
File "/home/lizhaoyu/CoqGym/serapi.py", line 134, in send
raise CoqExn(sexpdata.dumps(parsed_item[2][4]), sexpdata.dumps(parsed_item[2]))
serapi.CoqExn: (CErrors.UserError ("" "Compiled library Logic.lib.RelationPairs_ext (in file /home/lizhaoyu/CoqGym/coq_projects/UnifySL/lib/RelationPairs_ext.vo) makes inconsistent assumptions over library Coq.Structures.GenericMinMax"))

Specifically, maybe you can run python evaluate.py ours astactic_CoqGym --path runs/demo/checkpoints/model.pth --file ../data/UnifySL/GeneralLogic/KripkeModel.json to see this problem. Can you please help me figure this out?

Version of pytorch missing from `coq_gym.yml`

What version of pytorch do you use for this? It's not specified in your paper either. Thanks!

ASTactic/extract_proof_steps.py not working with --filter flag

Follow up from issue #80. Sorry to revisit, but using the --filter command with extract_proof_steps.py does not produce and populate a proof_steps/ output directory.

Running in debug mode, I can see the issue is something to do with utils.iter_proofs function on line 150. After running our proof_steps object is still empty and so the output logic (lines 154-167) doesn't output anything.

Parsing error

what does this parsing error mean?

Traceback (most recent call last):
  File "/Users/brando/anaconda3/envs/automl-meta-learning/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3343, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-b0f6a80bf848>", line 59, in <module>
    test_seperate_term_from_hash_lasses_data()
  File "<ipython-input-3-b0f6a80bf848>", line 51, in test_seperate_term_from_hash_lasses_data
    coqgym_parser = gallina.GallinaTermParser()
  File "/Users/brando/CoqGym/gallina.py", line 34, in __init__
    self.parser = Lark(StringIO(self.grammar), start='constr__constr', parser='lalr')
  File "/Users/brando/anaconda3/envs/automl-meta-learning/lib/python3.8/site-packages/lark/lark.py", line 300, in __init__
    self.grammar = load_grammar(grammar, self.source_path, self.options.import_paths, self.options.keep_all_tokens)
  File "/Users/brando/anaconda3/envs/automl-meta-learning/lib/python3.8/site-packages/lark/load_grammar.py", line 1061, in load_grammar
    return GrammarLoader(global_keep_all_tokens).load_grammar(grammar, source, import_paths)
  File "/Users/brando/anaconda3/envs/automl-meta-learning/lib/python3.8/site-packages/lark/load_grammar.py", line 979, in load_grammar
    new_td, new_rd = import_from_grammar_into_namespace(g, '__'.join(dotted_path), aliases)
  File "/Users/brando/anaconda3/envs/automl-meta-learning/lib/python3.8/site-packages/lark/load_grammar.py", line 751, in import_from_grammar_into_namespace
    term_defs.append([get_namespace_name(symbol, None), imported_terms[symbol]])
KeyError: Token('TERMINAL', 'STRING_INNER')

License Compatibility Review Suggested for Dataset

Dataset on Zenodo: Record ID: 8101883 [1]
Dataset License: Creative Commons Attribution 2.0 Generic (CC BY 2.0)

Overview:
The dataset incorporates components under various licenses, including:

Commercial
GPL
LGPL
MIT
BSD
Apache-2.0

Given the diversity of licenses, especially those with specific conditions like GPL and LGPL, alongside more permissive licenses such as Apache-2.0, BSD, and MIT, there's a potential for license compatibility considerations.

Suggestion:
It may be beneficial to review the dataset to align with the diverse licensing requirements effectively. This could involve:

Assessing the possibility of re-licensing the dataset to harmonize with the varied licenses, particularly focusing on GPL and LGPL requirements.
Exploring permission from copyright holders for their code's inclusion under CC BY 2.0, where applicable.
Considering the adjustment of content to maintain the dataset's licensing as CC BY 2.0, if possible.

This approach aims to ensure the dataset's compliance with open-source licensing standards and respect for copyright holders' intentions.

[1] https://zenodo.org/records/8101883

Why does CoqGym not use ML4PG (Machine Learning for Proof General)?

I came across ML4PG and was wondering why CoqGym does not use it but chose to use SerAPI instead.

Usage instructions?

I found the ICML 2019 paper and I wanted to try out the tool. There is extensive documentation about building and about the data format, but how do I actually use it to see when it is able to predict tactics in Coq proofs?

I managed to run eval_env.py and it runs on data/StructTact/Assoc.json and indeed it terminates without error as described, but it produces the result GIVEN_UP for all the lemmas. What else can I try?

line 6: unrecognized keyword ignored: db_pagesize

This might not be a big deal but wanted to let you know the unzip python file gives this warning:

tar -xvzf sexp_cache.tar.gz
x sexp_cache.lmdb
mdb_load -f sexp_cache.lmdb sexp_cache
mdb_load: line 6: unrecognized keyword ignored: db_pagesize

extract_steps doesn't work if you change location of the data

Hi Kaiyu,

I (eventually) saw:

CoqGym/ASTactic/extract_proof_steps.py

Line 79 in 97f1098

proj = filename.split(os.path.sep)[2]

which doesn't allow users to have their own path to the data (e.g. some hpc services require us to have them in specific locations etc.)

For my particular path this worked:

    proj = filename.split(os.path.sep)[6]

but we might need to think of a different way of processing the data.

(Also, that file has global variables were values are being set outside and then passing the function with a specific value set outside. That type of thing makes it really hard to debug, perhaps using classes would be better? I'm a big fan of functional programming and having arguments be explicit to make it easier for others to edit and build on your code. Anyway, just trying to help )

Embeddings for tactics

In the function beam_search in Prover.py, it looks like embeddings for environment, context, and goal are created using TermEncoder.py. Then these lines are called:

CoqGym/ASTactic/models/prover.py

Lines 74 to 80 in 37f93ce

 environment = {'idents': [v['qualid'] for v in environment], 

 'embeddings': environment_embeddings[0], 

 'quantified_idents': [v['ast'].quantified_idents for v in environment]} 

 local_context = {'idents': [v['ident'] for v in local_context], 

 'embeddings': context_embeddings[0], 

 'quantified_idents': [v['ast'].quantified_idents for v in local_context]} 

 goal = {'embeddings': goal_embeddings, 'quantified_idents': goal.quantified_idents}

Finally, a tactic is outputted with

CoqGym/ASTactic/models/prover.py

Line 81 in 37f93ce

asts = self.tactic_decoder.beam_search(environment, local_context, goal)

I can't seem to find if the tactic is ever embedded through that process. I had two questions:

Is it possible to create embeddings for tactics using TermEncoder.py? (Could I take the output of the decoder and use TermEncoder.py to create an embedding for it?
What do those lines above in the function beam_search do?

Location of meta files for dataset

I was trying to generate the dataset using the relevant instructions. However, there were no .meta files in any of the directories and I was also unable to generate them using the makefiles for each coq project.
Where can I access the .meta files used to build the dataset?

EOF error while running eval_env.py

Hi, I have followed the setup instructions. (Install Coq, SerAPI and CoqHammer, Build the Coq projects download pre-extracted data and unzip, setup opam and conda env, install other packages). I am using v18 Ubuntu system. However, while running eval_env.py, the terminal always told me an EOF error:

(coq_gym) robin@robin-Blade-Stealth:~/Workspace/DiffCoqGym$ python eval_env.py
/home/robin/Workspace/DiffCoqGym/data/additions/binary_strat.json
Traceback (most recent call last):
File "eval_env.py", line 232, in
with FileEnv(f, max_num_tactics=100, timeout=600) as file_env:
File "eval_env.py", line 133, in init
self.serapi = self.initialize_serapi()
File "eval_env.py", line 137, in initialize_serapi
serapi = SerAPI(timeout=1200)
File "/home/robin/Workspace/DiffCoqGym/serapi.py", line 73, in init
self.proc.expect_exact('(Feedback((doc_id 0)(span_id 1)(route 0)(contents Processed)))\0')
File "/home/robin/.conda/envs/coq_gym/lib/python3.7/site-packages/pexpect/spawnbase.py", line 421, in expect_exact
return exp.expect_loop(timeout)
File "/home/robin/.conda/envs/coq_gym/lib/python3.7/site-packages/pexpect/expect.py", line 179, in expect_loop
return self.eof(e)
File "/home/robin/.conda/envs/coq_gym/lib/python3.7/site-packages/pexpect/expect.py", line 122, in eof
raise exc
pexpect.exceptions.EOF: End Of File (EOF).
<pexpect.popen_spawn.PopenSpawn object at 0x7f8897bd3240>
searcher: searcher_string:
0: '(Feedback((doc_id 0)(span_id 1)(route 0)(contents Processed)))\x00'

I am guessing that might be issues about pexpect package or Coq version, but I still got this message even if I uninstalled my Coq 8.11.0. And I've never seen similar error message online. Do you have any idea about what accounts for the problem and how to fix it if possible? Thank you for your help.

Code for bottom up tree traversal and parser for terms str to AST

Hi,

I was wondering if I could be pointed to the right place for these two things:

The code that traverses the AST term object from the bottom to the top
The code that converts the Coq term (string) to the AST object.

Thanks in advance!

How is the mapping of individual words (or symbols) to vectors is done

I was wondering what the vocabulary was that coqgym was using. I read in the paper that it was mostly one hot vectors that were being fed directly. Can I:

get a confirmation that this is true or not?
link to the code that does the mapping of raw (string) word/symbol to embedding i.e. F:Word->Vector? (even if its one-hot)
where the collection of the vocabulary is done?

Thanks!

Testing fails with EOF on test folder

I have run the following command: python evaluate.py ours+hammer ours+hammer-TEST --path runs/astactic/checkpoints/model_003.pth --filter zfc and it ran for ~14 hours before failing with the following:

Can you please advise what this is and how it may be remedied? We are trying to reproduce your results.

A little question about CoqGym

Hi, I found that there are some constants (premise) in the environment or goal that have no term and type. For example,
there is a constant:
physical_path': 'coq_projects/coq/plugins/romega/ReflOmegaCore.vo:I.le_lt_int', 'short_ident': 'le_lt_int', 'qualid': 'SerTop.I.le_lt_int', 'term': None, 'type': None, 'sort': None, 'opaque': None, 'sexp': 'fd7fcdd748b0572ea9a408d07d29016aee494c5b'

Is there something wrong? How can I get its type and other attributes? Thanks.

How to efficiently process a batch of terms in one go for Tree Neural Networks

I know that for vision one can process the entire batch in one go by having the batch be a dimension in the tensor. Is something like this possible for Tree like NN? I am worried that having to do a forward pass for each example for my TNN might slow things down, but perhaps there is no other way. How do you do this, simply one term at a time?

(I assume multithreading might be possible but I don’t know how that would work with multiple threads/processing using the same gpu…plus debugging sounds like a nightmare. But I am curious what is the standard practice for this since TNN research exists.)

Does the CoqGym data set contain the Coq Standard Library?

Does the CoqGym data set contain the Coq Standard Library (https://github.com/princeton-vl/CoqGym/tree/master/coq_projects) and was the checkpoint trained on the Coq Standard library?

Documentation for how to reproduce benchmark

I wanted to re-run the CoqGym benchmarks (eg. 30% for ASTactic + hammer) before developing further. From my understadning the ASTactic testing scripts need to be fed one proof at a time (eg. "get_set_same"). How would you recommend passing the whole test dataset (or whatever was used for the benchmarks)?

python -u evaluate.py ours+hammer ours+hammer-TEST --path runs/astactic/checkpoints/model_003.pth --file ../data/StructTact/Assoc.json --proof "get_set_same"

Hidden states in TermEncoder accessed before initialized

In the TermEncoder class, the hidden_states dictionary is declared at line 148
hidden_states = {} # node -> hidden state,
accessed at line 168
h = hidden_states[c],
and initialized at line 185
hidden_states[node] = hiddens[i].
I seem to be getting at error because hidden_states is empty when the code tries to access it at line 168.

	environment = {'idents': [v['qualid'] for v in environment],
	'embeddings': environment_embeddings[0],
	'quantified_idents': [v['ast'].quantified_idents for v in environment]}
	local_context = {'idents': [v['ident'] for v in local_context],
	'embeddings': context_embeddings[0],
	'quantified_idents': [v['ast'].quantified_idents for v in local_context]}
	goal = {'embeddings': goal_embeddings, 'quantified_idents': goal.quantified_idents}

princeton-vl / coqgym Goto Github PK

coqgym's People

Contributors

Stargazers

Watchers

Forkers

coqgym's Issues

Recommend Projects

Recommend Topics

Recommend Org