Giter Club home page Giter Club logo

llemma_formal2formal's Introduction

llemma formal2formal

Scripts for the Lean formal2formal (tactic prediction) experiments in
Llemma: an open language model for mathematics [Azerbayev et al 2023]

Setup

Install Python packages:

pip install -r requirements.txt

Install Lean:

# from https://leanprover-community.github.io/install/linux.html
# After running this command, select (2), then `nightly`, then `y`:
curl https://raw.githubusercontent.com/leanprover/elan/master/elan-init.sh -sSf | sh
source $HOME/.elan/env
lake

Configure LeanDojo:

export CONTAINER="native"

Run

See scripts

Compute metrics

python compute_metrics.py
==>

codellama7b_minif2f_test        0.20491803278688525     50      244
codellama34b_minif2f_test       0.22131147540983606     54      244
llemma7b_minif2f_test   0.26229508196721313     64      244
llemma34b_minif2f_test  0.2581967213114754      63      244

Troubleshooting

  • We observe a Ray error when running the 34b script (with VLLM --tp-degree > 1) on an untraced LeanDojo repo. A workaround is to run the 7b script with --tp-degree 1 such that LeanDojo completes tracing the repo. Then run the 34b script with --tp-degree > 1.

Citation

Please cite the following:

@misc{azerbayev2023llemma,
      title={Llemma: An Open Language Model For Mathematics}, 
      author={Zhangir Azerbayev and Hailey Schoelkopf and Keiran Paster and Marco Dos Santos and Stephen McAleer and Albert Q. Jiang and Jia Deng and Stella Biderman and Sean Welleck},
      year={2023},
      eprint={2310.10631},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

llemma_formal2formal's People

Contributors

wellecks avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

llemma_formal2formal's Issues

A better architecture might be that the language model can determine possible methods to solve mathematical problems?Has anyone thought about this direction?

A better architecture might be that the language model can determine possible methods to solve mathematical problems, then perform specific operations to transform the solution of the mathematical problem into a series of thoughts, fill the thought chain, and add generality in the transformation part to make the thinking more divergent, and then the part of filling the thought chain needs to be more accurate

It's a bit like the architecture of chatgpt+mathematica, but the problem with the chatgpt+mathematica architecture is that mathematica is too biased towards hard coding and often reports errors, and chatgpt does not specifically train to split the input of a mathematical problem into step-by-step solutions

Performance on Minif2f-valid

Thank you for your valuable contribution to formal theorem proving. I would like to cite your work but I didn't find a reported result on the pass rate of minif2f-valid. I am sorry I don't have enough resources now to reproduce the results. So I wonder if you have done any evaluation on minif2f-valid. What is the performance? Thank you in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.