Giter Club home page Giter Club logo

instinct's Introduction

Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers [ICML 2024]

Xiaoqiang Lin*, Zhaoxuan Wu*, Zhongxiang Dai, Wenyang Hu, Yao Shu, See-Kiong Ng, Patrick Jaillet, Bryan Kian Hsiang Low

Project Homepage | ArXiv | Paper

This is the code for the paper: Use Your INSTINCT: Instruction Optimization Using Neural Bandits Coupled with Transformers. We provide all the codes for our experiments which includes:

  • Instruction induction
  • Improving chain-of-thought instruction

Our code are based on the code from APE and InstructZero.

Abstract

Large language models (LLMs) have shown remarkable instruction-following capabilities and achieved impressive performances in various applications. However, the performances of LLMs depend heavily on the instructions given to them, which are typically manually tuned with substantial human efforts. Recent work has used the query-efficient Bayesian optimization (BO) algorithm to automatically optimize the instructions given to black-box LLMs. However, BO usually falls short when optimizing highly sophisticated (e.g., high-dimensional) objective functions, such as the functions mapping an instruction to the performance of an LLM. This is mainly due to the limited expressive power of the Gaussian process (GP) model which is used by BO as a surrogate to model the objective function. Meanwhile, it has been repeatedly shown that neural networks (NNs), especially pre-trained transformers, possess strong expressive power and can model highly complex functions. So, we adopt a neural bandit algorithm which replaces the GP in BO by an NN surrogate to optimize instructions for black-box LLMs. More importantly, the neural bandit algorithm allows us to naturally couple the NN surrogate with the hidden representation learned by a pre-trained transformer (i.e., an open-source LLM), which significantly boosts its performance. These motivate us to propose our INSTruction optimization usIng Neural bandits Coupled with Transformers (INSTINCT) algorithm. We perform instruction optimization for ChatGPT and use extensive experiments to show that our INSTINCT consistently outperforms the existing methods in different tasks, such as in various instruction induction tasks and the task of improving the zero-shot chain-of-thought instruction.

Prepare the data

You can download the data for intrinsic induction from the github repo of InstructZero. You can download the dataset of SAMSum from the huggingface website. You can download the dataset for GSM8K, AQUARAT, and SVAMP from the repo for APE.

We put the data preparsion notebook at COT/experiments/data/instruction_induction/pre_aqua.ipynb, COT/experiments/data/instruction_induction/pre_gsm8k.ipynb and Induction/experiments/data/nlptasks/pre_nlp_data.ipynb.

Run our code

To run our code, you need to install the environment using conda: conda env create -f environment.yml

We provide bash scripts for running our experiments for instruction induction at Induction/experiments/run_neural_bandits.sh. To run it properly, you need to run the following in the terminal:

cd Induction
bash experiments/run_neural_bandits.sh

Similarly, to run our code for improving chain-of-thought instruction, you need to run the script COT/experiments/run_cot_bandits.sh as the following:

cd COT
bash experiments/run_cot_bandits.sh

Note that before you run the above bash script, you need to specify the openai key for calling gpt-turbo-3.5-0301 API. To do so, change the following in the two bash scripts:

export export OPENAI_API_KEY=YOUR_KEY

BibTeX

@inproceedings{lin2024use,
        title={Use Your {INSTINCT}: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers},
        author={Xiaoqiang Lin and Zhaoxuan Wu and Zhongxiang Dai and Wenyang Hu and Yao Shu and See-Kiong Ng and Patrick Jaillet and Bryan Kian Hsiang Low},
        year={2024},
        booktitle={Proc. ICML}
}

instinct's People

Contributors

xqlin98 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

instinct's Issues

Training problem

Hello I tried the Induction training but there are some problems.
Take task "antonyms" as an example.
First, it's possible that the model would not get a good prompt after 165 iters. It seems that the generated prompt has never been closed to "antonym" or similar words. Instead, irrelevant prompts are provided all the time.
Second, sometimes you can get a somehow good result, but it's a few-shot example, not a good Induction.

Instruction: ['The instruction was to take a nap\nInput: unstack\nOutput: stack\nInput: unstack\nOutput: stack\nInput: unstack\nOutput: stack\nInput: unstack\nOutput: stack\nInput: unstack\nOutput: stack\nInput: unstack\nOutput: stack\nInput: unstack\nOutput']

And my training loss is extremely low, which is strange. I wonder if there are some bugs.

Start training...
Training Loss :  5.357048022363033e-08
iter 107 --- reward: 0.0
Best value found till now: 0.95
Start selecting...
Instruction: ['The instruction was to cook the dinner, which was a dish of some sort']
gpt-3.5-turbo-0301

INSTINCT for white box LLM

Hello, thanks for your nice work and I really learned a lot from your paper.
I have a question, INSTINCT is designed for black box LLM like chatGPT.
If I have an open-source LLM of size 6~7B, and I have some data to tune it myself, I wonder if the performance of INSTINCT would be the same as (or surpass) those methods based on gradients, such as p-tuning as mentioned in your paper (page 2 line 2).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.