This repo is made to replicate our paper "Prompt Learning for Developing Software Exploits".
The Python and Assembly datasets after Parser: \py-IP
and \data_shell_gen_IP
.
Run the command below:
python shell_prompt_t5.py --save_init --do_train --do_eval --do_test --train_filename data_shell_gen_IP\\decoder-train.json.seq2seq --dev_filename data_shell_gen_IP\\decoder-dev.json.seq2seq --test_filename data_shell_gen_IP\\decoder-test.json.seq2seq --model_name Salesforce/codet5-base --loss_filename loss/demo.csv --num_train_epochs 20 --visible_gpu <GPU> --max_source_length 256 --max_target_length 128 --train_batch_size 4 --eval_batch_size 4 --log_name=./log/demo.log --output_dir=demo_output
bash evaluate.sh demo_output data_shell_gen_IP
In this case, demo_output
is the [eval_data]
, data_shell_gen_IP
is the [data_dir]
.
It can be find in \human_eval
.
All the generated results are in \generated_samples
.
python 3.7
pytorch 1.10.0
openprompt 0.1.1
rouge 0.3.0
nlg-eval 2.3
nltk 3.7