Comments (3)
Thanks for your interest in LMFlow! LMFlow benchmark hasn't supported automatic PubMedQA evaluation yet, but modifying it should be not that difficult. @2003pro I am wondering if you could take a look?
from lmflow.
Here is the key regex for extracting the answer from responses generated from the lora-tuned model. You can check this for your evaluation script:
elif args.dataset == "pubmedqa":
# pattern = "Output: (yes|no|maybe)"
# sttr = re.search(pattern, temp)
# answer = sttr.group(0)[8:] if sttr is not None else "N/A"
answer_map = {"a":"yes","b":"no","c":"maybe","A":"yes","B":"no","C":"maybe","N/A":"N/A"}
pattern = "(answer|Answer|ANSWER|output|Output|OUTPUT|A): \(*(A|B|C|a|b|c)"
sttr = re.search(pattern, pred)
if sttr is not None:
mid_answer = sttr.group(0)
answer = mid_answer[-1].lower()
else:
pattern = "\(*(A|B|C|a|b|c)\)*(\.|\s)"
sttr = re.search(pattern, pred)
if sttr is not None:
if '(' in sttr.group(0):
answer = sttr.group(0)[1].lower()
else:
answer = sttr.group(0)[0].lower()
else:
answer = "N/A"
return answer_map[answer]
elif args.dataset == "medmcqa":
# pattern = "Output: (A|B|C|D)."
# sttr = re.search(pattern, temp)
# answer = sttr.group(0)[8:-1].lower() if sttr is not None else "N/A"
pattern = "(answer|Answer|ANSWER|output|Output|OUTPUT|A): \(*(A|B|C|D|a|b|c|d)"
sttr = re.search(pattern, pred)
if sttr is not None:
mid_answer = sttr.group(0)
answer = mid_answer[-1].lower()
else:
pattern = "\(*(A|B|C|D|a|b|c|d)\)*(\.|\s)"
sttr = re.search(pattern, pred)
if sttr is not None:
if '(' in sttr.group(0):
answer = sttr.group(0)[1].lower()
else:
answer = sttr.group(0)[0].lower()
else:
answer = "N/A"
return answer
elif args.dataset == "usmle":
# pattern = "Output: (A|B|C|D)."
# sttr = re.search(pattern, temp)
# answer = sttr.group(0)[8:-1].lower() if sttr is not None else "N/A"
pattern = "(Answer|Output|A): \(*(A|B|C|D|a|b|c|d)"
sttr = re.search(pattern, pred)
if sttr is not None:
mid_answer = sttr.group(0)
answer = mid_answer[-1].lower()
else:
pattern = "\(*(A|B|C|D|a|b|c|d)\)*(\.|\s)"
sttr = re.search(pattern, pred)
if sttr is not None:
if '(' in sttr.group(0):
answer = sttr.group(0)[1].lower()
else:
answer = sttr.group(0)[0].lower()
else:
answer = "N/A"
return answer
from lmflow.
Thanks for your reply. I can use lm_eval
to evaluate.
from lmflow.
Related Issues (20)
- Running install.sh after git clone requires over 200GB Ram HOT 6
- [DPO is available?] HOT 2
- Unable to activate conda environment on Colab HOT 7
- Cannot open the address http://lmflow.org:5000 HOT 4
- "trust_remote_code=True" problem HOT 1
- Questions about task tuning in medical domain HOT 5
- Question Regarding Optimizer Reinitialization in Lisa Implementation HOT 4
- About using multiple GPUs to do lisa fine-tuning HOT 4
- How to set learning rate decay in lisa fine-tuning HOT 2
- [New Feature] Could someone share the finetuned diffusion model which is good at 256x256 resolution?
- Memory problem of Lisa finetuning HOT 5
- Does it support llama3? HOT 5
- Causal LM finetuning HOT 3
- ValueError: mutable default <class 'lmflow.utils.conversation_formatter.StringFormatter'> for field user_formatter is not allowed: use default_factory HOT 8
- Hello , Can LMFlow support Qwen1.5-1.8B model Fine-tuning? HOT 3
- LMFlow not support NVIDIA driver 11070? HOT 2
- Hello,Where is the script run_finetune_with_lora_save_aggregated_weights.sh?Why I can't find it in LMFlow/scripts ? HOT 2
- Out Of Memory Issue LISA HOT 4
- Weird Loss with LISA HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lmflow.