tannonk / llm_inference Goto Github PK

View Code? Open in Web Editor NEW

10.0 10.0 5.0 62.63 MB

LLM inference with HuggingFace (experimental)

Python 14.18% Shell 1.65% Jupyter Notebook 84.17%

llm_inference's People

Contributors

Stargazers

Watchers

Forkers

dennlinger municipalnoir alisonhc lmvasque sweta20

llm_inference's Issues

generated checklists are incorrect

Checklists generated with scripts/get_results.py are currently incorrect.

It should not expect cross-dataset test/valid set combinations, e.g.

bloom,asset-test,med-easi-validation,3,p0,1,287,random,No
bloom,asset-test,med-easi-validation,3,p0,1,489,random,No
bloom,asset-test,med-easi-validation,3,p0,1,723,random,No
bloom,asset-test,med-easi-validation,3,p0,1,732,random,No

@lmvasque

Replace `assert` with `Raise SomeError`

One of my nitpicks in Python code is that assert statements may not be run if running with the python -O flag. Although unlikely, better practice is to replace them with raise SomeError(""), which always triggers.
This simultaneously forces us to write some (more or less specific) error codes as well, such that ideally users have a clear idea why code failed.

Sample `prompts/p0.json`

The example script to run the inference.py file lists a prompts/p0.json file path, which does not seem to be included in the installation instructions. I have run scripts/fetch_data.sh/ script, but still nothing.

Would it be possible to share this as a dummy script to let users build on top of this template?

Allow data loading from archival sources (e.g., Huggingface datasets)

I wanted to collect a number of open tasks that I can think of as "issues", which hopefully makes it easier for people to collaborate on the code base.

To efficiently run experiments later on, we should probably look into writing a loader class that can generalize beyond file inputs to something like Huggingface datasets or .csv/.tsv files. This would also be good practice to enable a wider adoption of this script after whatever experiments we run.

[Potential Bug] Typing incomaptibility in util functions

Hi,
just noticed that there is a potential type conflict between serialize_to_jsonl (expects a List[str] as the second input, and the output of postprocess_model_outputs (List[List[str]]).
This is relevant for line 107 in inference.py, but not sure if this is just a potential mistake on the coding end, or could actually cause an error when running the script.

tannonk / llm_inference Goto Github PK

llm_inference's People

Contributors

Stargazers

Watchers

Forkers

llm_inference's Issues

generated checklists are incorrect

Replace `assert` with `Raise SomeError`

Sample `prompts/p0.json`

Allow data loading from archival sources (e.g., Huggingface datasets)

[Potential Bug] Typing incomaptibility in util functions

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent