Giter Club home page Giter Club logo

zeno-build's People

Contributors

amangupt01 avatar cabreraalex avatar krrishdholakia avatar lindiatjuatja avatar macabdul9 avatar maksimstw avatar neubig avatar pogayo avatar sparkier avatar zhaochenyang20 avatar zorazrw avatar zwhe99 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zeno-build's Issues

Create chatbots demo page

We now have chatbots implemented as of #30

We should:

  • Create a publicly accessible page demonstrating the interesting findings we get by using Zeno Build to compare chatbots
  • Add a link to this page from the main README

This will require finishing of several sub-issues, which we can add and link to this issue.

Make it possible to explore model confidences

Most generative models can provide an uncertainty level, and it would be interesting to be able to explore things, such as the correlation between model certainty and accuracy.

In order to do this, we would first need to modify the generate functions, such as in chat_generate.py, to output model uncertainties as well as strings.

This is certainly possible for Huggingface, and may be possible for API-based models.

Once these confidences are returned by the generate function, they would need to be passed on to the dataframe that is fed into Zeno in each individual example, such as the chatbot or summarization examples. For reference, the dataframe for the chatbot example is constructed here.

ChatBots: Support uni_eval

For the chatbots demo, we could support other evaluation metrics such as uni_eval for dialog evaluation, which may give us better insights.

Task: Create folder with Zeno functions

Many Zeno functions will be reusable across tasks, and can be passed into the visualize function.

For example, things like text length, unique words, etc. could be useful distill functions we want to use.

We might want to categorize this folder by task or data type.

Think of good way to to display model names in Zeno

Currently model names are displayed as model0, model1, but it'd be nice to have a better way of displaying them.

One suggestion is to display all of the non-constant parameters.
For example in the chatbots example, the prompt, model, and temperature are variable, so we could display those three parameters.

Core: Deduplicate Metrics

Currently metrics are implemented in multiple places. This is a potential source of bugs/disconnect, so we should deduplicate them and rely on the Zeno implementations.

Core: Debug duplicate models

For some reason, there are duplicate models in the results. This should be investigated.

For instance, there are 10 models in the "results" file, but only 4 after deduplication.

Visualizing 10 models
-1960601404797368550 {'training_dataset': 'sst2', 'base_model': 'bert-base-uncased', 'learning_rate': 7.032465170166586e-05, 'num_train_epochs': 3, 'weight_decay': 0.0013326587635397158, 'bias': 0.9619960134236489}
8087707616790777082 {'training_dataset': 'imdb', 'base_model': 'distilbert-base-uncased', 'learning_rate': 0.0007441349947622346, 'num_train_epochs': 2, 'weight_decay': 0.0022321073814882274, 'bias': 0.4729424283280248}
974984753089123939 {'training_dataset': 'imdb', 'base_model': 'bert-base-uncased', 'learning_rate': 4.1464852686965755e-05, 'num_train_epochs': 1, 'weight_decay': 0.0021863797480360337, 'bias': 0.010710576206724776}
-5709593449973764166 {'training_dataset': 'imdb', 'base_model': 'distilbert-base-uncased', 'learning_rate': 0.0007188594167931795, 'num_train_epochs': 4, 'weight_decay': 0.002204406220406967, 'bias': 0.17853136775181744}

Add an exhaustive optimizer

In some experiments we just want to do a complete search over the entire search space and enumerate all of the possibilities.

Currently we only have the RandomOptimizer, but we should also have an ExhaustiveOptimizer that does an exhaustive search.

This optimizer will be incompatible with Float search spaces, as they are not able to be enumerated.

Support multi-GPU support for Hugging Face provider

For locally hosted models from Hugging Face, it would be good to support multi-GPU inference, including:

  • Model parallelism to make sure that larger models fit in memory
  • Data parallelism for improved inference speed

Currently inference is handled using the Hugging Face provider:

def generate_from_huggingface(

Any code to support multi-GPU inference would have to be added there. Contributions are welcome!

error alarm. Beginner's problem

I don't know what to do with it. I made the following changes in the original file.I ran the code in vscode. do_prediction can work, but do_visualization can't.

I modified these hyperparameters
image

image
I took the l out of it and turned it into
image
I changed the address
image
I added this code at the very beginning of the file
image

The problem that appeared before no longer appears, but now the problem do not know how to solve.
image

Auto-generate documentation

Currently we do not have any automatically generated documentation including API doc. It would be nice to have this.

The main Zeno page has this, so maybe we could use the same method?

openai_utils.py throws Unclosed connector Error

I was using openai_utils.py for ChatGPT inference. It always worked fine for the first couple hundred samples, and then it always crushed with the following error. I tried to lower request_per_minute but the problem persists.

Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x000001FC0B9CFD10>
 92%|████████████████████████████████████████████████████████████████████████      | 1360/1473 [10:27<00:52,  2.17it/s]
Unclosed connector
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x000001FC0C137AF0>, 482899.734), (<aiohttp.client_proto.ResponseHandler object at 0x000001FC0C2A5550>, 482900.89)]']
connector: <aiohttp.connector.TCPConnector object at 0x000001FC0B9CFD50>

Create email address

Some people may feel more comfortable getting in touch through private methods such as email. Perhaps we should create an email address for zeno build or just zeno in general.

Task: Chat Your Data

We can create an example task for "chat your data" like is implemented in LangChain.

ChatBots: Find interesting and catchy trends

Find at least 3, target 5, interesting trends that can be demonstrated by our browsing of the results.
Write these up in a doc, e.g. on Google Doc on Notion, so they can be posted as tweets.

Make modeling library dependencies conditional

Currently libraries such as OpenAI, Cohere, and huggingface are used in the core library code indiscriminately. However, it would be better for at least OpenAI and Cohere to only be necessary if the libraries are actually being used. In order to do this, we can do dynamic imports and warn users that they need to install the library.

UX: Model names are opaque numbers

In the text classification demo, right now it seems that the model names are just numbers, so it's hard to tell which model is which.

Screen Shot 2023-04-19 at 6 13 08 PM

We should think about model naming. Here are some ideas:

  1. "model1", "model2", "model3". Not very useful, but if we can find parameters somewhere in the zeno interface (is this a possibility?) then we could look at the models.
  2. Find all parameters that vary between models, and print out a model name consisting of the concatenation of the parameters. For example, if we have "learning_rate", "training_data", and "batch_size", where "batch_size" is constant across all models, we could make the name "learning_rate=xxx,training_data=yyy".

Error alarm

I ran the code in the directory 'zeno-build-main\examples\text_classification\main', but my pycharm kept reporting errors, like this, and I don't know how to solve it.
1

Add search space that supports multiple experiments

Right now Zeno Build supports CombinatorialSearchSpace, which takes the cross product between all parameter configurations.

However, in many cases it's common to run multiple experiments, where you explore some part of the experiment space in the first experiment, and another part of the search space in the second experiment.

A current workaround would be to create two different configuration files and decide which one to use, or run sequentially on both. Another option is to specify this directly in the search space by having something like:

space = CompositeSearchSpace([
   CombinatorialSearchSpace({...}),  # experiment 1
   CombinatorialSearchSpace({...}),  # experiment 2
])

Doc: Main README

We need to write up the main README. I can take a first stab at it.

Core: Consolidate code

We now have three examples of tasks, we can start consolidating the code to reduce copy-pasting across the different tasks.

Zeno error: DataFrame columns must be unique for orient='records'.

When running Zeno and trying to visualize results, I get the following difficult-to-understand error.
@cabreraalex any ideas about how to fix this?

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 429, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/fastapi/applications.py", line 276, in __call__
    await super().__call__(scope, receive, send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/routing.py", line 443, in handle
    await self.app(scope, receive, send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/fastapi/applications.py", line 276, in __call__
    await super().__call__(scope, receive, send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/fastapi/routing.py", line 237, in app
    raw_response = await run_endpoint_function(
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/fastapi/routing.py", line 165, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/zeno/server.py", line 116, in get_filtered_table
    return zeno.get_filtered_table(req)
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/zeno/backend.py", line 587, in get_filtered_table
    return filt_df[[str(col) for col in req.columns]].to_json(orient="records")
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/pandas/core/generic.py", line 2532, in to_json
    return json.to_json(
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/pandas/io/json/_json.py", line 181, in to_json
    s = writer(
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/pandas/io/json/_json.py", line 237, in __init__
    self._format_axes()
  File "/Users/gneubig/opt/anaconda3/envs/llm_compare/lib/python3.10/site-packages/pandas/io/json/_json.py", line 301, in _format_axes
    raise ValueError(
ValueError: DataFrame columns must be unique for orient='records'.

ChatBots: Improve example display

Currently the chatbot example display looks like this:

Screen Shot 2023-05-06 at 2 11 10 PM

There are two problems:

  1. All system outputs are listed as "negative" for some reason, which is strange.
  2. It is only displaying one previous utterance, although in most cases two previous utterances worth of context are available.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.