Giter Club home page Giter Club logo

flare's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

flare's Issues

Could you share WikiAsp dataset used in the experiments?

Hi all, I am Jihyuk, a PhD student interested in retrieval-augmented LLMs.
I appreciate the open-sourcing of codes!

I am wondering if WikiAsp dataset used in the experiments can also be shared.

I noticed that the original, open-sourced WikiAsp dataset only includes summaries and reference documents.
But, it does not include inputs, e.g., "Generate a summary about Joe Biden", which are needed for FLARE.

Best regards,
Jihyuk

So many errors about JSONDecodeError

I met so many errors about JSONDecodeError when I running ./openai.sh wikiasp configs/wikiasp_flare_config.json
Did someone else meet the same question?

Process Process-3:
Process Process-4:
Process Process-2:
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/requests/models.py", line 971, in json
return complexjson.loads(self.text, **kwargs)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/init.py", line 357, in loads
return _default_decoder.decode(s)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/requests/models.py", line 971, in json
return complexjson.loads(self.text, **kwargs)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/init.py", line 357, in loads
return _default_decoder.decode(s)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/requests/models.py", line 971, in json
return complexjson.loads(self.text, **kwargs)
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/init.py", line 357, in loads
return _default_decoder.decode(s)
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None

During handling of the above exception, another exception occurred:

File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 382, in call
result = fn(*args, **kwargs)
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 200, in retrieve
ctx_ids, ctx_texts = self.retriever.retrieve(
Traceback (most recent call last):
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/retriever.py", line 98, in retrieve
results: Dict[str, Dict[str, Tuple[float, str]]] = self.retriever.retrieve(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/retriever.py", line 39, in retrieve
all_results = search_bing_batch(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/bing.py", line 47, in search_bing_batch
results.append(search_bing_api(query, **kwargs))
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/bing.py", line 23, in search_bing_api
response = response.json()
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/requests/models.py", line 975, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 382, in call
result = fn(*args, **kwargs)

The above exception was the direct cause of the following exception:

File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 200, in retrieve
ctx_ids, ctx_texts = self.retriever.retrieve(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/retriever.py", line 98, in retrieve
results: Dict[str, Dict[str, Tuple[float, str]]] = self.retriever.retrieve(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/retriever.py", line 39, in retrieve
all_results = search_bing_batch(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/bing.py", line 47, in search_bing_batch
results.append(search_bing_api(query, **kwargs))
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/bing.py", line 23, in search_bing_api
response = response.json()
Traceback (most recent call last):
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/requests/models.py", line 975, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

The above exception was the direct cause of the following exception:

File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
Traceback (most recent call last):
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 382, in call
result = fn(*args, **kwargs)
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 596, in query_agent_worker
generations, probs, retrievals, traces = qagent.prompt(prompts, api_key=(get_key_func, return_key_func))
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 200, in retrieve
ctx_ids, ctx_texts = self.retriever.retrieve(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 338, in prompt
return self.ret_prompt(queries, api_key=api_key)
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 378, in ret_prompt
ctx_ids, ctx_texts = self.retrieve(queries_to_issue, is_question=first_ret)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/retriever.py", line 98, in retrieve
results: Dict[str, Dict[str, Tuple[float, str]]] = self.retriever.retrieve(
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
line 326, in iter
raise retry_exc from fut.exception()
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/retriever.py", line 39, in retrieve
all_results = search_bing_batch(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 596, in query_agent_worker
generations, probs, retrievals, traces = qagent.prompt(prompts, api_key=(get_key_func, return_key_func))
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 338, in prompt
return self.ret_prompt(queries, api_key=api_key)
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 378, in ret_prompt
ctx_ids, ctx_texts = self.retrieve(queries_to_issue, is_question=first_ret)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 326, in iter
raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x7faebc5a9bb0 state=finished raised JSONDecodeError>]
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/bing.py", line 47, in search_bing_batch
results.append(search_bing_api(query, **kwargs))
tenacity.RetryError: RetryError[<Future at 0x7f8f9bf6abb0 state=finished raised JSONDecodeError>]
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/bing.py", line 23, in search_bing_api
response = response.json()
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/requests/models.py", line 975, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 596, in query_agent_worker
generations, probs, retrievals, traces = qagent.prompt(prompts, api_key=(get_key_func, return_key_func))
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 338, in prompt
return self.ret_prompt(queries, api_key=api_key)
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 378, in ret_prompt
ctx_ids, ctx_texts = self.retrieve(queries_to_issue, is_question=first_ret)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 326, in iter
raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x7fc435619b20 state=finished raised JSONDecodeError>]
0%| | 0/500 [00:05<?, ?it/s]
INFO:root:keys performance
/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/numpy/core/fromnumeric.py:3432: RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/numpy/core/_methods.py:190: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)

About confidence-based Active Retrieval

Hi, nice work!
I have a question about how to get a token's probability for confidence-based active retrieval. Can it be obtained from OpenAI's api? Or do we need another white-box model to calculate this probability?

How to add a new dataset?

I would like to try a new dataset in place of the wiki dump. It seems that I need to compress documents into .tsv format and make some adjustments over code. Could some simple instructions be given?

code for evaluation

Good work! Can you provide the evaluation code for the results presented in the paper (exactly math, f1, recall and precision)? Thanks!

How to merge document lists retrieved from multiple queries?

Hi~
In the paper, it is mentioned that "We retrieve using each generated question and interleave the returned documents into a single
ranking list to aid future generations."
I am curious about how to integrate document lists under multiple queries into a unique ranking list.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.