jzbjyb / flare Goto Github PK
View Code? Open in Web Editor NEWForward-Looking Active REtrieval-augmented generation (FLARE)
License: MIT License
Forward-Looking Active REtrieval-augmented generation (FLARE)
License: MIT License
The text-davinci-003 model seems to be unavailable now, so which model should I use to proceed with the experiment?
Hello, could you please provide information about the whereabouts of the StrategyQA and ASQA datasets?
what is models module imported in prep.py ?
Hi all, I am Jihyuk, a PhD student interested in retrieval-augmented LLMs.
I appreciate the open-sourcing of codes!
I am wondering if WikiAsp dataset used in the experiments can also be shared.
I noticed that the original, open-sourced WikiAsp dataset only includes summaries and reference documents.
But, it does not include inputs, e.g., "Generate a summary about Joe Biden", which are needed for FLARE.
Best regards,
Jihyuk
As per the title.
Downloads as psgs_w100.tsv file but is really psgs_w100.tsv.gz and needs to be unzipped
as title.
thanks
'text-davinci-002', 'text-davinci-003' have been deprecated, https://platform.openai.com/docs/deprecations.
in templates.py
, only torbo model supported? It seems that 'text-davinci-002', 'text-davinci-003' cannot generate token
?
def truncate_at_prob(self, low: float):
assert self.has_tokens, 'not supported'
if self.num_tokens <= 1:
return self
I met so many errors about JSONDecodeError when I running ./openai.sh wikiasp configs/wikiasp_flare_config.json
Did someone else meet the same question?
Process Process-3:
Process Process-4:
Process Process-2:
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/requests/models.py", line 971, in json
return complexjson.loads(self.text, **kwargs)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/init.py", line 357, in loads
return _default_decoder.decode(s)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/requests/models.py", line 971, in json
return complexjson.loads(self.text, **kwargs)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/init.py", line 357, in loads
return _default_decoder.decode(s)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/requests/models.py", line 971, in json
return complexjson.loads(self.text, **kwargs)
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/init.py", line 357, in loads
return _default_decoder.decode(s)
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
During handling of the above exception, another exception occurred:
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 382, in call
result = fn(*args, **kwargs)
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 200, in retrieve
ctx_ids, ctx_texts = self.retriever.retrieve(
Traceback (most recent call last):
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/retriever.py", line 98, in retrieve
results: Dict[str, Dict[str, Tuple[float, str]]] = self.retriever.retrieve(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/retriever.py", line 39, in retrieve
all_results = search_bing_batch(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/bing.py", line 47, in search_bing_batch
results.append(search_bing_api(query, **kwargs))
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/bing.py", line 23, in search_bing_api
response = response.json()
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/requests/models.py", line 975, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 382, in call
result = fn(*args, **kwargs)
The above exception was the direct cause of the following exception:
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 200, in retrieve
ctx_ids, ctx_texts = self.retriever.retrieve(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/retriever.py", line 98, in retrieve
results: Dict[str, Dict[str, Tuple[float, str]]] = self.retriever.retrieve(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/retriever.py", line 39, in retrieve
all_results = search_bing_batch(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/bing.py", line 47, in search_bing_batch
results.append(search_bing_api(query, **kwargs))
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/bing.py", line 23, in search_bing_api
response = response.json()
Traceback (most recent call last):
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/requests/models.py", line 975, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
The above exception was the direct cause of the following exception:
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
Traceback (most recent call last):
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 382, in call
result = fn(*args, **kwargs)
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 596, in query_agent_worker
generations, probs, retrievals, traces = qagent.prompt(prompts, api_key=(get_key_func, return_key_func))
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 200, in retrieve
ctx_ids, ctx_texts = self.retriever.retrieve(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 338, in prompt
return self.ret_prompt(queries, api_key=api_key)
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 378, in ret_prompt
ctx_ids, ctx_texts = self.retrieve(queries_to_issue, is_question=first_ret)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/retriever.py", line 98, in retrieve
results: Dict[str, Dict[str, Tuple[float, str]]] = self.retriever.retrieve(
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
line 326, in iter
raise retry_exc from fut.exception()
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/retriever.py", line 39, in retrieve
all_results = search_bing_batch(
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 596, in query_agent_worker
generations, probs, retrievals, traces = qagent.prompt(prompts, api_key=(get_key_func, return_key_func))
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 338, in prompt
return self.ret_prompt(queries, api_key=api_key)
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 378, in ret_prompt
ctx_ids, ctx_texts = self.retrieve(queries_to_issue, is_question=first_ret)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 326, in iter
raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x7faebc5a9bb0 state=finished raised JSONDecodeError>]
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/bing.py", line 47, in search_bing_batch
results.append(search_bing_api(query, **kwargs))
tenacity.RetryError: RetryError[<Future at 0x7f8f9bf6abb0 state=finished raised JSONDecodeError>]
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/bing.py", line 23, in search_bing_api
response = response.json()
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/requests/models.py", line 975, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 596, in query_agent_worker
generations, probs, retrievals, traces = qagent.prompt(prompts, api_key=(get_key_func, return_key_func))
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 338, in prompt
return self.ret_prompt(queries, api_key=api_key)
File "/Users/takerufang/Desktop/MyProject/FLARE-main/src/openai_api.py", line 378, in ret_prompt
ctx_ids, ctx_texts = self.retrieve(queries_to_issue, is_question=first_ret)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/tenacity/init.py", line 326, in iter
raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x7fc435619b20 state=finished raised JSONDecodeError>]
0%| | 0/500 [00:05<?, ?it/s]
INFO:root:keys performance
/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/numpy/core/fromnumeric.py:3432: RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
/Users/takerufang/miniconda3/envs/torch/lib/python3.8/site-packages/numpy/core/_methods.py:190: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
The tsv file is a tab separated file but prep.py treats each file as json
Hi, nice work!
I have a question about how to get a token's probability for confidence-based active retrieval. Can it be obtained from OpenAI's api? Or do we need another white-box model to calculate this probability?
Hello, I'd like to inquire about where I can find the 500 examples from the experimental 2WikiMultihopQA dataset.
How do you get token probability through openai's api
I would like to try a new dataset in place of the wiki dump. It seems that I need to compress documents into .tsv format and make some adjustments over code. Could some simple instructions be given?
Good work! Can you provide the evaluation code for the results presented in the paper (exactly math, f1, recall and precision)? Thanks!
Hi~
In the paper, it is mentioned that "We retrieve using each generated question and interleave the returned documents into a single
ranking list to aid future generations."
I am curious about how to integrate document lists under multiple queries into a unique ranking list.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.