Claude 3 and other models (like Reka) support prefill, where you can construct a chat

I like the term "prefill" for this. I think it's a CLI option: <div class="highlig

I'm tempted to switch can_stream to <code class="notr

A couple of things to consider: Some APIs might include the pr

Another thought <a class="user-mention notranslate" data-hovercard-type="user" data-ho

Another thought <a class="user-mention notranslate" data-hovercard-type="

Support prefill about llm HOT 8 OPEN

simonw commented on June 3, 2024

Support prefill

from llm.

Comments (8)

simonw commented on June 3, 2024 1

Probably needs a supports_prefill = True option on the class here too:

llm/llm/models.py

Lines 243 to 248 in 9ad9ac6

 class Model(ABC, _get_key_mixin): 

 model_id: str 

 key: Optional[str] = None 

 needs_key: Optional[str] = None 

 key_env_var: Optional[str] = None 

 can_stream: bool = False

Or should I call that can_prefill? Need a consistent naming convention here.

The image branch is currently using supports_images:

llm/llm/models.py

Lines 255 to 261 in eaf50d8

 class Model(ABC, _get_key_mixin): 

 model_id: str 

 key: Optional[str] = None 

 needs_key: Optional[str] = None 

 key_env_var: Optional[str] = None 

 can_stream: bool = False 

 supports_images: bool = False

from llm.

simonw commented on June 3, 2024

I like the term "prefill" for this. I think it's a CLI option:

llm -m claude-3-opus 'JSON list of US state names' --prefill '["'

And a Python argument:

model = llm.get_model("claude-3-opus")
response = model.prompt("JSON list of US state names", prefill='["')

from llm.

simonw commented on June 3, 2024

I'm tempted to switch can_stream to supports_streaming for consistency with new options. can_images doesn't sound as good as supports_images.

I also want a supports_system option, since some models don't support system prompts.

from llm.

jph00 commented on June 3, 2024

A couple of things to consider:

Some APIs might include the prefill in the response, whereas some might not (currently, Claude does not)
It's possible to mock prefill through prompting, eg OpenAI:

In the latter case, you don't actually know whether the model will include the prefill in the response or not (OpenAI picks one of the other somewhat randomly).

Therefore, perhaps the design of this lib should be that the assistant response always includes the prefill -- and if it's not included in the API response, then its added. And for APIs that don't support prefill (which I guess is anything that doesn't have a completion-based API -- for now is that just OpenAI?) we modify the prompt to mock it.

Does that seem reasonable?

from llm.

jph00 commented on June 3, 2024

Another thought @simonw -- I think there's 4 possibilities for API support:

Supported by the API, and it eats the prefill so you have to add it back
Supported by the API, and it includes the prefill in the response
Not supported by API, but reliably adds the prefill when asked (but may or may not include it in the response)
Not supported by API, and we haven't found any prompt that results in it reliably adding a prefill

So instead of a supports_prefill bool, how about an enum with these 4 options? For 1 it adds back the prefill, for 2 it doesn't, for 3 it checks whether it's there and adds it if not, and for 4 it raises an exception if a prefill is requested.

from llm.

simonw commented on June 3, 2024

In an interesting twist... some of the OpenAI models apparently support this too! https://twitter.com/HamelHusain/status/1782149471624888512

I've found that many OpenAI users do not know about pre-fill with the Chat API

But... it looks like they are a little bit inconsistent about whether they continue the prompt without the prefill or if they answer with the prefill included:

https://twitter.com/HamelHusain/status/1782154898102211053

I found inconsistent behavior with the newest gpt-4-turbo that doesn't conform though (this is consistent across many runs)

from llm.

simonw commented on June 3, 2024

Another thought @simonw -- I think there's 4 possibilities for API support:

Supported by the API, and it eats the prefill so you have to add it back

Supported by the API, and it includes the prefill in the response

Not supported by API, but reliably adds the prefill when asked (but may or may not include it in the response)

Not supported by API, and we haven't found any prompt that results in it reliably adding a prefill

So instead of a supports_prefill bool, how about an enum with these 4 options? For 1 it adds back the prefill, for 2 it doesn't, for 3 it checks whether it's there and adds it if not, and for 4 it raises an exception if a prefill is requested.

The supports_prefill boolean will actually just let the CLI tool know if it should throw an error if the user passes --prefill "something" - I'll leave it to custom Python code in each model implementation to handle whether or not that prefill needs to be added to the response. I expect this will be a bit fiddly for the GPT ones - might even need to say "if the response starts with an exact match for the prefill then don't prepend the prefill again".

from llm.

jph00 commented on June 3, 2024

I expect this will be a bit fiddly for the GPT ones - might even need to say "if the response starts with an exact match for the prefill then don't prepend the prefill again".

This is what we decided to do for OpenAI. I think it is a nice approach actually.

from llm.

Support prefill about llm HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	class Model(ABC, _get_key_mixin):
	model_id: str
	key: Optional[str] = None
	needs_key: Optional[str] = None
	key_env_var: Optional[str] = None
	can_stream: bool = False