Is your feature request related to a problem? Please describe. <p

Env vars' naming conventions for initialising models: In gener

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

OK then we have two options (at least for Azure) deployment na

CORE: Settings for model selection about docq HOT 13 CLOSED

docqai commented on July 23, 2024

CORE: Settings for model selection

from docq.

Comments (13)

cwang commented on July 23, 2024 1

Env vars' naming conventions for initialising models:

In general it should follow DOCQ_{VENDOR}_{USECASE}_{OBJECT} in all uppercase, for instance DOCQ_OPENAI_API_KEY.
If sticking to the above format, it should have avoided using any default env var names expected by LlamaIndex or LangChain - it's intentional for being explicit.

For 3rd-party model vendors, available {VENDOR} values include a single vendor string such :

OPENAI
COHERE
ANTHROPIC

For cloud vendors, these {VENDOR} include two parts, {CLOUDVENDOR} and {MODELVENDOR} separated by a _ such as:

AWS_TITAN
GCP_PALM
AZURE_OPENAI
AWS_LLAMA
AWS_FALCON

from docq.

janaka commented on July 23, 2024

Decide on env var naming convention for models

from docq.

janaka commented on July 23, 2024

So for azure Open AI it will be the following, make sense?

DOCQ_AZURE_OPENAI_API_KEY1
DOCQ_AZURE_OPENAI_API_KEY2
DOCQ_AZURE_OPENAI_API_BASE
DOCQ_AZURE_OPENAI_CHAT_DEPLOYMENT_NAME or DOCQ_AZURE_OPENAI_CHAT_DEPLOYMENTNAME?
DOCQ_AZURE_OPENAI_CHAT_MODEL_NAME
DOCQ_AZURE_OPENAI_TEXT_DEPLOYMENT_NAME
DOCQ_AZURE_OPENAI_TEXT_MODEL_NAME

Probably a good idea to have the name of the model deployed passed through explicitly.

LangChain also have the following as env vars but I don't think it's needed because neither have an explicit implication on infra resources. So will not pass in.

DOCQ_AZURE_OPENAI_API_TYPE
DOCQ_AZURE_OPENAI_API_VERSION

from docq.

cwang commented on July 23, 2024

DOCQ_AZURE_OPENAI_API_KEY1
DOCQ_AZURE_OPENAI_API_KEY2
DOCQ_AZURE_OPENAI_API_BASE
DOCQ_AZURE_OPENAI_API_VERSION

all good above

Re model name, no need to supply - the same way that with OpenAI, we're select an actual model on the fly within the application code.
Re deployment name, it depends on the relationship between models and deployments - if it's 1:1 then there's no need but we need a separate convention internally in the application code to assume same (or similar) model/deployment name. If it's 1 deployment with N models and then yes for deployment name however it should be

DOCQ_AZURE_OPENAI_DEPLOYMENT_NAME

Happy to chat to clarify

from docq.

janaka commented on July 23, 2024

Re model name - a deployment is tied to one model. The deployment name (the azure resource name) can be anything but at the moment I'm setting the deployment name = model name. This will work fine unless we need to deploy two instances of the same model like a dev a prod or needing to partition for some other reason. I don't know what these use cases could be.

Re deployment name if we go with DOCQ_AZURE_OPENAI_DEPLOYMENT_NAME are we not going to support multiple models at he same time? I understood that we needed this.

So have gpt-35-turbo and text-davinci-003 models available in parallel so app code can use both?

from docq.

janaka commented on July 23, 2024

Also why include DOCQ_AZURE_OPENAI_API_VERSION when there's not infra dependency. From what I can tell you can switch to older OpenAI API version client side with no changes on the infra.

This is separate from model version to be clear. those are set as infra config

from docq.

cwang commented on July 23, 2024

Also why include DOCQ_AZURE_OPENAI_API_VERSION when there's not infra dependency. From what I can tell you can switch to older OpenAI API version client side with no changes on the infra.

If this is not attached to the OpenAI service or deployment during creation time, then we can lose it.

If it's 1:1 between deployment and an actual model, then my view is don't set it just to be consistent and let application code to define which model (which in this case implies the deployment name) to use.

from docq.

janaka commented on July 23, 2024

@cwang I think you missed answering my second question.

Are we going to support two or more models at the same time from a single provider? I understood that we needed this.

Example: have 'gpt-35-turbo' and 'text-davinci-003' models available in parallel from Azure OpenAI so app functionality can be built that use both?

from docq.

cwang commented on July 23, 2024

Yes because it's 1:N between api key and deployments I believe?

from docq.

janaka commented on July 23, 2024

OK then we have two options (at least for Azure)

deployment name env var per model (each deployment is a single model, 1:1) some thing like DOCQ_AZURE_OPENAI_TEXT_DEPLOYMENT_NAME and DOCQ_AZURE_OPENAI_CHAT_DEPLOYMENT_NAME. Model name is implied but can use the model name to set the deployment name. This limits us to only having one deployment per model (probably fine).
a single env var for deployments with json that has all the deployment names. A little unconventional but this is nice because it can also explicitly contain the model name. We are only talking about a small json string so env var char limits shouldn't be an issue. E.g:

DOCQ_AZURE_OPENAI_DEPLOYMENT_NAMES = [{'dname':'textdavinchi-dep1', 'mname':'text-davinci-003'}, {'dname':'gpt35turbo-dep1', 'mname':'gpt-35-turbo'}]

dname * - deployment name
mname - model name

*keeping names short to save on chars. PaaS's like Azure App Service pass in a load of env vars, and the char limit is shared.

from docq.

cwang commented on July 23, 2024

deployment name env var per model (each deployment is a single model, 1:1) some thing like DOCQ_AZURE_OPENAI_TEXT_DEPLOYMENT_NAME and DOCQ_AZURE_OPENAI_CHAT_DEPLOYMENT_NAME. Model name is implied but can use the model name to set the deployment name. This limits us to only having one deployment per model (probably fine).

The answer to this is my previous question about the relationship between a deployment and models belonging to it. Is it 1:1 or 1:N? Can we find out with a definitive answer? Also I believe API keys are at OpenAI service level, therefore it's 1:N between keys and deployments. Can we also confirm that?

I don't think we should worry about deployment name and models - as I suggested earlier, using naming conventions in application code to handle it, e.g. with the assumption that 1 deployment contains 1 model, both named identically. Think about how we provision OpenAI (3rd-party one), again we don't specify actual models in env vars.

In short, env vars should be considered just enough to get everything started in this case. The relaxed approach will leave application (system settings) to dictate what model(s) to be initiated.

from docq.

janaka commented on July 23, 2024

deployment:model is 1:1.

~~Per the PR #34 , chosen option number 1 from above.~~

Change to using the deployement name = model name convention. Therefore deployment name will not be passed in an env var. Only the following three are set for Azure.

DOCQ_AZURE_OPENAI_API_KEY1
DOCQ_AZURE_OPENAI_API_KEY2
DOCQ_AZURE_OPENAI_API_BASE

from docq.

janaka commented on July 23, 2024

HuggingFace Inference Client

Ref docs

Client works across HF (free) Inference API or self-hosted Inference Endpoints.

class huggingface_hub.InferenceClient( model: typing.Optional[str] = Nonetoken: typing.Optional[str] = Nonetimeout: typing.Optional[float] = None )

Parameters:

model (str, optional) — The model to run inference with. Can be a model id hosted on the Hugging Face Hub, e.g. bigcode/starcoder or a URL to a deployed Inference Endpoint. Defaults to None, in which case a recommended model is automatically selected for the task.
token (str, optional) — Hugging Face token. Will default to the locally saved token.
timeout (float, optional) — The maximum number of seconds to wait for a response from the server. Loading a new model in Inference API can take up to several minutes. Defaults to None, meaning it will loop until the server is available.

Initialize a new Inference Client.

InferenceClient aims to provide a unified experience to perform inference. The client can be used seamlessly with either the (free) Inference API or self-hosted Inference Endpoints.

conversational( text: strgenerated_responses: typing.Optional[typing.List[str]] = Nonepast_user_inputs: typing.Optional[typing.List[str]] = Noneparameters: typing.Union[typing.Dict[str, typing.Any], NoneType] = Nonemodel: typing.Optional[str] = None ) → Dict

from docq.

CORE: Settings for model selection about docq HOT 13 CLOSED

Comments (13)

HuggingFace Inference Client

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent