Comments (13)
Env vars' naming conventions for initialising models:
- In general it should follow
DOCQ_{VENDOR}_{USECASE}_{OBJECT}
in all uppercase, for instanceDOCQ_OPENAI_API_KEY
. - If sticking to the above format, it should have avoided using any default env var names expected by LlamaIndex or LangChain - it's intentional for being explicit.
For 3rd-party model vendors, available {VENDOR}
values include a single vendor string such :
- OPENAI
- COHERE
- ANTHROPIC
For cloud vendors, these {VENDOR}
include two parts, {CLOUDVENDOR}
and {MODELVENDOR}
separated by a _
such as:
- AWS_TITAN
- GCP_PALM
- AZURE_OPENAI
- AWS_LLAMA
- AWS_FALCON
from docq.
Decide on env var naming convention for models
from docq.
So for azure Open AI it will be the following, make sense?
DOCQ_AZURE_OPENAI_API_KEY1
DOCQ_AZURE_OPENAI_API_KEY2
DOCQ_AZURE_OPENAI_API_BASE
DOCQ_AZURE_OPENAI_CHAT_DEPLOYMENT_NAME or DOCQ_AZURE_OPENAI_CHAT_DEPLOYMENTNAME?
DOCQ_AZURE_OPENAI_CHAT_MODEL_NAME
DOCQ_AZURE_OPENAI_TEXT_DEPLOYMENT_NAME
DOCQ_AZURE_OPENAI_TEXT_MODEL_NAME
Probably a good idea to have the name of the model deployed passed through explicitly.
LangChain also have the following as env vars but I don't think it's needed because neither have an explicit implication on infra resources. So will not pass in.
DOCQ_AZURE_OPENAI_API_TYPE
DOCQ_AZURE_OPENAI_API_VERSION
from docq.
DOCQ_AZURE_OPENAI_API_KEY1
DOCQ_AZURE_OPENAI_API_KEY2
DOCQ_AZURE_OPENAI_API_BASE
DOCQ_AZURE_OPENAI_API_VERSION
all good above
Re model name, no need to supply - the same way that with OpenAI, we're select an actual model on the fly within the application code.
Re deployment name, it depends on the relationship between models and deployments - if it's 1:1 then there's no need but we need a separate convention internally in the application code to assume same (or similar) model/deployment name. If it's 1 deployment with N models and then yes for deployment name however it should be
DOCQ_AZURE_OPENAI_DEPLOYMENT_NAME
Happy to chat to clarify
from docq.
Re model name - a deployment is tied to one model. The deployment name (the azure resource name) can be anything but at the moment I'm setting the deployment name = model name. This will work fine unless we need to deploy two instances of the same model like a dev a prod or needing to partition for some other reason. I don't know what these use cases could be.
Re deployment name if we go with DOCQ_AZURE_OPENAI_DEPLOYMENT_NAME are we not going to support multiple models at he same time? I understood that we needed this.
So have gpt-35-turbo and text-davinci-003 models available in parallel so app code can use both?
from docq.
Also why include DOCQ_AZURE_OPENAI_API_VERSION when there's not infra dependency. From what I can tell you can switch to older OpenAI API version client side with no changes on the infra.
This is separate from model version to be clear. those are set as infra config
from docq.
Also why include DOCQ_AZURE_OPENAI_API_VERSION when there's not infra dependency. From what I can tell you can switch to older OpenAI API version client side with no changes on the infra.
If this is not attached to the OpenAI service or deployment during creation time, then we can lose it.
If it's 1:1 between deployment and an actual model, then my view is don't set it just to be consistent and let application code to define which model (which in this case implies the deployment name) to use.
from docq.
@cwang I think you missed answering my second question.
Are we going to support two or more models at the same time from a single provider? I understood that we needed this.
Example: have 'gpt-35-turbo' and 'text-davinci-003' models available in parallel from Azure OpenAI so app functionality can be built that use both?
from docq.
Yes because it's 1:N between api key and deployments I believe?
from docq.
OK then we have two options (at least for Azure)
- deployment name env var per model (each deployment is a single model, 1:1) some thing like
DOCQ_AZURE_OPENAI_TEXT_DEPLOYMENT_NAME
andDOCQ_AZURE_OPENAI_CHAT_DEPLOYMENT_NAME
. Model name is implied but can use the model name to set the deployment name. This limits us to only having one deployment per model (probably fine). - a single env var for deployments with json that has all the deployment names. A little unconventional but this is nice because it can also explicitly contain the model name. We are only talking about a small json string so env var char limits shouldn't be an issue. E.g:
DOCQ_AZURE_OPENAI_DEPLOYMENT_NAMES = [{'dname':'textdavinchi-dep1', 'mname':'text-davinci-003'}, {'dname':'gpt35turbo-dep1', 'mname':'gpt-35-turbo'}]
dname
* - deployment name
mname
- model name
*keeping names short to save on chars. PaaS's like Azure App Service pass in a load of env vars, and the char limit is shared.
from docq.
deployment name env var per model (each deployment is a single model, 1:1) some thing like DOCQ_AZURE_OPENAI_TEXT_DEPLOYMENT_NAME and DOCQ_AZURE_OPENAI_CHAT_DEPLOYMENT_NAME. Model name is implied but can use the model name to set the deployment name. This limits us to only having one deployment per model (probably fine).
The answer to this is my previous question about the relationship between a deployment and models belonging to it. Is it 1:1 or 1:N? Can we find out with a definitive answer? Also I believe API keys are at OpenAI service level, therefore it's 1:N between keys and deployments. Can we also confirm that?
I don't think we should worry about deployment name and models - as I suggested earlier, using naming conventions in application code to handle it, e.g. with the assumption that 1 deployment contains 1 model, both named identically. Think about how we provision OpenAI (3rd-party one), again we don't specify actual models in env vars.
In short, env vars should be considered just enough to get everything started in this case. The relaxed approach will leave application (system settings) to dictate what model(s) to be initiated.
from docq.
deployment:model is 1:1.
Per the PR #34 , chosen option number 1 from above.
Change to using the deployement name = model name convention. Therefore deployment name will not be passed in an env var. Only the following three are set for Azure.
DOCQ_AZURE_OPENAI_API_KEY1
DOCQ_AZURE_OPENAI_API_KEY2
DOCQ_AZURE_OPENAI_API_BASE
from docq.
HuggingFace Inference Client
Client works across HF (free) Inference API or self-hosted Inference Endpoints.
class huggingface_hub.InferenceClient( model: typing.Optional[str] = Nonetoken: typing.Optional[str] = Nonetimeout: typing.Optional[float] = None )
Parameters:
model
(str, optional) — The model to run inference with. Can be a model id hosted on the Hugging Face Hub, e.g. bigcode/starcoder or a URL to a deployed Inference Endpoint. Defaults to None, in which case a recommended model is automatically selected for the task.
token
(str, optional) — Hugging Face token. Will default to the locally saved token.
timeout
(float, optional) — The maximum number of seconds to wait for a response from the server. Loading a new model in Inference API can take up to several minutes. Defaults to None, meaning it will loop until the server is available.
Initialize a new Inference Client.
InferenceClient aims to provide a unified experience to perform inference. The client can be used seamlessly with either the (free) Inference API or self-hosted Inference Endpoints.
conversational( text: strgenerated_responses: typing.Optional[typing.List[str]] = Nonepast_user_inputs: typing.Optional[typing.List[str]] = Noneparameters: typing.Union[typing.Dict[str, typing.Any], NoneType] = Nonemodel: typing.Optional[str] = None ) → Dict
from docq.
Related Issues (20)
- UI: indicate the Space for sources
- BUG: in Docq production SaaS logins session isn't persisting refresh
- BUG(UI): "new chat" button
- BUG: when Docq doesn't know the answer it still references a source HOT 1
- BUG: User isn't Admin of their personal Org HOT 2
- CORE: Dynamic personas
- BUG: Knowledge list in side bar appearing for general chat
- FIX : API key is hard coded switch to env var
- RFC: [WIP] Change persistence folder structure HOT 1
- BUG: Fix API endpoints
- CORE API: add custom error handler class that returns json
- CORE: API Key based authentication
- Slack Chat integration HOT 5
- Slack as data source
- CORE: Dynamically create LLM Usage Settings Collections
- BUG: Google credential config no longer works for Vertex AI services
- CORE/FIX: Thread Spaces are a major hack and brittle HOT 1
- BUG: gracefully handle missing Slack Env Var scenario
- CORE: ASK logic re-write to improve control and sophisticated RAG
- Improve Slack integration
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from docq.