Issues Policy acknowledgement <li class="task-li

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[BUG] Exception During Model Logging with Custom Ollama Class about mlflow HOT 6 OPEN

sreekarreddydfci commented on June 9, 2024

[BUG] Exception During Model Logging with Custom Ollama Class

from mlflow.

Comments (6)

harupy commented on June 9, 2024

@sreekarreddydfci A quick fix for this is explicitly specify pip requirements to skip the requirement inference:

mlflow.langchain.log_model(..., pip_requirements=[...])

from mlflow.

harupy commented on June 9, 2024

@sreekarreddydfci What are the extra fields that pydantic complains?

from mlflow.

sreekarreddydfci commented on June 9, 2024

@harupy I used the quick fix, it resolved inferring pip requirements issue.

When tried loading the logged model,

loaded_model = mlflow.langchain.load_model(logged_model.model_uri)

Got the following error:

ValidationError: 1 validation error for Ollama
options
  extra fields not permitted (type=value_error.extra)
File <command-3710765546539638>, line 2
      1 # Load model as a PyFuncModel.
----> 2 loaded_model = mlflow.langchain.load_model(logged_model.model_uri)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-6fdefbd4-710b-40a8-8764-423b3dffa1a0/lib/python3.10/site-packages/mlflow/langchain/__init__.py:934, in load_model(model_uri, dst_path)
    912 """
    913 Load a LangChain model from a local file or a run.
    914 
   (...)
    931     A LangChain model instance.
    932 """
    933 local_model_path = _download_artifact_from_uri(artifact_uri=model_uri, output_path=dst_path)
--> 934 return _load_model_from_local_fs(local_model_path)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-6fdefbd4-710b-40a8-8764-423b3dffa1a0/lib/python3.10/site-packages/mlflow/langchain/__init__.py:907, in _load_model_from_local_fs(local_model_path)
    905 _add_code_from_conf_to_system_path(local_model_path, flavor_conf)
    906 with patch_langchain_type_to_cls_dict():
--> 907     return _load_model(local_model_path, flavor_conf)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-6fdefbd4-710b-40a8-8764-423b3dffa1a0/lib/python3.10/site-packages/mlflow/langchain/__init__.py:616, in _load_model(local_model_path, flavor_conf)
    614     model = _load_runnables(local_model_path, flavor_conf)
    615 elif model_load_fn == _BASE_LOAD_KEY:
--> 616     model = _load_base_lcs(local_model_path, flavor_conf)
    617 else:
    618     raise mlflow.MlflowException(
    619         "Failed to load LangChain model. Unknown model type: "
    620         f"{flavor_conf.get(_MODEL_TYPE_KEY)}"
    621     )
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-6fdefbd4-710b-40a8-8764-423b3dffa1a0/lib/python3.10/site-packages/mlflow/langchain/utils.py:580, in _load_base_lcs(local_model_path, conf)
    578         model = _patch_loader(load_chain)(lc_model_path, **kwargs)
    579 elif agent_path is None and tools_path is None:
--> 580     model = _patch_loader(load_chain)(lc_model_path)
    581 else:
    582     from langchain.agents import initialize_agent
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-6fdefbd4-710b-40a8-8764-423b3dffa1a0/lib/python3.10/site-packages/mlflow/langchain/utils.py:525, in _patch_loader.<locals>.patched_loader(*args, **kwargs)
    524 def patched_loader(*args, **kwargs):
--> 525     return loader_func(*args, **kwargs, allow_dangerous_deserialization=True)
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain/chains/loading.py:631, in load_chain(path, **kwargs)
    625 if isinstance(path, str) and path.startswith("lc://"):
    626     raise RuntimeError(
    627         "Loading from the deprecated github-based Hub is no longer supported. "
    628         "Please use the new LangChain Hub at https://smith.langchain.com/hub "
    629         "instead."
    630     )
--> 631 return _load_chain_from_file(path, **kwargs)
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain/chains/loading.py:658, in _load_chain_from_file(file, **kwargs)
    655     config["memory"] = kwargs.pop("memory")
    657 # Load the chain from the config now.
--> 658 return load_chain_from_config(config, **kwargs)
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain/chains/loading.py:620, in load_chain_from_config(config, **kwargs)
    617     raise ValueError(f"Loading {config_type} chain not supported")
    619 chain_loader = type_to_loader_dict[config_type]
--> 620 return chain_loader(config, **kwargs)
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain/chains/loading.py:40, in _load_llm_chain(config, **kwargs)
     38 if "llm" in config:
     39     llm_config = config.pop("llm")
---> 40     llm = load_llm_from_config(llm_config, **kwargs)
     41 elif "llm_path" in config:
     42     llm = load_llm(config.pop("llm_path"), **kwargs)
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/langchain_community/llms/loading.py:33, in load_llm_from_config(config, **kwargs)
     28 if _ALLOW_DANGEROUS_DESERIALIZATION_ARG in llm_cls.__fields__:
     29     load_kwargs[_ALLOW_DANGEROUS_DESERIALIZATION_ARG] = kwargs.get(
     30         _ALLOW_DANGEROUS_DESERIALIZATION_ARG, False
     31     )
---> 33 return llm_cls(**config, **load_kwargs)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-6fdefbd4-710b-40a8-8764-423b3dffa1a0/lib/python3.10/site-packages/pydantic/v1/main.py:341, in BaseModel.__init__(__pydantic_self__, **data)
    339 values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
    340 if validation_error:
--> 341     raise validation_error
    342 try:
    343     object_setattr(__pydantic_self__, '__dict__', values)

Here's the model.yaml file of Ollama:

{'_type': 'ollama',
 'format': None,
 'keep_alive': None,
 'model': 'phi3:instruct',
 'options': {'mirostat': None,
  'mirostat_eta': None,
  'mirostat_tau': None,
  'num_ctx': None,
  'num_gpu': None,
  'num_predict': None,
  'num_thread': None,
  'repeat_last_n': None,
  'repeat_penalty': None,
  'stop': None,
  'temperature': 0.0,
  'tfs_z': None,
  'top_k': None,
  'top_p': None},
 'system': None,
 'template': None}

There's no mention of the extra fields that pydantic is complaining about.

Thanks.

from mlflow.

harupy commented on June 9, 2024

@sreekarreddydfci have you checked the langchain source code to see which fields are extra.

from mlflow.

sreekarreddydfci commented on June 9, 2024

@harupy,

I've made some significant advancements in deploying Ollama models as Databricks model serving endpoints, similar to how existing LLMs are handled. Below, I outline the progress made and the current challenges.

Progress Made:

Langchain Code Modification:
- I've made changes to the Langchain code to resolve a validation error, which can be reviewed here: Langchain Commit.
Endpoint Setup and Querying:
- Successfully created and queried a model serving endpoint on Databricks without any initial issues. The default querying script and modifications are detailed below.

Current Issues and Code Snippets:

Issue 1: Customizing the Querying Script

I need guidance on customizing the default querying script to match the existing querying format of dbrx-instruct or other LLM models on Databricks endpoints.

Default Querying Script After Endpoint Setup:

import os
import requests
import numpy as np
import pandas as pd
import json

def create_tf_serving_json(data):
    return {'inputs': {name: data[name].tolist() for name in data.keys()} if isinstance(data, dict) else data.tolist()}

def score_model(dataset):
    url = 'https://<url>/serving-endpoints/LangchainTest1/invocations'
    headers = {'Authorization': f'Bearer {os.environ.get("DATABRICKS_TOKEN")}', 'Content-Type': 'application/json'}
    ds_dict = {'dataframe_split': dataset.to_dict(orient='split')} if isinstance(dataset, pd.DataFrame) else create_tf_serving_json(dataset)
    data_json = json.dumps(ds_dict, allow_nan=True)
    response = requests.request(method='POST', headers=headers, url=url, data=data_json)
    if response.status_code != 200:
        raise Exception(f'Request failed with status {response.status_code}, {response.text}')
    return response.json()

Comparison with dbrx-instruct Script:

from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
    api_key=DATABRICKS_TOKEN,
    base_url="https://<url>/serving-endpoints"
)

chat_completion = client.chat.completions.create(
    messages=[
        {"role": "system", "content": "You are an AI assistant"},
        {"role": "user", "content": "Tell me about Large Language Models"}
    ],
    model="databricks-dbrx-instruct",
    max_tokens=256
)

print(chat_completion.choices[0].message.content)

Issue 2: Connection Error

I encountered a connection issue during the endpoint querying with below script:

import os
import requests
import json
DATABRICKS_TOKEN=$TOKEN
def create_simplified_json(topic):
    # Simplifying the JSON creation to fit the expected 'dataframe_records' format with 'topic'
    return {'dataframe_records': [{'topic': topic}]}

def score_model(topic):
    url = 'https://<url>/serving-endpoints/LangchainTest1/invocations'
    headers = {'Authorization': f'Bearer {DATABRICKS_TOKEN}',
               'Content-Type': 'application/json'}
    # Generate the payload using the simplified function
    data_json = json.dumps(create_simplified_json(topic))
    
    response = requests.post(url, headers=headers, data=data_json)
    if response.status_code != 200:
        raise Exception(f'Request failed with status {response.status_code}, {response.text}')
    
    return response.json()

# Example usage
try:
    topic = "What is machine learning?"  # Assign the topic here
    result = score_model(topic)
    print("Model response:", result)
except Exception as e:
    print("Error querying the model:", str(e))

Here's the error log:

Error Log:

Error querying the model: Request failed with status 400, {"error_code": "BAD_REQUEST", "message": "1 tasks failed. Errors: {0: 'error: ConnectionError(MaxRetryError(\"HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fcad9276aa0>: Failed to establish a new connection: [Errno 111] Connection refused'))\"))...

This error indicates a connectivity issue, likely due to an incorrect configuration pointing to localhost instead of the correct remote server or endpoint. Ollama serves on port 11434 by default.

How can I adjust the default querying script to align with the structured input expected by other Databricks endpoints, such as dbrx-instruct?
What steps can be taken to resolve the connectivity issues indicated in the error log? Is there a configuration step I might have missed?

Are there any other ways to serve these LLMs as endpoints?

Thank you.

from mlflow.

github-actions commented on June 9, 2024

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

from mlflow.

[BUG] Exception During Model Logging with Custom Ollama Class about mlflow HOT 6 OPEN

Comments (6)

Progress Made:

Current Issues and Code Snippets:

Issue 1: Customizing the Querying Script

Issue 2: Connection Error

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent