deepset-ai / hayhooks Goto Github PK
View Code? Open in Web Editor NEWDeploy Haystack pipelines behind a REST Api.
Home Page: https://haystack.deepset.ai
License: Apache License 2.0
Deploy Haystack pipelines behind a REST Api.
Home Page: https://haystack.deepset.ai
License: Apache License 2.0
I created a hayhooks, haystack and opensearch setup (see this github repo).
Both the query and the indexing yaml are created by dumping a working pipeline config in python.
The problem seems to be caused by the indexing.yml - Once i leave this file out, the query pipeline works without a problem.
Could you help me find the issues here? Error message I am getting is:
2024-04-29 13:38:03 json_schema = generate_for_schema_type(schema_or_field)
2024-04-29 13:38:03 File "/opt/venv/lib/python3.10/site-packages/pydantic/json_schema.py", line 765, in is_instance_schema
2024-04-29 13:38:03 return self.handle_invalid_for_json_schema(schema, f'core_schema.IsInstanceSchema ({schema["cls"]})')
2024-04-29 13:38:03 File "/opt/venv/lib/python3.10/site-packages/pydantic/json_schema.py", line 2093, in handle_invalid_for_json_schema
2024-04-29 13:38:03 raise PydanticInvalidForJsonSchema(f'Cannot generate a JsonSchema for {error_info}')
2024-04-29 13:38:03 pydantic.errors.PydanticInvalidForJsonSchema: Cannot generate a JsonSchema for core_schema.IsInstanceSchema (<class 'pandas.core.frame.DataFrame'>)
2024-04-29 13:38:03
2024-04-29 13:38:03 For further information visit https://errors.pydantic.dev/2.6/u/invalid-for-json-schema
In order to have a good LLM chat UX, we need to streame the response to the client. Langserve is doing this with an dedicated endpoint, hayhooks could do the same (pseudocode):
async def pipeline_stream(pipeline_run_req: PipelineRunRequest) -> StreamingResponse:
buffer = ...
result = pipe.run(data=pipeline_run_req.dict())
return StreamingResponse(buffer_generator)
app.add_api_route(
path=f"/{pipeline_def.name}/stream",
endpoint=pipeline_stream,
methods=["POST"],
name=pipeline_def.name,
response_model=PipelineRunResponse,
)
Additionally haystack should provide a special streaming_callback
that will write the chunk content to a buffer, that will be available to hayhooks. Maybe the Pipeline could add this logic and provides an pipe.stream method that will return a generator or simething like this.
Currently Hayhooks assumes deepset-ai/haystack#6651 was merged and released.
Hayhooks fails with Chat Messages because the pydantic conversion to a dict remove ChatMessage class.
I had to work around like this:
dict_local = pipeline_run_req.dict()
dict_local["prompt_builder"]["prompt_source"] = pipeline_run_req.prompt_builder.prompt_source
result = pipe.run(data=dict_local)
Currently if the server starts I go to http://localhost:1416/
which is the root of the server. But I get a empty response, which seems unsatisfying. Instead better to have a message to showcase.
Modifications to be done here:
hayhooks/src/hayhooks/server/app.py
Line 37 in ee9150c
I would like to take up this issue if that's fine.
need to add a Docker Compose deployment file to support fast launch of Hayhook Server in the local Docker environment , and the current deployment file only supports Kubernetes environment deployment
Currently, I am testing hayhooks, and I get the : Internal Server Error
-----example1.yml-------
example1.yml
components:
converter:
init_parameters:
extractor_type: DefaultExtractor
type: haystack.components.converters.html.HTMLToDocument
fetcher:
init_parameters:
raise_on_failure: true
retry_attempts: 2
timeout: 3
user_agents:
- haystack/LinkContentFetcher/2.0.1
type: haystack.components.fetchers.link_content.LinkContentFetcher
llm:
init_parameters:
generation_kwargs: {}
model: orca-mini
raw: false
streaming_callback: null
system_prompt: null
template: null
timeout: 1200
url: http://localhost:11434/api/generate
type: haystack_integrations.components.generators.ollama.generator.OllamaGenerator
prompt:
init_parameters:
template: |
"According to the contents of this website:
{% for document in documents %}
{{document.content}}
{% endfor %}
Answer the given question: {{query}}
Answer:
"
type: haystack.components.builders.prompt_builder.PromptBuilder
connections:
- receiver: converter.sources
sender: fetcher.streams
- receiver: prompt.documents
sender: converter.documents
- receiver: llm.prompt
sender: prompt.prompt
metadata: {}
Request body
{
"converter": {
"meta": {}
},
"fetcher": {
"urls": [
"https://haystack.deepset.ai/overview/quick-start"
]
},
"llm": {
"generation_kwargs": {}
},
"prompt": {
"query": "Which components do I need for a RAG pipeline?"
}
}
Could you indicate what is the correct curl command?
----Example2.yml----------
example2.yml
components:
llm:
init_parameters:
generation_kwargs: {}
model: orca-mini
raw: false
streaming_callback: null
system_prompt: null
template: null
timeout: 1200
url: http://localhost:11434/api/generate
type: haystack_integrations.components.generators.ollama.generator.OllamaGenerator
prompt_builder:
init_parameters:
template: |
"Given these documents, answer the question.
Documents:
{% for doc in documents %}
{{ doc.content }}
{% endfor %}
Question: {{query}}
Answer:"
type: haystack.components.builders.prompt_builder.PromptBuilder
retriever:
init_parameters:
document_store:
init_parameters:
collection_name: documents
embedding_function: default
persist_path: .
type: haystack_integrations.document_stores.chroma.document_store.ChromaDocumentStore
filters: null
top_k: 10
type: haystack_integrations.components.retrievers.chroma.retriever.ChromaEmbeddingRetriever
text_embedder:
init_parameters:
generation_kwargs: {}
model: orca-mini
timeout: 1200
url: http://localhost:11434/api/embeddings
type: haystack_integrations.components.embedders.ollama.text_embedder.OllamaTextEmbedder
connections:
- receiver: retriever.query_embedding
sender: text_embedder.embedding
- receiver: prompt_builder.documents
sender: retriever.documents
- receiver: llm.prompt
sender: prompt_builder.prompt
max_loops_allowed: 100
metadata: {}
Request body
{
"llm": {
"generation_kwargs": {}
},
"prompt_builder": {
"query": "How old was he when he died?"
},
"retriever": {
"filters": {},
"top_k": 3
},
"text_embedder": {
"text": "How old was he when he died?",
"generation_kwargs": {}
}
}
Could you indicate what is the correct curl command?
I have a web app of chatbot, and I am trying to re-write the RAG logic using haystack.
Now, I am looking for a solution to wrap these pipeline into APIs for my frontend server to call.
Can I use this solution? Does it have a stable version for production?
Or, maybe I should try to implement it by my own?
Any suggestions?
The examples under the deploy
folder are not NLP-related, it would be nice to show a proper pipeline
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.