Comments (3)
Cool, I see, so for pdf files I can basically start from backend/core/quivr_core/processor/implementations/megaparse_processor.py
and modify as I want
from quivr.
CORE-196 Enabling data ingestion pipelines
from quivr.
quivr-core
basically uses the registry to find the processor that matches a file extension.
The internal implementation of the processor could be as complex or as we want. As long as the processor implements ProcessorBase it can be added to the registry 👍🏼 The internal steps / parsing will all be internal to the processor and can use any external dependencies as we want.
from quivr.
Related Issues (20)
- Use documents metadata to improve the retrieval HOT 1
- Automatically extract document metadata HOT 1
- Enable filtering on extracted document metadata HOT 1
- Switching to LangGraph in the ingestion pipeline HOT 1
- Improving user experience in long conversations HOT 1
- [Bug]: the knowledge counter is stuck at 1000 HOT 3
- [Bug]: Document ...once loaded (by mistake) cannot be unloaded....and has to be embedded HOT 2
- [Bug]: pnly one file show up as loaded for ingestion even when multiple files are dragged for addition HOT 2
- fix celery notifier update status exception HOT 1
- Fix generate url knowledge in multiple brain HOT 1
- [Bug]: ✘ worker Error pull access denied for quivr-backend-api, repository does not exist or may require 'docker login' HOT 5
- [Feature]: csv file uploaded in the knowledge base should not be downloadable (via sources) HOT 2
- [Bug]: Can't Add Ollama Using Supabase HOT 4
- [Backend] Remove SyncFile and SyncActive HOT 1
- Integrating the Brain class into the RAGService class HOT 1
- [Bug]: New install cannot log in over local network and frontend seems incomplete. HOT 5
- [Backend] KMS upload limits HOT 1
- handle Notion specific integration in Frontend HOT 1
- [Feature]: Enable Editing of Previous Messages in Chat Interface HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from quivr.