Comments (8)
Hey, was asked to help someone trying to use your project who were getting the same error. Below is the reply I gave them, which includes the likely cause.
https://github.com/StanGirard/quiver/blob/adbb41eb40f20fc264dbd68df2079649518e381d/loaders/common.py#L14
https://github.com/StanGirard/quiver/blob/adbb41eb40f20fc264dbd68df2079649518e381d/loaders/common.py#L20
https://github.com/StanGirard/quiver/blob/adbb41eb40f20fc264dbd68df2079649518e381d/utils.py#L4
Looks like they create a temp file, then pass its file name to a function that tries to open it.
https://docs.python.org/3.9/library/tempfile.html#tempfile.NamedTemporaryFile
Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows)
(and I knew what to look for thanks to https://stackoverflow.com/questions/23212435/permission-denied-to-write-to-my-temporary-file)
from quivr.
This worked for me:
import os
import tempfile
import time
from utils import compute_sha1_from_file
from langchain.schema import Document
import streamlit as st
from langchain.text_splitter import RecursiveCharacterTextSplitter
def process_file(vector_store, file, loader_class, file_suffix):
documents = []
file_sha = ""
file_name = file.name
file_size = file.size
dateshort = time.strftime("%Y%m%d")
# Create a temporary file using mkstemp
fd, tmp_file_name = tempfile.mkstemp(suffix=file_suffix)
with os.fdopen(fd, 'wb') as tmp_file:
tmp_file.write(file.getvalue())
loader = loader_class(tmp_file_name)
documents = loader.load()
file_sha1 = compute_sha1_from_file(tmp_file_name)
chunk_size = st.session_state['chunk_size']
chunk_overlap = st.session_state['chunk_overlap']
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
documents = text_splitter.split_documents(documents)
# Add the document sha1 as metadata to each document
docs_with_metadata = [Document(page_content=doc.page_content, metadata={"file_sha1": file_sha1,"file_size":file_size ,"file_name": file_name, "chunk_size": chunk_size, "chunk_overlap": chunk_overlap, "date": dateshort}) for doc in documents]
vector_store.add_documents(docs_with_metadata)
# Don't forget to remove the temporary file when you're done with it
os.remove(tmp_file_name)
return
This version of common.py
should avoid the permission issue you were encountering on Windows.
from quivr.
Ouch something about windows probably 😬
Where did you install quiver and do you have access to the D folder mentioned ?
from quivr.
I think I followed all the instructions but once the streamlit runs I drag a PDF and when a click on Add to Database, this error is shown. Any idea?
THANK YOU !!!
I can see three letters drives in your answer. Probably that's the issue.
When you upload a file, it's going to a folder in the app, and after it is uploaded as embeddings, it's deleted. I don't know why this "duplication" is needed.
from quivr.
This is what is shown in the console:
2023-05-13 18:19:16.063 Uncaught app exception
Traceback (most recent call last):
File "M:\Working- ENVS\Python3.10B\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script
exec(code, module.dict)
File "N:- GoogleDrive USAL\Working\PYTHON\quiver-main\main.py", line 57, in
file_uploader(supabase, openai_api_key, vector_store)
File "n:- GoogleDrive USAL\Working\PYTHON\quiver-main\files.py", line 37, in file_uploader
file_processors[file_extension](vector_store, file)
File "n:- GoogleDrive USAL\Working\PYTHON\quiver-main\loaders\pdf.py", line 6, in process_pdf
return process_file(vector_store, file, PyPDFLoader, ".pdf")
File "n:- GoogleDrive USAL\Working\PYTHON\quiver-main\loaders\common.py", line 19, in process_file
documents = loader.load()
File "M:\Working- ENVS\Python3.10B\lib\site-packages\langchain\document_loaders\pdf.py", line 113, in load
return list(self.lazy_load())
File "M:\Working- ENVS\Python3.10B\lib\site-packages\langchain\document_loaders\pdf.py", line 120, in lazy_load
yield from self.parser.parse(blob)
File "M:\Working- ENVS\Python3.10B\lib\site-packages\langchain\document_loaders\base.py", line 87, in parse
return list(self.lazy_parse(blob))
File "M:\Working- ENVS\Python3.10B\lib\site-packages\langchain\document_loaders\parsers\pdf.py", line 16, in lazy_parse
with blob.as_bytes_io() as pdf_file_obj:
File "C:\Program Files\Python310\lib\contextlib.py", line 135, in enter
return next(self.gen)
File "M:\Working- ENVS\Python3.10B\lib\site-packages\langchain\document_loaders\blob_loaders\schema.py", line 86, in as_bytes_io
with open(str(self.path), "rb") as f:
PermissionError: [Errno 13] Permission denied: 'D:\TEMP\tmpim3u4796.pdf'
D:\TEMP has no problem with permissions, it's the temporary directory of the system, all programs and users have permission.
from quivr.
Hey, was asked to help someone trying to use your project who were getting the same error. Below is the reply I gave them, which includes the likely cause.
https://github.com/StanGirard/quiver/blob/adbb41eb40f20fc264dbd68df2079649518e381d/utils.py#L4
Looks like they create a temp file, then pass its file name to a function that tries to open it.
https://docs.python.org/3.9/library/tempfile.html#tempfile.NamedTemporaryFile
Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows)
(and I knew what to look for thanks to https://stackoverflow.com/questions/23212435/permission-denied-to-write-to-my-temporary-file)
That looks exactly like the problem I have. Any idea of how catch the error?
from quivr.
I have the same problem, on Windows as well.
from quivr.
I encountered a PermissionError when trying to open a temporary file on a Windows platform. The issue originates from this block of code in common.py:
with tempfile.NamedTemporaryFile(delete=True, suffix=file_suffix) as tmp_file:
tmp_file.write(file.getvalue())
tmp_file.flush()
loader = loader_class(tmp_file.name)
documents = loader.load()
file_sha1 = compute_sha1_from_file(tmp_file.name)
The PermissionError arises because tempfile.NamedTemporaryFile() opens a temporary file that cannot be opened again on Windows platforms while it's still open. This is due to the way Windows handles temporary files differently than Unix-based systems.
To resolve this issue, I modified the code to use tempfile.mkstemp() instead, which creates a temporary file in a more reliable manner across different platforms than tempfile.NamedTemporaryFile(). Importantly, it also ensures that the temporary file is closed before trying to open it again.
Here's the modified block of code:
# Create a temporary file using `tempfile.mkstemp`.
tmp_fd, tmp_file_name = tempfile.mkstemp(suffix=file_suffix)
try:
# Write to the temporary file.
with os.fdopen(tmp_fd, 'wb') as tmp_file:
tmp_file.write(file.getvalue())
tmp_file.flush()
# Now you can pass the temporary file's name to `loader_class` and `compute_sha1_from_file`.
loader = loader_class(tmp_file_name)
documents = loader.load()
file_sha1 = compute_sha1_from_file(tmp_file_name)
finally:
# Clean up the temporary file.
if os.path.exists(tmp_file_name):
os.remove(tmp_file_name)
from quivr.
Related Issues (20)
- [Bug]: Potential security issue
- [Bug]: image error on assistants page HOT 3
- backend-core error in "http://localhost:5050/user/identity" [Bug]: HOT 4
- [Bug]: Unable to log in HOT 1
- [Bug]: Not able to login. Dead home page. HOT 6
- [Bug]: Website HOT 2
- [Bug]: login button is not working cause of javascript HOT 4
- [Bug]: Knowledge articles are not being used HOT 3
- [Bug]: Cannot connect to localhost. Error: EACCES: permission denied, scandir '/app/public/.well-known'] HOT 6
- [Feature]: demo issue typo
- [Bug]: Api Not authenticated HOT 2
- [Bug]: Setting up locally shows screen "Talk to Quivr Unpaid users have access to a free and limited demo of Quivr" HOT 16
- [Bug]: Unable to Access Login Page and Enter Credentials HOT 6
- [Bug]: While creating a new brain in localhost instance of Quivr, I am getting error HOT 6
- [Bug]: HOT 5
- [Bug]:
- [Bug]: error /app/node_modules/sharp: Command failed. HOT 2
- LocalAI support HOT 2
- [Bug]: GPT4 Brain code is not clean
- [Bug]: Unable to start backend-core HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from quivr.