Comments (4)
感谢指路!尝试加载其他文本正常。排查后发现可能是因为 ChatGLM 的 README 顶部的 HTML 链接导致的,删除以下文本即可:
<p align="center">
🌐 <a href="https://chatglm.cn/blog" target="_blank">Blog</a> • 🤗 <a href="https://huggingface.co/THUDM/chatglm-6b" target="_blank">HF Repo</a> • 🐦 <a href="https://twitter.com/thukeg" target="_blank">Twitter</a> • 📃 <a href="https://arxiv.org/abs/2103.10360" target="_blank">[GLM@ACL 22]</a> <a href="https://github.com/THUDM/GLM" target="_blank">[GitHub]</a> • 📃 <a href="https://arxiv.org/abs/2210.02414" target="_blank">[GLM-130B@ICLR 23]</a> <a href="https://github.com/THUDM/GLM-130B" target="_blank">[GitHub]</a> <br>
</p>
此外直接以 rb
模式打开也会有问题(依然会乱码),但如果先指定 encoding 得到文件 IO 对象再打开似乎就没事了:
from langchain.document_loaders import UnstructuredFileIOLoader
with open(filepath, "r", encoding="utf8") as f:
loader = UnstructuredFileIOLoader(file=f, mode="elements")
docs = loader.load()
from langchain-chatchat.
请问是把readme下载至本地后,使用UnstructuredFileLoader加载的吗?
from langchain-chatchat.
请问是把readme下载至本地后,使用UnstructuredFileLoader加载的吗?
是的。因本地 GPU 显存不足,我在 AutoDL 平台上的云虚拟机中进行操作。已确认下载后文件为 UTF8 编码。相关库版本信息:
langchain 0.0.128
transformers 4.26.1
unstructued 0.5.8
from langchain-chatchat.
from langchain-chatchat.
Related Issues (20)
- 我在Fastchat中依照说明添加了一款新的大模型,如何将其更新到此框架中,使我可以在此框架中使用 HOT 2
- 多线程输入问题时,输出答案页面提示“Bad 'setIn' index 6 (should be between [0, 1])” HOT 1
- 启动之后,打开web界面只能显示输入框,其它前端页面不显示 HOT 2
- 如何在ChatChat中实现如重写(Query Rewrite)和多查询检索(Multiple Query)等查询转换的功能? HOT 3
- [BUG] 科学上网时无法对话 API通信遇到错误:peer closed connection without sending complete message body (incomplete chunked read) HOT 1
- 往向量库上传文件报错:TypeError: string indices must be integers, not 'str' HOT 11
- > 这个问题我也碰到了,api可以调用,但是webui服务一直有问题,本来以为是端口没开,后来在云服务器上把端口也开了还是不行,最后把自己的email地址输入进去就行了,有点无语 HOT 2
- [BUG] 简洁阐述问题 / Concise description of the issue HOT 4
- [BUG] reranker后报错了:0SError:[WinError 10038]在一个非套接字上尝试了一个操作;麻烦大佬看一看 HOT 1
- [BUG] reranker后报错了:0SError:[WinError 10038]在一个非套接字上尝试了一个操作;麻烦大佬看一看 HOT 5
- embedding的时候只能用到单卡 HOT 1
- 未找到相关文档,该回答为大模型自身能力解答! HOT 1
- 以docker方式运行时,如何修改启动参数 HOT 1
- 本地加载大模型,qwen15-32B-AWQ,不能使用GPU,请问要如何配置 HOT 3
- [FEATURE] 增加对非OpenXML格式的Word文件(.doc)的识别
- 设置CHUNK_SIZE 检索知识库返回content长度无变化 HOT 1
- 知识库检索的代码 HOT 2
- 没有报错,URL正确输出,点击后无法显示,加载不出UI界面 HOT 2
- [BUG] 简洁阐述问题 / 使用命令 python startup.py -a启动时出错 HOT 1
- 0.2.10,对话报错 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from langchain-chatchat.