Giter Club home page Giter Club logo

Comments (3)

timgogochen avatar timgogochen commented on July 3, 2024

what's happend ,by language limitation?

from weaviate.

rthiiyer82 avatar rthiiyer82 commented on July 3, 2024

Hey @timgogochen : I'm unable to reproduce the issue you mentioned above in Weaviate v1.25.4 which is the latest version available. I have used the below script. Please take a look and let me know if i am missing anything. I am not sure what you are trying to achieve as I don't know the language. Do you know the python client version you are using to interact with Weaviate Database?

import weaviate
import weaviate.classes.config as wvc
from weaviate.collections.classes.data import DataObject
import weaviate.classes as wc

client = weaviate.connect_to_local()


chunked_text = "人力资源社会保障部规章 人力资源服务机构管理规定 (2023 年 6 月 29 日人力资源社会保障部令第 50 号公布 自 2023 年 8 月 1 日起施行) 第一条 为了加强对人力资源服务机构的管理,规范人力资 源服务活动,健全统一开放、竞争有序的人力资源市场体系,促 进高质量充分就业和优化人力资源流动配置,根据《中华人民共 和国就业促进法》《人力资源市场暂行条例》等法律、行政法规, 第二条 在中华人民共和国境内的人力资源服务机构从事 人力资源服务活动,适用本规定。 第三条 县级以上人力资源社会保障行政部门依法开展本 行政区域内的人力资源服务机构管理工作。 人力资源社会保障部发布 人力资源社会保障部规章 第四条 人力资源社会保障行政部门应当加强人力资源服 务标准化、信息化建设,指导人力资源服务行业协会加强行业自 第二章 行政许可和备案"
chunks_list = list()


# Create collection
if (client.collections.exists("chunk_test")):
#   delete collection "Article" - THIS WILL DELETE THE COLLECTION AND ALL ITS DATA
  client.collections.delete("chunk_test")  # Replace with your collection name


collection = client.collections.create(
    name="chunk_test",
    vector_index_config=wvc.Configure.VectorIndex.hnsw(),
    properties=[
        wvc.Property(name="chapter_title", data_type=wvc.DataType.TEXT, vectorize_property_name=True,tokenization=wvc.Tokenization.WORD),
         wvc.Property(name="chunk", data_type=wvc.DataType.TEXT, vectorize_property_name=True),
          wvc.Property(name="chunk_index", data_type=wvc.DataType.NUMBER),
    ]
)

for i, chunk in enumerate(chunked_text):
    data_properties = {
        "chapter_title": "What is Git",
        "chunk": chunk,
        "chunk_index": i
    }
    data_object = DataObject(properties=data_properties)
    chunks_list.append(data_object)
print(chunks_list[0])

def load_records():
    collection.data.insert_many(
       chunks_list
    )

load_records();

And here is the output:
DataObject(properties={'chapter_title': 'What is Git', 'chunk': '人', 'chunk_index': 0}, uuid=None, vector=None, references=None) sys:1: ResourceWarning: unclosed <socket.socket fd=5, family=30, type=1, proto=6, laddr=('::1', 54361, 0, 0), raddr=('::1', 8080, 0, 0)>

You will notice, there is no error when inserting the objects.

from weaviate.

rthiiyer82 avatar rthiiyer82 commented on July 3, 2024

I have also tried the below code and that works as well...

data_properties = {
        "chapter_title": "What is Git",
        "chunk": chunked_text,
        "chunk_index": 1
    }
data_object = DataObject(properties=data_properties)
chunks_list.append(data_object)
print(chunks_list[0])

def load_records():
    collection.data.insert_many(
       chunks_list
    )

load_records();

The output with no errors:

DataObject(properties={'chapter_title': 'What is Git', 'chunk': '人力资源社会保障部规章 人力资源服务机构管理规定 (2023 年 6 月 29 日人力资源社会保障部令第 50 号公布 自 2023 年 8 月 1 日起施行) 第一条 为了加强对人力资源服务机构的管理,规范人力资 源服务活动,健全统一开放、竞争有序的人力资源市场体系,促 进高质量充分就业和优化人力资源流动配置,根据《中华人民共 和国就业促进法》《人力资源市场暂行条例》等法律、行政法规, 第二条 在中华人民共和国境内的人力资源服务机构从事 人力资源服务活动,适用本规定。 第三条 县级以上人力资源社会保障行政部门依法开展本 行政区域内的人力资源服务机构管理工作。 人力资源社会保障部发布 人力资源社会保障部规章 第四条 人力资源社会保障行政部门应当加强人力资源服 务标准化、信息化建设,指导人力资源服务行业协会加强行业自 第二章 行政许可和备案', 'chunk_index': 1}, uuid=None, vector=None, references=None) sys:1: ResourceWarning: unclosed <socket.socket fd=5, family=30, type=1, proto=6, laddr=('::1', 54538, 0, 0), raddr=('::1', 8080, 0, 0)>

from weaviate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.