Giter Club home page Giter Club logo

Comments (8)

matthewsun214 avatar matthewsun214 commented on September 2, 2024

Also, when I run the version 2, the CUDA is out of memory when I am training the kobe-full in 2*P100 GPUS. How can I solve it?

Thank you.

from kobe.

qibinc avatar qibinc commented on September 2, 2024

Hi @matthewsun214 , thanks for your interest in this work. In general (not specific to this work), when running into OOM errors, you can decrease the batch size (e.g., --batch-size=32 instead of the default 64). In addition, you can also use gradient accumulation.

Hope this help!

from kobe.

qibinc avatar qibinc commented on September 2, 2024

For building the facts, we use BM25 with the title to retrieve the knowledge from a knowledge base, which belongs to the traditional sparse retrieval. More recently, there are works like DPR (https://arxiv.org/abs/2004.04906) which learns dense and differentiable retrieval, and even WebGPT (https://arxiv.org/pdf/2112.09332.pdf) which uses existing search engines to retrieve knowledge.

from kobe.

qibinc avatar qibinc commented on September 2, 2024

image

Here is a screenshot of the current version's training progress. Please feel free to reach out if you run into unexpected problems or performance!

from kobe.

matthewsun214 avatar matthewsun214 commented on September 2, 2024

Thank you for your reply. I am trying to train the model with your provided data. Also,I would like to know that which tools do you use for handling Chinese words, such as Jieba??

from kobe.

qibinc avatar qibinc commented on September 2, 2024

Hi @matthewsun214 , we use the same pre-trained tokenizer as bert-base-chinese (See https://github.com/THUDM/KOBE#tokenization), which is basically tokenize Chinese into characters and numbers/alphabets into words. This is shown to be more effective than word-LM with word segmentation tools in Chinese.

By the way, I would suggest going through the README and the code before trying to get help from issues (probably get you answers much faster).

from kobe.

matthewsun214 avatar matthewsun214 commented on September 2, 2024

Hi @qibinc , we have already gone through the README, the code and your paper, you have mentioned the methodology of retrieving knowledge in the paper, but we faced different technical difficulties in data preprocessing, mainly building the fact file. I would suggest adding some suggestion/ instruction in README about building fact file, otherwise, it is really hard for people to guess how to build fact file.

from kobe.

qibinc avatar qibinc commented on September 2, 2024

Hi @matthewsun214 , thanks for your suggestions. I was referring to the tokenization (word segmentation) question you asked, which is clearly visible in the README and the code. Retrieving the knowledge goes beyond the scope of the paper, as we focus on utilizing the knowledge. I wish you good luck in your work.

from kobe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.