Giter Club home page Giter Club logo

bert-coie's Introduction

BERT-COIE

A BERT based Chinese Open Information Extraction Method

Requirements

python==3.6.5
Tensorflow>=1.12.0
pyltp==0.2.1

Usage:

Step 1: Download the BERT pre-trained checkpoint

Download the BERT-base Chinese checkpoints from https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip

to ./bert-model/

Step 2: Download the LTP data

Download the LTP model ltp_data_v3.4.0.zip from http://ltp.ai/download.html and unzip it

Step 3: Ready your dataset

One row for each piece of data, and a piece of data should be formed in json format.

It should contain a field called "natural", which is the original sentecen, and a field called "tag_seq" which is the tag sequence of

the sentence, one tag to one Chinese character. The tag scheme is "BIO" scheme. B-E1 for the begining of the argument 1, I-E1 for the

following words of argument 1. Similiar for E2(argument 2) and R(relation).

The full data set should be put into ./data/

Then use add_features/additional_features.py to add the POS and DP features

python add_features/additional_features.py -sp -rf data/saoke.json -train data/train.json -test data/test.json -ltp ../ltp_data_v3.4.0

Step 4: Training and testing

run BERT_COIE.py to train and test

Step 5: Post-processing and get P, R, F1

run utils/post_process.py

python post_processing.py -data_dir <path/to/the/testset> -output_dir <path/to/the_output_dir_of_the_model>

Updating in succession

bert-coie's People

Contributors

arbalest339 avatar

Stargazers

 avatar  avatar 张维哲 avatar  avatar  avatar Seder(方进) avatar  avatar

bert-coie's Issues

关于数据的格式

你好,请问我要用自己的数据,那数据的格式应该是怎样?可以举个例子吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.