Giter Club home page Giter Club logo

bertsimilarity's Introduction

BERTSimilarity

A BERT Embedding library for sentence semantic similarity measurement ๐Ÿค–

This library is a sentence semantic measurement tool based on BERT Embeddings. It uses the forward pass of the BERT (bert-base-uncased) model for estimating the embedding vectors and then applies the generic cosine formulation for distance measurement. The distance metric can be changed and the intermediate sentence and word embedding vectors can be attained as well. The model has been abstracted from the Google Research's BERT implementation.The pytorch wrapper over BERT is credited to Chris McCormick.

Dependencies

Pytorch

Transformers

Scipy

Usability

Installation is carried out using the pip command as follows:

pip install BERTSimilarity==0.1

For using inside the Jupyter Notebook or Python IDE:

import BERTSimilarity.BERTSimilarity as bertsimilarity

The 'Similarity_Test.py' file contains an example of using the Library in this context.

Samples

A sample of semantic similarity measurement with 4 different sentences , 2 of which are vaguely similar is provided below:

This Colab Notebook can be used as well for experimentation.

A Kaggle Kernel for Question Pair Similarity detection is also provided which uses this library.

The Notebook is featured in QuantumStat.com

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

bertsimilarity's People

Contributors

abhilash1910 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.