Giter Club home page Giter Club logo

ancient-greek-word2vec's Introduction

Ancient greek word2vec

This is a latent space for ancient greek trained on 149883 sentences from First1K project.

The vocabulary was lemmatized (534258 words + 96443 words not lemmatizable). 10,25,50,75,100 and 300 dimensions latent space are provided (gr...vec). As a comparison another model is proposed (Nov22_RW) based on this repo. Models ending by mc3 only take into account words that are present more the 3 times.

Online demonstration

Running a binder instance

Binder

Graph Output sample

Display a sample graph : This page is an example of graph output. Nodes are double-clickable to query the dictionary.

Precomputed standalone

A precomputed app for 300 dimensions latent space and 10 closest words baed on the 20000 most frequent greek words is available but only allow to display the graph, not distances calculations. access to the app

aner

Sense addition

The classical example king+woman-man=queen doesn't work properly with Fist1Kgreek dataset dataset maybe because queen (βασίλισσα) appears only 4 times. It works with Ryder Wishart's dataset (automatically selected on example)

Installation

Dev

clone this repo and install environment.yml

Production

You can build the docker container:

docker build -t yourtag/latentgreek .
or
docker buildx build --platform linux/amd64,linux/arm64 -t yourtag/latentgreek --push .
for apple silicon compatible

docker run -p 8888:8888 yourtag/latentgreek

This runs a server with the GUI.

References

Řehůřek, Radim, et Petr Sojka. « Software Framework for Topic Modelling with Large Corpora ». In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, 45‑50. Valletta, Malta: ELRA, 2010.

Muellner, Leonard. "The Free First Thousand Years of Greek". Digital Classical Philology: Ancient Greek and Latin in the Digital Revolution, edited by Monica Berti, Berlin, Boston: De Gruyter Saur, 2019, pp. 7-18 https://doi.org/10.1515/9783110599572-002

https://github.com/ryderwishart/ancient-greek-word2vec

ancient-greek-word2vec's People

Contributors

l0d0v1c avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.