Giter Club home page Giter Club logo

cngal-to-vndb's Introduction

CnGal -> VNDB

Introduction

This is a very simple implementation of the initial filtering solution I described on VNDB discussion board. It involves three parts (not exactly the same as below due to technical limitations):

  1. compare VNDB release extlink (Steam) with CnGal SteamId, and pick up those CnGal entries without any matched VNDB release (missing Steam release or not released on Steam at all)
  2. compare VNDB alttitle with CnGal name, and again pick up CnGal entries
  3. compare release date (!), this could be wrong due to a bug on CnGal side, but the number of VNs you need to check would be significantly smaller I guess

Components

  • zh-rel-on-vndb.py: filter zh-Hans & zh-Hant releases on VNDB whose parent VN has an original Chinese language.
  • cngal-data-format.py: make exported CnGal entries match the format of VNDB one. Exported JSON from CnGal data page is needed.
  • diff-cngal-vndb.py: compare CnGal data w/ VNDB existing Chinese VN, and divides the results for future proofing.

Usage

Easy Way

Install Python & GNU Make, clone the repo and simply run make. Everything should be done now. Just check output for the results. To clean up the data and restart, run make clean.

Vanilla Way

pip install -r requirements.txt

# Format CnGal data
python cngal-data-format.py

# Get VNDB data
# Get only Steam releases
python zh-rel-on-vndb.py -p 7 -s 1
# Get every zh-Hans & zh-Hant releases
python zh-rel-on-vndb.py -p 14 -s 0

# Diff CnGal & VNDB data
# Perform a fuzzy comparison
python diff-cngal-vndb.py -m 75 -n 50

Output

  • cngal-releas-*: formatted CnGal data
  • vndb-release-*: formatted VNDB data
  • miss-*: missing CnGal entries on VNDB, add these first
  • fuzzy-*: possibly missing CnGal entries on VNDB, check these later on
  • match-*: existing CnGal entries on VNDB, verify these at last

Todo

  • Add glob support in cngal-data-format.py
  • Make Steam filter optional in zh-rel-on-vndb.py for better fuzzy finding
  • Add Makefile
  • Use VNDB database dump for more complete matching (or update filters wisely)
  • Sort fuzzy output descended by similarity
  • Make metadata more informative
  • Support more metadata like producer, staff & character

Data from CnGal and VNDB has their respective licenses applied, you need to search it on the website or source code repository. The scripts included in this repo is licensed under MIT.

cngal-to-vndb's People

Contributors

vinfall avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.