Giter Club home page Giter Club logo

japanese-handwriting-classification-with-fragment-shaders's Introduction

Japanese Handwriting Classification with Fragment Shaders

NOTE: This was built and tested with Unity 2019.4.31f1 using built-in render pipeline, there may be shader compatibility issues with other versions.

Table of Contents

Overview

Japanese handwriting classification using a new and compact convolution model converted into fragment shaders to be used inside VRChat or Unity. This is general image classification system and does not use stroke order to predict the characters. It recognizes up to 3225 Japanese characters.

This is part of my Language Translation System where it will serve as input to the Japanese translation model.

ConvMixer

The name comes from the paper Patches Are All You Need where it mixes ideas from Vision Transformers into a pure convolutional network.

The basic idea is to break down the input image into patches, and run it through layers of depth-wise and point-wise convolutions. But unlike standard convolutional models, the ConvMixer architecture does not downsample successive layers, it keeps all layers the same size.

This JP handwriting classification network uses ConvMixer-144/4, with a kernel size of 5.

Problems

  • The ETL Handwritten Japanese Character Database I used to train this did not cover all 3225 characters the translator can recognize. To fill the missing characters, I used different Japanese fonts. But handwriting and fonts are not always the same.
  • The model overfits the training data too easily, I had to lower the learning rate and weight decay and added Dropout Layers.
  • Not OCR, only one character at a time.
  • I made it a very simple network to keep the performance high in VR.

Setup

  1. Download the latest .unitypackage from Release and import it.
  2. Look in the Prefab folder and add the JPHandwriteClassifier.prefab in the world or avatar.
  3. Check if the network works in Playmode.

The class predictions are rendered to Textures\RenderTextures\OutputLayersBuffer.renderTexture. You can read it in a shader by importing the .cginc Shaders\ConvMixer\ConvMixerModel.cginc

#include "ConvMixerModel.cginc"

and assuming _RenderTexture contains OutputLayersBuffer.renderTexture

Texture2D<float> _RenderTexture;

...

float firstPrediction = _RenderTexture[txTop1.xy];
float secondPrediction = _RenderTexture[txTop2.xy];
float thirdPrediction = _RenderTexture[txTop3.xy];
float fourthPrediction = _RenderTexture[txTop4.xy];
float fifthPrediction = _RenderTexture[txTop5.xy];

Mapping of the predicted class indexes to Japanese character can be found in jp_seq2text.tsv

This is only if you want to edit the shaders, the prefabs should work out the box.

Python, C++ Code

  • Python
    • https://keras.io/examples/vision/convmixer/
    • My Python code is a copy of the tutorial linked above but modified to spit out intermediate layer outputs and trained with a different dataset. It's better to follow the tutorial linked above than try to run mine.
    • Model weights in Keras .h5 format: here
    • It's weights only so you'll need to setup the model like this before loading the weights
conv_mixer_model = get_conv_mixer_256_8(image_size=64,
                                        filters=144,
                                        depth=4,
                                        kernel_size=5,
                                        num_classes=3225)
  • C++
    • Requires stb_image.h
    • Windows, for non-Windows platforms remove the Windows.h include.

Model Architecture

Resources

Datasets

Thanks to orels1 for setting up VMs for me to train on.

If you have questions or comments, you can reach me on Discord: SCRN#8008 or Twitter: https://twitter.com/SCRNinVR

japanese-handwriting-classification-with-fragment-shaders's People

Contributors

scrn-vrc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

facybenbook

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.