Giter Club home page Giter Club logo

hiertext's Introduction

hiertext

A Pytorch implementation of Unified_Detector for scenetext detection and layout analysis.

i am still working on it on my spare time and any Advising/discussion is welcome ! ๐Ÿคฏ:

update

2023-8-24 ๐Ÿ“† the training on hiertext was not successful, model outputs are random strip-like shapes. using coco panoptic2017 dataset to train on the Maxdeeplab(from Max Deeplab) was not successful. will working on it.

purpose of this project

This project is for personal interests and selfstudy ๐Ÿคก

The Unified Detctor model is originally from Google's Tensorflow project Unified Detector. This project is trying to build a similar model on Pytorch for further study on my spare time, inspired by a torch implementation of Max Deeplab.

TaskList

  • added a paragraph head to original maxdeeplab net
  • defined a paragraph grouping loss
  • modified loss computing code to support both raw and balanced style loss.
  • modified original maxdeeplab code to solve some OOM issue
  • defined a simple Hiertext dataset for torch dataloader
  • verrify the model
  • train the model on Hiertext dataset

reference

remark

  • the dims in original project config needs a huge number of Memory/FrameBuffer. eg. the mask instance output has shape [256, 384, 1024, 1024], for fp32 it takes 256 x 384 x 1024 x 1024 x 4 = 384 (GB) along. since the best device I can access is a Nvidia GPU with 32G fb, I have to reduce the num of masks under 40 and limit the batchsize to 4.
  • there are some code changes to reduce memory consumption. hopefully they will lead to same results. eg. use matmul to replace reducesum(expanddim elementwise multiply), use indexing for matched mask instead of multiply full matching matrix.

hiertext's People

Contributors

jaysontree avatar

Watchers

 avatar Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.