Comments (4)
大佬好,想咨询下数学公式识别用什么技术可以实现
本项目已经明显过时,近年的进展可以参考我的一篇博客https://chungkwong.cc/crohme.html,以及历届CROHME竞赛报告及引用它们的论文。简单来说,现在主流的方法把它建模为图片转文本问题(图片到latex token序列),然后用基于编码器-解码器结构的人工神经网络解决,其中编码器通常用基于CNN(特别是DenseNet的变种)或ViT的backbone network,而解码器一般为自回归的RNN(另加某种注意力机制)或Transformer decoder。要提高准确率,关键在于数据增广。
from mathocr.
感谢回复,那篇文章我看了,相关的论文哪里能看到,讯飞是采用哪种技术路线的
from mathocr.
感谢回复,那篇文章我看了,相关的论文哪里能看到,讯飞是采用哪种技术路线的
根据另一个公式识别竞赛ICDAR 2023 Competition on Recognition of Multi-line Handwritten Mathematical Expressions的论文中对科大讯飞的冠军系统描述为“iFLYTEK-OCR team uses an encoder-decoder architecture that formulates HMER as an image-to-sequence translation problem. Specifically, the Conv2Former is employed as the image encoder, and a bi-directional trained Transformer decoder with Attention Refinement Module is utilized as the latex sequence decoder. A Beam Search Ensemble is proposed to ensemble the models trained with different sizes of characters. Specifically, at each decoding step, probability distributions produced by all member models are averaged by certain weights, and the top-k candidate characters to be output are decided by the averaged probability distribution. As for the data augmentation, blur, random, color jitter, scale, and TIA Transform are applied to improve the generalization ability of the model.”。科大讯飞在过去几年发表了不少数学公式识别方面的论文,很容易搜索到。数学公式识别的论文一般都会引用CROHME 2016的论文,顺藤摸瓜就能找到这个领域的重要论文。
from mathocr.
收到,太感谢啦
from mathocr.
Related Issues (8)
- Can I use it with microsoft windows? HOT 1
- contact HOT 3
- how to run it HOT 1
- fonts目录的文件在哪能找到?? HOT 1
- 图片识别报错,请教这是什么问题 HOT 3
- datasets HOT 1
- win10 64位无法运行 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mathocr.