rapidai / rapidstructure Goto Github PK
View Code? Open in Web Editor NEW版面分析 | 表格识别 | 文档方向分类
License: Apache License 2.0
版面分析 | 表格识别 | 文档方向分类
License: Apache License 2.0
您好,从paddleocr的issue底下发现你们的工作,说实话被paddleocr的屎山代码搞得没脾气。感觉你们这个轻量版的不错,请问版面重建有实例代码吗?我看readme的流程图很清楚,要是有相关代码就更方便了。
能不能先把表格分割后再进行识别效果会好些
How to use gpu for inference?
The layout model is from PP-Structurev2-layout? or the PP-Structurev1 ?
请您详细描述想要添加的新功能或者是新特性
(Please describe in detail the new function or new feature you want to add)
rapidocr_onnxruntime感觉比rapidocr_json 慢好多 最好用户可以切换onnx 还是json
还有没有三个整合到一起的demo 先方向 再layout 再table
可以将这个表格检测模型和识别模型转为onnx放到RapidStructure吗?https://github.com/microsoft/table-transformer
文档错误 # bbox: [左上角x0,左上角y0, 右下角x0, 右上角x1]
应该是 # bbox: [左上角x0,左上角y0, 右下角x1, 右下角y1]
https://github.com/RapidAI/RapidStructure/blob/main/docs/README_Layout.md
(python) C:\Users\jft\RapidStructure>python rapid_main.py
[{'bbox': array([321.4160495 , 91.53214898, 562.06141263, 199.85522603]), 'label': 'text'}, {'bbox': array([ 58.67292211, 107.29000663, 300.25448676, 199.68142785]), 'label': 'text'}, {'bbox': array([321.70215978, 696.30348399, 561.81311168, 804.21204653]), 'label': 'text'}, {'bbox': array([ 56.13823644, 662.91114981, 305.76807089, 722.45002322]), 'label': 'text'}, {'bbox': array([ 60.03535906, 91.66820167, 249.37660473, 101.83736896]), 'label': 'title'}, {'bbox': array([ 61.19376827, 730.4047943 , 298.66428215, 787.99057095]), 'label': 'figure'}, {'bbox': array([ 80.07031778, 793.84156441, 278.21548591, 803.10274913]), 'label': 'figure_caption'}, {'bbox': array([355.79846928, 664.8275131 , 527.14342833, 689.68965201]), 'label': 'figure_caption'}, {'bbox': array([ 56.84690797, 234.85342862, 565.57721203, 491.91314587]), 'label': 'table'}, {'bbox': array([ 56.29500252, 525.46281841, 566.19082647, 647.91365501]), 'label': 'table'}, {'bbox': array([168.15167659, 205.71483414, 453.00579995, 232.4013521 ]), 'label': 'table_caption'}, {'bbox': array([230.31146447, 499.74774696, 455.9769955 , 526.09080024]), 'label': 'table_caption'}, {'bbox': array([356.1792743 , 664.69523244, 526.8971846 , 689.50478026]), 'label': 'table_caption'}]
[[ 57 235 566 492]
[ 56 525 566 648]]
Traceback (most recent call last):
File "C:\Users\jft\RapidStructure\rapid_main.py", line 106, in
test_input()
File "C:\Users\jft\RapidStructure\rapid_main.py", line 94, in test_input
table_html_str, _ = table_engine(cropped_img, ocr_result)
^^^^^^^^^^^^^^^^^
ValueError: too many values to unpack (expected 2)
使用的是test文件夹里的图片
请提供下述完整信息以便快速定位问题
(Please provide the following information to quickly locate the problem)
切割错误举例如下
{'bbox': array([ 10.66207372, 2231.28837461, 1073.79497789, 2344.79886966]), 'label': 'text'}
{'bbox': array([ 500.75091953, 12.21875445, 1613.09947578, 70.05863229]), 'label': 'text'}
{'bbox': array([ 7.98665632, 2364.51628573, 2081.99992877, 2901.74100257]), 'label': 'figure'}
{'bbox': array([2.87512385e-01, 7.39058257e+02, 2.08199993e+03, 2.22017507e+03]), 'label': 'table'}
{'bbox': array([ 4.60234904, 334.93956088, 2062.68867985, 693.52744136]), 'label': 'table'}
{'bbox': array([2.03777382e+00, 8.42523906e+01, 2.05676338e+03, 4.27240529e+02]), 'label': 'table'}
{'bbox': array([ 84.74693863, 712.07451807, 846.26481067, 750.32085518]), 'label': 'table_caption'}
请您详细描述想要添加的新功能或者是新特性
(Please describe in detail the new function or new feature you want to add)
如何输出表格html时,单元格tag里带上坐标?
您好,打扰了,我这边按照paddlestructure指定的方式训练版面分析,获取到了推断模型,然后使用paddle2onnx进行转onnx,然后导入到本项目下,推理的时候查看了一下输入输出:
输入:['image', 'scale_factor']
输出:['multiclass_nms3_0.tmp_0', 'multiclass_nms3_0.tmp_2']
使用您提供的模型文件输入输出是:
输入:['image']
输出:['transpose_0.tmp_0', 'transpose_2.tmp_0', 'transpose_4.tmp_0', 'transpose_6.tmp_0', 'transpose_1.tmp_0', 'transpose_3.tmp_0', 'transpose_5.tmp_0',
'transpose_7.tmp_0']
请问转onnx过程有其它的哪些操作,才能完全适配您这个项目?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.