dhavalpotdar / graph-convolution-on-structured-documents Goto Github PK
View Code? Open in Web Editor NEWThis repo contains code to convert Structured Documents to Graphs and implement a Graph Convolution Neural Network for node classification
This repo contains code to convert Structured Documents to Graphs and implement a Graph Convolution Neural Network for node classification
Hi,
Amazing work to understand structure documents. Is it possible to extract actual value for given entity. For example invoice #, total amount, company name, etc...
Thank you
See here
graph_dict = {}
for src_id, row in df.iterrows():
if row['below_obj_index'] != -1:
graph_dict[src_id] = [row['below_obj_index']]
if row['side_obj_index'] != -1:
graph_dict[src_id] = [row['side_obj_index']]
Hello, first of all thank you for your valuable work, i wanted to ask you on how should i proceed after generating the graph png image and the connections.csv file, how do i feed that to the Graph Convolutional Neural Network. Thanks in a dvance.
df['below_object'] = df.loc[nearest_dest_ids_vert, 'Object'].values
Because of this line I am getting this error. Any idea for avoiding this error ?
Hello ! I.m very interesting in your research. Nice work !
when I run model.compile(), the flow arror generate:
assert adj_list[0].shape[0] == self.w0.shape[0], f'The number of rows
AttributeError: 'Adjacency' object has no attribute 'w0'
Hello,
After formation of adjacency and feature matrix, we are looking to train our model but for that, how can we include the labels and how will it give the data extraction.
Kindly explain this query.
Thanks in advance.!
@dhavalpotdar
In some of the images, the dimension of the adjacency matrix is not matching with the no of words in the image. For example -
Suppose the image has 91 words then the adjacency matrix was of shape (90, 90) instead of (91, 91).
One word/node is cut off from the adjacency matrix.
I try to run the code grapher.py, and the next line
# ==================== vertical ===================================== #
# create df for plotting lines
df['below_object'] = df.loc[nearest_dest_ids_vert, 'Object'].values
gives the next error
KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported. The following labels were missing: Int64Index([-1, -1, -1, -1, -1, -1], dtype='int64'). See https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike"
What is the solution?
the code A, X = graph.make_graph_data(graph_dict, text_list)
is throwing an assertion error
Expected type <class 'str'>. Received <class 'float'>
:
graph_dict appears to be having float values: {0: [4.0], 4: [5.0], 5: [1.0]}
After debugging all the open issues error this appears to be the last one occuring at the last line of the code. Will appreciate anyhelp to run that Grapher.py successfully.
can you please tell how to create object_map.csv for images if want use Pytesseract only instead of Commercial OCR techniques.
when i run the model.py i got this error. could you please help to solve this?
By any chance, has anyone managed to train and test this model?
Will you please help me out with training script!
The model will be able to extract address, buyers name, amount, due date etc from the invoice?
Thanks in advance!
How many classes were you used to fix that error?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.