Comments (12)
You can find this function in the commode-utils
package. You can also install it with pip
.
from embeddings-for-trees.
Hi
Can you please help me with this.
Thanks.
from embeddings-for-trees.
Hi!
.ckpt
files correspond to model checkpoints with weights.
If you want to extract embedding you need to write some code :)
- First of all, you need to load checkpoint. See this example.
- Since you don't need a decoder, extract embedding and encoder modules:
node_emb = model._TreeLSTM2Seq__embedding
tree_enc = model._TreeLSTM2Seq__encoder
Sorry for not very convenient code, I will look forward to provide more accurate interfaces.
3. Retrieve trees embeddings
batched_trees.ndata["x"] = self.__embedding(batched_trees)
encoded_nodes = self.__encoder(batched_trees)
batched_encoded_nodes, mask = cut_into_segments(
encoded_nodes, batched_trees.batch_num_nodes(), False
)
- Now you have embeddings of all nodes in the tree, you can aggregate them by mean for example.
from embeddings-for-trees.
Thanks a lot for detailed explanation. I'll try above steps and will let you know.
from embeddings-for-trees.
Sorry to disturb you again.
batched_encoded_nodes, mask = cut_into_segments(
encoded_nodes, batched_trees.batch_num_nodes(), False
)
Where to put this code patch?
from embeddings-for-trees.
Write after extracting encoded_nodes.
To speed up computation, all trees in the batch are collated into a single tree. This function cuts this tree back, so you will have encoded nodes for each tree in batch and mask to properly slice it.
from embeddings-for-trees.
cut_into_segments function is undefined. Do I need to write this script on own or provided by some library?
from embeddings-for-trees.
Thank you.
from embeddings-for-trees.
Hi,
I created vector embedding for Java files.
Basically, I want to train a model for program repair task, such that it takes input-embeddings of buggy line and learn to generate output for fixed line.
I gave input buggy line and created embeddings as shown below. But I'm confused, how to convert these vectors in source code. As which value in vectors corresponding to which token?
Sorry for this silly question, but if you can provide some pointers it will be really helpful.
Thanks.
from embeddings-for-trees.
Wow, it is quite surprising for me that this even works :)
I use PyTorch as a framework to train the model and you use TensorFlow functions to operate with torch tensors...
Answering your questions:
- You may see that
encoded_nodes
passed into the decoder module. This module generates a sequence ofoutput_length
size.output_logits
is a matrix with the shape[sequence length; batch size; vocab size]
. So, if you want to generate some code, you should: (1) create a vocabulary with possible tokens for generating (label vocabulary), (2) pass the required output length to the forward method. batched_encoded_nodes
shape should be something like[batch size; size of max tree in the batch]
. The first vector corresponds to the root vector, the following vectors correspond to the tree with respect to their definition.
from embeddings-for-trees.
Alright thanks for help. I'll try with above points.
from embeddings-for-trees.
I close it due to inactivity, but feel free to reopen or create new issues in case of any questions!
from embeddings-for-trees.
Related Issues (2)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from embeddings-for-trees.