Giter Club home page Giter Club logo

Comments (6)

classicsong avatar classicsong commented on August 22, 2024

Thank you for using DGL-KE.
For negative sampling with type constraints. We can set the seed of the EdgeSampler to only edges within the constrained edge types. Then the sampled positive edges will only contains certain edge types.
For the negative sampling side, we (The C++ sampler) just corrupt the positive edges (head/tail pairs) and combine them with negative heads/tails. The edge types are not changed at this point.

Great thanks if you can contribute this feature!

from dgl-ke.

zheng-da avatar zheng-da commented on August 22, 2024

Thanks for the feature request. This is definitely something we should support. DGL-KE does joint negative sampling for efficiency. That is, instead of creating negative edges for each positive edge independently, we corrupt the head/tail node of a group of edges altogether and replace them with a new set of nodes randomly sampled from the graph. We need to extend joint negative sampling to the type constraint setting. We need to maintain the head/tail entities for each relation type. Potentially, we need to control the number of relations in a batch to achieve good efficiency.

from dgl-ke.

asaluja avatar asaluja commented on August 22, 2024

@classicsong @zheng-da thanks for the quick response! Yes, I agree that joint negative sampling is more efficient, so ideally doing joint negative sampling with type constraints would be best. There are probably other ways to do it - batching relations together and applying a special sampler for ever relation type (one sampler only per batch) is one way to do it.

I imagine it will take some time for this to be added to the repo - meanwhile on my end, do you think the two-stage procedure suggested above (sampling positive edges first, then based on sampled relation types sample negative edges) is a good way or is there something easier? I spent some time familiarizing myself with your codebase and it seemed this was the easiest way to do it.

Thanks again for the great work.

from dgl-ke.

zheng-da avatar zheng-da commented on August 22, 2024

@asaluja I agree that the two-stage procedure will work and it's something I have in mind as well. The main thing we need to take care of is how to combine this with joint negative sampling. We might need to control the number of relations in a batch so that joint negative sampling can be effective. Our experience is that if we reduce the number of relations in a batch, the performance of the trained embeddings drops. I think we need some experiments to balance computation efficiency and training speed. It'll be great if you can contribute this functionality. Please let us know if you have any questions about the current code base.

from dgl-ke.

vardaan123 avatar vardaan123 commented on August 22, 2024

Hi @zheng-da @asaluja I have the same use-case i.e. to sample negative samples with constraints on type of head/tail entity. As suggested by you, I set the seed edges to be the edges that belong to a particular edge-type/relation. However, the EvalSampler (or dgl.contrib.sampling.EdgeSampler) corrupts the edges by randomly sampling a node for head or tail position from the set of all entities (which includes both heads and tails). I want the head to be amongst all possible heads in the seed edges (and similarly for tail corruption). Any suggestions how this can be achieved? Thanks in advance.

from dgl-ke.

YijianLiu avatar YijianLiu commented on August 22, 2024

你好@zheng-da @asaluja 我有相同的用例,即对带有头/尾实体类型限制的负样本进行采样。正如您所建议的,我将种子边缘设置为属于特定边缘类型/关系的边缘。但是,EvalSampler(或dgl.contrib.sampling.EdgeSampler)通过从所有实体(包括头和尾)集合中随机采样节点的头或尾位置来破坏边缘。我希望头部是种子边缘中所有可能的头部之一(对于尾部损坏也是如此)。有什么建议可以实现吗?提前致谢。

Hello, I think you have learned the code in detail, so I want to ask you. On the paper, I see when sampling, the pos_g has 1024 edges but the neg_g has also 1024 edges, it corrupts every triplet 1 time, but not k times as mentioned on the paper, is it right?

from dgl-ke.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.