wenzhengzhang / entqa Goto Github PK
View Code? Open in Web Editor NEWPytorch implementation of EntQA paper
License: MIT License
Pytorch implementation of EntQA paper
License: MIT License
Hello, thank you for your very interesting work. I noticed that the final probability of the detected entities tends to decrease when i increase the number of candidates, which sometimes causes the result to be less accurate with increased k as the probability falls below the threshold. Is this normal behavior? Possibly due to the re-ranking function?
Hi, thanks for your amazing works in EntQA! I am currently working on adapting your work to a custom dataset. And I found following issues in the data preprocessing script that may raise some errors or misfunction.
Lines 238 to 242 in 7b3cec5
This char2token function actually forgets blank char when counting the mapping between characters and tokens. To fix this issue, the length should be char2token_list += [i] * (len(tok.replace("##", "")) + 1)
.
Lines 268 to 269 in 7b3cec5
This function looks have two weird parameters max_ent_length
and pad_to_max_ent_length
. I guess it should be max_length=args.max_length
and padding='max_length'
.
Hope this would provide some helps to future developers.
Hi, Wenzheng,
Thanks for your great work!
According to Table 1 and Table 3 in your paper, the test F1 and val F1 should be 85.8 and 87.5, respectively. But I got lower results as follows using the trained reader and reader inputs provided in this repo:
Test results:
{"pred_total": 4760, "gold_total": 4485, "strong_correct_num": 3902}
test recall 0.8700| test precision 0.8197 | test F1 0.8441 |
Val results
{"pred_total": 5110, "gold_total": 4791, "strong_correct_num": 4323}
val recall 0.9023 | val precision 0.846 | val F1 0.8732 |
May I know if the above results are reasonable? and how can I reproduce the results in your paper? Thanks!
Hello!
I want to know how to generate the three files "aida-yago2-dataset-train.tsv", "aida-yago2-dataset-val.tsv" and "aida-yago2-dataset-test.tsv" ? I can only generate one dataset file according to the website you recommend, that is "AIDA-YAGO2-dataset.tsv".
Dear author, I am trying to run the GERBIL evaluation.
In the gerbil-SpotWrapNifWS4Test/pom.xml, I found this:
<!-- Let's use a local repository for the local libraries of this project --> <repository> <id>local repository</id> <url>file://${project.basedir}/repository</url> </repository>
And when I tried to config experiment, there is an error caused by the missing artifacts in this local repository.
The error is as follow:
org.eclipse.aether.resolution.ArtifactResolutionException: The following artifacts could not be resolved: org.aksw:gerbil.nif.transfer:jar:1.1.0-SNAPSHOT, org.restlet:org.restlet:jar:2.2.1, org.restlet:org.restlet.ext.servlet:jar:2.2.1: Could not find artifact org.aksw:gerbil.nif.transfer:jar:1.1.0-SNAPSHOT in local repository (file:/data/luxy/EntQA/gerbil-SpotWrapNifWS4Test/repository)
How can I get the artifacts in the local repository?
How long have you waitting for evaluate EntQA on AIDA-testb in GERBIL?
I have waiting for 9 hours without any sentence transfer to the annotater.
How to solve this?
我自己按照方法配置,因为不能fan墙的缘故,总是在一些地方出错,我希望您这边可以在b站或者别的短视频平台发布一个教学视频,帮助不太了解这方面的人运行您的代码,不知道是否可以?
Hi,
I encounter an error while using the run_retriever.py script with the downloaded precomputed candidates embeddings.
This is what is returned by the program :
all_cands_embeds = np.load(args.cands_embeds_path)
File "/home/m/anaconda3/envs/entqa/lib/python3.8/site-packages/numpy/lib/npyio.py", line 440, in load
return format.read_array(fid, allow_pickle=allow_pickle,
File "/home/m/anaconda3/envs/entqa/lib/python3.8/site-packages/numpy/lib/format.py", line 787, in read_array
array.shape = shape
ValueError: cannot reshape array of size 2336706528 into shape (5903531,1024)
Would you be so kind as to help me understand what I am not doing properly ? :)
Hello, really nice work:
I have a few questions/corrections/suggestions to reproduce the gerbil results.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.