Describe the bug A clear and concise deion of what the bug

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

[BUG] Cannot import PDF files about paperlib HOT 6 CLOSED

brant-ruan commented on June 8, 2024

[BUG] Cannot import PDF files

from paperlib.

Comments (6)

GeoffreyChen777 commented on June 8, 2024

Hi, I cannot reproduce this issue. Can you provide the error notification?

I can import this paper to my lib, but the metadata is wrong. The reason is that this paper used a wrong DOI https://doi.org/10.1145/nnnnnnn.nnnnnnn

from paperlib.

brant-ruan commented on June 8, 2024

Hi, I cannot reproduce this issue. Can you provide the error notification?

I can import this paper to my lib, but the metadata is wrong. The reason is that this paper used a wrong DOI https://doi.org/10.1145/nnnnnnn.nnnnnnn

Thanks for pointing out the issue.

I use advanced search and find another paper with https://doi.org/10.1145/nnnnnnn.nnnnnnn in the paper. Seems this phenomenon is not common, but if some published papers didn't update this DOI code, paperlib will consider them as the same paper, which is "TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks" (NeurIPS).

For this special case, I can not modify the DOI and run scraping, as paperlib will still get the DOI from the paper and fetch information with it.

from paperlib.

GeoffreyChen777 commented on June 8, 2024

Currently, please manually edit the metadata of papers with such DOIs.

Tomorrow I have a conference deadline. After that, I will investigate and fix this issue ASAP.

from paperlib.

brant-ruan commented on June 8, 2024

Thanks. Wishing you all the best for your paper's acceptance at the conference :-)

from paperlib.

GeoffreyChen777 commented on June 8, 2024

@brant-ruan Hi, this issue has been fixed now.

I implemented an invalid doi checking process for the metadata server. However, we can only get the title and author list of this paper currently. I found that this is a very recent publication. No database records this paper until now.

For conference papers, it's common that we need to wait at least half to one year before those databases record them. I usually collect the recently accepted papers in my own research field and insert them into the metadata server database manually. But I cannot do that for all research fields.

I'm thinking, maybe creating a GitHub repo to store some lists of publications and letting the metadata server connect to this repo is a good idea. Let users submit a list of papers and create a pull request should be acceptable.

Best wishes.

from paperlib.

brant-ruan commented on June 8, 2024

@brant-ruan Hi, this issue has been fixed now.

I implemented an invalid doi checking process for the metadata server. However, we can only get the title and author list of this paper currently. I found that this is a very recent publication. No database records this paper until now.

For conference papers, it's common that we need to wait at least half to one year before those databases record them. I usually collect the recently accepted papers in my own research field and insert them into the metadata server database manually. But I cannot do that for all research fields.

I'm thinking, maybe creating a GitHub repo to store some lists of publications and letting the metadata server connect to this repo is a good idea. Let users submit a list of papers and create a pull request should be acceptable.

Best wishes.

Agree.

There is a common situation (at least for me) when I search for papers with search engine and get two download sources: 1) the publication database 2) the author's academic home page or the institution's page. The second one sometimes provides pre-publication versions (or something like that) without further updating and valid DOI. As the contents from both sources are usually identical, and the second source often becomes available earlier than the database, I will download from it.

The GitHub repo idea is great. I am very glad to contribute to the information security field.

from paperlib.

[BUG] Cannot import PDF files about paperlib HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent