Giter Club home page Giter Club logo

Comments (11)

jgm avatar jgm commented on September 23, 2024 3

Since the author of that test case, @benrbray , thinks it's okay to keep the 10/, and it's not in the official test suite, I think it's unproblematic to change this.

from citeproc.

benrbray avatar benrbray commented on September 23, 2024 2

I took a look at the original PR #88 and based on this discussion, it looks like I was attempting to preserve some existing behavior of pandoc. But the actual presentation of the DOI in the bibliography has no bearing on the functionality of PR #88, so I see no reason not to change the label if it helps with copy/paste.

Quoted from a discussion in the original PR:

I believe this is how pandoc currently works. I had to squint at linkifyVariables for a while earlier to realize it (note fixShortDOI x is passed to the tolink function, which determines the link anchor/target). I also just confirmed this behavior with the release copy of pandoc.

As for whether this is desirable: I'd say yes, since every shortDOI 10/abcde corresponds to the real DOI abcde, right? If we don't do it this way, and we have a <text variable="doi" prefix="https://doi.org/"., the link will render incorrectly as https://doi.org/10/abcde when given a shortDOI 10/abcde. Instead, it should be https://doi.org/abcde.

So, that explains fixShortDOI for creating the URLs, but I don't see any reason that the 10/ should be stripped from the actual link anchor.

The emphasized part seems to imply that two years ago, links of the form https://doi.org/10/abcde were invalid, but it's worth noting that today they do seem to be valid, and resolve to the same location as https://doi.org/abcde. The shortDOI website still recommends the latter:

Your request was processed. The shortcut for 10.48550/arXiv.2309.08509 is the handle:
10/ktqn
The shortcut HTTP URI is:
https://doi.org/ktqn
This shortcut will return the same results as https://doi.org/10.48550/arXiv.2309.08509,
and doi:10/ktqn can be used in place of doi:10.48550/arXiv.2309.08509.

Relevant Code

identifierToURL and fixShortDOI

citeproc/src/Citeproc/Types.hs

Lines 1483 to 1499 in 6969ce2

identifierToURL :: Identifier -> Text
identifierToURL ident =
case ident of
IdentDOI t -> tolink "https://doi.org/" (fixShortDOI t)
IdentPMCID t -> tolink "https://www.ncbi.nlm.nih.gov/pmc/articles/" t
IdentPMID t -> tolink "https://www.ncbi.nlm.nih.gov/pubmed/" t
IdentURL t -> tolink "https://" t
where
tolink pref x = if T.null x || ("://" `T.isInfixOf` x)
then x
else pref <> x
-- see https://shortdoi.org
fixShortDOI :: Text -> Text
fixShortDOI x = if "10/" `T.isPrefixOf` x
then T.drop 3 x
else x

handleIdent

citeproc/src/Citeproc/Eval.hs

Lines 1654 to 1664 in 6969ce2

handleIdent :: CiteprocOutput b => (Text -> Text) -> (Text -> Identifier) -> Eval b (Output b)
handleIdent f identConstr = do
mbv <- askVariable v
deleteSubstitutedVariables [v]
case f <$> (valToText =<< mbv) of
Nothing -> return NullOutput
Just t -> do
-- create link and remember that we've done so far
modify (\st -> st { stateUsedIdentifier = True })
let url = identifierToURL (identConstr t)
return $ Linked url [Literal $ fromText t]

usage of fixShortDOI to modify short DOI link text

"DOI" -> handleIdent fixShortDOI IdentDOI

from citeproc.

jgm avatar jgm commented on September 23, 2024

Short citeproc test case:

>>===== MODE =====>>
citation
<<===== MODE =====<<

>>===== RESULT =====>>
https://doi.org/10/gg7vfk.
<<===== RESULT =====<<


>>===== CSL =====>>
<?xml version="1.0" encoding="utf-8"?>
<style xmlns="http://purl.org/net/xbiblio/csl" class="note" version="1.0" demote-non-dropping-particle="display-and-sort" page-range-format="chicago">
  <macro name="access-note">
    <group delimiter=", ">
      <choose>
        <if type="legal_case" match="none">
          <choose>
            <if variable="DOI">
              <text variable="DOI" prefix="https://doi.org/"/>
            </if>
            <else>
              <text variable="URL"/>
            </else>
          </choose>
        </if>
      </choose>
    </group>
  </macro>
  <macro name="access">
    <group delimiter=". ">
      <choose>
        <if variable="issued" match="none">
          <group delimiter=" ">
            <text term="accessed" text-case="capitalize-first"/>
            <date variable="accessed" form="text"/>
          </group>
        </if>
      </choose>
      <choose>
        <if type="legal_case" match="none">
          <choose>
            <if variable="DOI">
              <text variable="DOI" prefix="https://doi.org/"/>
            </if>
            <else>
              <text variable="URL"/>
            </else>
          </choose>
        </if>
      </choose>
    </group>
  </macro>
  <citation et-al-min="4" et-al-use-first="1" disambiguate-add-names="true">
    <layout suffix="." delimiter="; ">
      <text macro="access-note"/>
    </layout>
  </citation>
</style>
<<===== CSL =====<<


>>===== CITATION-ITEMS =====>>
[ [ {"id":"baiMolAICal21", "type":"normal"} ] ]
<<===== CITATION-ITEMS =====<<



>>===== INPUT =====>>
[
  {
    "id": "baiMolAICal21",
    "abstract": "",
    "accessed": {
      "date-parts": [
        [
          2022,
          6,
          10
        ]
      ]
    },
    "author": [
      {
        "family": "Bai",
        "given": "Qifeng"
      },
      {
        "family": "Tan",
        "given": "Shuoyan"
      },
      {
        "family": "Xu",
        "given": "Tingyang"
      },
      {
        "family": "Liu",
        "given": "Huanxiang"
      },
      {
        "family": "Huang",
        "given": "Junzhou"
      },
      {
        "family": "Yao",
        "given": "Xiaojun"
      }
    ],
    "citation-key": "baiMolAICal21",
    "container-title": "Briefings in Bioinformatics",
    "DOI": "10/gg7vfk",
    "ISSN": "1467-5463, 1477-4054",
    "issue": "3",
    "issued": {
      "date-parts": [
        [
          2021,
          5,
          20
        ]
      ]
    },
    "language": "en",
    "note": "MolAICal",
    "page": "bbaa161",
    "source": "DOI.org (Crossref)",
    "title": "MolAICal: a soft tool for 3D drug design of protein targets by artificial intelligence and classical algorithm",
    "title-short": "MolAICal",
    "type": "article-journal",
    "URL": "https://academic.oup.com/bib/article/doi/10.1093/bib/bbaa161/5890512",
    "volume": "22"
  }
]
<<===== INPUT =====<<


>>===== VERSION =====>>
1.0
<<===== VERSION =====<<

from citeproc.

jgm avatar jgm commented on September 23, 2024

Oh, it looks like this was an intentional change in commit f87f3fd
I believe this was PR #88.
@benrbray can you comment on the reason for omitting the 10/?

from citeproc.

jgm avatar jgm commented on September 23, 2024

I noticed that in the official CSL test suite, test/extra/link_plainurls.txt, they have

  - shortDOIs should be handled correctly, meaning that every shortDOI of
      the form 10/abcde should be converted to the DOI abcde

So I think removing the 10/ prefix is actually expected (if not by the spec, by the official test suite). Given that, I think maybe we should keep things as they are...

from citeproc.

davidoskky avatar davidoskky commented on September 23, 2024

Hello,
while I understand this may be part of the actual standard; this is not strictly implemented on doi.org.

This is their description on the search bar in the doi.org website

DOIs include a prefix (prefixes always start with 10.) and a suffix, separated by a forward slash (/). Prefacing the DOI with doi.org/ will turn it into an actionable link, for example, [https://doi.org/10.47366/sabia.v5n1a3](http://doi.org/10.47366/sabia.v5n1a3). Clicking that link will ‘resolve’ it, i.e. redirect to the latest information about the object it identifies, even if the object changes or moves.

Searching 10/gg7vfk through that search bar does resolve to the article, while searching for gg7vfk leads to an error page.
This means I'd rather not publish articles with references with a DOI in the short form without the 10/ prefix, specifically because the shortDOI is somewhat new and people are not really used to it.
I'm afraid people may print the article, attempt to manually type down the DOI on doi.org and be confused when that does not resolve to an article.

from citeproc.

jgm avatar jgm commented on September 23, 2024

Let's check in with CSL devs on this. @bdarcus @fbennett

from citeproc.

bdarcus avatar bdarcus commented on September 23, 2024

Not really sure myself. cc @adam3smith, @bwiernik, @denismaier

from citeproc.

bwiernik avatar bwiernik commented on September 23, 2024

The shortDOI with or without 10/ will resolve, but the actual shortDOI alias must include 10/. I suggest reverting that commit and always rendering with 10/

from citeproc.

adam3smith avatar adam3smith commented on September 23, 2024

I'd generally point out that shortDOIs are not, in fact, DOIs:
as per the quote section from the DOI manual:

DOIs include a prefix (prefixes always start with 10.)

shortDOIs do not have a prefix that starts with 10. I'd recommend avoiding them in publication -- they have a bunch of downsides such as adding more redirection, which can break tooling, making the use of regexs for detection much harder etc.

That said, when they're used, they should definitely be used including the 10/ pseudo prefix: they are completely un-interpretable without. I'm not sure what the note in the test-suite refers to (and I can't find it in the test-suite repo, but that's probably just because I'm doing something wrong), but in its plain reading, I'm pretty certain it's incorrect. Both DOI and shortDOI always includes the prefix, so "the DOI abcde" does not make any sense as a phrase.

from citeproc.

jgm avatar jgm commented on September 23, 2024

Oh, sorry, I was being stupid. This test is from test/extra which means it's not in the official repository, but rather one I added at some point.

from citeproc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.