Comments (11)
You could probably also use the ScaifeDL CTS API's getCapabilities request:
https://scaife-cts.perseus.org/api/cts?request=GetCapabilities
That gives you the author/work/edition/translation metadata for every URN.
from lexica.
I think you want to use something like
http://scaife-cts.perseus.org/api/cts?request=GetLabel&urn=urn:cts:greekLit:tlg0020.tlg001.perseus-grc2 without the passage for that particular call.
A few points to add.
- The URNs identified in the LSJ may be incorrect. These were checked against the existing Perseus collections (at the time) and that was done based on whether the link itself was valid. So, if the data was bad, —as was often the case in the "ibid" citations where the wrong antecedent is picked up,— we may not have identified that as a problem. If a URN was included for a work not yet in Perseus, then the problem would have been harder to spot. The quality will be better where an unambiguous reference was given: Plu. Brut. 7 but the data is very tricky in this regard, as you know.
- The current Scaife collections do not have all of the texts in Perseus. There are many texts in Scaife not found in Perseus and many works in Perseus not yet moved into Scaife. So there are likely going to be cases where LSJ includes a URN that Scaife does not recognize. (For the most part, the recent additions to Scaife not found in Perseus are post-classical — so they are not generally part of the LSJ canon.)
- As works move into Scaife from Perseus the URNs change. So the top level identifier should be consistent but the edition extensions may change. In your example, Scaife features tlg0020.tlg001.perseus-grc2 while Perseus (www) had tlg0020.tlg001.perseus-grc1
- The last release of the catalog is several years out of date from the backend data. I do not think you'll see atom feeds for anything added subsequently. We have some tools in development that will better address this hidden data issue.
from lexica.
@TinaRussell
tlg4083 is not in the Scaife Viewer, so I wouldn't expect it to work. It's also not identified in the catalog, although I see an issue that indirectly refers to this.
I see it is the Eustathius Commentary on the Iliad.
I also see this on an old survey of IDs for which no results were returned — which would make sense.
from lexica.
hi @TinaRussell , Peter Heslin has incorporated the URNs in his Diogenes application, whose code you can download at https://github.com/pjheslin/diogenes . To accommodate this use in Diogenes, I've done fairly extensive work on the references in LSJ and Lewis & Short (hunting down and repairing where Il. 2.349, 458 becomes Homer-Iliad-2-349, Homer-Iliad-458, or the like). Maybe his code will be helpful? He allows people to type in authors and select works by title, and nobody is confronted with URNs directly, but perhaps you can make use of his code to go in the other direction.
from lexica.
@TinaRussell
Hi, I know you've been in touch with James Tauber on related issues but I didn't want to leave this unanswered.
I don't know of any converters or other tools for this—we don't host any at Perseus.
The original abbreviations should still be in the data but we don't have a mapping tool for these. The abbreviations in LSJ are fraught with irregularities, though, so this can be a challenge. An early project of mine was cleaning up these links and correcting invalid references, so often times the data itself was either incorrectly entered or inconsistently presented.
I am not aware of a single master list of all of these URNs — particularly the base URNs (such as urn:cts:greekLit:tlg0033.tlg001) but the underlying data is cataloged such as here:
There may be tools or scripts others have created to better address this and James would be the best place to start with that.
FYI, Giuseppe Celano has a Unicode version of the data:
https://github.com/gcelano/LSJ_GreekUnicode
from lexica.
Yeah, for my project I tried to make something that would expand the abbreviations, and I was able to come up with a one-to-one mapping for the author abbreviations, but for abbreviations of works, some are unique, some vary in meaning depending on the author given, and some I think you’re just supposed to figure out from context. It’s a headache. But, since every reference/citation in the LSJ has a URN attached, I realized I ought to take advantage of that, as it means somebody before me had to figure out what each citation means (man, what a Herculean task).
Thank you for pointing out the Perseus Catalog! I suppose the makeshift solution would be to plug each URN into the catalog’s URL scheme, scrape information from the resulting page, and cache the information somewhere. But, there’s gotta be something more elegant/aboveboard than that.
I’ve asked James about how to use the URNs, but haven’t heard back from him on it, yet.
My project is here, by the way: https://github.com/TinaRussell/hermeneus
from lexica.
I may have found the answer: http://sites.tufts.edu/perseuscatalog/?page_id=93 “…to specifically request the ATOM feed of the data, you append /atom to the URIs.” So by using the canonical URL plus /atom
, I should be able to get something more machine-readable.
from lexica.
Thank you for that! BTW, I’ve tried making other requests using that URL format, following the specification here https://github.com/cite-architecture/cts_spec/blob/master/md/specification.md and it doesn’t seem to work. For example, http://scaife-cts.perseus.org/api/cts?request=GetLabel&urn=urn:cts:greekLit:tlg0020.tlg001.perseus-grc1:195 gets me an “UnknownCollection” error. Is there something I’m doing wrong, or is the functionality simply unfinished? Thanks!
from lexica.
I would guess that it's probably just unfinished, but @jtauber would be a better person to answer that.
from lexica.
So, I managed to pull together a list of all the unique URNs cited in the LSJ. If you’re curious, it’s here: https://pastebin.com/aBDUBU07 They’re shortened to the work part of the work component (e.g. “urn:cts:greekLit:tlg0020.tlg001”), given what you said @lcerrato and because I figured Liddell and Scott weren’t terribly concerned with differing digital editions. Then I tried using the API to get the title for each one, and I found that about half of the URNs in that form work, and about half return an error. E.g. the first one, to the Odyssey, works: http://scaife-cts.perseus.org/api/cts?request=GetLabel&urn=urn:cts:greekLit:tlg0012.tlg002 but, the second one returns an error: http://scaife-cts.perseus.org/api/cts?request=GetLabel&urn=urn:cts:greekLit:tlg4083.tlg001 Again, is this unfinished functionality? Are URNs shortened like that supposed to work? Or, is this a better question for @jtauber?
from lexica.
@helmadik Thanks! I ended up writing a script to take the base URN of every work cited in the LSJ, try to see if it gets a result via the CTS API, and if so, record the URN and the work’s title in text form, as key-value pairs in a hash table, as seen here: https://github.com/TinaRussell/hermeneus/blob/fca545966fc358c7d3e574bc7c7443e8fc28fa05/hrm-abbr.el#L389 The program uses the resulting hash table (instead of calling the API directly) to figure out which work has what title. It only works for about half the works cited, though (for the others, the abbreviated title shown in the LSJ stays as it is), so it’s quite possible that Peter has figured out a better way.
from lexica.
Related Issues (20)
- clarify readme vs wiki HOT 2
- (grc.lsj.perseus-eng5.xml) incorrect entry split εἷς HOT 1
- (grc.lsj.perseus-engX.xml) minor typo HOT 15
- (lat.ls.perseus-eng1.xml) small typo fixes HOT 5
- (lat.ls.perseus-eng1.xml) front matter missing
- Unable to parse due entity not defined errors HOT 2
- FYI: Lewis and Short to JSON project HOT 1
- (1999.04.0060) error in entry HOT 2
- Lewis and Short: carets in many entries; misreading of double diacritic mark HOT 3
- How to insert new senses? HOT 2
- (lat.ls.perseus-eng1.xml) unicode conversion errors HOT 2
- (lat.ls.perseus-eng1.xml) remove the entities
- (grc.lsj.perseus-eng5.xml) merged entries HOT 2
- (grc.lsj.perseus-eng26.xml) typo HOT 2
- (grc.lsj.perseus-eng25.xml) missing entry? HOT 1
- (grc.lsj.perseus-eng16.xml) typo HOT 1
- (LSJ) missing entry or incomplete? HOT 4
- (grc.lsj.perseus-eng4.xml) typo - entry issue HOT 1
- (grc.lsj.perseus-eng4.xml) entry n25038 incomplete HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lexica.