Comments (14)
I think that this could be covered in a source
element that could contain two info:
- media type
- file name
{
"metadata": {
"source": {
"type": "application/epub+zip",
"filename": "title.epub"
}
}
}
Once we know the media type and filename, I'm not convinced that we would need to also provide a value indicating that metadata were generated.
from webpub-manifest.
Another useful thing we can do with the source.type
information is to inject Readium CSS only in the case application/epub+zip
, this would allow to stream an EPUB on the web while still supporting the CSS overrides and pagination.
Once we know the media type and filename, I'm not convinced that we would need to also provide a value indicating that metadata were generated.
That's right, the only useful thing I could think of is to know that the title was generated. But if we can check that the title is the filename, then it's actionable.
from webpub-manifest.
I don't see how a filename can give useful info to the application. Here you're proposing the filename because some source formats will not contain a title, so the title will be generated from the filename. Is it because the only mandatory meta in RWPM is the title?
Also, knowing the source media-type seems sufficient to know which metadata have been generated (e.g. the title in case the media type is cbz).
from webpub-manifest.
In the case of a CBZ or a PDF, the filename might be the only info available.
If you know the filename, you can compare it to the title generated by the streamer and do something based on that comparaison.
from webpub-manifest.
I see ... but the title is still mandatory, so it means the RWPM will contain:
{
"metadata": {
"@type": "http://schema.org/Book",
"conformsTo": "https://readium.org/webpub-manifest/profiles/pdf",
"title": "name.pdf",
"source": {
"type": "application/pdf",
"filename": "name.pdf"
}
}
}
The client part "viewer" just wants to display a title. Who cares about the original filename at this point?
from webpub-manifest.
Also, knowing the source media-type seems sufficient to know which metadata have been generated (e.g. the title in case the media type is cbz).
In both CBZ and PDF, sometimes we have a real title (for CBZ, it is the single root folder inside the archive, if there's one), so we can't be sure the filename was used. Also file extensions are not mandatory, so checking for .pdf
doesn't work either.
from webpub-manifest.
What about overloading dc:source
though? Isn't that problematic?
https://www.w3.org/publishing/epub3/epub-packages.html
https://www.dublincore.org/specifications/dublin-core/dcmi-terms/elements11/source/
<dc:source id="src-id">urn:isbn:9780375704024</dc:source>
<meta refines="#src-id" property="identifier-type" scheme="onix:codelist5">15</meta>
<meta refines="#src-id" property="source-of">pagination</meta>
from webpub-manifest.
According to the parsing specification, dc:source
should be preserved "as is" when parsing (under additionalProperties
), there is no production of unprefixed source
property:
https://github.com/readium/architecture/blob/master/streamer/parser/metadata.md
from webpub-manifest.
Ah, here is an older related issue about dc:source
:
#14
from webpub-manifest.
Note that the R2 TypeScript implementation currently preserves dc:source
as source
instead of placing it in additionalProperties
/ otherMetadata
, which is not the correct behaviour, because source
is not defined in the RWPM's JSON schema ( https://github.com/readium/webpub-manifest/blob/master/schema/metadata.schema.json ), or the JSON-LD context ( https://readium.org/webpub-manifest/context.jsonld ), or the EPUB parsing doc ( https://github.com/readium/architecture/blob/master/streamer/parser/metadata.md ).
https://github.com/IDPF/epub3-samples/blob/master/30/childrens-literature/EPUB/package.opf#L29
https://idpf.github.io/epub3-samples/30/samples.html
=>
from webpub-manifest.
currently preserves
dc:source
assource
instead of placing it inadditionalProperties
/otherMetadata
Just to clear any ambiguity, on mobile otherMetadata
is an implementation detail of the in-memory model, to store the additional properties. In the generated RWPM, any additional metadata would be under metadata
, such as this OPF:
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:a11y="http://www.idpf.org/epub/vocab/package/a11y/#">
<dc:title>Alice's Adventures in Wonderland</dc:title>
<dc:rights>Public Domain</dc:rights>
<meta property="a11y:certifiedBy">EDRLab</meta>
</metadata>
produces the RWPM (after resolving the full URI from the XML namespaces of other metadata):
{
"metadata": {
"title": "Alice's Adventures in Wonderland",
"http://purl.org/dc/terms/rights": "Public Domain",
"http://www.idpf.org/epub/vocab/package/a11y/#certifiedBy": "EDRLab"
}
}
And with the in-memory model:
publication.metadata.title
publication.metadata["http://purl.org/dc/terms/rights"] // (internally uses `otherMetadata`)
Note that we have a special case with the dc:
prefix, which is actually aliased to dcterms:
.
// The dc URI is expanded as dcterms
// See https://www.dublincore.org/specifications/dublin-core/dcmi-terms/
// > While these distinctions are significant for creators of RDF applications, most
// > users can safely treat the fifteen parallel properties as equivalent. The most
// > useful properties and classes of DCMI Metadata Terms have now been published as
// > ISO 15836-2:2019 [ISO 15836-2:2019]. While the /elements/1.1/ namespace will be
// > supported indefinitely, DCMI gently encourages use of the /terms/ namespace.
I'm not sure any of this is documented in the EPUB parsing guide, as metadata extensions were not really supported at the time.
from webpub-manifest.
How about sourceFile
to circumvent the dc:source
issue?
{
"metadata": {
"sourceFile": {
"type": "application/epub+zip",
"name": "title.epub"
}
}
}
from webpub-manifest.
...just thinking aloud regarding the term sourceFile
: if the publication "asset" (e.g. EPUB zip archive) is acquired via HTTP Content-Disposition: attachment; filename="book.epub"
with header Content-Type
= application/epub+zip
... then sourceFile
makes sense, but what about other HTTP fetch types whereby the notion of "file" is not so clear? (e.g. HTTP GET
request on URL https://domain.com/books/1
)
That being said, the name
field of the sourceFile
object seems appropriate, as this is clearly about "filename".
Alternatively, to avoid using terms that have other meanings / uses (such as "source", "origin", "resource", etc.), what about
{
"metadata": {
"originalAsset": {
"type": "application/epub+zip",
"filename": "title.epub"
}
}
}
from webpub-manifest.
Good point, then I would still use name
since it is still useful outside the context of a file.
For example, fetching a CBZ from https://comics.com/watchmen
, wtithout Content–Disposition
. The parser could use the last path component of the URI to generate the title
, and we would have:
{
"metadata": {
"originalAsset": {
"type": "application/vnd.comicbook+zip",
"name": "watchmen"
}
}
}
from webpub-manifest.
Related Issues (20)
- DiViNa and accessibility HOT 1
- EPUB Module: Landmarks HOT 6
- Default language for.. say metadata.title? HOT 14
- camelCase for all JSON property keys? What's `page-list` then? HOT 4
- Remove roles and MARC relators from contributor
- Editing needed to clarify EPUB presentation+layout HOT 1
- Should a Web Publication be styleable? HOT 8
- TOC entries with no `href` just `title` HOT 4
- Multiple OPDS prices HOT 2
- [Divina] Have Link Object be the default format (rather than a URI) each time a file is expected
- [Divina] A more intuitive order for listing presentation hints
- Adding information about the profile in the manifest HOT 7
- Identify proprietary DRM schemes HOT 11
- Link relation for dark color scheme in alternate links
- RPF : codec / non-codec terminology
- Divina: linkable rectangles in images HOT 2
- Improve Documentation
- OpenId Provider endpoint metadata should be allowed as link rel values HOT 1
- a11y.schema.json is invalid
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from webpub-manifest.