Description
As a user, I would like to be able to view the photos, digital or analogue, on a timeline.
Variation(s)
Sorting images per given metadata field (subject, location, format, etc.)
Proposed Solutions
A request was made by Team C to be able to arrange and display on a timeline images by year or decade during the workshop, especially for the SGV_12 Ernst Brunner collection which is fully digitised. For this particular collection, about 47,800 images were digitised and according to Salsah, ca. 38,850 of those records contain a date (within the hasDate
property) .
However, it is not possible to get a chronologically ordered list of all the images without looking up the database. A given downloaded resource do indeed contain (in its header by searching for the appropriate XMP metadata fields) the date when it was digitised but not when it was taken by the photographer (if known).
Exporting all metadata from Salsah
Team B has been in the process of extracting all metadata and data related to the PIA project (SGV_05, SGV_10 and SGV_12) from Salsah and is in collaboration with the DaSCH that manages the virtual research environment. Lukas Rosenthaler created a Python script - that we've slightly modified - and with the following command, we can extract all images in TIFF and their associated metadata in XML:
python3 salsah2xml.py -P sgv -s 4102 --start 0 --nrows -1 --filter sgv:in_collection={ID} --download --write-metadata --restype sgv:image {username} {password} https://www.salsah.org
If we've been able to extract all the images from the SGV_12, we are still in the process of extracting only the metadata as the batch process sometimes encounters invalid characters (such as backspaces) and makes the script breaks and unable to write XML.
Extracting date-based metadata
Once we will have all the associated metadata in XML, we can extract the name of the file (equivalent to its SGV Signature) and the date by using the XML2 utility to a CSV file (very useful for extracting non-repeatable XML tags, attributes and values) rather than waiting for a specific script that will be done for exporting all metadata to a dedicated PIA database.
- Homebrew Formula:
brew install xml2
- Command:
xml2 < input.xml | 2csv resource @id @label date > output.csv
Amending XMP metadata fields
For adding or modifying metadata in the header of a given image, we can use ExifTool. The XMP metadata fields that we could potentially change are:
xmp:MetadataDate
or;
xmp:ModifyDate
(slightly better than xmp:MetadataDate
as according to the XMP specification, it should be the same or more recent than xmp:ModifyDate
)
The entries must comply with the W3C datetime practice (which is basically based on ISO 8601). For instance, we can enter a year (YYYY), a year and month (YYYY-MM), a complete date (YYYY-MM-DD) and even something more precise with hours, minutes and seconds. What we can't do is enter a range such as a decade or a rough guess like "19XX", "between 1920 and 1930" or "1985?", something that is a common practice within bibliographical and archival records.
- Homebrew Formula:
brew install exiftool
- Command to display the technical metadata (File, XMP, ICC):
exiftool -s -G {name}.{format}
- Command to modify a metadata value (e.g.
xmp:ModifyDate
): exiftool -ModifyDate={newdate} {name.format}
Creating a routine to embed dates in all image headers
A batch associating the metadata extracted from the Salsah export by embedding it in the image headers via one or more ExifTool commands will need to be developed to automate the process.
Limitations
- We need to run all the above process where the images will be stored.
- Some dates are not very precise and we will need to make a decision on which value to put in the file.
- Not every software can sort images with XMP metadata, we will need a bespoken programme like Adobe Lightroom (https://www.organizingphotos.net/date-taken-date-created-date-modified-photo-time-stamps/). It is possible to create new tags by modifying the ExifTool configuration, but this solution could be risky because a third party program might not recognise the values being added and could not sort them in this fashion.
Additional Background