Giter Club home page Giter Club logo

Comments (13)

JoshOBrien avatar JoshOBrien commented on May 28, 2024

@insilentio Thanks for the report. That sounds reasonable, but I won't be able to have a closer look for at least a day. Will get to it as soon as possible, though.

from exiftoolr.

JoshOBrien avatar JoshOBrien commented on May 28, 2024

@insilentio Hi Daniel. Would you mind installing the fixed version I just pushed up to GitHub, to confirm that it now works for you?

Thanks,

Josh

from exiftoolr.

insilentio avatar insilentio commented on May 28, 2024

@JoshOBrien
Hi Josh. I have installed the new version from github. Unfortunately, due to restrictions at my workplace regarding installation of rtools and the proxy settings, I was not able to test it in the Windows environment where the problem originally occurred (I hope you can push the new version to CRAN soon :-)).
However, I have tried to replicate the situation on my Mac. The problem with the 2 arguments is definitively solved here.

Thanks a lot for the quick action,
Daniel

from exiftoolr.

JoshOBrien avatar JoshOBrien commented on May 28, 2024

@insilentio

Sure, I'll go ahead and push that to CRAN. (ETA, three hours later, it's already been accepted by CRAN and the new version of the source package is up on the home repository at https://cran.wu.ac.at/ ! Will typically take a day or two for the Windows binary package to be compiled and make its way out to all the mirror repositories.)

If you happen to have (or can easily construct) an image file that uses two non-default charsets in its file name and its tags, and which you can share with me for testing purposes, I'd really appreciate that. (I do understand that you may not be able to do so.)

Thanks,

Josh

from exiftoolr.

insilentio avatar insilentio commented on May 28, 2024

@JoshOBrien

I have an example attached, although I am not sure how it will behave on your environment - I guess it depends heavily upon the codepage of your OS. You'll have to set the filename to QS_Höngg.jpg as GitHub is replacing the original name with a random one.
Anyway, in my case the result without and with charset arguments (on CLI):
exiftool QS_Höngg.jpg -filename -city
File Name : QS_H÷ngg.jpg
City : Z├╝rich

The strange characters are misinterprations of the german "Umlaute" ö and ü.
When using with charset, I get the desired result. But at least in my case, I have to use different codepages for the tags and the name; it does not work otherwise.

exiftool QS_Höngg.jpg -filename -city -charset filename=cp1250 -charset exiftool=cp850
File Name : QS_Höngg.jpg
City : Zürich

Hope that helps.
Best,
Daniel
QS_Höngg

from exiftoolr.

JoshOBrien avatar JoshOBrien commented on May 28, 2024

@insilentio Thanks! I can report that this at least now runs without an error on my Windows 10 OS. You must be right about my OS' code page not being the right one to fully test this on, though, as the characters with umlauts are still not properly rendered in the value returned by exif_read(). Once this reaches CRAN, please do let me know whether this does or does not work correctly.

  • Josh

from exiftoolr.

insilentio avatar insilentio commented on May 28, 2024

@JoshOBrien
well, the good news is, the change is working as intended! Thanks again.

However, that doesn't help in my specific case, as I've found out now. Problem seems to be mainly that you are using JSON output in exif_read() (obviously, for dataframe conversion), and JSON output is always (see exiftool doc) converted to UTF8. So I end up with my strange characters again, no matter what codepages I am using as arguments. (without your change, I couldn't even read the files, though). I was trying with iconv() to get a useful output, with no luck so far. I guess I will go with exif_call() now and use the -csv argument and try to parse the output into a dataframe.

Anyway, when I was working with your code (exif_read()), I found that the -q argument is set in any case, before you even check for the parameter:
args <- c("-n", "-j", "-q", "-b", args) if (quiet) { args <- c(args, "-q") }

Not sure it that is really intended.
Best,
Daniel

from exiftoolr.

JoshOBrien avatar JoshOBrien commented on May 28, 2024

@insilentio

Very interesting and good to know about that conversion to UTF8. If you end up with somewhat robust code for parsing the results of exif_call(), I'd be interested in having a look, and potentially incorporating that in the package.

I can't remember now why I included a -q flag in the set of always-supplied flags, but I'm pretty sure it was intentional. Bolstering my recollection is this comment in the source code of exif_read():

## an extra -q further silences warnings
if (quiet) {
    args <- c(args, "-q")
}

Take care and best of luck,

Josh

from exiftoolr.

insilentio avatar insilentio commented on May 28, 2024

I am currently working with the following code which seems to work; don't know about the robustness, though:

arglist <- c("-charset", "exiftool=cp1250", "-charset", "filename=cp1252", "-csv", "-n", "-b", "-T")  
exifinfo <- exif_call(c(arglist, taglist), image_files, intern = TRUE)  
exifinfo <- read_csv(paste(exifinfo, collapse = "\n"))

In the args, the -csv and -T are crucial; in my case it then works properly for a list of several 100 images. I use readr's read_csv; probably read.csv would work, too. The trick is to collapse the csv-input with a line break (\n).

If that is overall more robust, I am not so sure. With the csv, you get the risk of different separators according to different locales (e.g. ";" instead of ",").

Best,
Daniel

from exiftoolr.

JoshOBrien avatar JoshOBrien commented on May 28, 2024

@insilentio

The trickiest part seems to be properly processing tag values that contain commas, double quotes, new lines, or leading or trailing spaces.

The ExifTool FAQ here includes a couple of recipes for doing that from the command line, but I haven't been able to get the Windows one to work properly. If I do figure that out, I'd definitely consider adding processing via csv output as an alternative to the tool's current processing via json output.

Cheers,

Josh

from exiftoolr.

JoshOBrien avatar JoshOBrien commented on May 28, 2024

@insilentio

OK, I did figure out how to do this, and have implemented an initial version in this repository's "csv-read" branch. I will eventually add it as an option to exif_read().

When I do so, do you mind if include the image you sent me in the package, to demonstrate the added functionality?

Thanks for your help,

Josh

from exiftoolr.

insilentio avatar insilentio commented on May 28, 2024

@JoshOBrien
the image is under a CC BY SA 4.0 license, therefore it should be fine.
Thanks for your efforts,
Daniel

from exiftoolr.

JoshOBrien avatar JoshOBrien commented on May 28, 2024

@insilentio FYI, I've now added the option to process Exif metadata via a csv (rather than a JSON) intermediate, and used (a compressed version of) the image you shared in the example demonstrating that option's use. The new option is available from exiftoolr_0.1.5, and can be used as shown below. Thanks once again for helping me to get this working.

library(exiftoolr)
## Use pipeline="csv" for images needing explicit specification
## and proper handling of a non-default character sets
img_file <- system.file(package = "exiftoolr", "images", "QS_Hongg.jpg")
args <- c("-charset", "exiftool=cp1250")
res <- exif_read(img_file, args = args, pipeline = "csv")
res[["City"]]  ## "Zurich", with an umlaut over the "u"

from exiftoolr.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.