Giter Club home page Giter Club logo

jpegturbo.jl's Introduction

JpegTurbo

Stable Dev Build Status Coverage

JpegTurbo.jl is a Julia wrapper of the C library libjpeg-turbo that provides IO support for the JPEG image format. This package also backs the JPEG IO part of ImageIO and FileIO.

For benchmark results against other image IO backends, please check here.

Usage

There are two different usages for this package:

  • (convenient) via the FileIO: save/load
  • (powerful) via the JpegTurbo.jl interfaces: jpeg_encode/jpeg_decode

FileIO interface: save/load

FileIO is an IO frontend with various IO backends; ImageIO is the default IO backend provided by the JuliaImages ecosystem. When JpegTurbo (and/or ImageIO) are available in DEPOT_PATH, FileIO will uses JpegTurbo to load and save the JPEG images:

using FileIO
img = rand(64, 64)
save("test.jpg", img)
load("test.jpg")

Note that you do not necessarily need to install them in your project environments. For instance, you can do (@v1.8) pkg> add JpegTurbo or (@v1.8) pkg> add ImageIO and it should work for your local setup.

JpegTurbo interface: jpeg_encode/jpeg_decode

jpeg_encode is used to compress 2D colorant matrix as JPEG image.

jpeg_encode(filename::AbstractString, img; kwargs...) -> Int
jpeg_encode(io::IO, img; kwargs...) -> Int
jpeg_encode(img; kwargs...) -> Vector{UInt8}

jpeg_decode is used to decompress JPEG image as 2D colorant matrix.

jpeg_decode([T,] filename::AbstractString; kwargs...) -> Matrix{T}
jpeg_decode([T,] io::IO; kwargs...) -> Matrix{T}
jpeg_decode([T,] data::Vector{UInt8}; kwargs...) -> Matrix{T}

Advanced: in-memory encode/decode

For some applications, it can be faster to do encoding/decoding without the need to read/write disk:

using JpegTurbo
img = rand(64, 64)
bytes = jpeg_encode(img) # Vector{UInt8}
img_saveload = jpeg_decode(bytes) # size: 64x64

Advanced: preview optimization

One can request a single-component output or a downsampled output so that fewer calculation is needed during the decompression. This can be particularly useful to accelerate image preview.

using BenchmarkTools, TestImages, JpegTurbo
filename = testimage("earth", download_only=true)
# full decompression
@btime jpeg_decode(filename); # 224.760 ms (7 allocations: 51.54 MiB)
# only decompress luminance component
@btime jpeg_decode(Gray, filename); # 91.157 ms (6 allocations: 17.18 MiB)
# process only a few pixels
@btime jpeg_decode(filename; scale_ratio=0.25); # 77.254 ms (8 allocations: 3.23 MiB)
# process only a few pixels for luminance component
@btime jpeg_decode(Gray, filename; scale_ratio=0.25); # 63.119 ms (6 allocations: 1.08 MiB)

An exclusive alternative to scale_ratio is preferred_size:

# minimal `scale_ratio` that output size is greater than or equal to (512, 512)
jpeg_decode(filename; preferred_size=(512, 512)) # size: (751, 750)
# maximal `scale_ratio` that output size is less than or equal to (512, 512)
jpeg_decode(filename; preferred_size=(<=, (512, 512))) # size: (376, 375)

Acknowledgements

The purpose of this project is to replace ImageMagick.jl with ImageIO. Steven G. Johnson first initialized an early draft version JpegTurbo.jl, this package steals the name from him :). Clang.jl is used to generate the low-level ccall wrapper. Yupei Qi, the current maintainer of Clang.jl, has generously help me debugging C-related codes. This package won't work at all without his help. My another prior project Sixel.jl was also under his generous guidance.

jpegturbo.jl's People

Contributors

gnimuc avatar jaakkor2 avatar jkrumbiegel avatar johnnychen94 avatar mikumikunisiteageru avatar timholy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

jpegturbo.jl's Issues

Non-allocating image loading

In my quest for ever-faster image data pipelines for training neural nets, I've been playing around with the source code to figure out how to reduce allocations when decoding images. I'm writing here to see if my assumptions about how one could go about this are correct and to clear up some questions.

It seems there are two allocations made:

  • in jpeg_decode, a matrix out is created as the Julia representation of the image data which is returned to the caller
  • in _jpeg_decode!, a UInt8-vector buf is created which JpegTurbo writes to

Copying some code from jpeg_decode, I've managed to make a method that takes in an out of the correct size and type and uses that instead of allocating it. There are some segfaults when transposing is not handled correctly or the size and type of the buffer aren't correct, but I assume these can be fixed. In any case, removing this allocation cuts memory usage in half. I assume that something similar could be done for the buf allocation; or for CTs that are based on UInt8s anyway (like RGB{N0f8}), maybe even a view will do.

As for the API to use buffered data loading, I was thinking it may be safest to have a Buffer struct that holds a out and buf. Since one often wants to reuse this buffer to load images of differing sizes, these buffers could be grown to the largest encountered image size; and images smaller than the current out buffer returned as views. This could be used something like:

buffer = JpegTurbo.Buffer(RGB{N0f8}, initialsize)
img = JpegTurbo.jpeg_decode!(buffer, file|data)
parent(img) === buffer.out

Does this approach make sense? Is there a simpler way? Am I missing something when it comes to the transposing?

Error message handling

Would there be a way to get the error messages thrown by C at the Julia level?
For instance I would wish to be able to control the logging flow of stuff like:

Corrupt JPEG data: 283 extraneous bytes before marker 0xd9

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Reading image sizes lazily

Just read that scale_ratio can be passed when decoding, which sounds awesome! Is it also somehow possible (maybe even with an external package) to read the size (h, w) of an image without loading first.

My use case is a high-throughput scenario, loading images for a deep learning pipeline. Often many of these images are saved in a much larger, but variable, resolution and are scaled and cropped to the same size (e.g. (256, 256)) before being batched and run through the model. It would be super useful to know the image size before decoding the image, so that I can select the smallest possible scale_ratio such that we still have (h>256 ,w>256). Still need to benchmark how this compares in load speed to just loading each image, downscaling and saving it once. Even if it only offers some load speed improvements, avoiding the extra resizing preprocessing step could also help reduce quality degradation, though.

faster `jpeg_encode` for filename and IO

The current pipeline consists of two stages:

  1. first allocate a C-side buffer in memory and stores the encoded bytes, and
  2. unsafe_wrap the buffer in Julia side and use Julia’s IO operation to write to the output file
function jpeg_encode(filename::AbstractString, img; kwargs...)
    open(filename, "w") do io
        jpeg_encode(io, img; kwargs...)
    end
end
jpeg_encode(io::IO, img; kwargs...) = write(io, jpeg_encode(img; kwargs...))

If we can fuse two stages, then we can get a faster version.

For jpeg_encode(filename, img) specifically, it's also possible to wrap the C stdio Libc.FILE version.

benchmark results against other backends

JPEG backends comparison

Even though JpegTurbo.jl provides more advanced and efficient in-memory features, the benchmark only tests the filename version because all other backends don't support this.

using JpegTurbo
using BenchmarkTools
using TestImages

img = testimage("cameraman");
filename = "tmp.jpg"

jpeg_encode(filename, img);
data = jpeg_encode(img);
@assert read(filename) == data;

@btime jpeg_encode($img); # 855.914 μs (7 allocations: 306.49 KiB)
@btime jpeg_encode(filename, $img); # 1.064 ms (20 allocations: 307.19 KiB)

@assert jpeg_decode(filename) == jpeg_decode(data)
@btime jpeg_decode($data); # 795.819 μs (18 allocations: 514.66 KiB)
@btime jpeg_decode(filename); # 836.992 μs (45 allocations: 630.62 KiB)

Generally speaking, for the filename version, JpegTurbo.jl and OpenCV (python) are the fastest versions since they are both backed by libjpeg-turbo.

v0.1.0
Julia versioninfo:
Julia Version 1.8.0-DEV.1434
Commit 4abf26eec8 (2022-01-30 20:04 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  CPU: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.0 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 8

JpegTurbo.jl versioninfo:
JpegTurbo.jl version: 0.1.0
libjpeg version: 62
libjpeg-turbo version: 2.1.0
bit mode: 8
SIMD: enabled

OpenCV version: 4.5.5
OpenCV libjpeg-turbo version: 2.1.2-62

Scikit-image version: 0.19.1

moonsurface Gray{N0f8} (256, 256)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 0.44 0.33 22.66 39.0516
ImageMagick.jl 1.19 1.47 22.24 39.0533
QuartzImageIO.jl 1.19 0.69 25.01 39.5761
OpenCV (Python) 0.63 1.26 29.47 42.3855
Scikit-image 1.02 2.06 10.83 34.0346

cameraman Gray{N0f8} (512, 512)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 1.20 0.83 50.19 47.1206
ImageMagick.jl 3.33 4.57 49.30 47.1297
QuartzImageIO.jl 2.32 1.23 50.82 47.0878
OpenCV (Python) 1.39 3.33 65.63 49.2061
Scikit-image 1.86 5.07 27.55 41.9095

pirate Gray{N0f8} (512, 512)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 1.31 1.05 79.84 40.9099
ImageMagick.jl 3.81 5.03 78.51 40.9127
QuartzImageIO.jl 2.94 1.49 81.77 41.3253
OpenCV (Python) 1.56 3.80 104.68 43.5602
Scikit-image 2.16 6.04 42.09 35.6561

house Gray{N0f8} (512, 512)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 1.09 0.69 35.70 50.0640
ImageMagick.jl 3.07 4.59 35.16 50.0741
QuartzImageIO.jl 1.98 1.10 36.61 49.6511
OpenCV (Python) 0.87 2.60 46.67 51.8188
Scikit-image 1.37 4.76 20.61 45.5563

rand Gray{Float64} (512, 512)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 2.27 1.79 215.91 38.3134
ImageMagick.jl 4.31 6.15 189.28 38.3145
QuartzImageIO.jl 4.48 2.43 218.95 39.1134
OpenCV (Python) 2.73 4.83 257.61 42.3367
Scikit-image 3.35 9.21 142.45 28.5258

rand Gray{Float64} (4096, 4096)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 334.87 220.86 13795.17 38.3061
ImageMagick.jl 403.20 603.74 12104.39 38.3053
QuartzImageIO.jl 329.46 273.10 13851.35 39.1145
OpenCV (Python) 158.54 235.52 16464.69 42.3227
Scikit-image 209.01 507.61 9091.49 28.5394

fabio RGB{N0f8} (512, 512)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 1.59 3.96 55.91 42.7003
ImageMagick.jl 6.14 6.02 72.76 45.5593
QuartzImageIO.jl 4.50 6.73 55.38 42.0154
OpenCV (Python) 2.65 4.90 72.57 44.0539
Scikit-image 4.48 13.57 31.68 37.8229

barbara RGB{N0f8} (576, 720)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 2.48 6.85 140.21 36.1151
ImageMagick.jl 10.56 10.19 179.70 38.1910
QuartzImageIO.jl 7.65 10.58 139.84 36.0860
OpenCV (Python) 5.11 8.69 185.88 37.2003
Scikit-image 6.87 20.66 74.79 32.7438

mandril RGB{N0f8} (512, 512)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 2.04 4.68 149.28 27.7261
ImageMagick.jl 8.95 7.54 241.40 32.2466
QuartzImageIO.jl 6.15 7.41 150.35 27.7677
OpenCV (Python) 5.59 7.90 190.93 28.2401
Scikit-image 5.08 16.98 76.89 25.6526

coffee RGB{N0f8} (400, 600)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 1.53 3.83 78.10 36.1796
ImageMagick.jl 6.23 5.68 100.48 38.2192
QuartzImageIO.jl 4.47 5.95 78.63 36.1604
OpenCV (Python) 2.99 5.75 102.26 37.4588
Scikit-image 5.01 13.31 40.71 32.2566

lighthouse RGB{N0f8} (512, 768)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 2.57 6.26 125.39 38.6723
ImageMagick.jl 9.51 9.00 147.12 39.6910
QuartzImageIO.jl 7.43 10.17 125.60 38.8406
OpenCV (Python) 4.46 8.26 165.09 40.4860
Scikit-image 7.82 21.04 63.88 33.8235

earth_apollo RGB{N0f8} (3002, 3000)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 69.68 162.96 1779.18 39.5714
ImageMagick.jl 218.13 221.63 2463.45 42.1515
QuartzImageIO.jl 169.27 340.41 1734.75 39.5452
OpenCV (Python) 89.01 142.87 2428.32 40.6026
Scikit-image 144.32 349.93 906.01 37.6173

rand RGB{Float64} (512, 512)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 5.31 5.36 248.71 12.7247
ImageMagick.jl 10.62 9.05 446.51 31.8190
QuartzImageIO.jl 8.03 8.06 249.97 12.7733
OpenCV (Python) 4.58 7.48 300.47 12.7396
Scikit-image 6.77 20.91 154.07 12.6310

rand RGB{Float64} (4096, 4096)

Backend encode time(ms) decode time(ms) encoded size(KB) PSNR(dB)
JpegTurbo.jl 863.61 465.04 15877.02 12.7284
ImageMagick.jl 888.20 872.52 28547.04 31.8186
QuartzImageIO.jl 613.77 711.31 15949.90 12.8192
OpenCV (Python) 279.60 376.26 19189.29 12.7814
Scikit-image 412.91 966.76 9828.86 12.6804

Unable to automatically install 'JpegTurbo'

When we tried to add JpegTurbo in julia v1.7.3, we got the follow error message. What could be the problem?

  Downloaded artifact: JpegTurbo
ERROR: Unable to automatically install 'JpegTurbo' from '/home/zhaoli9/.julia/packages/JpegTurbo_jll/x2k7x/Artifacts.toml'
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] ensure_artifact_installed(name::String, meta::Dict{String, Any}, artifacts_toml::String; platform::Base.BinaryPlatforms.Platform, verbose::Bool, quiet_download::Bool, io::Base.TTY)
    @ Pkg.Artifacts ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/Artifacts.jl:441
  [3] download_artifacts(env::Pkg.Types.EnvCache; platform::Base.BinaryPlatforms.Platform, julia_version::VersionNumber, verbose::Bool, io::Base.TTY)
    @ Pkg.Operations ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/Operations.jl:617
  [4] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}, new_git::Set{Base.UUID}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform)
    @ Pkg.Operations ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/Operations.jl:1182
  [5] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform, kwargs::Base.Pairs{Symbol, Base.TTY, Tuple{Symbol}, NamedTuple{(:io,), Tuple{Base.TTY}}})
    @ Pkg.API ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/API.jl:268
  [6] add(pkgs::Vector{Pkg.Types.PackageSpec}; io::Base.TTY, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Pkg.API ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/API.jl:149
  [7] add(pkgs::Vector{Pkg.Types.PackageSpec})
    @ Pkg.API ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/API.jl:144
  [8] do_cmd!(command::Pkg.REPLMode.Command, repl::REPL.LineEditREPL)
    @ Pkg.REPLMode ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/REPLMode/REPLMode.jl:407
  [9] do_cmd(repl::REPL.LineEditREPL, input::String; do_rethrow::Bool)
    @ Pkg.REPLMode ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/REPLMode/REPLMode.jl:385
 [10] do_cmd
    @ ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/REPLMode/REPLMode.jl:376 [inlined]
 [11] (::Pkg.REPLMode.var"#24#27"{REPL.LineEditREPL, REPL.LineEdit.Prompt})(s::REPL.LineEdit.MIState, buf::IOBuffer, ok::Bool)
    @ Pkg.REPLMode ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/REPLMode/REPLMode.jl:549
 [12] #invokelatest#2
    @ ./essentials.jl:716 [inlined]
 [13] invokelatest
    @ ./essentials.jl:714 [inlined]
 [14] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
    @ REPL.LineEdit ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/REPL/src/LineEdit.jl:2493
 [15] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef)
    @ REPL ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/REPL/src/REPL.jl:1232
 [16] (::REPL.var"#49#54"{REPL.LineEditREPL, REPL.REPLBackendRef})()
    @ REPL ./task.jl:429

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.