juliaio / jpegturbo.jl Goto Github PK
View Code? Open in Web Editor NEWJulia interface to libjpeg-turbo
License: MIT License
Julia interface to libjpeg-turbo
License: MIT License
I'm not even sure if this doable, see "Error handling" section in libjpeg.txt and
the "JPEG DECOMPRESSION SAMPLE INTERFACE" part in example.c
It's unclear to me how to create jmp_buf
from the julia side..
The current pipeline consists of two stages:
function jpeg_encode(filename::AbstractString, img; kwargs...)
open(filename, "w") do io
jpeg_encode(io, img; kwargs...)
end
end
jpeg_encode(io::IO, img; kwargs...) = write(io, jpeg_encode(img; kwargs...))
If we can fuse two stages, then we can get a faster version.
For jpeg_encode(filename, img)
specifically, it's also possible to wrap the C stdio Libc.FILE
version.
Just read that scale_ratio
can be passed when decoding, which sounds awesome! Is it also somehow possible (maybe even with an external package) to read the size (h, w)
of an image without loading first.
My use case is a high-throughput scenario, loading images for a deep learning pipeline. Often many of these images are saved in a much larger, but variable, resolution and are scaled and cropped to the same size (e.g. (256, 256)
) before being batched and run through the model. It would be super useful to know the image size before decoding the image, so that I can select the smallest possible scale_ratio
such that we still have (h>256 ,w>256)
. Still need to benchmark how this compares in load speed to just loading each image, downscaling and saving it once. Even if it only offers some load speed improvements, avoiding the extra resizing preprocessing step could also help reduce quality degradation, though.
In my quest for ever-faster image data pipelines for training neural nets, I've been playing around with the source code to figure out how to reduce allocations when decoding images. I'm writing here to see if my assumptions about how one could go about this are correct and to clear up some questions.
It seems there are two allocations made:
jpeg_decode
, a matrix out
is created as the Julia representation of the image data which is returned to the caller_jpeg_decode!
, a UInt8
-vector buf
is created which JpegTurbo writes toCopying some code from jpeg_decode
, I've managed to make a method that takes in an out
of the correct size and type and uses that instead of allocating it. There are some segfaults when transposing is not handled correctly or the size and type of the buffer aren't correct, but I assume these can be fixed. In any case, removing this allocation cuts memory usage in half. I assume that something similar could be done for the buf
allocation; or for CT
s that are based on UInt8
s anyway (like RGB{N0f8}
), maybe even a view will do.
As for the API to use buffered data loading, I was thinking it may be safest to have a Buffer
struct that holds a out
and buf
. Since one often wants to reuse this buffer to load images of differing sizes, these buffers could be grown to the largest encountered image size; and images smaller than the current out
buffer returned as views. This could be used something like:
buffer = JpegTurbo.Buffer(RGB{N0f8}, initialsize)
img = JpegTurbo.jpeg_decode!(buffer, file|data)
parent(img) === buffer.out
Does this approach make sense? Is there a simpler way? Am I missing something when it comes to the transposing?
This is needed to support JPEG Compression for TiffImages.
See section "Abbreviated datastreams and multiple images" in https://raw.githubusercontent.com/libjpeg-turbo/libjpeg-turbo/main/libjpeg.txt
cc: @tlnagy
use BenchmarkTools and PkgBenchmark
See the "Progressive JPEG support" section in https://raw.githubusercontent.com/libjpeg-turbo/libjpeg-turbo/main/libjpeg.txt
Would there be a way to get the error messages thrown by C at the Julia level?
For instance I would wish to be able to control the logging flow of stuff like:
Corrupt JPEG data: 283 extraneous bytes before marker 0xd9
Even though JpegTurbo.jl provides more advanced and efficient in-memory features, the benchmark only tests the filename
version because all other backends don't support this.
using JpegTurbo
using BenchmarkTools
using TestImages
img = testimage("cameraman");
filename = "tmp.jpg"
jpeg_encode(filename, img);
data = jpeg_encode(img);
@assert read(filename) == data;
@btime jpeg_encode($img); # 855.914 μs (7 allocations: 306.49 KiB)
@btime jpeg_encode(filename, $img); # 1.064 ms (20 allocations: 307.19 KiB)
@assert jpeg_decode(filename) == jpeg_decode(data)
@btime jpeg_decode($data); # 795.819 μs (18 allocations: 514.66 KiB)
@btime jpeg_decode(filename); # 836.992 μs (45 allocations: 630.62 KiB)
Generally speaking, for the filename version, JpegTurbo.jl and OpenCV (python) are the fastest versions since they are both backed by libjpeg-turbo.
Julia versioninfo:
Julia Version 1.8.0-DEV.1434
Commit 4abf26eec8 (2022-01-30 20:04 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin18.7.0)
CPU: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.0 (ORCJIT, skylake)
Environment:
JULIA_NUM_THREADS = 8
JpegTurbo.jl versioninfo:
JpegTurbo.jl version: 0.1.0
libjpeg version: 62
libjpeg-turbo version: 2.1.0
bit mode: 8
SIMD: enabled
OpenCV version: 4.5.5
OpenCV libjpeg-turbo version: 2.1.2-62
Scikit-image version: 0.19.1
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 0.44 | 0.33 | 22.66 | 39.0516 |
ImageMagick.jl | 1.19 | 1.47 | 22.24 | 39.0533 |
QuartzImageIO.jl | 1.19 | 0.69 | 25.01 | 39.5761 |
OpenCV (Python) | 0.63 | 1.26 | 29.47 | 42.3855 |
Scikit-image | 1.02 | 2.06 | 10.83 | 34.0346 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 1.20 | 0.83 | 50.19 | 47.1206 |
ImageMagick.jl | 3.33 | 4.57 | 49.30 | 47.1297 |
QuartzImageIO.jl | 2.32 | 1.23 | 50.82 | 47.0878 |
OpenCV (Python) | 1.39 | 3.33 | 65.63 | 49.2061 |
Scikit-image | 1.86 | 5.07 | 27.55 | 41.9095 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 1.31 | 1.05 | 79.84 | 40.9099 |
ImageMagick.jl | 3.81 | 5.03 | 78.51 | 40.9127 |
QuartzImageIO.jl | 2.94 | 1.49 | 81.77 | 41.3253 |
OpenCV (Python) | 1.56 | 3.80 | 104.68 | 43.5602 |
Scikit-image | 2.16 | 6.04 | 42.09 | 35.6561 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 1.09 | 0.69 | 35.70 | 50.0640 |
ImageMagick.jl | 3.07 | 4.59 | 35.16 | 50.0741 |
QuartzImageIO.jl | 1.98 | 1.10 | 36.61 | 49.6511 |
OpenCV (Python) | 0.87 | 2.60 | 46.67 | 51.8188 |
Scikit-image | 1.37 | 4.76 | 20.61 | 45.5563 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 2.27 | 1.79 | 215.91 | 38.3134 |
ImageMagick.jl | 4.31 | 6.15 | 189.28 | 38.3145 |
QuartzImageIO.jl | 4.48 | 2.43 | 218.95 | 39.1134 |
OpenCV (Python) | 2.73 | 4.83 | 257.61 | 42.3367 |
Scikit-image | 3.35 | 9.21 | 142.45 | 28.5258 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 334.87 | 220.86 | 13795.17 | 38.3061 |
ImageMagick.jl | 403.20 | 603.74 | 12104.39 | 38.3053 |
QuartzImageIO.jl | 329.46 | 273.10 | 13851.35 | 39.1145 |
OpenCV (Python) | 158.54 | 235.52 | 16464.69 | 42.3227 |
Scikit-image | 209.01 | 507.61 | 9091.49 | 28.5394 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 1.59 | 3.96 | 55.91 | 42.7003 |
ImageMagick.jl | 6.14 | 6.02 | 72.76 | 45.5593 |
QuartzImageIO.jl | 4.50 | 6.73 | 55.38 | 42.0154 |
OpenCV (Python) | 2.65 | 4.90 | 72.57 | 44.0539 |
Scikit-image | 4.48 | 13.57 | 31.68 | 37.8229 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 2.48 | 6.85 | 140.21 | 36.1151 |
ImageMagick.jl | 10.56 | 10.19 | 179.70 | 38.1910 |
QuartzImageIO.jl | 7.65 | 10.58 | 139.84 | 36.0860 |
OpenCV (Python) | 5.11 | 8.69 | 185.88 | 37.2003 |
Scikit-image | 6.87 | 20.66 | 74.79 | 32.7438 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 2.04 | 4.68 | 149.28 | 27.7261 |
ImageMagick.jl | 8.95 | 7.54 | 241.40 | 32.2466 |
QuartzImageIO.jl | 6.15 | 7.41 | 150.35 | 27.7677 |
OpenCV (Python) | 5.59 | 7.90 | 190.93 | 28.2401 |
Scikit-image | 5.08 | 16.98 | 76.89 | 25.6526 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 1.53 | 3.83 | 78.10 | 36.1796 |
ImageMagick.jl | 6.23 | 5.68 | 100.48 | 38.2192 |
QuartzImageIO.jl | 4.47 | 5.95 | 78.63 | 36.1604 |
OpenCV (Python) | 2.99 | 5.75 | 102.26 | 37.4588 |
Scikit-image | 5.01 | 13.31 | 40.71 | 32.2566 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 2.57 | 6.26 | 125.39 | 38.6723 |
ImageMagick.jl | 9.51 | 9.00 | 147.12 | 39.6910 |
QuartzImageIO.jl | 7.43 | 10.17 | 125.60 | 38.8406 |
OpenCV (Python) | 4.46 | 8.26 | 165.09 | 40.4860 |
Scikit-image | 7.82 | 21.04 | 63.88 | 33.8235 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 69.68 | 162.96 | 1779.18 | 39.5714 |
ImageMagick.jl | 218.13 | 221.63 | 2463.45 | 42.1515 |
QuartzImageIO.jl | 169.27 | 340.41 | 1734.75 | 39.5452 |
OpenCV (Python) | 89.01 | 142.87 | 2428.32 | 40.6026 |
Scikit-image | 144.32 | 349.93 | 906.01 | 37.6173 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 5.31 | 5.36 | 248.71 | 12.7247 |
ImageMagick.jl | 10.62 | 9.05 | 446.51 | 31.8190 |
QuartzImageIO.jl | 8.03 | 8.06 | 249.97 | 12.7733 |
OpenCV (Python) | 4.58 | 7.48 | 300.47 | 12.7396 |
Scikit-image | 6.77 | 20.91 | 154.07 | 12.6310 |
Backend | encode time(ms) | decode time(ms) | encoded size(KB) | PSNR(dB) |
---|---|---|---|---|
JpegTurbo.jl | 863.61 | 465.04 | 15877.02 | 12.7284 |
ImageMagick.jl | 888.20 | 872.52 | 28547.04 | 31.8186 |
QuartzImageIO.jl | 613.77 | 711.31 | 15949.90 | 12.8192 |
OpenCV (Python) | 279.60 | 376.26 | 19189.29 | 12.7814 |
Scikit-image | 412.91 | 966.76 | 9828.86 | 12.6804 |
See the "Partial image decompression" section in https://raw.githubusercontent.com/libjpeg-turbo/libjpeg-turbo/main/example.txt
The following two methods are missing in #3
jpeg_decode(io::IO)
jpeg_decode(bytes::Vector{UInt8})
When we tried to add JpegTurbo in julia v1.7.3, we got the follow error message. What could be the problem?
Downloaded artifact: JpegTurbo
ERROR: Unable to automatically install 'JpegTurbo' from '/home/zhaoli9/.julia/packages/JpegTurbo_jll/x2k7x/Artifacts.toml'
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] ensure_artifact_installed(name::String, meta::Dict{String, Any}, artifacts_toml::String; platform::Base.BinaryPlatforms.Platform, verbose::Bool, quiet_download::Bool, io::Base.TTY)
@ Pkg.Artifacts ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/Artifacts.jl:441
[3] download_artifacts(env::Pkg.Types.EnvCache; platform::Base.BinaryPlatforms.Platform, julia_version::VersionNumber, verbose::Bool, io::Base.TTY)
@ Pkg.Operations ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/Operations.jl:617
[4] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}, new_git::Set{Base.UUID}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform)
@ Pkg.Operations ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/Operations.jl:1182
[5] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform, kwargs::Base.Pairs{Symbol, Base.TTY, Tuple{Symbol}, NamedTuple{(:io,), Tuple{Base.TTY}}})
@ Pkg.API ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/API.jl:268
[6] add(pkgs::Vector{Pkg.Types.PackageSpec}; io::Base.TTY, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ Pkg.API ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/API.jl:149
[7] add(pkgs::Vector{Pkg.Types.PackageSpec})
@ Pkg.API ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/API.jl:144
[8] do_cmd!(command::Pkg.REPLMode.Command, repl::REPL.LineEditREPL)
@ Pkg.REPLMode ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/REPLMode/REPLMode.jl:407
[9] do_cmd(repl::REPL.LineEditREPL, input::String; do_rethrow::Bool)
@ Pkg.REPLMode ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/REPLMode/REPLMode.jl:385
[10] do_cmd
@ ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/REPLMode/REPLMode.jl:376 [inlined]
[11] (::Pkg.REPLMode.var"#24#27"{REPL.LineEditREPL, REPL.LineEdit.Prompt})(s::REPL.LineEdit.MIState, buf::IOBuffer, ok::Bool)
@ Pkg.REPLMode ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/Pkg/src/REPLMode/REPLMode.jl:549
[12] #invokelatest#2
@ ./essentials.jl:716 [inlined]
[13] invokelatest
@ ./essentials.jl:714 [inlined]
[14] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
@ REPL.LineEdit ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/REPL/src/LineEdit.jl:2493
[15] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef)
@ REPL ~/works/public_code/julia-1.7.3/share/julia/stdlib/v1.7/REPL/src/REPL.jl:1232
[16] (::REPL.var"#49#54"{REPL.LineEditREPL, REPL.REPLBackendRef})()
@ REPL ./task.jl:429
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.