= libimageflow + imageflow-server

Imageflow will bring world-class image quality and performance to all languages through a C-compatible API (libimageflow) and a separate RESTful turnkey HTTP server and command-line tool. Linux, Mac, and Windows are supported. The Imageflow Kickstarter ended successfully!

[ ](https://travis-ci.org/imazen/imageflow/builds)

How can I help?

Send us 'challenging' images and tasks.

Verifiably valid images that tend to fail with other tools,
Tasks that are complex and unexpected
Sets of (input/task/expected output) triples that meet the above criteria, or fail with other tools.

Explore the new JSON API and give us feedback.

I frequently request public comment on potentially controversial API changes. Please discuss; additional perspectives are very helpful for understanding the full scope of tradeoffs and effects a change may have.

We're currently discussing the JSON API.

Help us set up post-CI test suites and benchmark runners.

We have compare.rb to compare results with ImageWorsener and ImageMagick
We have scope.sh (same folder) to analyze Imageflow output with ResampleScope.
We have integration tests, with checksummed and expected image results stored on S3.
We use DSSIM from @pornel for checking expected/actual similarity when it's expected to differ by more than rounding errors.
We use off-by-one checking for everything else, to avoid floating-point differences becoming PITA.
Off-by-one checking mean we have to store the 'expected' somewhere, and can't rely exclusively on hashes. We currently put them on S3 manually and pull them down automatically. See imageflow_core/tests/visuals.rs

The above is not nearly enough. Now that we have a JSON API, we can store and run integration tests on a larger scale.

It would be ideal to have a set of scripts capable of updating, uploading, and launching linux docker containers in the cloud (almost any cloud, although I prefer DO and AWS), running tests, and having them upload their results to S3 and shut themselves down when they're done. AWS Lambda's 5-minute limit is not enough, unfortuantely. We use AppVeyor and Travis for our core tests, but I expect our suite to hit 1-2 hours for exericising all edge cases for all the images involved (also, basic fuzz testing is slow). This is particularly valuable for benchmarks, where running a psuedo-baremetal machine for long periods is cost-prohibitive - but lanuching a docker container on a maximum-size AWS instance gets us pretty close to baremetal performance, and might even work for logging and evaluating performance results/regressions over time.

Do you have a physical Windows box?

Virtual machines aren't great for benchmarks. Imageflow (flow-proto1, for now) could benefit from independent benchmarking on physical machines. Or, if you know how to script borrowing a consistent-performing Windows box in the cloud, and setting it up/tearing it down, that would be ideal. We have build scripts that work on AppVeyor, but that's not very useful for benchmarking.

Algorithm implementation work

API Work

Refactorings

How to build

We're assuming you've cloned already.

     git clone [email protected]:imazen/imageflow.git
     cd imageflow

All build scripts support VALGRIND=True to enable valgrind instrumentation of automated tests.

Docker (linux/macOS)

docker pull imazen/build_if_gcc54
cd ci/docker
./test.sh build_if_gcc54

Linux

We need quite a few packages in order to build all dependencies. You probably have most of these already.

You'll need both Python 3 and Python 2.7. Ruby is optional, but useful for extras.

apt-get for Ubuntu Trusty

sudo apt-get install --no-install-recommends \
  apt-utils sudo build-essential wget git nasm dh-autoreconf pkg-config curl \
  libpng-dev libssl-dev ca-certificates \
  libcurl4-openssl-dev libelf-dev libdw-dev python2.7-minimal \
  python3-minimal python3-pip python3-setuptools valgrind

apt-get Ubuntu Xenial

sudo apt-get install --no-install-recommends \
    apt-utils sudo build-essential wget git nasm dh-autoreconf pkg-config curl \
    libpng-dev libssl-dev ca-certificates \
    rubygems-integration ruby libcurl4-openssl-dev libelf-dev libdw-dev python2.7-minimal \
    python3-minimal python3-pip python3-setuptools valgrind

If you don't have Xenial or Trusty, adapt the above to work with your distro.

After running apt-get (or your package manager), you'll need conan, cmake, dssim, and Rust Nightly 2016-09-01.

curl https://sh.rustup.rs -sSf | sh -s -- -y --default-toolchain nightly-2016-09-01
sudo pip3 install conan
./ci/nixtools/install_cmake.sh
./ci/nixtools/install_dssim.sh
./build.sh

OS X

You'll need a bit less on OS X, although this may not be comprehensive:

brew update || brew update
brew install cmake || true
brew install --force openssl || true
brew link openssl --force || true
brew install conan nasm
./ci/nixtools/install_dssim.sh
./build.sh

Windows

Don't try to open anything in any IDE until you've run conan install, as cmake won't be complete.

You'll need Git, NASM, curl, Rust, OpenSSL, Conan, CMake, and Chocolatey.

See ci/wintools for installation scripts for the above tools.

Run win_verify_tools.bat to check on your tool status.
Run win_enter_env.bat to start a sub-shell with VS tools loaded and a proper PATH
Run win_build_c.bat to compile the C components
Run win_build_rust.bat to compile everything except the web server.
Run win_build_rust_server.bat to compile the HTTP server.

Windows: build/Imageflow.sln will be created during 'win_build_c.bat', but is only set up for Release mode compilation by default. Switch configuration to Release to get a build. You'll need to run conan install directly if you want to change architecture, since the solutions need to be regeneterated.

cd build
conan install -u --file ../conanfile.py --scope build_tests=True --build missing  -s build_type=Release -s arch=x86_64
cd ..
conan build

libimageflow is still in the prototype phase. It is neither API-stable nor secure.

The Problem - Why we need imageflow

Image processing is a ubiquitous requirement. All popular CMSes, many CDNs, and most asset pipelines implement at least image cropping, scaling, and recoding. The need for mobile-friendly websites (and consequently responsive images) makes manual asset creation methods time-prohibitive. Batch asset generation is error-prone, highly latent (affecting UX), and severely restricts web development agility.

Existing implementations lack tests and are either (a) incorrect, and cause visual artifacts or (b) so slow that they've created industry cargo-cult assumptions about "architectural needs"; I.e, always use a queue and workers, because we can gzip large files on the fly but not jpeg encode them (which makes no sense from big O standpoint). This creates artificial infrastructure needs for many small/medium websites, and makes it expensive to offer image processing as part of a CDN or optimization layer. We can eliminate this problem, and make the web faster for all users.

Image resampling is difficult to do correctly, and hard to do efficiently. Few attempts have been made at both. Our algorithm can resample a 16MP image in 84ms using just one core. On a 16-core server, we can resample 15 such images in 262ms. Modern performance on huge matrices is all about cache-friendliness and memory latency. Compare this to 2+ seconds for FreeImage to do the same operation on 1 image with inferior accuracy. ImageMagick must be compiled in (much slower) HDRI to prevent artifacts, and even with OpenMP enabled, using all cores, is still more than an order of magnitude slower (two orders of magnitude without perf tuning).

In addition, it rarely took me more than 45 minutes to discover a vulnerability in the imaging libraries I worked with. Nearly all imaging libraries were designed as offline toolkits for processing trusted image data, accumulating years of features and attack surface area before being moved to the server. Image codecs have an even worse security record than image processing libraries, yet released toolkit binaries often include outdated and vulnerable versions.

@jcupitt, author of the excellent libvips has this advice for using any imaging library:

I would say the solution is layered security.

Only enable the load libraries you really need. For example, libvips will open microscope slide images, which most websites will not require.

Keep all the image load libraries patched and updated daily.
Keep the image handling part of a site in a sandbox: a separate process, or even a separate machine, running as a low-privilege user.
Kill and reset the image handling system regularly, perhaps every few images.

This accurate advice should be applied to any use of ImageMagick, GraphicsMagick, LibGD, FreeImage, or OpenCV.

Also, make sure that whichever library you choose has good test coverage and automatic Valgrind and Coverity scanning set up. Also, read the Coverity and valgrind reports.

Unfortunately, in-process or priviledged exeuction is the default in every CMS or image server whose code I've reviewed.

Given the unlikelyhood of software developers learning correct sandboxing in masse (which isn't even possible to do securely on windows), it seems imperative that we create an imaging library that is safe for in-process use.

The proposed solution: Create a test-covered library that is safe for use with malicious data, and says NO to any of the following

Operations that do not have predictable resource (RAM/CPU) consumption.
Operations that cannot be performed in under 100ms on a 16MP jpeg, on a single i7 core.
Operations that undermine security in any way.
Dependencies that have a questionable security track-record. LibTiff, etc.
Optimizations that cause incorrect results (such as failing to perform color-correction, or scaling in the sRGB space instead of linear). (Or using 8-bit instead of 14-bit per channel when working in linear - this causes egregious truncation/color banding).
Abstractions that prevent major optimizations (over 30%). Some of the most common (enforced codec agnosticism) can prevent ~3000% reductions in cost.

Simplifying assumptions

32-bit sRGB is our 'common memory format'. To interoperate with other libraries (like Cairo, if users want to do text/vector/svg), we must support endian-specific layout. (BGRA on little-endian, ARGB on big-endian). Endian-agnostic layout may also be required by some libraries; this needs to be confirmed or disproven.
We use 128-bit floating point (BGRA, linear, premultiplied) for operations that blend pixels. (Linear RGB in 32-bits causes severe truncation).
The uncompressed 32-bit image can fit in RAM. If it can't, we don't do it. This is for web output use, not scientific or mapping applications. Also; at least 1 matrix transposition is required for downsampling an image, and this essentially requires it all to be in memory. No paging to disk, ever!
We support jpeg, gif, and png natively. All other codecs are plugins. We only write sRGB output.

Integration options

The components

libjpeg-turbo or mozjpeg
libpng
giflib
LittleCMS
ImageResizer - FastScaling for optimized, single-pass rendering.
ImageResizer (From which we will port most of the domain logic, if not the image decoding/encoding portions)
OpenCV or CCV for separate plugin to address face-aware auto-cropping.

All of the "hard" problems have been solved individually; we have proven performant implementations to all the expensive parts of image processing.

We also have room for more optimizations - by integrating with the codecs at the block and scan-line level, we can greatly reduce RAM and resource needs when downsampling large images. Libvips has proven that this approach can be incredibly fast.

A generic graph-based representation of an image processing workflow enables advanced optimizations and potentially lets us pick the fastest or best backend depending upon image format/resolution and desired workflow. Given how easily most operations compose, this could easily make the average workflow 3-8x faster, particularly when we can compose decoding and scaling for certain codecs.

API needs.

We should separate our high-level API needs from our low-level primitive needs.

At a high level, users will want (or end up creating) both declarative (result-descriptive) and imperative (ordered operation) APIs. People reason about images in a lot of different ways, and if the tool doesn't match their existing mental pattern, they'll create one that does.

A descriptive API is the most frequently used, and we drafted RIAPI to standardize the basics.

Among the many shiny advanced features that I've published over the years, a couple have stood out as particularly useful and popular with end-users.

Whitespace cropping - Apply an energy filter (factoring in all 4 channels!) and then crop off most of the non-energy bounds below a threshold. This saves tremendous time for all e-commerce users.
Face-aware cropping - Any user profile photo will need to be cropped to multiple aspect ratios, in order to meet native app and constrained space needs. Face detection can be extremely fast (particularly if your scaling algorithm is fast), and this permits the server to make smart choices about where to center the crop (or if padding is required!).

The former (whitespace cropping) doesn't require any dependencies. The latter, face rectangle detection may or may not be easily extracted from OpenCV/ccv; this might involve a dependency. The data set is also several megabytes, so it justifies a separate assembly anyway.

How does one learn image processing?

There are not many great textbooks on the subject. Here are some from my personal bookshelf. Between them (and Wikipedia) I was able to put together about 60% of the knowledge I needed; the rest I found by reading the source code to many popular image processing libraries.

I would start by reading Principles of Digital Image Processing: Core Algorithms front-to-back, then Digital Image Warping. Wikipedia is also good, although the relevant pages are not linked or categorized together - use specific search terms, like "bilinear interpolation" and "Lab color space".

The Graphics Gems series is great for optimization inspiration:

Also, I made some notes regarding issues to be aware of when creating an imaging library.

I'm not aware of any implementations of (say, resampling) that are completely correct. Very recent editions of ImageMagick are very close, though. Most offer a wide selection of 'filters', but fail to scale/truncate the input or output offsets appropriately, and the resulting error is usually greater than the difference between the filters.

Source code to read

I have found the source code for OpenCV, LibGD, FreeImage, Libvips, Pixman, Cairo, ImageMagick, stb_image, Skia, and FrameWave is very useful for understanding real-world implementations and considerations. Most textbooks assume an infinite plane, ignore off-by-one errors, floating-point limitations, color space accuracy, and operational symmetry within a bounded region. I cannot recommend any textbook as an accurate reference, only as a conceptual starting point.

Also, keep in mind that computer vision is very different from image creation. In computer vision, resampling accuracy matters very little, for example. But in image creation, you are serving images to photographers, people with far keener visual perception than the average developer. The images produced will be rendered side-by-side with other CSS and images, and the least significant bit of inaccuracy is quite visible. You are competing with Lightroom; with offline tools that produce visually perfect results. End-user software will be discarded if photographers feel it is corrupting their work.

sadiqmmm / imageflow Goto Github PK

imageflow's Introduction