Giter Club home page Giter Club logo

Comments (20)

0xC0000054 avatar 0xC0000054 commented on June 12, 2024 1

If you make MaxThreads available through the FileType's PropertyCollection / PropertyNames, I can add a benchmark to PdnBench and test out performance that way. (edit: as in, I can benchmark it various values of MaxThreads)

I changed the AvifFileTypePlugin class to public and added a constructor overload that allows the caller to set the max encoder thread count.
If the maxEncoderThreads parameter is null, the plugin will use its default number of threads.

from pdn-avif.

0xC0000054 avatar 0xC0000054 commented on June 12, 2024

aom does use quite a bit of memory when encoding, it is also slow for larger images.
Another encoder option is rav1e, it is faster but potentially would produce lower quality images than aom.
That would also require me to rewrite my encoding code to use the rav1e API.

from pdn-avif.

rickbrew avatar rickbrew commented on June 12, 2024

Do we know why aom uses so much memory? 10-20x is just ... mind boggling

from pdn-avif.

rickbrew avatar rickbrew commented on June 12, 2024

And don't I remember you saying aom doesn't check if an allocation returns null? It just dies or stomps on null pointers if it the process runs out of memory?

from pdn-avif.

rickbrew avatar rickbrew commented on June 12, 2024

So my take on this is that using up gigabytes of memory isn't okay. And if aom isn't handling errors (e.g. malloc returns null), that's definitely not okay and is a potential security problem and that's not acceptable. I'd prefer using rav1e in that case even if it produces lower quality images, as long as they're working on improving the quality (even at the expense of performance).

I am least worried about the performance of encoding, although it was definitely very slow when I was trying it out on my Ryzen 3950X.

from pdn-avif.

0xC0000054 avatar 0xC0000054 commented on June 12, 2024

And don't I remember you saying aom doesn't check if an allocation returns null?

No, that is libavif, it reads and writes the AVIF file format.
I wrote my own C# code to read and write the AVIF file format just to avoid that bug.

Do we know why aom uses so much memory? 10-20x is just ... mind boggling

I do not, but v2.0.0 should use less memory than the previous 1.x versions.
aom is the AV1 reference encoder built during the development of the AV1 standard, so it may be that they were focusing on quality / correctness instead of CPU and memory usage.

from pdn-avif.

rickbrew avatar rickbrew commented on June 12, 2024

I see, that makes sense. Do we know how much less memory 2.0 should use? And an ETA for its completion?

from pdn-avif.

0xC0000054 avatar 0xC0000054 commented on June 12, 2024

Do we know how much less memory 2.0 should use? And an ETA for its completion?

I am currently using 2.0.0.

from pdn-avif.

0xC0000054 avatar 0xC0000054 commented on June 12, 2024

@rickbrew

I just tested this.

It appears that aom uses up to 2.8 GB when encoding that file with a single thread.
This is still a lot of memory for the size of the image.

The API only allows you to set the maximum number of encoding threads.
When I set it to use up to 8 threads the memory usage jumps to ~3.2 GB, so almost 4 GB on a 32 thread CPU certainly sounds possible.

I also tested encoding a few other image sizes with a single thread.

A 3820x2160 (4K) image used ~2 GB of memory.
A 1920x1080 image used ~700 MB of memory.
A 1280x720 image used ~420 MB of memory.

It is definitely looking like aom over-commits memory for images above 1920x1080.

Regarding rav1e, I will have to wait for them to release version 0.4 due to needing support for encoding YUV 4:0:0 (monochrome) images.

from pdn-avif.

rickbrew avatar rickbrew commented on June 12, 2024

Any chance this commit will improve the memory usage? https://aomedia.googlesource.com/aom/+/91579255d37c61ec3d470f0eaba9074e5cf93d2c

Edit: the commit is dated after the release of 2.0, afaict

I'll test PDN in x86 mode to try and force out-of-memory and validate that we get some kind of error when aom hits out-of-memory instead of some spurious invalid pointer stomping that will result in who-knows-what

from pdn-avif.

0xC0000054 avatar 0xC0000054 commented on June 12, 2024

Any chance this commit will improve the memory usage?

It might.
Hopefully the AOM team will start issuing regular releases, but I can always periodically check the libavif repository to see what AOM commit it is bundling.

I'll test PDN in x86 mode to try and force out-of-memory and validate that we get some kind of error when aom hits out-of-memory instead of some spurious invalid pointer stomping that will result in who-knows-what

Sounds good.
I know the out of memory reporting works when initializing the encoder, I had to optimize my color space conversion code to avoid that on x64.

from pdn-avif.

0xC0000054 avatar 0xC0000054 commented on June 12, 2024

@rickbrew

I added a maxThreads field to the encoder options:

maxThreads = Environment.ProcessorCount

Try adjusting that value and see what performance / memory usage trade-offs you can find.

I am wondering if there may be a limit on how many cores/threads can do useful work before it becomes a waste of memory and/or CPU time.
Trying to use up to 32 threads for one image seems like it could waste a lot of CPU time waiting for various threads to synchronize, but I could be wrong.

from pdn-avif.

rickbrew avatar rickbrew commented on June 12, 2024

If you make MaxThreads available through the FileType's PropertyCollection / PropertyNames, I can add a benchmark to PdnBench and test out performance that way. (edit: as in, I can benchmark it various values of MaxThreads)

Performance was worse with only 1 thread. CPU usage with all 32 threads reached about 50%, but used an extra 1GB of memory.

I even took a trace with Windows Performance Recorder and opened it up in Windows Performance Analyzer. Here's what it looks like: (there are several saves that happened at various Speed settings)

image

The parabolic CPU usage is certainly interesting :) It also doesn't look like synchronization, or at least wasteful spinning, is an issue.

from pdn-avif.

rickbrew avatar rickbrew commented on June 12, 2024

Actually, adding maxThreads to the PropertyCollection may not be a good idea. You can't remove properties from the ControlInfo that is created in OnCreateConfigUI() ... maybe I should add that. (this means it would show up in the UI, which we don't want)

from pdn-avif.

0xC0000054 avatar 0xC0000054 commented on June 12, 2024

Another option would be to split a large image into smaller chunks / tiles, this would reduce memory usage at the expense of file size.

The HEIF format that AVIF is based on supports the concept of an image grid, where a larger image can be reconstructed from a number of identically sized tiles.
The tiles do not need to be square, one of Microsoft's AVIF samples uses a 5x4 grid of 1280x720 pixel tiles (the final image dimensions are 6400x2880 pixels).
The HEIC image in your first post also uses this method, it consists of an 8x6 grid of 504x504 pixel tiles.

The main issue that I have with implementing this is that I cannot think of an algorithm to determine the optimal tile size for an image.

from pdn-avif.

rickbrew avatar rickbrew commented on June 12, 2024

Do you suspect there's an easy way to determine the optimal tile size? Or would it be really hard to determine? (for insance, 256x tiles would be good, 300x would be bad, 480x would be better... it wouldn't be monotonically increasing, in other words?)

I'm brainstorming here, but maybe the default speed mode would just do a good-faith effort at choosing a reasonable tile size. The primary goal would be to achieve reasonable performance (at least PNG-esque speed). "Fast" certainly doesn't imply "optimal".

Does tiling also reduce quality? I'm just guessing-by-analogy based on normal lossless compression principles, whereby splitting a large stream into chunks and compressing them in isolation would not be able to take advantage of cross-tile repetition.

from pdn-avif.

0xC0000054 avatar 0xC0000054 commented on June 12, 2024

Do you suspect there's an easy way to determine the optimal tile size?

I am currently thinking that I will use a loop that exits if the tile width/height is <= 512 or when the tile count reaches 10 (100 tiles in the worst case).
Images that have a width/height <= 512 will not use tiles.

The format allows up to 256 tiles in both the horizontal and vertical directions, I decided to cap it as I think having an image with more than 100 tiles is excessive.

Does tiling also reduce quality?

Possibly, the only existing tiled AVIF image that I have uses a separate image for each tile.

I'm brainstorming here, but maybe the default speed mode would just do a good-faith effort at choosing a reasonable tile size. The primary goal would be to achieve reasonable performance (at least PNG-esque speed). "Fast" certainly doesn't imply "optimal".

I think the medium speed is tolerable for a 512x512 image, I will have to finish my encoder changes to see if that scales when encoding multiple tiles.
256x256 is even faster, but it requires many more tiles for the typical digital camera sized image.

from pdn-avif.

0xC0000054 avatar 0xC0000054 commented on June 12, 2024

I'm just guessing-by-analogy based on normal lossless compression principles, whereby splitting a large stream into chunks and compressing them in isolation would not be able to take advantage of cross-tile repetition.

It looks like this may be the case, the increase in file size cannot be explained by only metadata / headers (which are a fixed size).
Using the image from your first post, here are the results:

Quality: 85
YUV 4:2:2

Image dimensions: 4032x3024
Tile1: 512 pixel max tile size, 8x6 grid of 504x504 pixel tiles
Tile2: 1920 pixel max tile size, 3x2 grid of 1512x1344 pixel tiles

Fast:
  no tiles: ~2.5GB of memory used when saving, ~2.05 MB on disk
  Tile1: ~340 MB of memory used when saving, ~2.06 MB on disk, ~17 KB increase
  Tile2: ~775 MB of memory used when saving, ~2.05 MB on disk, ~5.2 KB increase
Medium: 
  no tiles: ~3GB of memory used when saving, ~1.91 MB on disk
  Tile1: ~375 MB of memory used when saving, ~1.98 MB on disk, ~76 KB increase
  Tile2: ~950 MB of memory used when saving, ~1.95 MB on disk, ~42 KB increase
Slow: 
  no tiles: ~3GB of memory used when saving, ~1.85 MB on disk
  Tile1: ~430 MB of memory used when saving, ~1.97 MB on disk, ~115 KB increase
  Tile2: ~950 MB of memory used when saving, ~1.93 MB on disk, ~81 KB increase

I am not sure if the memory savings are worth the decrease in compression efficiency.

from pdn-avif.

rickbrew avatar rickbrew commented on June 12, 2024

I am not sure if the memory savings are worth the decrease in compression efficiency.

Fast:
no tiles: ~2.5GB of memory used when saving, ~2.05 MB on disk
Tile1: ~340 MB of memory used when saving, ~2.06 MB on disk, ~17 KB increase

An 86.4% decrease in memory usage for less than 1% increase in disk size sounds VERY worth it to me.

Maybe Fast, Medium, and Slow would use progressively larger tiles.

And you can consider adding an Extreme or Very Slow mode that doesn't use tiles. "Very Slow" has precedent -- Handbrake uses this, for instance. Fastest, Faster, Fast, Medium, Slow, Very Slow, iirc.

from pdn-avif.

0xC0000054 avatar 0xC0000054 commented on June 12, 2024

And you can consider adding an Extreme or Very Slow mode that doesn't use tiles. "Very Slow" has precedent -- Handbrake uses this, for instance. Fastest, Faster, Fast, Medium, Slow, Very Slow, iirc.

I will add a Very Slow compression speed.

Maybe Fast, Medium, and Slow would use progressively larger tiles.

That certainly simplifies picking a maximum tile size.
I am currently thinking that I will use the following settings:

Speed Max Tile Size
Fast 512
Medium 1280
Slow 1920
Very Slow N/A

If a valid tile size is not found, the image will be will encoded as a single tile.

from pdn-avif.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.