Giter Club home page Giter Club logo

Comments (25)

StefanKarpinski avatar StefanKarpinski commented on May 22, 2024

Thanks for summarizing all of that here.

from gadfly.jl.

pygy avatar pygy commented on May 22, 2024

:-)

from gadfly.jl.

pao avatar pao commented on May 22, 2024

Thanks; sorry for junking up your pull request!

from gadfly.jl.

pygy avatar pygy commented on May 22, 2024

No problem, I'm actually interested in this discussion.

Color perception is quirky. While I understand @dcjones's wish to use an algorithm, it may be hard to beat cherry-picked schemes.

This Programmers.StackExchange discussion is somewhat relevant, see these:

There's also this Stack Overflow discussion, which leads to this:

from gadfly.jl.

timholy avatar timholy commented on May 22, 2024

If the task is to come up with a set of colors that are as easy to distinguish as possible (and of course this is far from the only interesting task when it comes to choosing colors), I don't think any of those examples come close to the algorithmically-generated one:

http://www.mathworks.com/matlabcentral/fx_files/29702/3/distinguishable_colors.png

(The one on the left is the relevant one.)

from gadfly.jl.

timholy avatar timholy commented on May 22, 2024

@dcjones, re fixing hue and lightness. Certainly that's a very nice touch in some cases! It adds a real sense of balance and elegance to the figure.

But if the task is simply to have as many different markers or lines as possible, then permitting variations in hue and lightness can also help increase the number of choices. If you compare your scatterplot in JuliaLang/julia#2278 against that bar stripe in my png file, to my eyes the typical minimum pairwise separation among >20 bars is approximately equivalent to the typical minimum pairwise separation among your 8 colors.

from gadfly.jl.

pao avatar pao commented on May 22, 2024

@timholy I haven't looked at the code you put up on FEX, but have you looked into (or implemented) an algorithm that also works when communicating with colorblind peers?

Certainly that's a very nice touch in some cases! It adds a real sense of balance and elegance to the figure.

But it does sacrifice contrast, as only color contrast remains. This contributes to making the points harder to distinguish.

from gadfly.jl.

timholy avatar timholy commented on May 22, 2024

but have you looked into (or implemented) an algorithm that also works when communicating with colorblind peers?

Nope, but if you have a suitable metric, it's trivial:

function colordiff_trichromatic(a,b)
  # Calculate the distance between two colors a and b, assuming typical trichromatic vision
end
function colordiff_rgblind(a,b)
  # Calculate the distance between two colors a and b, for red-green colorblind subjects
end

palette = distinguishable_colors(n, colordiff_rgblind)

For colordiff_trichromatic I just used the Euclidean distance in LAB colorspace. I don't know the right thing to do for the various forms of colorblindness (and don't have time to look into it now), but I suspect this isn't very hard to get at least approximately correct. This is the nice thing about the approach, all it needs to generate colors is the ability to measure the "perceptual distance" between them.

from gadfly.jl.

StefanKarpinski avatar StefanKarpinski commented on May 22, 2024

Seems like if we have a metric for each common color perception regime (including normal color vision), then taking the min over all of them would give you a new metric which, if we optimize wrt, we'd get colors that are distinct for everyone.

from gadfly.jl.

timholy avatar timholy commented on May 22, 2024

I was just thinking the same thing...

from gadfly.jl.

pao avatar pao commented on May 22, 2024

Cool! That sounds like a valuable addition.

from gadfly.jl.

pygy avatar pygy commented on May 22, 2024

TL;DR: If you want to cover all types of color blindness, you'll end up with the grey scale. If you limit yourself to the most common cases(99% of the population), you should use the blue-yellow axis (a plane that intersects blue, yellow, white and black).

Short summary of the physiology of color perception.

The retina has two kind of sensitive cells: rods and cones.

Rods are only present in the periphery of the retina and have a low sensory threshold. They mostly provide the grey scale night vision. Since there are none in the center of the retina (the fovea), you're unable to read or distinguish details in low light conditions. They are not involved in color perception.

The cones are mostly present in the fovea, with a lower density elsewhere. There are three kinds of cones, with different absorption spectra. The spectra of the first two kinds of cones overlap a lot.

Rhodopsin absorption spectra

The brain doesn't get the direct output of theses sensors. It gets two contrasts: R - G and B - (R+G), which explains why the absorption peaks don't correspond to #f00, #0f0and #00fin RGB space.

Trichromacy (three kind of cones) is the norm. There are three kind of dichromacy, with partial or total deficit in one kind of cone, and monochromacy a.k.a. achromatopsia, where two types of cones are missing (extremely rare. There's an even rarer condition where there's only rods).

Each of these deficits lead to a perception corresponding to a projection of the RGB space in a different 2d plane, or, in the case of partial deficit, a distortion of the space. These projections are off course different according to the kind of deficit.

Because the ~red and ~green light receptors ( rhodopsins ) are coded by genes located on the X chromosome, color blindness of these types, which is recessive, occurs mostly in men (since women have tow copies they have a much higher chance to get at least one working copy).

The most common case is a partial deficit in the ~green cone (prevalence: ♂6%, ♀0.4%), then comes partial ~red, total ~green and total red deficit (each ♂1%, ♀0.01%). Total and partial ~blue deficits are more rare (respectively <1% and ~0.1%,in both men and women).

The red vs green axis is thus disturbed in ~4.7% of the population, while the blue vs yellow axis is disturbed in less than 1% of the population.

from gadfly.jl.

dcjones avatar dcjones commented on May 22, 2024

@timholy I do agree that a wider span of lightness (and maybe chroma) values need to be selected from for these qualitative scales. "Harmoniousness" and "distinguishability" aren't entirely compatible, but hopefully we can find a solution that takes both into account.

@pygy Great summary! Obviously greyscale isn't a great default (though definitely an option we should provide). Taking into account difficulty in red/green discrimination in default scales would be great, though.

I'm looking at a few papers on "Daltonization", which is adjusting colors to be perceivable to those with color vision deficiencies. This one for example:

Huang, Jia-Bin, et al. "Information preserving color transformation for protanopia and deuteranopia." Signal Processing Letters, IEEE 14.10 (2007): 711-714.

I need to to read more, but that makes me tempted to try generating scales using simple rules (alternating fixed lightness and hue steps maybe), then implement a daltonize function that takes a vector of colors and adjusts them, which seems like a straightforward optimization problem if the right measure of distinguishability is used.

Also potentially useful, this paper discusses reproducing colorbrewer scales programmatically.

Wijffelaars, Martijn, et al. "Generating color palettes using intuitive parameters." Computer Graphics Forum. Vol. 27. No. 3. Blackwell Publishing Ltd, 2008.

Once we figure this out, I should probably move everything to JuliaLang/Color.jl so everyone can benefit.

from gadfly.jl.

dcjones avatar dcjones commented on May 22, 2024

Well this has been a fun rabbit-hole. I certainly learned a lot more about color.

In this paper, a method of projecting colors into a reduced gamut to simulate color blindness is defined.

Brettel, H., Viénot, F., & Mollon, J. D. (1997). Computerized simulation of color appearance for dichromats. Josa A, 14(10), 2647–2655.

I implemented this in compose. Here's how my original plot looks like to someone with deuteranopia (the most common form in which the green or medium wavelength cone is lacking).
Colortest1

This probably overstates the effect, since most color blind people aren't totally deuteranopic. I extended Brettel's algorithm to simulate degrees of color blindness. Here's 80% loss of the green cone function.
Colortest3

Regardless, it's clear my original color picking strategy is really really bad for anyone with color blindness.

Projecting into this reduced gamut and measuring color difference is the basis of a good objective function. A pretty good measure of color difference is just euclidean distance in LAB space. But it turns out a lot of work has gone into defining a more accurate metric (one that more closely represents how different two colors look to humans), culminating in the CIEDE2000 formula, which is absolutely bonkers, but seems to work well.

Here's an attempt at generating a random color palette that optimizes minimum pairwise CIEDE2000 color distance using a stupidly simple stochastic hill climbing approach.
Colortest3

With that working I can optimize over distance after simulating color blindness, which gives me a palette like this.
Colortes43

Which to a deuteranopic person looks like:
Colortest5

Not perfect, but a huge improvement over the first simulated color blindness image! The coolest part is that we can generate many colorblind safe palettes, these are just examples. We can also tweak the objective function to get different results.

Colorbrewer on the other hand defines a total of three qualitative color blind safe palettes, and none of which will give you more than four colors, so this is way more powerful approach. (Take that geography department!)

By the way, if someone who is actually deuteranopic can confirm that these colors are easier to read, that would be helpful.

from gadfly.jl.

pao avatar pao commented on May 22, 2024

Wow, that's impressive. I should be able to run this by someone on Monday. And thanks for taking this so seriously!

from gadfly.jl.

timholy avatar timholy commented on May 22, 2024

Very nice work, Daniel. This will really be a huge contribution.

Two other points:

  1. In practice, a color-choice scheme that depends on any measure of randomness might prove to be frustrating; each time you generate a figure for publication, the colors will turn out different. In my Matlab version, I generated a discrete list of candidates (on a 30x30x30 grid, I believe), and selected the n+1 color as the point with largest distance from its nearest neighbor among the first n colors (via exhaustive search). Even simpler than any kind of hill-climbing, and not a performance drag I've ever noticed by comparison to pushing pixels to the screen.
  2. Are you including the background color in your optimization? The points might have to be distinguishable from it as well as from each other. You could optionally "seed" the optimization with a list of fixed colors that you want to avoid, e.g., if the user will draw on a gray background and will annotate a few points with black arrows.

from gadfly.jl.

StefanKarpinski avatar StefanKarpinski commented on May 22, 2024

That is very impressive. My main take away is that this is a completely incomprehensible way to present this data no matter how good your color scheme is :-)

from gadfly.jl.

pygy avatar pygy commented on May 22, 2024

Nudging the overlapping dots would make it more readable (ideally in a predictable order).

from gadfly.jl.

StefanKarpinski avatar StefanKarpinski commented on May 22, 2024

Here's how we've been presenting the benchmarks in presentations (created with Numbers):

perf

It seems pretty readable as a bar graph and grouping the numbers by language seems reasonable since it ends up showing each language as a little performance profile shape. I would love to use a pretty Gadfly-generated SVG similar to this instead!

from gadfly.jl.

dcjones avatar dcjones commented on May 22, 2024

@timholy Yeah, different colors each time you draw a plot is no good. Fixing the seed is an option, but your deterministic approach seems more appealing. I will experiment with this later. I'm also not taking into account the background color, so sometimes colors are not sufficiently distinguishable from the grey of the background.

@StefanKarpinski Added the missing bits to Gadfly. Here's a first pass.
BenchmarkBar

This will look better once I rearrange the order of the benchmarks and languages, which is not something I can do right now. Also, "distinguishability" and "garishness" seem to go hand in hand for color palettes. That obviously needs some work.

from gadfly.jl.

pao avatar pao commented on May 22, 2024

With that working I can optimize over distance after simulating color blindness, which gives me a palette like this.

n=1, but my deuteranope colleague found the optimized palette easier to read, so it looks like that is a promising approach.

from gadfly.jl.

dcjones avatar dcjones commented on May 22, 2024

Cool. Actually the Brettel paper evaluated their transformation with a total of two subjects, so we aren't doing much worse. :)

from gadfly.jl.

dcjones avatar dcjones commented on May 22, 2024

I'm using @timholy's brute-force algorithm now, which is much better than my hill climbing thing. Restricting to a relatively small set of possible hue/chroma/lightness values also gives a little control over how aesthetically offensive the palette is. So I've arrived at something that I think is a reasonably good trade off between accessibility and aesthetics.

And for the tangental issue (or was color palettes the tangent?) of making a plot of Julia benchmark results. Does this seem okay @StefanKarpinski?

Benchmarks3

# Note: this code depends on very recent commits to DataFrames and Gadfly.
using Gadfly, DataFrames

benchmarks = DataFrame(readcsv("benchmarks.csv"),
                       ["Language", "Benchmark", "Time"])

# Normalize to C
benchmarks = merge(benchmarks, subset(benchmarks, :(Language .== "c")),
                   "Benchmark", "outer")
within!(benchmarks, :(Time ./= Time_1))

# Reorder factors
benchmarks["Language"] = PooledDataArray(benchmarks["Language"])
benchmarks["Language"] = reorder!(benchmarks["Language"], benchmarks["Time"])
benchmarks["Benchmark"] = PooledDataArray(benchmarks["Benchmark"])
benchmarks["Benchmark"] = reorder!(benchmarks["Benchmark"], benchmarks["Time"])

# Capitolize
capitalize(str) = string(uppercase(str[1]) , str[2:end])
benchmarks["Language"] = levels!(benchmarks["Language"],
                                 Dict(benchmarks["Language"],
                                      [capitalize(l)
                                       for l in benchmarks["Language"]]))

benchmarks = subset(benchmarks, :(Language .!= "C"))

p = plot(benchmarks,
         {:x => "Language", :y => "Time", :color => "Benchmark"},
         Geom.bar, Stat.identity, Scale.y_log10,
         Guide.YLabel("Time (Relative to C++)"))

draw(SVG("benchmarks.svg", 900px, 400px), p)

from gadfly.jl.

timholy avatar timholy commented on May 22, 2024

That is just lovely. Really nice work, Daniel.

from gadfly.jl.

dcjones avatar dcjones commented on May 22, 2024

All of this lives in JuliaLang/Color.jl now. Closing.

from gadfly.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.