Comments (25)
Thanks for summarizing all of that here.
from gadfly.jl.
:-)
from gadfly.jl.
Thanks; sorry for junking up your pull request!
from gadfly.jl.
No problem, I'm actually interested in this discussion.
Color perception is quirky. While I understand @dcjones's wish to use an algorithm, it may be hard to beat cherry-picked schemes.
This Programmers.StackExchange discussion is somewhat relevant, see these:
- http://www.csc.ncsu.edu/faculty/healey/download/cstr.96.pdf
- http://www.poste.it/azienda/sapi/Interactive%20Genetic%20Algorithm%20for%20colors.pdf
There's also this Stack Overflow discussion, which leads to this:
- http://afriggeri.github.com/RYB/ rationale: http://threekings.tk/mirror/ryb_TR.pdf
- http://tools.medialab.sciences-po.fr/iwanthue/index.php
from gadfly.jl.
If the task is to come up with a set of colors that are as easy to distinguish as possible (and of course this is far from the only interesting task when it comes to choosing colors), I don't think any of those examples come close to the algorithmically-generated one:
http://www.mathworks.com/matlabcentral/fx_files/29702/3/distinguishable_colors.png
(The one on the left is the relevant one.)
from gadfly.jl.
@dcjones, re fixing hue and lightness. Certainly that's a very nice touch in some cases! It adds a real sense of balance and elegance to the figure.
But if the task is simply to have as many different markers or lines as possible, then permitting variations in hue and lightness can also help increase the number of choices. If you compare your scatterplot in JuliaLang/julia#2278 against that bar stripe in my png file, to my eyes the typical minimum pairwise separation among >20 bars is approximately equivalent to the typical minimum pairwise separation among your 8 colors.
from gadfly.jl.
@timholy I haven't looked at the code you put up on FEX, but have you looked into (or implemented) an algorithm that also works when communicating with colorblind peers?
Certainly that's a very nice touch in some cases! It adds a real sense of balance and elegance to the figure.
But it does sacrifice contrast, as only color contrast remains. This contributes to making the points harder to distinguish.
from gadfly.jl.
but have you looked into (or implemented) an algorithm that also works when communicating with colorblind peers?
Nope, but if you have a suitable metric, it's trivial:
function colordiff_trichromatic(a,b)
# Calculate the distance between two colors a and b, assuming typical trichromatic vision
end
function colordiff_rgblind(a,b)
# Calculate the distance between two colors a and b, for red-green colorblind subjects
end
palette = distinguishable_colors(n, colordiff_rgblind)
For colordiff_trichromatic
I just used the Euclidean distance in LAB colorspace. I don't know the right thing to do for the various forms of colorblindness (and don't have time to look into it now), but I suspect this isn't very hard to get at least approximately correct. This is the nice thing about the approach, all it needs to generate colors is the ability to measure the "perceptual distance" between them.
from gadfly.jl.
Seems like if we have a metric for each common color perception regime (including normal color vision), then taking the min over all of them would give you a new metric which, if we optimize wrt, we'd get colors that are distinct for everyone.
from gadfly.jl.
I was just thinking the same thing...
from gadfly.jl.
Cool! That sounds like a valuable addition.
from gadfly.jl.
TL;DR: If you want to cover all types of color blindness, you'll end up with the grey scale. If you limit yourself to the most common cases(99% of the population), you should use the blue-yellow axis (a plane that intersects blue, yellow, white and black).
Short summary of the physiology of color perception.
The retina has two kind of sensitive cells: rods and cones.
Rods are only present in the periphery of the retina and have a low sensory threshold. They mostly provide the grey scale night vision. Since there are none in the center of the retina (the fovea), you're unable to read or distinguish details in low light conditions. They are not involved in color perception.
The cones are mostly present in the fovea, with a lower density elsewhere. There are three kinds of cones, with different absorption spectra. The spectra of the first two kinds of cones overlap a lot.
The brain doesn't get the direct output of theses sensors. It gets two contrasts: R - G
and B - (R+G)
, which explains why the absorption peaks don't correspond to #f00
, #0f0
and #00f
in RGB space.
Trichromacy (three kind of cones) is the norm. There are three kind of dichromacy, with partial or total deficit in one kind of cone, and monochromacy a.k.a. achromatopsia, where two types of cones are missing (extremely rare. There's an even rarer condition where there's only rods).
Each of these deficits lead to a perception corresponding to a projection of the RGB space in a different 2d plane, or, in the case of partial deficit, a distortion of the space. These projections are off course different according to the kind of deficit.
Because the ~red and ~green light receptors ( rhodopsins ) are coded by genes located on the X chromosome, color blindness of these types, which is recessive, occurs mostly in men (since women have tow copies they have a much higher chance to get at least one working copy).
The most common case is a partial deficit in the ~green cone (prevalence: ♂6%, ♀0.4%), then comes partial ~red, total ~green and total red deficit (each ♂1%, ♀0.01%). Total and partial ~blue deficits are more rare (respectively <1% and ~0.1%,in both men and women).
The red vs green axis is thus disturbed in ~4.7% of the population, while the blue vs yellow axis is disturbed in less than 1% of the population.
from gadfly.jl.
@timholy I do agree that a wider span of lightness (and maybe chroma) values need to be selected from for these qualitative scales. "Harmoniousness" and "distinguishability" aren't entirely compatible, but hopefully we can find a solution that takes both into account.
@pygy Great summary! Obviously greyscale isn't a great default (though definitely an option we should provide). Taking into account difficulty in red/green discrimination in default scales would be great, though.
I'm looking at a few papers on "Daltonization", which is adjusting colors to be perceivable to those with color vision deficiencies. This one for example:
Huang, Jia-Bin, et al. "Information preserving color transformation for protanopia and deuteranopia." Signal Processing Letters, IEEE 14.10 (2007): 711-714.
I need to to read more, but that makes me tempted to try generating scales using simple rules (alternating fixed lightness and hue steps maybe), then implement a daltonize
function that takes a vector of colors and adjusts them, which seems like a straightforward optimization problem if the right measure of distinguishability is used.
Also potentially useful, this paper discusses reproducing colorbrewer scales programmatically.
Wijffelaars, Martijn, et al. "Generating color palettes using intuitive parameters." Computer Graphics Forum. Vol. 27. No. 3. Blackwell Publishing Ltd, 2008.
Once we figure this out, I should probably move everything to JuliaLang/Color.jl so everyone can benefit.
from gadfly.jl.
Well this has been a fun rabbit-hole. I certainly learned a lot more about color.
In this paper, a method of projecting colors into a reduced gamut to simulate color blindness is defined.
Brettel, H., Viénot, F., & Mollon, J. D. (1997). Computerized simulation of color appearance for dichromats. Josa A, 14(10), 2647–2655.
I implemented this in compose. Here's how my original plot looks like to someone with deuteranopia (the most common form in which the green or medium wavelength cone is lacking).
This probably overstates the effect, since most color blind people aren't totally deuteranopic. I extended Brettel's algorithm to simulate degrees of color blindness. Here's 80% loss of the green cone function.
Regardless, it's clear my original color picking strategy is really really bad for anyone with color blindness.
Projecting into this reduced gamut and measuring color difference is the basis of a good objective function. A pretty good measure of color difference is just euclidean distance in LAB space. But it turns out a lot of work has gone into defining a more accurate metric (one that more closely represents how different two colors look to humans), culminating in the CIEDE2000 formula, which is absolutely bonkers, but seems to work well.
Here's an attempt at generating a random color palette that optimizes minimum pairwise CIEDE2000 color distance using a stupidly simple stochastic hill climbing approach.
With that working I can optimize over distance after simulating color blindness, which gives me a palette like this.
Which to a deuteranopic person looks like:
Not perfect, but a huge improvement over the first simulated color blindness image! The coolest part is that we can generate many colorblind safe palettes, these are just examples. We can also tweak the objective function to get different results.
Colorbrewer on the other hand defines a total of three qualitative color blind safe palettes, and none of which will give you more than four colors, so this is way more powerful approach. (Take that geography department!)
By the way, if someone who is actually deuteranopic can confirm that these colors are easier to read, that would be helpful.
from gadfly.jl.
Wow, that's impressive. I should be able to run this by someone on Monday. And thanks for taking this so seriously!
from gadfly.jl.
Very nice work, Daniel. This will really be a huge contribution.
Two other points:
- In practice, a color-choice scheme that depends on any measure of randomness might prove to be frustrating; each time you generate a figure for publication, the colors will turn out different. In my Matlab version, I generated a discrete list of candidates (on a 30x30x30 grid, I believe), and selected the
n+1
color as the point with largest distance from its nearest neighbor among the firstn
colors (via exhaustive search). Even simpler than any kind of hill-climbing, and not a performance drag I've ever noticed by comparison to pushing pixels to the screen. - Are you including the background color in your optimization? The points might have to be distinguishable from it as well as from each other. You could optionally "seed" the optimization with a list of fixed colors that you want to avoid, e.g., if the user will draw on a gray background and will annotate a few points with black arrows.
from gadfly.jl.
That is very impressive. My main take away is that this is a completely incomprehensible way to present this data no matter how good your color scheme is :-)
from gadfly.jl.
Nudging the overlapping dots would make it more readable (ideally in a predictable order).
from gadfly.jl.
Here's how we've been presenting the benchmarks in presentations (created with Numbers):
It seems pretty readable as a bar graph and grouping the numbers by language seems reasonable since it ends up showing each language as a little performance profile shape. I would love to use a pretty Gadfly-generated SVG similar to this instead!
from gadfly.jl.
@timholy Yeah, different colors each time you draw a plot is no good. Fixing the seed is an option, but your deterministic approach seems more appealing. I will experiment with this later. I'm also not taking into account the background color, so sometimes colors are not sufficiently distinguishable from the grey of the background.
@StefanKarpinski Added the missing bits to Gadfly. Here's a first pass.
This will look better once I rearrange the order of the benchmarks and languages, which is not something I can do right now. Also, "distinguishability" and "garishness" seem to go hand in hand for color palettes. That obviously needs some work.
from gadfly.jl.
With that working I can optimize over distance after simulating color blindness, which gives me a palette like this.
n=1
, but my deuteranope colleague found the optimized palette easier to read, so it looks like that is a promising approach.
from gadfly.jl.
Cool. Actually the Brettel paper evaluated their transformation with a total of two subjects, so we aren't doing much worse. :)
from gadfly.jl.
I'm using @timholy's brute-force algorithm now, which is much better than my hill climbing thing. Restricting to a relatively small set of possible hue/chroma/lightness values also gives a little control over how aesthetically offensive the palette is. So I've arrived at something that I think is a reasonably good trade off between accessibility and aesthetics.
And for the tangental issue (or was color palettes the tangent?) of making a plot of Julia benchmark results. Does this seem okay @StefanKarpinski?
# Note: this code depends on very recent commits to DataFrames and Gadfly.
using Gadfly, DataFrames
benchmarks = DataFrame(readcsv("benchmarks.csv"),
["Language", "Benchmark", "Time"])
# Normalize to C
benchmarks = merge(benchmarks, subset(benchmarks, :(Language .== "c")),
"Benchmark", "outer")
within!(benchmarks, :(Time ./= Time_1))
# Reorder factors
benchmarks["Language"] = PooledDataArray(benchmarks["Language"])
benchmarks["Language"] = reorder!(benchmarks["Language"], benchmarks["Time"])
benchmarks["Benchmark"] = PooledDataArray(benchmarks["Benchmark"])
benchmarks["Benchmark"] = reorder!(benchmarks["Benchmark"], benchmarks["Time"])
# Capitolize
capitalize(str) = string(uppercase(str[1]) , str[2:end])
benchmarks["Language"] = levels!(benchmarks["Language"],
Dict(benchmarks["Language"],
[capitalize(l)
for l in benchmarks["Language"]]))
benchmarks = subset(benchmarks, :(Language .!= "C"))
p = plot(benchmarks,
{:x => "Language", :y => "Time", :color => "Benchmark"},
Geom.bar, Stat.identity, Scale.y_log10,
Guide.YLabel("Time (Relative to C++)"))
draw(SVG("benchmarks.svg", 900px, 400px), p)
from gadfly.jl.
That is just lovely. Really nice work, Daniel.
from gadfly.jl.
All of this lives in JuliaLang/Color.jl now. Closing.
from gadfly.jl.
Related Issues (20)
- Gadfly not working for simple point plot HOT 2
- tests and building docs currently failing HOT 3
- typos HOT 3
- Question re wrapping Gadfly.jl functionality in Tidier.jl HOT 8
- Wrong x-axis upper limit with date ? HOT 6
- Please add a feature to vary the thickness of the line in Geom.line and Geom.path HOT 1
- Specifying number of ticks HOT 6
- trying to use `Stat.yticks` causes `MethodError`
- Gadfly histogram ignore the bincount (and how to remove background lines) HOT 2
- Gadfly pop up not working in Ubuntu - workaround HOT 1
- How to set the width of the bar? HOT 1
- How to supress output to stdout? HOT 8
- Zooming broken on faceted plots
- Hexbin plotting out of range
- tspan in svg output is malformated HOT 5
- colorkey min/max values HOT 1
- Can't add title to 2D function plot HOT 1
- Gadfly v1.4.0 ?
- SVG does not work with Gadfly 1.4.0 and Compose 0.9.5 HOT 6
- SVG generation has a XML mistake on tag <tspan>
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gadfly.jl.