juliastats / distance.jl Goto Github PK
View Code? Open in Web Editor NEWJulia module for Distance evaluation
License: MIT License
Julia module for Distance evaluation
License: MIT License
(I'd do this myself but I don't have JuliaStats permissions)
This issue is being filed by a script, but if you reply, I will see it.
PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their test (if available) on both the stable version of Julia (0.2) and the nightly build of the unstable version (0.3).
The results of this script are used to generate a package listing enhanced with testing results.
The status of this package, Distance, on...
'No tests, but package loads.' can be due to their being no tests (you should write some if you can!) but can also be due to PackageEvaluator not being able to find your tests. Consider adding a test/runtests.jl
file.
'Package doesn't load.' is the worst-case scenario. Sometimes this arises because your package doesn't have BinDeps support, or needs something that can't be installed with BinDeps. If this is the case for your package, please file an issue and an exception can be made so your package will not be tested.
This automatically filed issue is a one-off message. Starting soon, issues will only be filed when the testing status of your package changes in a negative direction (gets worse). If you'd like to opt-out of these status-change messages, reply to this message.
It's become a bit of a convention to use the plural for package names – freeing up the singular Distance
for a type name. In this case, it's not really needed since the mathematical term for a distance is "metric", but it might be good to follow the convention when the ecosystem starts switching over to Julia 0.4.
It would be nice to add support for Jaccard distance and the Rogers-Tanimoto dissimilarity distance.
For some distance measures, correlations between columns with NaNs are being returned as zeros. For example:
x=randn(2,2)
x[1,1]=NaN
ans1=pairwise(Euclidean(), x)
ans2=sqrt(pairwise(SqEuclidean(), x))
You'll notice ans2 is correct, but ans1 is not...
I'm using a recent version of Julia, where At_mul_B
is deprecated.
mahalanobis(x, y, Q)
x and y it's vectors
Q must be covariance ...
julia> Q=cov(x,y)
28.409813261664944
julia> mahalanobis(x, y, Q)
ERROR: no method mahalanobis(Array{Float64,1}, Array{Float64,1}, Float64)
julia>
Hi all, I want to use JSDistance function to calculate divergence of 10 probability distributions, however I found that the default JSDistanc function only can compare the divergence of two distributions.
Could I add a JSmetric function for this package?
The square root of the Jensen–Shannon divergence is a metric. In bioinformatics field, we often use it to test isoform abundance changes among multiple expreiments in RNA-Seq data.
Here is the Jensen-Shannon divergence of m discrete probability distributions (p^1, ...,p^m
):
JSmetric(p^1,...,p^m)=H(\frac{p^1+...+p^m}{m})-\frac{\sum_{j=1}^{m} H(p^j)}{m}
where H()
is entropy.
Currently the result type is always Float64
other than for the hamming metric and weighted (semi)metrics. It would make sense in many cases for the result type to be of the same type as the inputs, for example evaluate(dist::Euclidean,a::AbstractArray{Float32,1},b::AbstractArray{Float32,1})
could return a Float32
. However, when a
and b
are vectors of Int
s it would make sense to return a Float64
.
What is the planned/intended behavior?
It seems that to_fptype (used in results_type) is gone. I don't know how to solve this---just removing the to_fptype() seems to work for me.
I'm a beginner and I would love a couple of examples, maybe even per function. They don't need to be crazy extensive, but just so I could copy-paste and have it work right away. Here's what I'd love to see (and maybe you'll see why I need them), just as an example:
using Distance
n = 5
m = 8
x = rand(n,m)
y = rand(n,m)
r = colwise(Euclidean,x,y)
Disclaimer: this doesn't work for me, I get an
ERROR: no method colwise(Type{Euclidean}, Array{Float64,2}, Array{Float64,2})
PackageEvaluator.jl is a script that runs nightly.
It attempts to load all Julia packages and run their tests (if available) on both the stable
version of Julia (0.3) and the nightly build of the unstable version (0.4).
The results of this script are used to generate a package listing
enhanced with testing results. This service also benefits package developers by notifying them if
their package breaks for some reason (caused by e.g. changes in Julia, changes in dependencies,
or broken binary dependencies.)
Currently PackageEvaluator attempts to find your test scripts using a heuristic, preferring the
standarized test/runtests.jl
whenever present. Using test/runtests.jl
allows people to test
your package using simply Pkg.test("Distance")
, with any testing-only dependencies being
installed by looking at test/REQUIRE
.
Your package doesn't appear to have a test/runtests.jl
file. PackageEvaluator is going to move
away from auto-detecting tests and will instead only test packages with a test/runtests.jl
file. This change will take place in about a month.
You can:
If you'd like help or more information, please just reply to this issue.
To ensure that the functions work with them.
Hepy New Year;)
_
_ _ ()_ | A fresh approach to technical computing
() | () () | Documentation: http://docs.julialang.org
_ _ | | __ _ | Type "help()" for help.
| | | | | | |/ ` | |
| | || | | | (| | | Version 0.3.3 (2014-11-23 20:19 UTC)
/ |_'|||__'| | Official http://julialang.org/ release
|__/ | x86_64-w64-mingw32
julia> using Distance
WARNING: The Distance package is deprecated. Please use a new package Distances instead.
Please , remove "Distance" ffrom list of packages
Paul
Is is possible to add support for the span semi-norm which is defined as
spanNorm(x,y) = max(x-y) - min(x-y)
This is used frequently in value iteration for solving dynamic programming equations.
In Julia Studio 0.4.4 on Mac OS X 10.9, I am unable to load the Distance package. When I run:
julia> using Distance
Julia studio terminal do not respond, it never goes back to the prompt. I have tried to reinstall the package as well as to remove the ~/.julia
folder, but the problem persists.
julia> Pkg.status()
Required packages:
- DataFrames 0.4.3
- Distance 0.2.6
- IJulia 0.1.11 55b60c47 (dirty)
- RDatasets 0.1.1
Additional packages:
- BinDeps 0.2.12
- Blocks 0.0.4
- DataArrays 0.0.3
- GZip 0.2.12
- Homebrew 0.0.6
- JSON 0.3.5
- Nettle 0.1.3
- NumericExtensions 0.3.6
- REPLCompletions 0.0.1
- SortingAlgorithms 0.0.1
- StatsBase 0.3.8
- URIParser 0.0.2
- ZMQ 0.1.11
[enhancement] It would be freaking awesome if you could add a bwdist function. A function that will take an instance of a distance type and an n-dimensional array of Bool. It will then return an array with the same size as the input array where each cell has the distance to the closest true containing cell.
In addition: it might be relevant in this respect to return memory-cheap types dependent on the distance metric. Since the returned distances in bwdist will be whole positive numbers for many of the distance metrics (e.g. SqEuclidean, Cityblock, Chebyshev, Hamming, etc), it might be cool to work in Uint8, 16, 32, 64, or 128 (dependent on the largest possible distance) just to save place without any loss of accuracy. I guess the easiest is to just add "cheap" versions of those types (e.g. CheapSqEuclidean) to the set of existing distance types.
Hope you find this easy and fun!
$ /usr/local/src/julia/julia/julia -e 'versioninfo()'
Julia Version 0.3.0-prerelease+1692
Commit 736251d* (2014-02-23 06:21 UTC)
Platform Info:
System: Linux (i686-redhat-linux)
CPU: Genuine Intel(R) CPU T2250 @ 1.73GHz
WORD_SIZE: 32
BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY)
LAPACK: libopenblas
LIBM: libopenlibm
$ /usr/local/src/julia/julia/julia runtests.jl
test/test_dists.jl ...
Warning: New definition
sum(DenseArray{T<:Number,N},Union(Int32,(Int32...,),Array{Int32,1})) at /home/rick/.julia/NumericExtensions/src/reducedim.jl:241
is ambiguous with:
sum(BitArray{N},Any) at bitarray.jl:1570.
To fix, define
sum(BitArray{N},Union(Int32,(Int32...,),Array{Int32,1}))
before the new definition.
$
Is there any reason that package doesn't have implementations for 1D distances? Like
evaluate{T <: Real}(dist::SqEuclidean, a::T, b::T) = (a-b)*(a-b)
$ ./julia ~/.julia/Distance/runtests.jl
test/test_dists.jl ...
ERROR: test error during all_approx(colwise(SqMahalanobis(Q),X,Y),#2575#r1,1.0e-13)
no method all_approx(ContiguousView{Float64,2,Array{Float64,1}}, Array{Float64,1}, Float64)
in anonymous at test.jl:53
in do_test at test.jl:37
in anonymous at /home/rick/.julia/Distance/test/test_dists.jl:120
in anonymous at no file:6
in include_from_node1 at loading.jl:120
while loading /home/rick/.julia/Distance/test/test_dists.jl, in expression starting on line 152
while loading /home/rick/.julia/Distance/runtests.jl, in expression starting on line 3
$ ./julia -E 'Pkg.installed("Distance")'
v"0.3.0"
$ ./julia -e 'versioninfo()'
Julia Version 0.3.0-prerelease+1599
Commit 6477ca2* (2014-02-17 00:38 UTC)
Platform Info:
System: Linux (i686-redhat-linux)
CPU: Genuine Intel(R) CPU T2250 @ 1.73GHz
WORD_SIZE: 32
BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY)
LAPACK: libopenblas
LIBM: libopenlibm
$
I am trying to release a package that has Distance.jl and PyCall.jl as its only dependencies. I get the following errors when I use Pkg.publish(). Any thoughts on what I should do?
DataFrames v0.5.0 – no valid versions exist for package DataArrays
Distance v0.3.0 – no valid versions exist for package NumericExtensions
Distance v0.3.1 – no valid versions exist for package NumericExtensions
RDatasets v0.1.0 – no valid versions exist for package DataArrays
The following code fails with an InexactError:
pairwise(Euclidean(), [1 2;
3 4])
This could be changed by forcing certain metrics, like Euclidean, to specify their result_type
as a floating point:
result_type(::Euclidean, T1::Type, T2::Type) = Float64
Does that seem alright to you?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.