Giter Club home page Giter Club logo

rdp's People

Contributors

fhirschmann avatar reinout avatar zippy1981 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

rdp's Issues

limiting number of points

Dear all,

is there a way to limit number of simplified points together with max distance (epsilon)?
Or to be more precise, to pick only subset of N the most relevant points for given dmax.

Thanks in advance!

Cheers,
Ivica

Makes no difference in reducing the number of points

I have a dataframe which I want to reduce the number of points of every feature,

This is my dataframe

ย  recency frequency money
0 8.353111 1.625226 20.943134
1 2.934699 4.013015 26.170988
2 5.703040 4.013015 31.328091
3 4.958268 4.529335 42.014511
4 4.291614 5.502551 31.964992

Shape is 793,3 before rdp

After using RDP
[ 8.35311112, 1.62522615, 20.94313392],
[ 2.93469946, 4.01301524, 26.17098815],
[ 5.70303988, 4.01301524, 31.32809134],
...,

Shape is 793,3 after rdp no change in output

keeping track of the index element output from rdp

Hello and thank you for your package.
I am using it for signal processing and having a timestamps on each data. Is there a possibility to recover these timestamp or the index of the data kept once output from the rdp algorithm?

Many thanks

simplify_coords function causes unexpected exit

The simplify_coords function, which uses the RDP algorithm, causes my script to abruptly (and silently) exit under both 0.3.9 and 0.4.4 (the two versions I tested).

It was passed a list of approximately 212k LineStrings (i.e. [x, y, z] elements) and exited seconds later to command prompt.

ccount = len(coordlist)
simplified = simplify_coords(coordlist, 1)
print ccount, len(simplified)

I was running my script under Python 2.7.10 on Windows 10.

unoptimized for loop

This code works MUCH faster. It does not use for loop, instead it uses numpy vectorization

import numpy as np


def line_dists(points, start, end):
    if np.all(start == end):
        return np.linalg.norm(points - start, axis=1)

    vec = end - start
    cross = np.cross(vec, start - points)
    return np.divide(abs(cross), np.linalg.norm(vec))


def rdp(M, epsilon=0):
    M = np.array(M)
    start, end = M[0], M[-1]
    dists = line_dists(M, start, end)

    index = np.argmax(dists)
    dmax = dists[index]

    if dmax > epsilon:
        result1 = rdp(M[:index + 1], epsilon)
        result2 = rdp(M[index:], epsilon)

        result = np.vstack((result1[:-1], result2))
    else:
        result = np.array([start, end])

    return result

epsilon is int type

Python: 3.9
RDP: 0.8

Minor issue, in rdp constructor the epsilon by default is 0, which makes pylance interpret it as an int type.

Bizarre np.linalg.norm error

Hi fhirschmann,

Thanks for writing this library, it's immensely useful for me. I'm trying to use it in some neural network applications where I'm simplifying the data I am analyzing - mainly vector images.

Occasionally I get a weird error thrown from the rdp code from the np.linalg.norm calls: "The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()". It works on 95% of the time though. I wonder if you have also encountered this issue before. If not, I'll try to do more digging to see what is up with this.

Cheers.

Wrong calculation of pldist

The current pldist function returns the distance between a point and the infinte line generated by two other points (i.e the distance to the orthogonal projection), which is incorrect. It should return the distance between a point and a line-segment.

Here is a fix coming from this answer https://stackoverflow.com/a/54442561:

def pldist(point, start, end):
    """
    Calculates the distance from ``point`` to the line given
    by the points ``start`` and ``end``.

    :param point: a point
    :type point: numpy array
    :param start: a point of the line
    :type start: numpy array
    :param end: another point of the line
    :type end: numpy array
    """
    if np.all(start == end)):
        return np.linalg.norm(point - start)

    # normalized tangent vector
    d = np.divide(end - start, np.linalg.norm(end - start))

    # signed parallel distance components
    s = np.dot(start - point, d)
    t = np.dot(point - end, d)

    # clamped parallel distance
    h = np.max([s, t, 0])

    # perpendicular distance component, as before
    c = np.cross(point - start, d)

    # use hypot for Pythagoras to improve accuracy
    return np.hypot(h, c)

Recursion bug

Although I love recursion and Ramer-Douglas-Peucker is naturally defined as a recursion, this is quite inefficient in Python and leads often to errors like RecursionError: maximum recursion depth exceeded in comparison. In my case I used a GPS track from a GPS sport-watch which has about 12072 points. Python's default recursion depth is 1000 and it can be set with sys.setrecursionlimit but this should never be used actually since you are shifting the problem just a bit away.
Luckily, RDP can also be defined iteratively as shown here.
Since your RDP implementation seems to be the only one on PyPI it would be nice to reformulate it as an iterative algorithm.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.