Giter Club home page Giter Club logo

pyxdameraulevenshtein's Introduction

pyxDamerauLevenshtein

Build Status

LICENSE

This software is licensed under the BSD 3-Clause License. Please refer to the separate LICENSE file for the exact text of the license. You are obligated to give attribution if you use this code.

ABOUT

pyxDamerauLevenshtein implements the Damerau-Levenshtein (DL) edit distance algorithm for Python in Cython for high performance. Courtesy Wikipedia:

In information theory and computer science, the Damerau-Levenshtein distance (named after Frederick J. Damerau and Vladimir I. Levenshtein) is a "distance" (string metric) between two strings, i.e., finite sequence of symbols, given by counting the minimum number of operations needed to transform one string into the other, where an operation is defined as an insertion, deletion, or substitution of a single character, or a transposition of two adjacent characters.

This implementation is based on Michael Homer's pure Python implementation, which implements the optimal string alignment distance algorithm. It runs in O(N*M) time using O(M) space. It supports unicode characters.

REQUIREMENTS

This code requires Python 3.7+ and a C compiler such as GCC. Although the code was written in Cython, Cython is not required for installation.

INSTALL

pyxDamerauLevenshtein is available on PyPI at https://pypi.org/project/pyxDamerauLevenshtein/.

Install using pip:

pip install pyxDamerauLevenshtein

Install from source:

python setup.py install

or

pip install .

USING THIS CODE

The following methods are available:

  • Edit distance (damerau_levenshtein_distance)

    • Compute the raw distance between two strings (i.e., the minumum number of operations necessary to transform one string into the other).
    • Additionally, the distance between lists and tuples can also be computed.
  • Normalized edit distance (normalized_damerau_levenshtein_distance)

    • Compute the ratio of the edit distance to the length of max(string1, string2). 0.0 means that the sequences are identical, while 1.0 means that they have nothing in common. Note that this definition is the exact opposite of difflib.SequenceMatcher.ratio().
  • Edit distance against a sequence of sequences (damerau_levenshtein_distance_seqs)

    • Compute the raw distances between a sequence and each sequence within another sequence (e.g., list, tuple).
  • Normalized edit distance against a sequence of sequences (normalized_damerau_levenshtein_distance_seqs)

    • Compute the normalized distances between a sequence and each sequence within another sequence (e.g., list, tuple).

Basic use:

from pyxdameraulevenshtein import damerau_levenshtein_distance, normalized_damerau_levenshtein_distance
damerau_levenshtein_distance('smtih', 'smith')  # expected result: 1
normalized_damerau_levenshtein_distance('smtih', 'smith')  # expected result: 0.2
damerau_levenshtein_distance([1, 2, 3, 4, 5, 6], [7, 8, 9, 7, 10, 11, 4])  # expected result: 7

from pyxdameraulevenshtein import damerau_levenshtein_distance_seqs, normalized_damerau_levenshtein_distance_seqs
array = ['test1', 'test12', 'test123']
damerau_levenshtein_distance_seqs('test', array)  # expected result: [1, 2, 3]
normalized_damerau_levenshtein_distance_seqs('test', array)  # expected result: [0.2, 0.33333334, 0.42857143]

DIFFERENCES

Other Python DL implementations:

pyxDamerauLevenshtein differs from other Python implementations in that it is both fast via Cython and supports unicode. Michael Homer's implementation is fast for Python, but it is two orders of magnitude slower than this Cython implementation. jellyfish provides C implementations for a variety of string comparison metrics and is sometimes faster than pyxDamerauLevenshtein.

Python's built-in difflib.SequenceMatcher.ratio() performs about an order of magnitude faster than Michael Homer's implementation but is still one order of magnitude slower than this DL implementation. difflib, however, uses a different algorithm (difflib uses the Ratcliff/Obershelp algorithm).

Performance differences (on Intel i7-2600 running at 3.4Ghz):

>>> import timeit
>>> #this implementation:
... timeit.timeit("damerau_levenshtein_distance('e0zdvfb840174ut74j2v7gabx1 5bs', 'qpk5vei 4tzo0bglx8rl7e 2h4uei7')", 'from pyxdameraulevenshtein import damerau_levenshtein_distance', number=500000)
7.417556047439575
>>> #Michael Homer's native Python implementation:
... timeit.timeit("dameraulevenshtein('e0zdvfb840174ut74j2v7gabx1 5bs', 'qpk5vei 4tzo0bglx8rl7e 2h4uei7')", 'from dameraulevenshtein import dameraulevenshtein', number=500000)
667.0276439189911
>>> #difflib
... timeit.timeit("difflib.SequenceMatcher(None, 'e0zdvfb840174ut74j2v7gabx1 5bs', 'qpk5vei 4tzo0bglx8rl7e 2h4uei7').ratio()", 'import difflib', number=500000)
135.41051697731018

pyxdameraulevenshtein's People

Contributors

andlen avatar gfairchild avatar gsnedders avatar internaut avatar joleaf avatar maxbachmann avatar mittagessen avatar nijel avatar onyb avatar simobasso avatar svenski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyxdameraulevenshtein's Issues

Distance between "abc", "abcde" is 2 but between "abcde", "abc" is 1

I appreciate that this implementation is a great help with array calculation !
I checked distances with simple strings.
I am afraid that I have strange distance calculation results.

damerau_levenshtein_distance('abcde', 'abc') = 1
damerau_levenshtein_distance('abc', 'abcde') = 2

Failure with numpy 1.18.4

We're experiencing a bug with pyxDamerauLevenshtein 1.5.3 (any version above, and including, 1.5.2, really), when using it with numpy 1.18.4 (or any version between 1.18.4 and 1.20.0).

Here's how to reproduce (using docker):

docker run --rm -ti python:3.7.6 bash
pip install numpy==1.18.4
pip install pyxdameraulevenshtein==1.5.3
python -c 'from pyxdameraulevenshtein import damerau_levenshtein_distance'

This will produce the following error:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "__init__.pxd", line 242, in init pyxdameraulevenshtein
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

This was working fine until at least last Friday, but started failing without any change on our part.

I'm not sure if anything can be done (seems pretty hard to debug), but if someone has the same issue, upgrading numpy to 1.20.0+ or downgrading pyxDamerauLevenshtein to 1.5.1 works.

Great -> Grreat != Meat -> Meet

This is a feature request.

It would be useful for me to distinguish between the distance of 1 caused by an extra letter repeated vs. a completely different letter.

In this example:
Meat -> Meet # Distance 1 is okay
Great -> Grreat # Repeating just the letter 'r', so I would like the distance to be less than 1 (0.5?)

reproducing benchmark results

I tried to reproduce the benchmarks in your readme. However my results running the same benchmark are greatly different from the results you achieved. Note I scaled the benchmarks down from 500000 to 5000, since this enough to get a good idea of the performance difference without spending all day running the benchmark.
On Python 3.9 I get:

>>> import timeit
>>> timeit.timeit("damerau_levenshtein_distance('e0zdvfb840174ut74j2v7gabx1 5bs', 'qpk5vei 4tzo0bglx8rl7e 2h4uei7')", 'from pyxdameraulevenshtein import damerau_levenshtein_distance', number=5000)
0.30914585100072145
>>> timeit.timeit("dameraulevenshtein('e0zdvfb840174ut74j2v7gabx1 5bs', 'qpk5vei 4tzo0bglx8rl7e 2h4uei7')", 'from dameraulevenshtein import dameraulevenshtein', number=5000)
2.0448212559995227
>>> timeit.timeit("difflib.SequenceMatcher(None, 'e0zdvfb840174ut74j2v7gabx1 5bs', 'qpk5vei 4tzo0bglx8rl7e 2h4uei7').ratio()", 'import difflib', number=5000)
0.29983263299982355

and on Python2.7:

>>> import timeit
>>> timeit.timeit("damerau_levenshtein_distance('e0zdvfb840174ut74j2v7gabx1 5bs', 'qpk5vei 4tzo0bglx8rl7e 2h4uei7')", 'from pyxdameraulevenshtein import damerau_levenshtein_distance', number=5000)
0.4308760166168213
>>> timeit.timeit("dameraulevenshtein('e0zdvfb840174ut74j2v7gabx1 5bs', 'qpk5vei 4tzo0bglx8rl7e 2h4uei7')", 'from dameraulevenshtein import dameraulevenshtein', number=5000)
1.8721919059753418
>>> timeit.timeit("difflib.SequenceMatcher(None, 'e0zdvfb840174ut74j2v7gabx1 5bs', 'qpk5vei 4tzo0bglx8rl7e 2h4uei7').ratio()", 'import difflib', number=5000)
0.3515639305114746

So the performance differences in your benchmarks appear to differentiate by one order of magnitude for both pyxdameraulevenshtein <-> Michael Homer's implementation and pyxdameraulevenshtein <-> difflib. My best guess would be that the Python version you created them on (they existed since the start when the library was Python 2.4+) was still significantly slower than more recent versions of Python. It would probably make sense to redo these benchmarks on a more recent version of Python.

Substring Levenshtein

Is it possible to add a method, so when given substring a, string b, and a maximum normalized edit distance value; the method returns indexes for all occurrences of a in b, where the normalized Levenshtein is lower or equal to the given maximum edit distance value.

Error Installing through Pip on Python3.7 on Linux Ubuntu 18.04

I'm unable to install this on my local Linux Ubuntu 18.04 running Python3.7.

My current install process / what I have tried:

Try installing some possible missing dependencies.

Create a fresh new virtual environment for Python3.7

sudo apt-get install -yqq --noinstall-recommends apt-utils gcc build-essential

Try upgrading these packages
python3.7 -m pip install --upgrade pip setuptools wheel

Try installing the package
python3.7 -m pip install --no-cache-dir pyxDamerauLevenshtein

Note: These commands worked on a Docker image using Python3.7:slim-buster as the base image. I have also tried this
on a few other machines running 18.04

Current Output:

Defaulting to user installation because normal site-packages is not writeable
Collecting pyxDamerauLevenshtein
  Downloading pyxDamerauLevenshtein-1.6.1.tar.gz (55 kB)
     |████████████████████████████████| 55 kB 2.9 MB/s 
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  ERROR: Command errored out with exit status 1:
   command: /usr/bin/python3.7 /home/erb13020/.local/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py get_requires_for_build_wheel /tmp/tmp_901baqu
       cwd: /tmp/pip-install-wnfgf6t1/pyxdameraulevenshtein_c6df6562c7e04fd6a22642aacd88bc26
  Complete output (10 lines):
  Traceback (most recent call last):
    File "/home/erb13020/.local/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py", line 280, in <module>
      main()
    File "/home/erb13020/.local/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py", line 263, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
    File "/home/erb13020/.local/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py", line 108, in get_requires_for_build_wheel
      backend = _build_backend()
    File "/home/erb13020/.local/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py", line 99, in _build_backend
      obj = getattr(obj, path_part)
  AttributeError: module 'setuptools.build_meta' has no attribute '__legacy__'
  ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/bin/python3.7 /home/erb13020/.local/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py get_requires_for_build_wheel /tmp/tmp_901baqu Check the logs for full command output.

Any hints?

Distance is one too small if first string is longer

I looked through the existing bugs and the documentation and didn't see anything pointing this out. I was also unable to reproduce this with your reference python implementation here

$ python3 -m pip show pyxdameraulevenshtein
Name: pyxDamerauLevenshtein
Version: 1.6
Summary: pyxDamerauLevenshtein implements the Damerau-Levenshtein (DL) edit distance algorithm for Python in Cython for high performance.
Home-page: https://github.com/gfairchild/pyxDamerauLevenshtein
Author: Geoffrey Fairchild
Author-email: [email protected]
License: BSD 3-Clause License
Location: /usr/local/lib/python3.7/site-packages
Requires: numpy
Required-by: 
>>> from pyxdameraulevenshtein import damerau_levenshtein_distance
>>> damerau_levenshtein_distance('1', '')                                                                                                                         
0
>>> damerau_levenshtein_distance('', '1')                                                                                                                         
1

distance_withNPArray: Expected unicode, got numpy.unicode_

Found a problem running damerau_levenshtein_distance_withNPArray with an array of Unicode strings under Python 2.7:

keys = numpy.array(store.keys(), dtype="U")
res = damerau_levenshtein_distance_withNPArray("Children’s songs", keys)

Exception TypeError: 'Expected unicode, got numpy.unicode_' in 'pyxdameraulevenshtein.damerau_levenshtein_distance' ignored
Exception TypeError: 'Expected unicode, got numpy.unicode_' in 'pyxdameraulevenshtein.damerau_levenshtein_distance' ignored
[...]

Environment: Python 2.7 (under Mac OS X), NumPy 1.9.2

How to make edits to the source code

Thanks, it is really cool!

I was trying to make an edit, but I cannot get any effect when I do:

python setup.py install

I tried deleting all the source files in my virtualenv, including a .so file, but it doesn't make a difference.

Could you tell me what I can do to make a change to the .pyx and see the result in a built module?

Clarify that pyxDamerauLevenshtein calculates the restricted Damerau-Levenshtein distance

There are two different distances called Damerau-Levenshtein (see the Wikipedia page, for example), one is the restricted distance and the other is the original one, the unrestricted distance. pyxDamerauLevenshtein calculates the restricted distance (so the distance between "CA" and "ABC" is 3, not 2). It would be good for this to be made clear in the documentation. (Even nicer would be to have a Cython implementation of the unrestricted distance as well!)

Thanks!

PEP561 type stubs for mypy

I've written the needed pyi files to type the library. It looks like it would take some pretty big changes to the package structure to make it work with side by side stubs. Would you prefer I keep trying to get it to work side by side? I can also make a stub-only package and give it to you to upload separately? Or would you prefer something else?

Calculated distance seems to be incorrect in some situations

It's definitely possible that I am doing something totally silly here, but I think that this algorithm is returning an incorrect value for a pair of inputs that I tried.

In [1]: from pyxdameraulevenshtein import damerau_levenshtein_distance

In [2]: damerau_levenshtein_distance("49482", "48924")
Out[2]: 4

I believe that the Damerau-Levenshtein distance between these two sequences is actually 3, not 4. The words can be transformed as follows:

After 0 operations: 49482
After 1 operations: 49842 (transpose 8 and 4)
After 2 operations: 49824 (transpose 2 and 4)
After 3 operations: 48924 (transpose 9 and 8)

'ValueError: numpy.ufunc has the wrong size, try recompiling' when install pyxDamerauLevenshtein

How to reproduce:

  1. docker run -it --entrypoint=/bin/bash python:3.6 to have a clean environment;
  2. pip install numpy pyxdameraulevenshtein
  3. call python
  4. >>> import pyxdameraulevenshtein

Result:
Traceback (most recent call last):
File "", line 1, in
File "init.pxd", line 885, in init pyxdameraulevenshtein
ValueError: numpy.ufunc has the wrong size, try recompiling. Expected 192, got 216

pyxDamerauLevenshtein==1.5

ImportError: cannot import name 'damerau_levenshtein_distance_ndarray'

In the CHANGES.md I read:

This is a breaking change if you currently rely on either of the *_ndarray methods.

`damerau_levenshtein_distance_ndarray` refactored to `damerau_levenshtein_distance_seqs`

Simply claiming that you broke my code is not helpful. I noticed.

Please document better how you broke the code that relies on it.

Also, please provide a migration plan.

Thanks in advance.

Cannot install module on Python 3.7

Hi. I am trying to install this module, but running into an error relating to a numpy import issue?

Python version: 3.7.3 (installed from source)
Numpy: Installed by using pip install
OS: CentOS 7.6

This is the error I got from console when running the pip install for this module:
Original error was: /tmp/pip-build-env-lfix5eze/overlay/lib/python3.7/site-packages/numpy/core/_multiarray_umath.cpython-37m-x86_64-linux-gnu.so: failed to map segment from shared object: Operation not permitted

Unable to install as a setup.py dependency

In my project when I am trying to install pyxDamerauLevenshtein as a run time dependency in my setuptools' setup.py file I am getting following error:

Processing dependencies for atarashi==0.0.9

Searching for pyxDamerauLevenshtein>=1.5

Reading https://pypi.org/simple/pyxDamerauLevenshtein/

Downloading https://files.pythonhosted.org/packages/09/d8/77d02800d687ff8e12c8ec7b4ed917249fca27a1bccc6d24f0ac507a794c/pyxDamerauLevenshtein-1.5.tar.gz#sha256=00b836cdb7cff24abe20c2fdc6cdda4ac24413a82cf318c983980ce4aaec7e16

Best match: pyxDamerauLevenshtein 1.5

Processing pyxDamerauLevenshtein-1.5.tar.gz

Writing /tmp/easy_install-r1p8hre8/pyxDamerauLevenshtein-1.5/setup.cfg

Running pyxDamerauLevenshtein-1.5/setup.py -q bdist_egg --dist-dir /tmp/easy_install-r1p8hre8/pyxDamerauLevenshtein-1.5/egg-dist-tmp-twm03pcs

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/setuptools/sandbox.py", line 154, in save_modules
    yield saved
  File "/usr/local/lib/python3.7/site-packages/setuptools/sandbox.py", line 195, in setup_context
    yield
  File "/usr/local/lib/python3.7/site-packages/setuptools/sandbox.py", line 250, in run_setup
    _execfile(setup_script, ns)
  File "/usr/local/lib/python3.7/site-packages/setuptools/sandbox.py", line 45, in _execfile
    exec(code, globals, locals)
  File "/tmp/easy_install-r1p8hre8/pyxDamerauLevenshtein-1.5/setup.py", line 97, in <module>
    "atarashi/build_deps.py",
  File "/usr/local/lib/python3.7/site-packages/setuptools/__init__.py", line 140, in setup
    return distutils.core.setup(**attrs)
  File "/usr/local/lib/python3.7/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/local/lib/python3.7/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/usr/local/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/bdist_egg.py", line 163, in run
    self.run_command("egg_info")
  File "/usr/local/lib/python3.7/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/local/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/egg_info.py", line 296, in run
    self.find_sources()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/egg_info.py", line 303, in find_sources
    mm.run()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/egg_info.py", line 534, in run
    self.add_defaults()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/egg_info.py", line 570, in add_defaults
    sdist.add_defaults(self)
  File "/usr/local/lib/python3.7/distutils/command/sdist.py", line 228, in add_defaults
    self._add_defaults_ext()
  File "/usr/local/lib/python3.7/distutils/command/sdist.py", line 311, in _add_defaults_ext
    build_ext = self.get_finalized_command('build_ext')
  File "/usr/local/lib/python3.7/distutils/cmd.py", line 299, in get_finalized_command
    cmd_obj.ensure_finalized()
  File "/usr/local/lib/python3.7/distutils/cmd.py", line 107, in ensure_finalized
    self.finalize_options()
  File "/tmp/easy_install-r1p8hre8/pyxDamerauLevenshtein-1.5/setup.py", line 33, in finalize_options
    # Used for the long_description.  It's nice, because now 1) we have a top level
AttributeError: 'dict' object has no attribute '__NUMPY_SETUP__'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "setup.py", line 174, in <module>
    'build_py': BuildAtarashi,
  File "/usr/local/lib/python3.7/site-packages/setuptools/__init__.py", line 140, in setup
    return distutils.core.setup(**attrs)
  File "/usr/local/lib/python3.7/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/local/lib/python3.7/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/usr/local/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/install.py", line 67, in run
    self.do_egg_install()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/install.py", line 117, in do_egg_install
    cmd.run()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/easy_install.py", line 415, in run
    self.easy_install(spec, not self.no_deps)
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/easy_install.py", line 657, in easy_install
    return self.install_item(None, spec, tmpdir, deps, True)
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/easy_install.py", line 704, in install_item
    self.process_distribution(spec, dist, deps)
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/easy_install.py", line 749, in process_distribution
    [requirement], self.local_index, self.easy_install
  File "/usr/local/lib/python3.7/site-packages/pkg_resources/__init__.py", line 777, in resolve
    replace_conflicting=replace_conflicting
  File "/usr/local/lib/python3.7/site-packages/pkg_resources/__init__.py", line 1060, in best_match
    return self.obtain(req, installer)
  File "/usr/local/lib/python3.7/site-packages/pkg_resources/__init__.py", line 1072, in obtain
    return installer(requirement)
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/easy_install.py", line 676, in easy_install
    return self.install_item(spec, dist.location, tmpdir, deps)
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/easy_install.py", line 702, in install_item
    dists = self.install_eggs(spec, download, tmpdir)
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/easy_install.py", line 887, in install_eggs
    return self.build_and_install(setup_script, setup_base)
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/easy_install.py", line 1155, in build_and_install
    self.run_setup(setup_script, setup_base, args)
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/easy_install.py", line 1141, in run_setup
    run_setup(setup_script, args)
  File "/usr/local/lib/python3.7/site-packages/setuptools/sandbox.py", line 253, in run_setup
    raise
  File "/usr/local/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.7/site-packages/setuptools/sandbox.py", line 195, in setup_context
    yield
  File "/usr/local/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.7/site-packages/setuptools/sandbox.py", line 166, in save_modules
    saved_exc.resume()
  File "/usr/local/lib/python3.7/site-packages/setuptools/sandbox.py", line 141, in resume
    six.reraise(type, exc, self._tb)
  File "/usr/local/lib/python3.7/site-packages/setuptools/_vendor/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.7/site-packages/setuptools/sandbox.py", line 154, in save_modules
    yield saved
  File "/usr/local/lib/python3.7/site-packages/setuptools/sandbox.py", line 195, in setup_context
    yield
  File "/usr/local/lib/python3.7/site-packages/setuptools/sandbox.py", line 250, in run_setup
    _execfile(setup_script, ns)
  File "/usr/local/lib/python3.7/site-packages/setuptools/sandbox.py", line 45, in _execfile
    exec(code, globals, locals)
  File "/tmp/easy_install-r1p8hre8/pyxDamerauLevenshtein-1.5/setup.py", line 97, in <module>
    "atarashi/build_deps.py",
  File "/usr/local/lib/python3.7/site-packages/setuptools/__init__.py", line 140, in setup
    return distutils.core.setup(**attrs)
  File "/usr/local/lib/python3.7/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/local/lib/python3.7/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/usr/local/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/bdist_egg.py", line 163, in run
    self.run_command("egg_info")
  File "/usr/local/lib/python3.7/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/local/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/egg_info.py", line 296, in run
    self.find_sources()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/egg_info.py", line 303, in find_sources
    mm.run()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/egg_info.py", line 534, in run
    self.add_defaults()
  File "/usr/local/lib/python3.7/site-packages/setuptools/command/egg_info.py", line 570, in add_defaults
    sdist.add_defaults(self)
  File "/usr/local/lib/python3.7/distutils/command/sdist.py", line 228, in add_defaults
    self._add_defaults_ext()
  File "/usr/local/lib/python3.7/distutils/command/sdist.py", line 311, in _add_defaults_ext

    build_ext = self.get_finalized_command('build_ext')
  File "/usr/local/lib/python3.7/distutils/cmd.py", line 299, in get_finalized_command
    cmd_obj.ensure_finalized()
  File "/usr/local/lib/python3.7/distutils/cmd.py", line 107, in ensure_finalized
    self.finalize_options()
  File "/tmp/easy_install-r1p8hre8/pyxDamerauLevenshtein-1.5/setup.py", line 33, in finalize_options
    # Used for the long_description.  It's nice, because now 1) we have a top level
AttributeError: 'dict' object has no attribute '__NUMPY_SETUP__'

But if I install it as a separate call to pip, it installs successfully:

# pip3 install pyxdameraulevenshtein>=1.5
pyxdameraulevenshtein 1.5 requires numpy, which is not installed.

# python3 setup.py install
...
Searching for pyxDamerauLevenshtein==1.5
Best match: pyxDamerauLevenshtein 1.5
Adding pyxDamerauLevenshtein 1.5 to easy-install.pth file
...

setup.py I am using: setup.py

Errors using Python3.3 / string<->bytes

Hi,

I just ran into a type error, when using python3.3.
TypeError: expected bytes, str found

I had to fix this by calling the function with encoded strings:

print(normalized_damerau_levenshtein_distance('smtih'.encode(), 'smith'.encode()))

Maybe a simple parameter check would be enough?

Thanks

numpy.dtype error

I am using Python2.7 of ArcGIS 10.5 (cannot update python version). Installed pyxDamerauLevenshtein (1.5.3) and numpy 1.16.2 in a virtual environment as ArcGIS use numpy 1.9.3 for its core moudles.
When I run, the script run under virtual environment but I got this error..

File "__init__.pxd", line 206, in init pyxdameraulevenshtein
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 56 from C header, got 52 from PyObject

_

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.