Giter Club home page Giter Club logo

python-duckling's Introduction

duckling

Python wrapper for wit.ai's Duckling Clojure library

Build Status

CircleCi Build Status

Pypi Version

PyPI version

Introduction

This library is inspired by Edward Stone's Python wrapper for natty.

It provides a low-level access to Duckling's parse() function as well as a wrapper for easy access.

Requirements

python-duckling requires an installed JVM (either plain JRE or JDK) to run since Duckling itself is implemented in Clojure (which is leveraging the JVM).

Examples

High-level (DucklingWrapper)
    d = DucklingWrapper()
    print(d.parse_time(u'Let\'s meet at 11:45am'))
    # [{u'dim': u'time', u'end': 21, u'start': 11, u'value': {u'value': u'2016-10-14T11:45:00.000-07:00', u'others': [u'2016-10-14T11:45:00.000-07:00', u'2016-10-15T11:45:00.000-07:00', u'2016-10-16T11:45:00.000-07:00']}, u'text': u'at 11:45am'}]
    print(d.parse_temperature(u'Let\'s change the temperatur from thirty two celsius to 65 degrees'))
    # [{u'dim': u'temperature', u'end': 65, u'start': 55, u'value': {u'unit': u'degree', u'value': 65.0}, u'text': u'65 degrees'}, {u'dim': u'temperature', u'end': 51, u'start': 33, u'value': {u'unit': u'celsius', u'value': 32.0}, u'text': u'thirty two celsius'}]
Low-level (Duckling)
    d = Duckling()
    d.load() # always load the model first
    print(d.parse('tomorrow'))
    # [{u'body': u'tomorrow', u'dim': u'time', u'end': 8, u'value': {u'values': [{u'grain': u'day', u'type': u'value', u'value': u'2016-10-10T00:00:00.000-07:00'}], u'grain': u'day', u'type': u'value', u'value': u'2016-10-10T00:00:00.000-07:00'}, u'start': 0}]

Other examples can be found in the test directory.

Functions

High-level (DucklingWrapper)
DucklingWrapper(jvm_started=False, parse_datetime=False, language=Language.ENGLISH, minimum_heap_size='128m', maximum_heap_size='2048m'):

    """Simplified Python wrapper for Duckling by wit.ai.

    Attributes:
        jvm_started: Optional attribute to specify if the JVM has already been
            started (with all Java dependencies loaded).
        parse_datetime: Optional attribute to specify if datetime string should
            be parsed with datetime.strptime(). Default is False.
        language: Optional attribute to specify language to be used with
            Duckling. Default is Language.ENGLISH.
        minimum_heap_size: Optional attribute to set initial and minimum heap
            size. Default is 128m.
        maximum_heap_size: Optional attribute to set maximum heap size. Default
            is 2048m.
    """

duckling_wrapper.parse(self, input_str, reference_time=''):
        """Parses input with Duckling for all dims.

        Args:
            input_str: An input string, e.g. 'You owe me twenty bucks, please
                call me today'.
            reference_time: Optional reference time for Duckling.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """

duckling_wrapper.parse_time(self, input_str, reference_time=''):
        """Parses input with Duckling for occurences of times.

        Args:
            input_str: An input string, e.g. 'Let's meet at 11:45am'.
            reference_time: Optional reference time for Duckling.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """

duckling_wrapper.parse_timezone(self, input_str):
        """Parses input with Duckling for occurences of timezones.

        Args:
            input_str: An input string, e.g. 'My timezone is pdt'.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """

duckling_wrapper.parse_temperature(self, input_str):
        """Parses input with Duckling for occurences of temperatures.

        Args:
            input_str: An input string, e.g. 'Let's change the temperature from
                thirty two celsius to 65 degrees'.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """

duckling_wrapper.parse_number(self, input_str):
        """Parses input with Duckling for occurences of numbers.

        Args:
            input_str: An input string, e.g. 'I'm 25 years old'.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """

duckling_wrapper.parse_ordinal(self, input_str):
        """Parses input with Duckling for occurences of ordinals.

        Args:
            input_str: An input string, e.g. 'I'm first, you're 2nd'.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """

duckling_wrapper.parse_distance(self, input_str):
        """Parses input with Duckling for occurences of distances.

        Args:
            input_str: An input string, e.g. 'I commute 5 miles everyday'.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """

duckling_wrapper.parse_volume(self, input_str):
        """Parses input with Duckling for occurences of volumes.

        Args:
            input_str: An input string, e.g. '1 gallon is 3785ml'.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """

duckling_wrapper.parse_money(self, input_str):
        """Parses input with Duckling for occurences of moneys.

        Args:
            input_str: An input string, e.g. 'You owe me 10 dollars'.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """

duckling_wrapper.parse_duration(self, input_str):
        """Parses input with Duckling for occurences of durations.

        Args:
            input_str: An input string, e.g. 'I ran for 2 hours today'.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """

duckling_wrapper.parse_email(self, input_str):
        """Parses input with Duckling for occurences of emails.

        Args:
            input_str: An input string, e.g. 'Shoot me an email at
                contact@frank-blechschmidt.com'.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """

duckling_wrapper.parse_url(self, input_str):
        """Parses input with Duckling for occurences of urls.

        Args:
            input_str: An input string, e.g. 'http://frank-blechschmidt.com is
                under construction, but you can check my github
                github.com/FraBle'.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """

duckling_wrapper.parse_phone_number(self, input_str):
        """Parses input with Duckling for occurences of phone numbers.

        Args:
            input_str: An input string, e.g. '424-242-4242 is obviously a fake
                number'.

        Returns:
            A preprocessed list of results (dicts) from Duckling output.
        """
Low-level (Duckling)
Duckling(jvm_started=False, parse_datetime=False, minimum_heap_size='128m', maximum_heap_size='2048m'):

    """Python wrapper for Duckling by wit.ai.

    Attributes:
        jvm_started: Optional attribute to specify if the JVM has already been
            started (with all Java dependencies loaded).
        parse_datetime: Optional attribute to specify if datetime string 
            should be parsed with datetime.strptime(). Default is False.
        minimum_heap_size: Optional attribute to set initial and minimum heap
            size. Default is 128m.
        maximum_heap_size: Optional attribute to set maximum heap size. Default
            is 2048m.
    """

duckling.load(self, languages=[]):
        """Loads the Duckling corpus.

        Languages can be specified, defaults to all.

        Args:
            languages: Optional parameter to specify languages,
                e.g. [Duckling.ENGLISH, Duckling.FRENCH] or supported ISO 639-1 Codes (e.g. ["en", "fr"])
        """

duckling.parse(self, input_str, language=Language.ENGLISH, dim_filter=None, reference_time=''):
        """Parses datetime information out of string input.

        It invokes the Duckling.parse() function in Clojure.
        A language can be specified, default is English.

        Args:
            input_str: The input as string that has to be parsed.
            language: Optional parameter to specify language,
                e.g. Duckling.ENGLISH.
            dim_filter: Optional parameter to specify a single filter or
                list of filters for dimensions in Duckling.
            reference_time: Optional reference time for Duckling.

        Returns:
            A list of dicts with the result from the Duckling.parse() call.

        Raises:
            RuntimeError: An error occurres when Duckling model is not loaded
                via load().
        """

Future Work

  • Support new Haskell version of Duckling (probably in new repo)

Credit

  • wit.ai for their awesome work and tools for the NLP community
  • Edward Stone for the inspiration to write a python wrapper for library from a different programming language

Contributors

License

  • Apache License 2.0 (check the LICENSE file)

python-duckling's People

Contributors

frable avatar laurentvalette avatar oziee avatar tmbo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

python-duckling's Issues

problem when parsing

Hi,

When i launch print(d.parse_time(u'Let's meet at 11:45am')), i obtain the error
'<' not supported between instances of 'slice' and 'int'

Add grain to _parse_basic_info()

Grain is quite important for many datetime use cases. Could this be added to the DucklingWrapper method _parse_basic_info() if the results return a grain key-value pair. The abstract DucklingWrapper is nice to use for convenience but limiting the grain makes its somewhat useless, for my use case at least.

I could make pull request to do so if you prefer.

parse_time issue

print(d.parse_time(u'list top 20 customers in last 10 days'))

Output :

[{'dim': 'time', 'text': 'last 10 days', 'start': 25, 'end': 37, 'value': {'value': {'to': '2018-05-02T00:00:00.000+05:30', 'from': '2018-04-22T00:00:00.000+05:30'}, 'others': []}}, {'dim': 'time', 'text': '20', 'start': 9, 'end': 11, 'value': {'value': '2018-05-02T20:00:00.000+05:30', 'grain': 'hour', 'others': [{'grain': 'hour', 'value': '2018-05-02T20:00:00.000+05:30'}, {'grain': 'hour', 'value': '2018-05-03T20:00:00.000+05:30'}, {'grain': 'hour', 'value': '2018-05-04T20:00:00.000+05:30'}]}}]

However, If I change the string to the following, the output comes fine:

print(d.parse_time(u'list top customers in last 10 days'))

Output:

[{'dim': 'time', 'text': 'last 10 days', 'start': 8, 'end': 20, 'value': {'value': {'to': '2018-05-02T00:00:00.000+05:30', 'from': '2018-04-22T00:00:00.000+05:30'}, 'others': []}}]

Why does the same query 'list top 20 customers by sales in last 10 days' seem to work fine on duckling.wit.ai

jpype._jexception.RuntimeExceptionPyRaisable

Hi I am trying to run python wrapper but I'm getting raise _RUNTIMEEXCEPTION.PYEXC("Class %s not found" % name) jpype._jexception.RuntimeExceptionPyRaisable: java.lang.RuntimeException: Class clojure.java.api.Clojure not found

I'm running simple example

from duckling import Duckling d = Duckling() d.load() print(d.parse('tomorrow'))

I'm not sure what I'm doing wrong, may be in installations. I have installed
Duckling 1.8
JPype1 0.6.3
And I'm running it ancaonda env.

Dimension object unwieldy

Hey @FraBle great work here 👍 😄! just tried to use the duckling.dim.Dim object to have all the allowed values for dimensions, but I need to do it like this

from duckling.dim import Dim
from inspect import getmembers
allowed_values = [m[1] for m in getmembers(Dim) if not m[0].startswith("__") and not m[0].endswith("__")]

is there a more convenient way to get these values? If not, I'd also be happy to make a PR to turn it into a dict or sth that is easier accessible for people

Fatal Error on Os X

from duckling import Duckling
d = Duckling()
d.load()
d.parse('on September 31')

Error:

Mar. 20, 2020 5:22:34 P.M. clojure.tools.logging$eval277$fn__281 invoke
SEVERE: Error while resolving {:timezone nil, :pred #object[clojure.lang.AFunction$1 0x3412a3fd "clojure.lang.AFunction$1@3412a3fd"], :index 10, :dim :time, :rule {:name "on <date>", :pattern (#object[duckling.engine$pattern_fn$fn__4027 0x1bbf3a6e "duckling.engine$pattern_fn$fn__4027@1bbf3a6e"] #object[duckling.engine$pattern_fn$fn__4033 0x76e4a652 "duckling.engine$pattern_fn$fn__4033@76e4a652"]), :production #object[duckling.time.prod$eval26107$fn__26108 0x7a81a89a "duckling.time.prod$eval26107$fn__26108@7a81a89a"]}, :log-prob -6.3550387135069855, :pos 0, :end 15, :direction nil, :text "on September 31"}
org.joda.time.IllegalFieldValueException: Value 0 for minuteOfHour is not supported: Illegal instant due to time zone offset transition (daylight savings time 'gap'): 1906-09-01T00:00:52.000 (America/Edmonton)

does not work in Windows

i ran the following code

import duckling
d = duckling.Duckling()

This code works fine in Ubuntu but does not work in Windows. In windows I get a "Python.exe has stopped working" window

How might we increase the processing time for parsing?

First, love what you've done as a translation from clojure to python to leverage duckling.

When running, compared to parsedatetime, it's quite slow. What might we do to increase the time? How can we be helpful? Or is this a by-product of translating the code from clojure to python?

parse_time reference_time timezone reversal issue

There appears to be an inversion of the timezone direction (e.g. +07:00 instead of -07:00) when setting a reference time string as reference_time to parse_time.

For example:

d.parse_time('tomorrow at 5pm', reference_time='2017-04-11T08:26:07.470413-07:00')

Returns:

[{u'dim': u'time', u'end': 15, u'start': 0, u'value': {u'value': u'2017-04-12T17:00:00.000+07:00', u'others': [u'2017-04-12T17:00:00.000+07:00']}, u'text': u'tomorrow at 5pm'}]

I think this should return 'value': '2017-04-12T17:00:00.000-07:00'

Thank you for this otherwise great wrapper!

Language Support

Hi!

First of all, thank you a lot for building this wrapper and make duckling more accessible to python developers. I tried the German module with the following information: "Ihre Tochter ist am 31.12.2009 geboren worden". It does not detect 31.12.2009 as a date. Then I tested the demo on https://duckling.wit.ai/ and here, wenn selecting the german module, it works fine. What could be the issue? Please find below the code:

from duckling import *

d = DucklingWrapper("de")

print(d.parse_time(u'Ihre Tochter ist am 31.12.2009 geboren worden'))

output;
[{'dim': 'time', 'text': '2009', 'start': 26, 'end': 30, 'value': {'value': '2009-01-01T00:00:00.000+01:00', 'grain': 'year', 'others': []}}, {'dim': 'time', 'text': '31', 'start': 20, 'end': 22, 'value': {'value': '2031-01-01T00:00:00.000+01:00', 'grain': 'year', 'others': [{'grain': 'year', 'value': '2031-01-01T00:00:00.000+01:00'}]}}, {'dim': 'time', 'text': '12', 'start': 23, 'end': 25, 'value': {'value': '2019-08-09T12:00:00.000+02:00', 'grain': 'hour', 'others': [{'grain': 'hour', 'value': '2019-08-09T12:00:00.000+02:00'}, {'grain': 'hour', 'value': '2019-08-10T00:00:00.000+02:00'}, {'grain': 'hour', 'value': '2019-08-10T12:00:00.000+02:00'}]}}]

I suspect there is something wrong with loading the German module, since it does not understand in general.

thanks a lot for the help in advance!

currency recognition problem

Hi,

I am having a problem with currency parsing:

image

Both the cases, with "Euro" and with "€", should in principle work, right? A general call to parse() gives back 20000 as a number, not as currency.

What am I doing wrong?

Thanks for your help,

Andrea.

Duckling library not working in flask

from duckling import DucklingWrapper
d=DucklingWrapper()
def getReply(request):
d.parse_time("today is holiday")

The code just stops here. Its a flask code.

jpype dependency not listed

I have problems installing duckling. It seems that jpype is not available. Could you please add this information to the readme and also give me a hint which version of jpype to get is best?

Duckling compatibility with Python 3

Hi

Not sure why, but a day or two ago duckling.py stopped working on Python 3 system due to types.StringType on line 140. I've been using Python 3 for a while, but only gotten this problem now.

Would it make sense to change line 140 to if isinstance(dim_filter, string_types): to use the string type from six instead of StringType from types?

Regards
Bernardt

Duckling 0.4.24 messages in the log

since updating to the new 0.4.24 using the rasa_nlu which uses the ducklingwrapper i get these messages
these messages weren't there in 0.4.16

i do think this probably is a duckling problem as i found this issue logged there
facebookarchive/duckling_old#210

"not found" "Sri, 13. velj"
"not found" "san silvestro"
"not found" "notte di san silvestro"
"not found" "??? ?????? ???? ???????"
"not found" "????? ?19"
"not found" "?????? ????"
"not found" "?1 ????"
"not found" "?????? ???? 2015"
"not found" "?15 ?????"
"not found" "?15 ?????"
"not found" "??? ???? ?????"
"not found" "??? ???, ?18 ???????"
"not found" "???, ?????? ???? ???????"
"not found" "??? ?????"
"not found" "??? ????? ?????? ?? ???????"
"not found" "@ 3pm"
"not found" "?????? ???? ???????"
"not found" "????? ???? ???????"
"not found" "3:15 ???????"
"not found" "3:20 ???????"
"not found" "3:20 ??????"
"not found" "????? ???? ???? ???????"
"not found" "????? ???? ????"
"not found" "???? ???? ???????"
"not found" "???? ???? ???? ???? ???? ?????? ???????"
"not found" "???? ????? ??????"

python-sutime seems to interfere with python-duckling

Hi, as soon as I added your other lib:
https://github.com/FraBle/python-sutime

I got:
Traceback (most recent call last):
File "step_1_gen_input.py", line 137, in
main(sys.argv[1:])
File "step_1_gen_input.py", line 127, in main
jmaps = get_dataset(fpaths, args.n)
File "step_1_gen_input.py", line 43, in get_dataset
d = duckling.DucklingWrapper()
File "/usr/local/lib/python3.5/site-packages/duckling/wrapper.py", line 23, in init
jvm_started=jvm_started, parse_datetime=parse_datetime)
File "/usr/local/lib/python3.5/site-packages/duckling/duckling.py", line 45, in init
self.clojure = jpype.JClass('clojure.java.api.Clojure')
File "/usr/local/lib/python3.5/site-packages/jpype/_jclass.py", line 55, in JClass
raise _RUNTIMEEXCEPTION.PYEXC("Class %s not found" % name)
jpype._jexception.RuntimeExceptionPyRaisable: java.lang.RuntimeException: Class clojure.java.api.Clojure not found

Removing it makes the issue go away.

basic numbers in German Language not working

python 3.6
duckling 1.18

German
text = "dritte Nachricht"
English
text = "third message"

from duckling import DucklingWrapper
from duckling import language as lang
nlp = DucklingWrapper()
nlp.language = lang.Language.GERMAN

nlp.parse("dritte Nachricht")

returns []

English
while I checked the same text in English then it's working fine.

from duckling import DucklingWrapper
from duckling import language as lang
nlp = DucklingWrapper()
nlp.language = lang.Language.ENGLISH
nlp.parse("third message")

returns

[{'dim': 'ordinal',
  'text': 'third',
  'start': 0,
  'end': 5,
  'value': {'value': 3}},
 {'dim': 'time',
  'text': 'third',
  'start': 0,
  'end': 5,
  'value': {'value': '2020-02-03T00:00:00.000+05:30',
   'grain': 'day',
   'others': [{'grain': 'day', 'value': '2020-02-03T00:00:00.000+05:30'},
    {'grain': 'day', 'value': '2020-03-03T00:00:00.000+05:30'},
    {'grain': 'day', 'value': '2020-04-03T00:00:00.000+05:30'}]}}]

Long duration throws error

Thanks for this very useful library. =] And apologies in advance if this is simply a shortcoming in my understanding of my (limited) knowledge of Python, and not a problem with the library.

I am currently using python-duckling with the newest checkout of git.ai's duckling, which is duckling-0.4.23.jar. I added weeks to python-duckling in the most straightforward way:

@@ -179,6 +179,7 @@ class Duckling(object):
             u'minute': self._parse_int,
             u'hour': self._parse_int,
             u'day': self._parse_int,
+            u'week': self._parse_int,
             u'month': self._parse_int,

After doing this, I ran on some transcribed conversations. Because the transcriptions contain inaccuracies, I encountered the substring "7208 weeks". This causes the following exception to be thrown:

>>> import duckling
>>> import pandas
>>> d = duckling.Duckling()
>>> d.load()
>>> print(d.parse("7208 weeks"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\dblackburn\AppData\Local\Continuum\Anaconda3\lib\site-packages\duckling-1.3.2-py3.6.egg\duckling\duckling.py", line 135, in parse
    return self._parse_result(duckling_result)
  File "C:\Users\dblackburn\AppData\Local\Continuum\Anaconda3\lib\site-packages\duckling-1.3.2-py3.6.egg\duckling\duckling.py", line 166, in _parse_result
    field.getValue(), entry[u'dim'])
  File "C:\Users\dblackburn\AppData\Local\Continuum\Anaconda3\lib\site-packages\duckling-1.3.2-py3.6.egg\duckling\duckling.py", line 199, in _parse_dict
    result[key] = _functions_with_dim[key](field.getValue(), dim)
  File "C:\Users\dblackburn\AppData\Local\Continuum\Anaconda3\lib\site-packages\duckling-1.3.2-py3.6.egg\duckling\duckling.py", line 199, in _parse_dict
    result[key] = _functions_with_dim[key](field.getValue(), dim)
  File "C:\Users\dblackburn\AppData\Local\Continuum\Anaconda3\lib\site-packages\duckling-1.3.2-py3.6.egg\duckling\duckling.py", line 233, in _parse_value
    return _dims[dim](java_value)
  File "C:\Users\dblackburn\AppData\Local\Continuum\Anaconda3\lib\site-packages\duckling-1.3.2-py3.6.egg\duckling\duckling.py", line 211, in _parse_float
    return float(java_number.toString())
AttributeError: 'str' object has no attribute 'toString'

This runs fine with "1000 weeks". The problem occurs due to integer overflow.

For now, I've worked around the problem with the following changes to _parse_float, but would appreciate any thoughts on whether this is a problem on my end, a problem with the library, or asking for 7208 weeks is simply an unreasonable input. =P

@@ -207,8 +208,11 @@ class Duckling(object):
         return result

     def _parse_float(self, java_number):
-        return float(java_number.toString())
-
+        if(hasattr(java_number, "toString")):
+            return float(java_number.toString())
+        else:
+            return "<<Absurdly large number>>"

Invalid Timezone Exception In Tests and Prototype

Running your tests I get the following error:

(env) cmuell89 [~/git/python-duckling] $ python -m pytest duckling/
=============================================================================================== test session starts ===============================================================================================
platform linux -- Python 3.5.2+, pytest-3.0.6, py-1.4.32, pluggy-0.4.0
rootdir: /home/cmuell89/git/python-duckling, inifile: 
plugins: cov-2.4.0
collected 65 items 

duckling/test/test_duckling.py ..........................
duckling/test/test_duckling_wrapper.py ....FF...........................FF....

==================================================================================================== FAILURES =====================================================================================================
________________________________________________________________________________ test_parse_time_with_reference_time_and_timezone _________________________________________________________________________________

duckling_wrapper = <duckling.wrapper.DucklingWrapper object at 0x7fbff39ed198>

    def test_parse_time_with_reference_time_and_timezone(duckling_wrapper):
        result = duckling_wrapper.parse_time(
>           u'Let\'s meet tomorrow at 12pm', reference_time=u'1990-12-30 15:30:00-8:00')

duckling/test/test_duckling_wrapper.py:63: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
duckling/wrapper.py:263: in parse_time
    reference_time=reference_time)
duckling/wrapper.py:56: in _parse
    reference_time=reference_time)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <duckling.duckling.Duckling object at 0x7fbff39ed7b8>, input_str = "Let's meet tomorrow at 12pm", language = 'en$core', dim_filter = 'time', reference_time = '1990-12-30 15:30:00-8:00'

    def parse(self, input_str, language=Language.ENGLISH, dim_filter=None, reference_time=''):
        """Parses datetime information out of string input.
    
            It invokes the Duckling.parse() function in Clojure.
            A language can be specified, default is English.
    
            Args:
                input_str: The input as string that has to be parsed.
                language: Optional parameter to specify language,
                    e.g. Duckling.ENGLISH.
                dim_filter: Optional parameter to specify list of filters for
                    dimensions in Duckling.
                reference_time: Optional reference time for Duckling.
    
            Returns:
                A list of dicts with the result from the Duckling.parse() call.
    
            Raises:
                RuntimeError: An error occurres when Duckling model is not loaded
                    via load().
            """
        if self._is_loaded is False:
            raise RuntimeError(
                'Please load the model first by calling load()')
        duckling_parse = self.clojure.var("duckling.core", "parse")
        duckling_time = self.clojure.var("duckling.time.obj", "t")
        clojure_hashmap = self.clojure.var("clojure.core", "hash-map")
    
        filter_str = '[]'
        if dim_filter:
            filter_str = '[:{filter}]'.format(filter=dim_filter)
    
        if reference_time:
            duckling_result = duckling_parse.invoke(
                language,
                input_str,
                self.clojure.read(filter_str),
                clojure_hashmap.invoke(
                    self.clojure.read(':reference-time'),
                    duckling_time.invoke(
>                       *self._parse_reference_time(reference_time))
                )
            )
E           jpype._jexception.clojure.lang.ExceptionInfoPyRaisable: clojure.lang.ExceptionInfo: Invalid timezone {:tz 8.0}

duckling/duckling.py:127: clojure.lang.ExceptionInfoPyRaisable

I also get the same error attempting to pass your test reference time to my prototyping file for the parse method of the lower level duckling class.

Problem in Django multi threaded environment

According the fix for multi threaded environment:
if threading.activeCount()>1
in init method, It is checking for 2nd thread which is creating the same class's object.
But what if I have only one object which is shared by all threads? It is failing in this case because one thread is creating it and rest all are using it.
So I removed this check. I am not sure of the consequences, please suggest.

incorrect year

for phrase "list all movies released from 23 may to 2 aug", duckling is giving date range as "2018-5-23 to 2018-8-2". Year is being parsed incorrectly.

Duckling in threaded Django environment

Hi FraBle

I’ve applied a fix within my local Duckling (duckling.py around line 108 to allow parsing in a multi-threaded environment. The parse call is coming from a different thread as the one that created the instance and started the JVM.

The code I added just above 'language = …’ around line 108 is:

if threading.activeCount() > 1:
if jpype.isThreadAttachedToJVM() is not 1:
jpype.attachThreadToJVM()

It is a snippet from init and seems to solve my problem. However, I don’t really know jpype. Please have a look at the code and let me know if you want me to create a pull request for it.

Kind regards
Bernardt

Installation error with Docker

When trying to install the latest version of Duckling with the Docker python:3-stretch image with JDK8 installed I get the following error:

ISearching for duckling
Reading https://pypi.python.org/simple/duckling/
No local packages or working download links found for duckling
error: Could not find suitable distribution for Requirement.parse('duckling')

Not sure what could be the root cause of this error.

parse_time throws KeyError exception - sample included

Traceback (most recent call last):
File "wtf.py", line 64, in
d.parse_time(b)
File "/usr/local/lib/python3.5/site-packages/duckling/wrapper.py", line 224, in parse_time
return self._parse(input_str, dim=Dim.TIME)
File "/usr/local/lib/python3.5/site-packages/duckling/wrapper.py", line 41, in _parse
input_str, self.language, dim_filter=dim)
File "/usr/local/lib/python3.5/site-packages/duckling/duckling.py", line 106, in parse
return self._parse_result(duckling_result)
File "/usr/local/lib/python3.5/site-packages/duckling/duckling.py", line 124, in _parse_result
field.getValue(), entry[u'dim'])
File "/usr/local/lib/python3.5/site-packages/duckling/duckling.py", line 156, in _parse_dict
result[key] = _functionskey
KeyError: 'from'

For this python snippet:

import duckling
a="""
I was reading through your and others' recommendations and saw(for the first
time) your dinner invitation. Sorry for the delay. I need to take a rain
check. Life is really hectic right now. As you will note from the time, I am
here early just to clean out my e-mails--what has technology done to us. I am
in CA once a week. Yesterday, my 5 year old called me at the office at  7am
and asked me why I was never there when he woke up. So much for my cushy
in-house job. Anyway, a night away is too hard to find right now. What about
lunch? Thurs or Fri of next week?



    "Welsh, H. Ronald" <[email protected]>
    05/16/2001 06:26 PM

         To: "'[email protected]'" <[email protected]>
         cc:
         Subject: RE: Trial Lawyer


Richard,

I know two top flight trial lawyers in New York:

Greg Joseph
Fried, Frank, Harris, Shriver, & Jacobson
One New York Plaza
New York, New York 10004
212-859-8584
[email protected]


Norm Kleinberg
Hughes Hubbard & Reed LLP
One Battery Park Plaza
NY  10004
212-837-6680
[email protected]


In addition, VE has a lateral partner whom I have not met:

James Serota
VE
NY
917-206-8004
[email protected]

Jamie's bio says that his area of practice is antitrust and business
litigation,particularly concentrated on industries
that are evolving from regulation to competition, including electric
utilities.  It says he chaired the Fuel and Energy Industry Committee of the
ABA Antitrust Section from 94-97.

When are you in town, so we can have dinner?

Ron
"""

d = duckling.DucklingWrapper()
d.parse_time(a)

Giving all the possible dimensions and values

d1.parse('i need to earn 100k dollars') it gives

[{'dim': 'number', 'text': '100k', 'start': 12, 'end': 16, 'value': {'value': 100000.0}}, {'dim': 'time', 'text': '100', 'start': 12, 'end': 15, 'value': {'value': '0100-01-01T00:00:00.000+05:53:28', 'grain': 'year', 'others': []}}, {'dim': 'amount-of-money', 'text': '100k dollars', 'start': 12, 'end': 24, 'value': {'value': 100000.0, 'unit': '$'}}, {'dim': 'distance', 'text': '100k', 'start': 12, 'end': 16, 'value': {'value': 100.0, 'unit': 'kilometre'}}, {'dim': 'volume', 'text': '100k', 'start': 12, 'end': 16, 'value': {'value': 100000.0, 'unit': None, 'latent': True}}, {'dim': 'temperature', 'text': '100k', 'start': 12, 'end': 16, 'value': {'value': 100000.0, 'unit': None}}]

but i need only

{'dim': 'amount-of-money', 'text': '100k dollars', 'start': 12, 'end': 24, 'value': {'value': 100000.0, 'unit': '$'}},

Money Parsing Error

Hi Guys,

For me, 1 million dollar and 0.1 million dollar are returning same response. I had raised an issue at Duckling Repo and the Dev team has confirmed that issue is with the Python library. Could you please update the library?

Version Info

Name: duckling
Version: 1.7.2

Problems after update to JPype1 0.6.3

Hi

After updating to latest JPype1 (0.6.3) I had a problem with some of my unit tests that included Duckling and I had to make a modification in duckling.py _start_jvm(...)

The modification was to change on line 67:
if jpype.isJVMStarted() is not 1: to if jpype.isJVMStarted() is False:
(

if jpype.isJVMStarted() is not 1:
)

This makes sense since jpype.isJVMStarted() seems to return True/False.

This one change fixed my problem. However, since jpype.isThreadAttachedToJVM() seems to also return True/False and both methods are used in various places in the code, there might be more places that this needs to be fixed.

Kind regards
Bernardt

Timezone mismatch with reference_time

I am seeing that the timezone set with reference_time is not being returned in the result.
Here is the example

>>> import duckling
>>> import pprint
>>> d = duckling.DucklingWrapper()
>>> pprint.pprint(d.parse_time('today at 2 pm', reference_time='2017-05-28T00:00:00+05:30'))
[{u'dim': u'time',
  u'end': 13,
  u'start': 0,
  u'text': u'today at 2 pm',
  u'value': {u'others': [u'2017-05-28T14:00:00.000+05:00'],
             u'value': u'2017-05-28T14:00:00.000+05:00'}}]

Though I am using the timezone +05:30 the result has +05:00
I am using version 1.7.1

Incompatibility with recent versions of JPype

I tried running duckling after installing a more recent version of JPype and started getting the following error when I tried using the parse_time function:

/usr/local/lib/python3.6/dist-packages/jpype/_jstring.py in getitem(self, i)
46
47 def getitem(self, i):
---> 48 if i < 0:
49 i += len(self)
50 if i < 0:

TypeError: '<' not supported between instances of 'slice' and 'int'

When I revert back to JPype 0.7.4 (the one I used before), it works, but I get a deprecation warning:

""/usr/local/lib/python3.6/dist-packages/jpype/_core.py:209: UserWarning:
Deprecated: convertStrings was not specified when starting the JVM. The default
behavior in JPype will be False starting in JPype 0.8. The recommended setting
for new code is convertStrings=False. The legacy value of True was assumed for
this session. If you are a user of an application that reported this warning,
please file a ticket with the developer.
""")"

It says I have to file a ticket with the developer if I notice this warning. Since current pip install for duckling installs JPype 1.0 and not older versions, it will throw the same error for many others. can you please take a look at this?

clojure issue

Hi,
When i run
d = Duckling()

I get the following issue.

File "C:\Users\hi\Anaconda3\lib\site-packages\duckling\duckling.py", line 53, in init
self.clojure = jpype.JClass('clojure.java.api.Clojure')

File "C:\Users\hi\Anaconda3\lib\site-packages\jpype_jclass.py", line 73, in JClass
raise _RUNTIMEEXCEPTION.PYEXC("Class %s not found" % name)

java.lang.RuntimeExceptionPyRaisable: java.lang.RuntimeException: Class clojure.java.api.Clojure not found

How do i fix the same

duckling 0.2.23 downgrade

i still get messages in the log.. but only two line:) not the lots of ?????? lines

"not found" "san silvestro"
"not found" "notte di san silvestro"

and I'm using en files so not sure why its logging Italian info about new years eve etc:)

Seg fault on ubuntu 14.04 (3.13 kernel) not on 16.04 (4.4.0 kernel)

I'm seeing something that appears to be a kernel interaction with python-duckling - here are the steps to illustrate/reproduce:

Spinning up a (Digital Ocean defined) default ubuntu 14.04.5 x64 (3.13.0-121 kernel) image on 4Gig ram, two core Digital Ocean "droplet", and installing Ducking with the following:

apt-get update ; apt-get install -y --no-install-recommends build-essential git-core default-jre python-pip python-dev python-setuptools ; pip install duckling

On ubuntu 14.04 (3.13 kernel)

Duckling instance that Segmentation Faults:

# python
Python 2.7.6 (default, Oct 26 2016, 20:30:19) 
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import duckling
>>> d= duckling.DucklingWrapper()
Segmentation fault (core dumped)
#

On ubuntu 16.04 (4.4.0 kernel)

Repeating the same on Digital Ocean Ubuntu 16.04.2 x64 creates an instance that works:

# python
Python 2.7.12 (default, Nov 19 2016, 06:48:10) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import duckling
>>> d= duckling.DucklingWrapper()
>>> print(d.parse_time(u'Let\'s meet at 11:45am'))
[{u'dim': u'time', u'end': 21, u'start': 11, u'value': {u'grain': u'minute', u'value': u'2017-07-26T11:45:00.000Z', u'others': [{u'grain': u'minute', u'value': u'2017-07-26T11:45:00.000Z'}, {u'grain': u'minute', u'value': u'2017-07-27T11:45:00.000Z'}, {u'grain': u'minute', u'value': u'2017-07-28T11:45:00.000Z'}]}, u'text': u'at 11:45am'}]
>>> 

(d= duckling.DucklingWrapper() took about 30 secs to return)

On ubuntu 14.04 force loaded to have 4.4.0 kernel

Retrying on 14.04 but updating kernel (4.4.0-87) also works:
( kernel update cmd from https://wiki.ubuntu.com/Kernel/LTSEnablementStack#Server-1 )

# apt-get install --install-recommends linux-generic-lts-xenial
[...bunch of output...]
# reboot
[... re-login...]
# python
Python 2.7.6 (default, Oct 26 2016, 20:30:19) 
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import duckling
>>> d= duckling.DucklingWrapper()
"not found" "san silvestro"
"not found" "notte di san silvestro"
>>> print(d.parse_time(u'Let\'s meet at 11:45am'))
[{u'dim': u'time', u'end': 21, u'start': 11, u'value': {u'grain': u'minute', u'value': u'2017-07-26T11:45:00.000Z', u'others': [{u'grain': u'minute', u'value': u'2017-07-26T11:45:00.000Z'}, {u'grain': u'minute', u'value': u'2017-07-27T11:45:00.000Z'}, {u'grain': u'minute', u'value': u'2017-07-28T11:45:00.000Z'}]}, u'text': u'at 11:45am'}]
>>>

(d= duckling.DucklingWrapper() took about 30 secs to return)
(while the results seem correct, the object instantiation seems to be triggering issue facebookarchive/duckling_old#210)

Ubuntu 14.04 matters for Docker Cloud

I'm specifically interested in ubuntu 14.04/3.13-kernel because Docker containers deployed under the Docker Cloud/Cluster automated environment (when you see the "[whale] Deploy to Cloud" button in Github projects) only runs ubuntu 14.04/3.13-kernal hosts. So I'm unable to use python-duckling in Docker Cloud auto deployment system today.

Does not recognise dd/mm/yyyy format

Surprisingly, it does not recognise dd/mm/yyyy or mm/dd/yyyy or yyyy/mm/dd for of date!
It misreads dd-mm-yyyy as mm-dd-yyyy

In a case - u"The patient paid ten dollars for a kidney operation in sixties."
it interpretes "ten" as 2010 !

Extremely slow library

I tried by executing below code and it took 30 sec to extract the details. Kindly guide me if I can speed up the execution.

import duckling
d = duckling.DucklingWrapper()
print(d.parse_time('connect today'))

Dependency of Jre not mentioned

I installed the package but when I tried to create an object of DucklingWrapper(), it got the following error :-

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-2-a17e6c39d0f0> in <module>()
----> 1 d = DucklingWrapper()

/mnt/800GB/nimoy/venv/lib/python3.5/site-packages/duckling/wrapper.py in __init__(self, jvm_started, parse_datetime, language, minimum_heap_size, maximum_heap_size)
     33             parse_datetime=parse_datetime,
     34             minimum_heap_size=minimum_heap_size,
---> 35             maximum_heap_size=maximum_heap_size)
     36         self.duckling.load()
     37         self._dims = {

/mnt/800GB/nimoy/venv/lib/python3.5/site-packages/duckling/duckling.py in __init__(self, jvm_started, parse_datetime, minimum_heap_size, maximum_heap_size)
     42         if not jvm_started:
     43             self._classpath = self._create_classpath()
---> 44             self._start_jvm(minimum_heap_size, maximum_heap_size)
     45 
     46         try:

/mnt/800GB/nimoy/venv/lib/python3.5/site-packages/duckling/duckling.py in _start_jvm(self, minimum_heap_size, maximum_heap_size)
     67         if not jpype.isJVMStarted():
     68             jpype.startJVM(
---> 69                 jpype.getDefaultJVMPath(),
     70                 *jvm_options
     71             )

/mnt/800GB/nimoy/venv/lib/python3.5/site-packages/jpype/_core.py in get_default_jvm_path()
    119         finder = LinuxJVMFinder()
    120 
--> 121     return finder.get_jvm_path()
    122 
    123 # Naming compatibility

/mnt/800GB/nimoy/venv/lib/python3.5/site-packages/jpype/_jvmfinder.py in get_jvm_path(self)
    129         for method in self._methods:
    130             try:
--> 131                 jvm = method()
    132 
    133                 # If found check the architecture

/mnt/800GB/nimoy/venv/lib/python3.5/site-packages/jpype/_jvmfinder.py in _get_from_known_locations(self)
    182         :return: The path to the JVM library, or None
    183         """
--> 184         for home in self.find_possible_homes(self._locations):
    185             jvm = self.find_libjvm(home)
    186             if jvm is not None:

/mnt/800GB/nimoy/venv/lib/python3.5/site-packages/jpype/_jvmfinder.py in find_possible_homes(self, parents)
     95 
     96         for parent in parents:
---> 97             for childname in sorted(os.listdir(parent)):
     98                 # Compute the real path
     99                 path = os.path.realpath(os.path.join(parent, childname))

FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib/jvm'

I ran install java-jre and problem was solved. Please add java-jre as a dependency.

KeyError: 'product' when ducling handling keyword 'meat'

I have following problem when I use duckling to deal with time in RASA NLU. It only happens when the text contain 'meat', like 'gilled meat'. but it is fine with 'Meat' or 'gilledmeat'.


KeyError Traceback (most recent call last)
in ()
1 # Test Block for RASA NLU Performance
2 Sentence="gilled meat"
----> 3 interpreter.parse(Sentence.lower())

/opt/app/anaconda3/lib/python3.6/site-packages/rasa_nlu/model.py in parse(self, text, time)
293
294 for component in self.pipeline:
--> 295 component.process(message, **self.context)
296
297 output = self.default_output_attributes()

/opt/app/anaconda3/lib/python3.6/site-packages/rasa_nlu/extractors/duckling_extractor.py in process(self, message, **kwargs)
128 "".format(message.time, ref_time, e))
129
--> 130 matches = self.duckling.parse(message.text, reference_time=ref_time)
131 relevant_matches = [match
132 for match in matches

/opt/app/anaconda3/lib/python3.6/site-packages/duckling/wrapper.py in parse(self, input_str, reference_time)
236 ]
237 """
--> 238 return self._parse(input_str, reference_time=reference_time)
239
240 def parse_time(self, input_str, reference_time=''):

/opt/app/anaconda3/lib/python3.6/site-packages/duckling/wrapper.py in _parse(self, input_str, dim, reference_time)
54 duckling_result = self.duckling.parse(
55 input_str, self.language, dim_filter=dim,
---> 56 reference_time=reference_time)
57 for entry in duckling_result:
58 if entry[u'dim'] in self._dims:

/opt/app/anaconda3/lib/python3.6/site-packages/duckling/duckling.py in parse(self, input_str, language, dim_filter, reference_time)
159 language, input_str, self.clojure.read(filter_str))
160
--> 161 return self._parse_result(duckling_result)
162
163 def _parse_reference_time(self, reference_time):

/opt/app/anaconda3/lib/python3.6/site-packages/duckling/duckling.py in _parse_result(self, duckling_result)
185 if key == u'value':
186 entry[key] = self._parse_dict(
--> 187 field.getValue(), entry[u'dim'])
188 else:
189 entry[key] = _functionskey

/opt/app/anaconda3/lib/python3.6/site-packages/duckling/duckling.py in _parse_dict(self, java_dict, dim)
220 result[key] = _functions_with_dim[key](field.getValue(), dim)
221 else:
--> 222 result[key] = _functionskey
223 return result
224

KeyError: 'product'

BTW, below are the version of dependency packages:

  1.   Duckling: 1.7.3
    
  2.   Spacy: 2.0.3
    
  3.   Sklearn: 0.19.0
    
  4.   rasa_nlu: 0.10.6
    

Configuration of RASA as follow:
{
"pipeline": ["nlp_spacy", "tokenizer_spacy", "intent_entity_featurizer_regex", "intent_featurizer_spacy", "ner_crf","ner_spacy", "ner_synonyms", "intent_classifier_sklearn","ner_duckling"],
"path" : "//models",
"data" : "/
/JSON_File.json",
"fine_tune_spacy_ner": true
}

Many thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.