Giter Club home page Giter Club logo

Comments (6)

bear avatar bear commented on July 19, 2024

both should always try to parse the dates using the given locale that it
was initialized against - you can construct a pdt instance with any locale
name it supports.

The difference in the result they return is because nlp is looking for
keywords and markers within a sentance or paragraph and then calls parse
for each one.

On Sun, Nov 16, 2014 at 5:49 AM, Rob1080 [email protected] wrote:

When parsing dates in a form that parsedatetime doesn't expect for the
local e.g. YYYY/MM/dd and using en_us locale, then nlp and parse return
different results.

For example:
`
parse_result = p.parse("2015/10/25",
datetime.datetime(2014,9,24,12).timetuple())
#parse_result returns 0
#(time.struct_time(tm_year=2014, tm_mon=11, tm_mday=16, tm_hour=12,
tm_min=41, tm_sec=14, tm_wday=6, tm_yday=320, tm_isdst=0), 0)

nlp_result = p.nlp("2015/10/25",
datetime.datetime(2014,9,24,12).timetuple()))
#nlp_result returns 2 dates, one set to 2014-9-24 20:15 and another with
date set to 2014/10/25
#((datetime.datetime(2014, 9, 24, 20, 15), 2, 0, 4, '2015'),
(datetime.datetime(2014, 10, 25, 12, 0), 1, 5, 10, '10/25'))
`

I'd expect both to return no result. Alternatively, could we be smarter in
how parse and nlp work with regard to processing dates in the form
yyyy/mm/dd ?
The usecase is that some users (regardless of locale) may choose to use
yyyy/mm/dd or their standard locale syntax. Provided the year is 4
characters long, we could parse the date in format yyyy/mm/dd. Are there
more gotchas that I'm not aware of?

Thanks
Rob


Reply to this email directly or view it on GitHub
#70.

Bear

[email protected] (email)
[email protected] (xmpp, email)
[email protected] (xmpp, email)
http://code-bear.com/bearlog (weblog)

PGP Fingerprint = 9996 719F 973D B11B E111 D770 9331 E822 40B3 CD29

from parsedatetime.

Rob1080 avatar Rob1080 commented on July 19, 2024

Thanks Mike.
Regarding the locales; is there a way to support both yyyy/mm/dd and another date order format , provided the the year is 4 character length? I understand that there would be confusion with yy/mm/dd dd/mm/yy but if yyyy/mm/dd it should be clear. I suppose I could pre-parse the text looking for yyyy/mm/dd format before parsing.

Regarding NLP parsing; I understand that it looks for keywords and markers, but it seems strange that 2014/05/05 would be interpreted as a time and date, given that there are no spaces.

from parsedatetime.

bear avatar bear commented on July 19, 2024

NLP - to be honest I haven't dug into the code to find out exactly where it
gets confused - enabling debug would generate a lot of output to that cause
if your curious.

as far as always allowing for yyyy/mm/dd I originally didn't try to figure
out the order because their was one country that did YYYY/DD/MM but that
appears to no longer be the case. I would now say the code should look for
a ##/##/#### or ####/##/## and if found determine the order by which side
the YYYY is found.

On Mon, Nov 17, 2014 at 2:36 AM, Rob1080 [email protected] wrote:

Thanks Mike.
Regarding the locales; is there a way to support both yyyy/mm/dd and
another date order format , provided the the year is 4 character length? I
understand that there would be confusion with yy/mm/dd dd/mm/yy but if
yyyy/mm/dd it should be clear. I suppose I could pre-parse the text looking
for yyyy/mm/dd format before parsing.

Regarding NLP parsing; I understand that it looks for keywords and
markers, but it seems strange that 2014/05/05 would be interpreted as a
time and date, given that there are no spaces.


Reply to this email directly or view it on GitHub
#70 (comment).

Bear

[email protected] (email)
[email protected] (xmpp, email)
[email protected] (xmpp, email)
http://code-bear.com/bearlog (weblog)

PGP Fingerprint = 9996 719F 973D B11B E111 D770 9331 E822 40B3 CD29

from parsedatetime.

Rob1080 avatar Rob1080 commented on July 19, 2024

Thanks Mike, didn't know there was a county that did YYYY/DD/MM!

from parsedatetime.

bear avatar bear commented on July 19, 2024

yea, when I was building parsedatetime I spent a good hour looking thru the documentation and output of the IBM ICU library to get test locales for the code that builds the regexes. The good thing about globalization is that some countries have adjusted to a more european style.

from parsedatetime.

bear avatar bear commented on July 19, 2024

marking this as closed and tagging it as a question for others to find

from parsedatetime.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.