Giter Club home page Giter Club logo

parsedatetime's Introduction

parsedatetime

Parse human-readable date/time strings.

PyPI Version Build Status Code Coverage

Parsedatetime now targets Python 3 and is currently tested with Python 3.9

Use https://github.com/bear/parsedatetime/releases/tag/v2.6 if you need Python 2.7 compatibility.

Installing

You can install parsedatetime using

pip install parsedatetime

Development environment

Development is done using a pipenv virtural environment

make env

Note: black is still listed as a beta library, and as such, must be installed with the --pre flag

Running Tests

From the source directory

make test

To run tests on several Python versions that are installed in the pipenv virtual environment

$ make tox
[... tox creates a virtualenv for every python version and runs tests inside of each]
py39: commands succeeded

The tests depend on PyICU being installed using the pyicu-binary package which removes the source build step. PyICU depends on icu4c which on macOS requires homebrew

brew install icu4c

Using parsedatetime

Detailed examples can be found in the examples directory.

as a time tuple

import parsedatetime
    
cal = parsedatetime.Calendar()
cal.parse("tomorrow")

as a Python datetime object

from datetime import datetime

time_struct, parse_status = cal.parse("tomorrow")
datetime(*time_struct[:6])

with timezone support using pytz

import parsedatetime
from pytz import timezone

cal = parsedatetime.Calendar()
datetime_obj, _ = cal.parseDT(datetimeString="tomorrow", tzinfo=timezone("US/Pacific"))

Documentation

The generated documentation is included by default in the docs directory and can also be viewed online at https://bear.im/code/parsedatetime/docs/index.html

The documentation is generated with

make docs

Notes

The Calendar class has a member property named ptc which is created during the class init method to be an instance of parsedatetime_consts.CalendarConstants().

History

The code in parsedatetime has been implemented over the years in many different languages (C, Clipper, Delphi) as part of different custom/proprietary systems I've worked on. Sadly the previous code is not "open" in any sense of that word.

When I went to work for Open Source Applications Foundation and realized that the Chandler project could benefit from my experience with parsing of date/time text I decided to start from scratch and implement the code using Python and make it truly open.

After working on the initial concept and creating something that could be shown to the Chandler folks, the code has now evolved to its current state with the help of the Chandler folks, most especially Darshana.

parsedatetime's People

Contributors

ankostis avatar axsapronov avatar bear avatar dansteeves68 avatar dependabot[bot] avatar dirkjankrijnders avatar fake-name avatar geoffreyfloyd avatar idpaterson avatar kyb3r avatar lborgav avatar livibetter avatar medecau avatar mvhconsult avatar oxan avatar paulrenvoise avatar philiptzou avatar psav avatar rbu avatar rec avatar rl-0x0 avatar rmecham avatar rob1080 avatar sashaacker avatar sbonnick avatar sbraz avatar shubhras01 avatar thedrow avatar tonyg avatar zed avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

parsedatetime's Issues

All dates in text

Hi, is it possible to recursively look for all dates in a text, for example extract one, eliminate this chunk of text, parse again for the next one, and so on? Any tips on how I should go for this using parsedatetime?

No check to determine if a date exists within the text passed to parse() function

Example of operation:

Python 2.7.3 (default, Apr 20 2012, 22:39:59)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import parsedatetime as pdt
c = pdt.Constants()
c.BirthdayEpoc = 80
p = pdt.Calendar(c)
print p.parse("Mary had a little lamb.")
((2014, 3, 1, 20, 53, 56, 6, 69, 1), 1)
print p.parse("foo bar")
(time.struct_time(tm_year=2013, tm_mon=3, tm_mday=10, tm_hour=20, tm_min=54, tm_sec=6, tm_wday=6, tm_yday=69, tm_isdst=1), 0)
print p.parse("March 12th, 2013")
((2013, 3, 12, 20, 54, 34, 6, 69, 1), 1)

This is a resubmit with more information, and a better understanding of the library's operation. Previous issue was #48.

Default Locale Settings Don’t Recognize “P.M.”

The default settings for the parser correctly detect “PM” and “pm”:

>>> import parsedatetime as pdt
>>> c = pdt.Calendar()
>>> c.parse('6 PM')
((2014, 5, 29, 18, 0, 0, 3, 149, 0), 2)
>>> c.parse('6 pm')
((2014, 5, 29, 18, 0, 0, 3, 149, 0), 2)

but not “P.M.” and “p.m.”:

>>> c.parse('6 P.M.')
((2014, 5, 29, 6, 0, 0, 3, 149, 0), 2)
>>> c.parse('6 p.m.')
((2014, 5, 29, 6, 0, 0, 3, 149, 0), 2)

I can persuade the parser to recognize them:

>>> co = pdt.Constants('en')
>>> co.pm.append('p.m.')
>>> c = pdt.Calendar(co)
>>> c.parse('6 p.m.')
((2014, 5, 29, 18, 0, 0, 3, 149, 0), 2)
>>> c.parse('6 P.M.')
((2014, 5, 29, 18, 0, 0, 3, 149, 0), 2)

Unexpected output with cal.nlp('Sunday is a time of rest')

Using version 1.4
Wondering why I would have gotten a datetime back at all ?

CODE:
import parsedatetime
cal = parsedatetime.Calendar()
nlp = cal.nlp('Sunday is a time of rest')
print nlp

OUTPUT:
((datetime.datetime(2014, 11, 2, 17, 52, 48), 1, 0, 6, 'sunday'),)

Julian triggers July

I am having an issue where word stems within words seem to give me false positives.

import parsedatetime.parsedatetime as pdt
c = pdt.Calendar()
c.parse('Julian Lee (the saxophonist')

(time.struct_time(tm_year=2013, tm_mon=7, tm_mday=7, tm_hour=17, tm_min=53, tm_sec=4, tm_wday=6, tm_yday=188, tm_isdst=-1), 1)

There is clearly no date information in that text, but the library returns this. Do you have any suggestions on how to get around this. I am thinking I'll have to modify the internal regexes to also look for spaces to separate keywords such as "jul".

Thanks,

Richard

Hours/Minutes parsing error

Hi,

The code is working fine for my use, but I noticed some differences in hours/minutes between the input and the result.

result = p.parse("Tomorrow 09h12")
print(result[0])
>>> time.struct_time(tm_year=2014, tm_mon=1, tm_mday=13, tm_hour=18, tm_min=0, tm_sec=0, tm_wday=0, tm_yday=13, tm_isdst=-1)

and

result = p.parse("06/01/2014 09h12")
print(result[0])
>>> time.struct_time(tm_year=2014, tm_mon=6, tm_mday=1, tm_hour=10, tm_min=48, tm_sec=41, tm_wday=6, tm_yday=152, tm_isdst=-1)

In the two examples I kept the same "09h12" and I've got different results... What am I missing ?

Thanks in advance

In `'February 24th 12PM PST'` `12PM` is being interpreted as the year '12

It works for 1PM, but 12PM and 11AM are taken as years instead of times.

>>> import parsedatetime.parsedatetime as pdt
>>> cal = pdt.Calendar()
>>> cal.parse('February 24th 12PM PST')
((2012, 2, 24, 15, 8, 38, 1, 50, 0), 1)
>>> cal.parse('February 24th 1PM PST')
((2013, 2, 24, 13, 0, 0, 1, 50, 0), 3)
>>> cal.parse('February 24th 11AM PST')
((2011, 2, 24, 15, 8, 58, 1, 50, 0), 1)
>>> 

Afternoon?

"Afternoon" defaults to 12pm, which of course isn't noon when using the parse or nlp functions.
"Tomorrow afternoon" defaults to 12pm the next day when using the parse function.
"Tomorrow afternoon" defaults to 9am the next day when using the nlp function.

I expected these to default to 3pm the current day, and 3pm the next day respectively. (Or a comparable time that is distinctly after noon.) Morning, noon, and evening all work as expected.

inconsistent application of sourceTime in Calendar.parseDT

Using the parseDT method, some relative strings use the provided source time for comparison, others use the actual current time:

In [3]: naiveDtm = datetime.datetime(2015, 6, 9, 1, 17, 0)
In [4]: cal.parseDT('1 hour ago', sourceTime=naiveDtm)
In [5]: cal = parsedatetime.Calendar()
In [6]: cal.parseDT('1 hour ago', sourceTime=naiveDtm)
Out[6]: (datetime.datetime(2015, 6, 9, 0, 17), 2)

In [7]: cal.parseDT('1 hour from now', sourceTime=naiveDtm)
Out[7]: (datetime.datetime(2015, 6, 9, 21, 6, 34), 2)

In [8]: cal.parseDT('now', sourceTime=naiveDtm)
Out[8]: (datetime.datetime(2015, 6, 9, 20, 6, 49), 2)

parsedatetime is considering "now" to be the current time even when a sourceTime is provided. "ago" is using the sourceTime as a reference.

Wrong Parse for specific date formats

Really appreciate if you could please help me add the code to parse the following date formats as well:

cal.parse('4 February 2014 18.33 GMT')
cal.parse('2014-02-03T14:01:04-08:00')

Thanks a lot

cal.parse('last friday') is broken

>>> datetime.datetime.today()
datetime.datetime(2012, 10, 13, 7, 11, 54, 461928)
>>> cal.parse('last friday')
(time.struct_time(tm_year=2012, tm_mon=10, tm_mday=19, tm_hour=7, tm_min=11, tm_sec=58, tm_wday=4, tm_yday=293,     tm_isdst=-1), 1)
>>> ## last friday shouldn't give me a week from last friday!
...
>>> pdt.__version__
'1.0.0'
>>>

nlp() doesn't recognize some "next ..." expressions

cl.nlp('next week') # returns None
cl.nlp('next day') # returns None
cl.nlp('next hour') # returns None
cl.nlp('next hour after ...') # returns empty tuple
cl.nlp('next Monday') # passes, returns correct date 

In the same time parse() works correctly on these examples

TypeError parsing 'August 24, 2012'

Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import parsedatetime as pdt
>>> cal = pdt.Calendar()
>>> cal.parse('August 24th, 2012')
((2012, 8, 24, 17, 13, 58, 4, 60, 0), 1)
>>> cal.parse('August 24, 2012')
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/local/lib/python2.6/dist-packages/parsedatetime/__init__.py", line 1514, in parse
    totalTime = self._evalString(parseStr, totalTime)
  File "/usr/local/lib/python2.6/dist-packages/parsedatetime/__init__.py", line 1014, in _evalString
    log.debug('attempt to parse as rfc822 - %s' % sourceTime)
TypeError: not all arguments converted during string formatting
>>>

New PyPI release?

@bear it has been a long while since version 1.4. Would you like to release a new version on PyPI? Or you can grant me the privileges to update the package (my username is Philip_Tzou on PyPI).

composed negative relative timedelta

Hi,
Thanks for this cool library!

Tested what is the date three days one hour and 20 min ago, but..

In [1]: from parsedatetime import Calendar
In [2]: c = Calendar()
In [3]: c.parse("")
Out[3]: 
(time.struct_time(tm_year=2015, tm_mon=1, tm_mday=24, tm_hour=22, tm_min=42, tm_sec=33, tm_wday=5, tm_yday=24, tm_isdst=0),
 0)
In [4]: c.parse("3 days 1 hour 20 min ago")
Out[4]: 
(time.struct_time(tm_year=2015, tm_mon=1, tm_mday=22, tm_hour=0, tm_min=3, tm_sec=6, tm_wday=3, tm_yday=22, tm_isdst=-1),
 3)
In [5]: c.parse("3 days ago 1 hour ago 20 min ago")
Out[5]: 
(time.struct_time(tm_year=2015, tm_mon=1, tm_mday=21, tm_hour=21, tm_min=23, tm_sec=16, tm_wday=2, tm_yday=21, tm_isdst=-1),
 1)

Is it the expected behaviour?

Thanks!

pdt.Calendar() doesn't work on iPython

Hello,

I'd like to use parsedatetime in a project of mine, but after installed through pip (sudo pip install parsedatetime) I just can't use it in iPython's console.

Here's the output:

In [1]: import parsedatetime as pdt

In [2]: cal = pdt.Calendar()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/home/lsmagalhaes/<ipython-input-2-4614cc4e0aa2> in <module>()
----> 1 cal = pdt.Calendar()

AttributeError: 'module' object has no attribute 'Calendar'

Is there anything am I doing wrong?

Here's the help output:

In [3]: pdt?
Type:       module
Base Class: <type 'module'>
String Form:<module 'parsedatetime' from '/usr/local/lib/python2.7/dist-packages/parsedatetime/__init__.pyc'>
Namespace:  Interactive
File:       /usr/local/lib/python2.7/dist-packages/parsedatetime/__init__.py
Docstring:
parsedatetime.py contains the C{Calendar} class where the C{parse()}
method can be found.

parsedatetime_consts.py contains the C{Constants} class that builds the
various regex values using locale information if available.

When I type "pdt.C?" or even "pdf.Calendar?", then nothing is found.

Supporting multiple shortweekday abbreviations

Would it make sense to support multiple formats for shortdays
e.g. Tuesday: tue|tues Thursday: thu|thr|thurs|thur

I was thinking that in pdt_locales.py
self.shortWeekdays = [ 'mon', 'tues', 'wed', 'th', 'fri', 'sat', 'sun', ]

could change to
self.shortWeekdays = [ 'mon', 'tue|tues', 'wed', 'thu|thur|thurs|thr', 'fri', 'sat', 'sun', ]

and then I could update the parser to suppor the new syntax.
Is there a better way to do this?

Calendar.nlp() fails on "MM/DD" input when sourceTime provided

Here is some example code to simulate the issue I am experiencing. I am using localized sourceTimes and the calendar's nlp functionality to do the parse. The issue seems to happen when I use a sourceTime and try to parse at the same time as the day.

import datetime
import parsedatetime
import pytz

cal = parsedatetime.Calendar()

# Construct a localized sourceTime.
tz_name = u'America/Los_Angeles'
timezone = pytz.timezone(tz_name)
src_time = timezone.localize(datetime.datetime(2014, 8, 11, 0, 0, 0))

# PASSES.
# Result =
# ((datetime.datetime(2015, 8, 10, 0, 0), 1, 0, 4, u'8/10'),)
cal.nlp(u"8/10", sourceTime=src_time)

# FAILS. Same day as sourceTime
# Result = 
# ((datetime.datetime(2014, 8, 15, 12, 54, 44), 0, 0, 4, u'8/11'),)
cal.nlp(u"8/11", sourceTime=src_time)

# PASSES.
# Result =
# ((datetime.datetime(2014, 8, 12, 0, 0), 1, 0, 4, u'8/12'),)
cal.nlp(u"8/12", sourceTime=src_time)

Parse as according to level of granularity

Hi, quick question: I basically want to parse date ranges given as from and until, e.g. from = 'jan 2013' and until = 'dec 2013'. In that case I would like to get from resolved to Jan 1, 2013 00:00 and until to Dec 31, 2013 23:59. Likewise, "today" should be parsed as 00:00 of today if parsing for from (backwards inclusive), and 23:59 of today if parsing for to (forward inclusive). Any way to achieve that with parsedatetime?

I digged into the code, what I'd really need is knowing what level of detail the string had (is it a day? a week? a month? a year?) when it comes back from parse(...). Ideas?

ValueError: invalid literal for int() with base 10

Its me again, calling parse() with more crazy inputs.

This time it is 659127,660214.

Traceback (most recent call last):
  File "parsedatetime/tests/TestRegression.py", line 18, in testUnboundQuantity
    self.cal.parse("659127,660214")
  File "/home/ilo/Devel/Tattoo/parsedatetime/parsedatetime/__init__.py", line 1597, in parse
    totalTime = self._evalString(parseStr, totalTime)
  File "/home/ilo/Devel/Tattoo/parsedatetime/parsedatetime/__init__.py", line 1135, in _evalString
    hr, mn, sec = _extract_time(m)
  File "/home/ilo/Devel/Tattoo/parsedatetime/parsedatetime/__init__.py", line 114, in _extract_time
    seconds = int(seconds)
ValueError: invalid literal for int() with base 10: '27,660214'

add Make file for tests, environment,etc

create a Makefile with the following targets:

test - run tests
init - setup virtualenv
doc - generate docs
build - run setup.py for wheel and source dists
???

Hour:25AM/PM not being parsed.

Hello,

>>> cal.parse('March 14th 8am')
((2014, 3, 14, 8, 0, 0, 2, 71, 1), 3)
>>> cal.parse('March 14th 8:00am')
((2014, 3, 8, 0, 0, 0, 2, 71, 1), 3)
>>> cal.parse('March 14th 8:15am')
((2014, 3, 8, 15, 0, 0, 2, 71, 1), 3)
>>> cal.parse('March 14th 8:24am')
((2014, 3, 8, 0, 0, 0, 2, 71, 1), 3)

But changing that to anything greater than 8:24am prevents the string from being parsed.

>>> cal.parse('March 14th 8:30am')
(time.struct_time(tm_year=2014, tm_mon=3, tm_mday=12, tm_hour=18, tm_min=10, tm_sec=44, tm_wday=2, tm_yday=71, tm_isdst=1), 0)

Inconsistent results with '3 months before 12/31'

Sometimes it returns datetime.date(2014, 9, 30) and sometimes datetime.date(2014, 12, 31) -- although results are consistent within a script.

Here's my script:

import parsedatetime as pdt
from datetime import date
from time import mktime


def natural_date(human_readable):
    human_readable = human_readable.lower()

    # Flag to cause parsedatetime to never go forward
    # http://stackoverflow.com/a/25098991/1093087
    ptc = pdt.Constants()
    ptc.YearParseStyle = 0
    cal = pdt.Calendar(ptc)

    result, parsed_as = cal.parse(human_readable)

    if not parsed_as:
        raise ValueError("Unable to parse %s" % (human_readable))

    return date.fromtimestamp(mktime(result))

#
# Produce bug
#
human_readable = '3 months before 12/31'

results = {}

for n in range(10):
    result = natural_date(human_readable)
    results[result] = (result in results) and results[result] + 1 or 0

print(pdt.__version__)
print(results)

Here's what I see:

klenwell@ubuntu1204:~$ .venv-3.4.1/bin/python sandbox/nldp_bug.py 
1.4
{datetime.date(2014, 9, 30): 9}
klenwell@ubuntu1204:~$ .venv-3.4.1/bin/python sandbox/nldp_bug.py 
1.4
{datetime.date(2014, 9, 30): 9}
klenwell@ubuntu1204:~$ .venv-3.4.1/bin/python sandbox/nldp_bug.py 
1.4
{datetime.date(2014, 9, 30): 9}      # <---- Compare this
klenwell@ubuntu1204:~$ .venv-3.4.1/bin/python sandbox/nldp_bug.py 
1.4
{datetime.date(2014, 12, 31): 9}     # <---- And this

Incorrect date output for format (1 month 1 day ago) (1 month 2 days ago)

parsedatetime is showing incorrect output for few date formats.
I'm using Python 3.4.1 and parsedatetime 1.4

Current date: 29 January 2015

cal.parse('1 month 1 day ago')
(time.struct_time(tm_year=2015, tm_mon=1, tm_mday=30, tm_hour=15, tm_min=36, tm_sec=41, tm_wday=4, tm_yday=30, tm_isdst=-1), 3)

cal.parse('1 month 2 days ago')
(time.struct_time(tm_year=2015, tm_mon=1, tm_mday=31, tm_hour=15, tm_min=47, tm_sec=10, tm_wday=5, tm_yday=31, tm_isdst=-1), 3)

Incorrect dates

I've been noticing some issues when using strings like "Third Monday in January 2013" as well as using offsets with strings like this "Fourth Thursday in November 2013 +1 day"

Here's some examples:

String -- parsedatetime return -- expected return
Last Monday of January 2013 -- 2013-01-19 -- 2013-01-28
First Monday of January 2013 -- 2013-01-26 -- 2013-01-07
First Monday of September 2013 -- 2013-09-26 - 2013-09-02
First Monday of September 2013 +6 days -- 2013-09-30 - 2013-09-08

make a new release to really fix backwards compatibility

It seems that a incompatible change of parsedatetime breaks MoinMoin wiki's fullsearch action on debian jessie.

There was already issue #11 and it was closed as fixed (and i can confirm that with repo code it is fixed), but you still need to get the stuff released to really fix it for release users (== most people).

Also, please update the changelog and point out that 1.3 fixes that backwards compatibility issue.

Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2] on linux2

>>> from parsedatetime.parsedatetime import Calendar
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named parsedatetime

>>> import parsedatetime.parsedatetime as pdt
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named parsedatetime

>>> from parsedatetime import Calendar

>>> import parsedatetime
>>> parsedatetime.__version__
'1.2'

NLP false positives

I was testing NLP and I came across a small issue.
If the test case is something like "I spent $300" then it picks up 300 as a time:
result = self.cal.nlp("I spent $300", datetime.datetime(2013, 8, 1, 21, 25, 0).timetuple()) self.assertTrue(result[0][4] != "300") #expect this to pass,$300 shouldn't be picked up as a time.

Is there a way to whitelist certain prefixes e.g. ['-'] , which would prevent $300 being picked up as a time and thus prevent some false positives coming through?

I suppose I could strip out the words I know aren't dates, before parsing through nlp.

sourceTime in v1.1.2

From the docs, it seems that the sourceTime kwarg gives the user the ability to set the time from which datetimeString will be calculated. I'm looking to calculate relative dates from a historical perspective.

my goal is to get something like this:
for everyday in days_between_date1_and_date2:
database.get_stuff(value>today,value<tomorrow)

So the for the dates between date1 and date2, I need to calculate the value of 'today','yesterday','tomorrow', etc.

>>> import parsedatetime as pdt; 
>>> c = pdt.Constants();p = pdt.Calendar(c)
>>> _from = p.parse('yesterday')[0] #get the sruct_time obj
>>> _today,parsed = p.parse('today',sourceTime=_from) #no errors are raised, see Lines 1311-1317
>>> parsed  #0 means we were unable to parse 
0

Do I misunderstand the purpose of the sourceTime kwarg, am I give it bad data as sourceTime, or is there something else I'm missing.

Thanks.

Short weekday abbreviations

Looking at pdt_locales.py, the short weekday names all seem correct except Thursday (it's abbreviated as th while all other are 3 letter abbreviations [except tues]). Is the correct abbreviation th or thu?
self.shortWeekdays = [ 'mon', 'tues', 'wed', 'th', 'fri', 'sat', 'sun', ]

Also, is it possible to support additional weekday abbreviations
e.g. th, thu, thurs = Thursday
tue, tues = Tuesday

OverflowError: date value out of range

The OverflowError is thrown if the parse method is called with "7eda37ab-b1b0-3e0e-8786-248508232d93" as an argument.

Code to replicate: pdt.Calendar().parse("7eda37ab-b1b0-3e0e-8786-248508232d93").

Traceback by the current master revision (commit 9f8af907dbc4dbf3eb2a743a611871f361b83f62):

Traceback (most recent call last):
  File "parsedatetime/tests/TestRegression.py", line 18, in testUnboundQuantity
    self.cal.parse("7eda37ab-b1b0-3e0e-8786-248508232d93")
  File "/home/ilo/Devel/Tattoo/parsedatetime/parsedatetime/__init__.py", line 1597, in parse
    totalTime = self._evalString(parseStr, totalTime)
  File "/home/ilo/Devel/Tattoo/parsedatetime/parsedatetime/__init__.py", line 1233, in _evalString
    sourceTime     = self._buildTime(sourceTime, quantity, modifier, units)
  File "/home/ilo/Devel/Tattoo/parsedatetime/parsedatetime/__init__.py", line 321, in _buildTime
    target        = start + datetime.timedelta(days=qty)
OverflowError: date value out of range

parse returns default time of 0900 with dates like 'next friday' despite passed struct_time

I’m passing a date like: next wednesday
parse returns the proper date, but a time of 9:00 am
passing a struct_time with a time set to 00:01 returns the same 9:00 am time
For other date formats, however, the passed struct_time has the intended effect.
Here is my testing.

Here is the behavior without the struct_time
p.parse('next wednesday')
(time.struct_time(tm_year=2015, tm_mon=2, tm_mday=18, tm_hour=9, tm_min=0, tm_sec=0, tm_wday=2, tm_yday=49, tm_isdst=-1), 1)

Now I’ll add the time.struct_time
t_s = time.struct_time(time.localtime()[:3] + (0,0,0,0,0,-1))
t_s
time.struct_time(tm_year=2015, tm_mon=2, tm_mday=12, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=0, tm_yday=0, tm_isdst=-1)
But, the parsed date still returns as 9am
p.parse('next wednesday',t_s)
(time.struct_time(tm_year=2015, tm_mon=2, tm_mday=21, tm_hour=9, tm_min=0, tm_sec=0, tm_wday=5, tm_yday=52, tm_isdst=-1), 1)

Other date formats will work
p.parse('July 1, 2005',t_s)
((2005, 7, 1, 0, 0, 0, 0, 0, -1), 1)
p.parse('July 1',t_s)
((2015, 7, 1, 0, 0, 0, 0, 0, -1), 1)

But when I want to get the effect with a more vague date, the passed struct_time doesn’t seem to have the effect.
p.parse('today',t_s)
(time.struct_time(tm_year=2015, tm_mon=2, tm_mday=12, tm_hour=9, tm_min=0, tm_sec=0, tm_wday=3, tm_yday=43, tm_isdst=-1), 1)
p.parse('today+ 5 days',t_s)
(time.struct_time(tm_year=2015, tm_mon=2, tm_mday=17, tm_hour=9, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=48, tm_isdst=-1), 1)

So it appears that passing the struct_time works for all but the dates in the form
next wed
tomorrow
etc.

"{month} {day}" not parsed if it equals sourceTime

Python 2.7.3 (default, Jan  2 2013, 13:56:14)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import parsedatetime as pdt
>>> pdt.__version__
'1.2'
>>> import datetime
>>> cal = pdt.Calendar()
>>> # without sourceTime specified, parse() works correctly, even if the string happens to be today
... cal.parse('Apr 30')
((2014, 4, 30, 10, 23, 31, 2, 120, 1), 1)
>>> now = datetime.datetime.now()
>>> now.strftime('%b %d')
'Apr 30'
>>> cal.parse(now.strftime('%b %d'))
((2014, 4, 30, 10, 20, 18, 2, 120, 1), 1)
>>> # does not parse if sourceTime specified and is the same as the date being parsed
... cal.parse('Apr 30', datetime.datetime(2014, 4, 30))
(time.struct_time(tm_year=2014, tm_mon=4, tm_mday=30, tm_hour=10, tm_min=24, tm_sec=20, tm_wday=2, tm_yday=120, tm_isdst=1), 0)
>>> cal.parse(now.strftime('%b %d'), now)
(time.struct_time(tm_year=2014, tm_mon=4, tm_mday=30, tm_hour=10, tm_min=20, tm_sec=20, tm_wday=2, tm_yday=120, tm_isdst=1), 0)
>>> # but works fine if sourceTime is another date
... cal.parse('Apr 30', datetime.datetime(2014, 4, 29))
((2014, 4, 30, 0, 0, 0, 1, 119, -1), 1)
>>> cal.parse('Apr 30', datetime.datetime(2014, 5, 1))
((2015, 4, 30, 0, 0, 0, 3, 121, -1), 1)
>>> # and 'today' works fine
... cal.parse('today', datetime.datetime(2014, 4, 30))
(time.struct_time(tm_year=2014, tm_mon=4, tm_mday=30, tm_hour=9, tm_min=0, tm_sec=0, tm_wday=2, tm_yday=120, tm_isdst=-1), 1)

Unexpected struct_time flag with Calendar.parse on HTML <a href> string

I was expecting a 0 instead of a 2 for the struct_time flag.

import parsedatetime
cal = parsedatetime.Calendar
print cal.parse('<a data-navid="792" href="/departments/departments-n-z/probate/estates-trusts/transfer-by-affidavit-50-000-and-under">')

OUTPUT:
((2014, 12, 2, 0, 0, 0, 1, 336, -1), 2)

Please tag version 1.3

Hey,

could you add a tag to commit 9a416d7, and preferable also upload it to PyPI? I hadn't noticed that 1.3 was out already until I perused the commits.

Thanks!
Patrice

testOffsetAfterNoon fails (FreeBSD, Python 2.7.3)

Revision 692168f

python run_tests.py

...

FAIL: testOffsetAfterNoon (parsedatetime.tests.TestSimpleOffsets.test)

Traceback (most recent call last):
File "/tank/emaste/src/parsedatetime/parsedatetime/tests/TestSimpleOffsets.py", line 94, in testOffsetAfterNoon
self.assertTrue(_compareResults(self.cal.parse('5 hours after 12pm', start), (target, 2)))
AssertionError: False is not true


Ran 46 tests in 0.710s

FAILED (failures=1)

nlp and parse return different results for date string.

When parsing dates in a form that parsedatetime doesn't expect for the local e.g. YYYY/MM/dd and using en_us locale, then nlp and parse return different results.

For example

parse_result = p.parse("2015/10/25", datetime.datetime(2014,9,24,12).timetuple())
#parse_result returns 0 
#(time.struct_time(tm_year=2014, tm_mon=11, tm_mday=16, tm_hour=12, tm_min=41, tm_sec=14, tm_wday=6, tm_yday=320, tm_isdst=0), 0)

nlp_result = p.nlp("2015/10/25", datetime.datetime(2014,9,24,12).timetuple()))
#nlp_result returns 2 dates, one set to 2014-9-24 20:15 and another with date set to 2014/10/25
#((datetime.datetime(2014, 9, 24, 20, 15), 2, 0, 4, '2015'), (datetime.datetime(2014, 10, 25, 12, 0), 1, 5, 10, '10/25'))

I'd expect both to return no result. Alternatively, could we be smarter in how parse and nlp work with regard to processing dates in the form yyyy/mm/dd ?
The usecase is that some users (regardless of locale) may choose to use yyyy/mm/dd or their standard locale syntax. Provided the year is 4 characters long, we could parse the date in format yyyy/mm/dd. Are there more gotchas that I'm not aware of?

Thanks
Rob

parse returns tuple with a return code of 3

Great module! Haven't found anything similar yet ...

Seeing the issue below with v1.1.2:

import parsedatetime.parsedatetime as pdt
c = pdt.Calendar()
c.parse("tomorrow 0800")
((2013, 11, 4, 8, 0, 0, 0, 308, -1), 3)

Thanks.

Enable travis

This project doesn't have CI.
I'd be willing to contribute a .travis.yml script if the maintainers will enable travis.

Support parsing iso8601 dates

YYYY-MM-DD or YYYY-MM-DDThh:mm:ssTZD formats

>>> import parsedatetime as pdt
>>> cal = pdt.Calendar()
>>> cal.parse('1999-12-24')
(time.struct_time(tm_year=2013, tm_mon=4, tm_mday=4, tm_hour=14, tm_min=2, tm_sec=49, tm_wday=3, tm_yday=94, tm_isdst=1), 0)

Please tag release for distributions

Please tag releases so that this module can be packaged in Linux distributions easily.
Would like to include it as one of the optional additions for gcalcli.

If you're struggling with version numbers, just use dates, like 20130626.

nlp does not include "now" in the phrase "1 year from now"

I use nlp to extract date phrases from user-provided text, leaving just the remaining text. In this use case, it is fairly apparent when nlp does not extend the phrase to include all terms. When processed through nlp, the phrase one year from now does not include the word now while one year from today works as expected.

> python -c "import parsedatetime;print parsedatetime.Calendar().nlp('1 year from now')"
((datetime.datetime(2016, 5, 29, 22, 22, 39), 1, 0, 11, '1 year from'),)
> python -c "import parsedatetime;print parsedatetime.Calendar().nlp('1 year from today')"
((datetime.datetime(2016, 5, 29, 9, 0), 1, 0, 17, '1 year from today'),)

Very minor detail, just wanted to make note so that I can fix it when I get a chance. The dates seem to be correct, differentiating in an exact by-the-second time from "now" versus a date with a morning time from "today"

testOffsetAfterNoon fails after install

From test run after install

======================================================================
FAIL: testOffsetAfterNoon (parsedatetime.tests.TestSimpleOffsets.test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/dan/Dropbox/Repositories/parsedatetime/parsedatetime/tests/TestSimpleOffsets.py", line 94, in testOffsetAfterNoon
    self.assertTrue(_compareResults(self.cal.parse('5 hours after 12pm',     start), (target, 2)))
AssertionError: False is not true

And then in iPython

In [1]: import parsedatetime as pdt

In [2]: cal = pdt.Calendar()

In [3]: cal.parse("last friday")
parse (top of loop): [last friday][]
parse (bottom) [last friday][last][][friday]
weekday False, dateStd False, dateStr False, time False, timeStr False, meridian False
dayStr False, modifier False, modifier2 True, units False, qunits False
parse (top of loop): [friday][]
parse (bottom) [][friday][][]
weekday True, dateStd False, dateStr False, time False, timeStr False, meridian False
dayStr False, modifier False, modifier2 False, units False, qunits False
_evalString(friday, None)
attempt to parse as rfc822 - None
wd 4, wkdy 4, offset 2, style 1
parse (top of loop): [last friday][last]
parse (bottom) [][last][][]
weekday False, dateStd False, dateStr False, time False, timeStr False, meridian False
dayStr False, modifier False, modifier2 False, units False, qunits False
_evalString(last, time.struct_time(tm_year=2012, tm_mon=11, tm_mday=30, tm_hour=12, tm_min=6, tm_sec=27, tm_wday=4, tm_yday=335, tm_isdst=-1))
Out[3]: 
(time.struct_time(tm_year=2012, tm_mon=11, tm_mday=30, tm_hour=12, tm_min=6, tm_sec=27, tm_wday=4, tm_yday=335, tm_isdst=-1),
 1)

testMonths fails

$ python3 run_tests.py parsedatetime
..................F..............................
======================================================================
FAIL: testMonths (parsedatetime.tests.TestUnits.test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./parsedatetime/tests/TestUnits.py", line 107, in testMonths
    self.assertTrue(_compareResults(self.cal.parse('1 month',  start), (target, 1)))
AssertionError: False is not true

----------------------------------------------------------------------
Ran 49 tests in 0.284s

FAILED (failures=1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.