rezemika / humanized_opening_hours Goto Github PK
View Code? Open in Web Editor NEWA parser for the opening_hours fields from OpenStreetMap
License: GNU Affero General Public License v3.0
A parser for the opening_hours fields from OpenStreetMap
License: GNU Affero General Public License v3.0
Hey there!
HOH is currently in v1.0.0 beta 1, so it is almost production-ready. However, I would greatly appreciate some feedbacks on the API, to have a module as developer-friendly as possible. So, if you have any notice, complaint or question about the methods or the usage of HOH, please feel free to post them in this issue.
Thank you in advance!
In english we write "on {weekday}" but in portuguese we have to write "na segunda"(on monday) and "no sábado"(on saturday). Weekend days are male nouns.
Basically this:
na segunda-feira
na terça-feira
na quarta-feira
na quinta-feira
na sexta-feira
no sábado
no domingo
As stated in #17 when running the basic example found at the beginning of README
>>> import humanized_opening_hours as hoh
>>> field = "Mo-Fr 06:00-21:00; Sa,Su 08:00-12:00"
>>> oh = hoh.OHParser(field, locale="en")
>>> oh.is_open()
True
>>> oh.next_change()
datetime.datetime(2017, 12, 24, 12, 0)
>>> print('\n'.join(oh.description()))
"""
From Monday to Friday: 6:00 AM – 9:00 PM.
From Saturday to Sunday: 8:00 AM – 12:00 PM.
"""
I get a completely different result:
>>> import humanized_opening_hours as hoh
>>> field = "Mo-Fr 06:00-21:00; Sa,Su 08:00-12:00"
>>> oh = hoh.OHParser(field, locale="en")
>>> print('\n'.join(oh.description()))
From Sunday to Thursday: 6:00 AM – 9:00 PM.
From Friday to Saturday: 8:00 AM – 12:00 PM.
I suspect this might be related to some settings, possibily regarding Python internal localization as I live in Italy.
Let me know if I can help you, for example with any details on localization settings of my Python installation (which by the way is Python 3.6.5 on Ubuntu 18.04.1 in WSL running on Windows 10).
I'm getting the error Dynamic Earley doesn't support weights on terminals
when I try to compile humanized_opening_hours
This should come from lark-parser/lark#383
The parser will fail against opening hours that contain nonstandard days of the week abbreviations and the presence of 'AM' and / or 'PM'.
For example, "Mon-Sat 11:30AM-10PM"
will fail on both of the counts above (see OSM node 30899821
). Moreover, the lack of :00
seems to confuse it as well.
Python 3.6.5 :: Anaconda, Inc.
import humanized_opening_hours as hoh
hoh.OHParser("Mon-Sat 11:30AM-10PM")
Error:
Traceback (most recent call last):
File "/anaconda3/lib/python3.6/site-packages/humanized_opening_hours/main.py", line 326, in __init__
self.field, optimize
File "/anaconda3/lib/python3.6/site-packages/humanized_opening_hours/field_parser.py", line 296, in get_tree_and_rules
tree = PARSER.parse(field)
File "/anaconda3/lib/python3.6/site-packages/lark/lark.py", line 223, in parse
return self.parser.parse(text)
File "/anaconda3/lib/python3.6/site-packages/lark/parser_frontends.py", line 118, in parse
return self.parser.parse(text)
File "/anaconda3/lib/python3.6/site-packages/lark/parsers/xearley.py", line 130, in parse
column = scan(i, column)
File "/anaconda3/lib/python3.6/site-packages/lark/parsers/xearley.py", line 119, in scan
raise UnexpectedCharacters(stream, i, text_line, text_column, {item.expect for item in to_scan}, set(to_scan))
lark.exceptions.UnexpectedCharacters: No terminal defined for 'n' at line 1 col 3
Mon-Sat 11:30AM-10PM
^
Expecting: {Terminal('MINUS'), Terminal('CLOSED'), Terminal('COMMA'), Terminal('__IGNORE_0'), Terminal('OPEN')}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-36-384ae46503c2>", line 1, in <module>
hoh.OHParser(opening_hours)
File "/anaconda3/lib/python3.6/site-packages/humanized_opening_hours/main.py", line 332, in __init__
col=e.column
humanized_opening_hours.exceptions.ParseError: The field could not be parsed, it may be invalid. Error happened on column 3.
In order for the string to be parsed correctly, it must be modified to: 'Mo-Sa 11:30-22:00'
.
Great job on the package overall. I know that this is a hard problem!
The following concerns the new-parsing
branch.
The commit 754060b rewrites a large part of the code, but makes the recursion of the next_change()
method inoperative. If someone has any idea of how to do that (or to do better!), please don't hesitate to propose a pull request!
The goal is to get the true next change for a field like Mo-Fr 00:00-24:00
. Currently, for a test from a Monday, it will return 24:00 on Monday, it should return 24:00 on Friday.
Here is the concerned method:
humanized_opening_hours/humanized_opening_hours/main.py
Lines 348 to 377 in 754060b
The opening_hours of this place are: Mo-Fr 11:30-15:00, We-Mo 18:00-23:00
The plain text description goes like this:
>>> oh = hoh.OHParser(field, locale="en")
>>> print(oh.plaintext_week_description(year=2018, weeknumber=1, first_weekday=0))
Monday: 6:00 PM – 11:00 PM
Tuesday: 11:30 AM – 3:00 PM
Wednesday: 6:00 PM – 11:00 PM
Thursday: 6:00 PM – 11:00 PM
Friday: 6:00 PM – 11:00 PM
Saturday: 6:00 PM – 11:00 PM
Sunday: 6:00 PM – 11:00 PM
whereas the expected is:
Monday: 11:30 AM – 3:00 PM and 6:00 PM – 11:00 PM
Tuesday: 11:30 AM – 3:00 PM
Wednesday: 11:30 AM – 3:00 PM and 6:00 PM – 11:00 PM
Thursday: 11:30 AM – 3:00 PM and 6:00 PM – 11:00 PM
Friday: 11:30 AM – 3:00 PM and 6:00 PM – 11:00 PM
Saturday: 6:00 PM – 11:00 PM
Sunday: 6:00 PM – 11:00 PM
I wanted to use a fork (temporarily), and found that the default setup.py
fails on my system (Python 3.5, Ubuntu 16.04) as it uses sys.path[0]
in an attempt to find the development directory. I assume this is because the main dev environment involves prepending to $PYTHONPATH. However, this makes the package non-portable - system.path[0]
is set to ''
by default, so setup.py
raises a FileNotFoundError
when attempting to change to that directory.
To indicate the directory of setup.py
, use __file__
:
os.path.chdir(os.path.dirname(os.path.realpath(__file__)))
The opening hours of this place are:
Oct-Mar 07:30-19:30; Apr-Sep 07:00-21:00
It means that the 30th of november at 14h30 it should be open.
However:
>>> oh = hoh.OHParser('Oct-Mar 07:30-19:30; Apr-Sep 07:00-21:00')
>>> oh.is_open()
False
Now if I add the year as your grammar allows it, then it works as expected:
>>> oh = hoh.OHParser('2018 Oct - 2019 Mar 07:30-19:30; Apr-Sep 07:00-21:00')
>>> oh.is_open()
True
The problem appears when the MonthDayRange spans between two different years without the explicit information in the monthday_date_month
.
Do you have a solution to this problem ?
Thanks!
OHParser raises a ParseError on day ranges after a comma:
hoh.OHParser('We,Fr-Su 10:00-17:00')
ParseError: The field could not be parsed, it may be invalid. Error happened on column 5 when parsing '-'.
This error doesn't happen if you have the range first: hoh.OHParser('We-Fr,Su 10:00-17:00')
I've opened a pull request at #6, but since I'm new to EBNF it probably contains some issues.
This tool is really great for parsing (or sanitizing) strings that have the osm opening_hours format, and getting them in a readable description. 💯
hoh.OHParser(" Mo-Fr 08:30-16:30", locale="fr").description()
// ['Du lundi au vendredi : 08:30 – 16:30.']
I was wondering if it was possible to leverage this tool (or another tool ?) to do the other way around ? 👇
hoh.parse_description("Du lundi au vendredi : 08:30 – 16:30", locale="fr")
// Mo-Fr 08:30-16:30
fyi my current implementation/workaround for this is with string replace 🙃
import re
oh_description = "Du lundi au vendredi : 08:30 – 16:30"
oh_description = re.sub("du", "", oh_description, flags=re.IGNORECASE)
oh_description = re.sub("lundi", "Mo", oh_description, flags=re.IGNORECASE)
...
Thanks anyway
It happens with Su off;Tu-Fr off;Sa 15:00-20:00;Mo 15:00-20:00
....
return _current_or_next_timespan(new_dt, i=i+1)
File "/home/ramin/workspace/map/POI/humanized_opening_hours/main.py", line 522, in _current_or_next_timespan
return _current_or_next_timespan(new_dt, i=i+1)
File "/home/ramin/workspace/map/POI/humanized_opening_hours/main.py", line 519, in _current_or_next_timespan
new_dt.date()+datetime.timedelta(i),
OverflowError: date value out of range
seems that Tu-Fr off
is selected here:
it is fixed changing this line:
to:
if matching_rules and matching_rules[0].status != 'closed':
When using the intro code:
import humanized_opening_hours as hoh
field = "Mo-Fr 06:00-21:00; Sa,Su 08:00-12:00"
oh = hoh.OHParser(field, locale="en")
I get
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __init__() got an unexpected keyword argument 'locale'
Hi, you wrote in your readme:
you can save some time by passing the parser to the constructor, instead to recreate it each time. To do
this, get the Lark parser with the humanized_opening_hours.field_parser.get_parser() function, and pass
it to the OHParser constructor via the parser argument.
Perhaps a more user-friendly approach is to save the lark instance, either as a property in the OHParser
instance like:
if not self._parser:
self._parser = field_parser.get_parser()
return self._parser
Or as a global variable in the field_parser module.
Just a suggestion :)
Hello !
Thanks for creating this package :)
I was slightly surprised that oh.description()
returned a list, but oh.plaintext_week_description()
returned a string using \n
.
Is there a historical reason or issue for that ? Would you be ok if I considered doing a PR to add oh.week_description()
that returns a list ? (and why not oh.plaintext_description()
that returned a string using \n
- and you could even push it by choosing the delimiter)
Anyway I know I can also just simply do oh.plaintext_week_description().split("\n")
so I'm using it like this for now :)
Sorry for the bother
Currently, only english and french are supported for rendering. It would be great if HOH support other languages. So, if there's any language that you would like to see HOH work with, please don't hesitate to create a pull request! :)
To add a new translation:
AVAILABLE_LOCALES
in rendering.py
;humanized_opening_hours
folder;mkdir -p locales/<LOCALE>/LC_MESSAGES
(replace <LOCALE>
with the name of the locale);xgettext -o locales/<LOCALE>/LC_MESSAGES/hoh.pot temporal_objects.py rendering.py
;msgfmt -o locales/<LOCALE>/LC_MESSAGES/hoh.mo locales/fr_FR/LC_MESSAGES/hoh.pot
;Thank you!
Il y a quelques imprécisions dans le README et les docstrings. Il faudrait harmoniser et corriger tout cela.
Aussi, il y a une petite erreur dans le README.
>>> import humanized_opening_hours
>>> field = "Mo-Fr 06:00-21:00; Sa,Su 07:00-21:00"
>>> hoh = HumanizedOpeningHours(field) # Devrait être "humanized_opening_hours.HumanizedOpeningHours(field)"
[...]
Pour l'instant, pour pouvoir faire un rendu, il faut faire tout ça.
import humanized_opening_hours
field = "Mo-Fr 06:00-21:00; Sa,Su 07:00-21:00"
hoh = humanized_opening_hours.HumanizedOpeningHours(field)
hohr = humanized_opening_hours.HOHRenderer(hoh)
print(hohr.description())
Il faudrait ajouter une méthode à HumanizedOpeningHours pour récupérer un objet hohr
prêt à être utilisé, pour pouvoir faire quelque chose comme ça.
import humanized_opening_hours
field = "Mo-Fr 06:00-21:00; Sa,Su 07:00-21:00"
hoh = humanized_opening_hours.HumanizedOpeningHours(field)
print(hoh.render().description())
The OSM entry for the Louvre museum in Paris has the following opening_hours
:
"Mo,Th,Sa,Su 09:00-18:00; We,Fr 09:00-21:45; Tu off; Jan 1,May 1,Dec 25: off"
If you try to parse it with the hoh parser, you get the following error:
>>> hoh.OHParser("Mo,Th,Sa,Su 09:00-18:00; We,Fr 09:00-21:45; Tu off; Jan 1,May 1,Dec 25: off")
ParseError: The field could not be parsed, it may be invalid. Error happened on column 71 when parsing '5:off'.
The problem is the colon at the end just before the "off" which is not allowed by your grammar and makes the hoh parser crashing to avoid confusion with "DIGITAL_MOMENT":
This colon is allowed by the specifications as a separator_for_readability
though it's an optional token.
Could you adapt your grammar to allow this separator ?
Thanks :)
According to https://wiki.openstreetmap.org/wiki/Key:opening_hours, the following syntax is allowed:
Su 10:00+ Sunday from 10:00 to an unknown or unspecified closing time.
However, hoh returns an error:
>>> import humanized_opening_hours as hoh
>>> oh = hoh.OHParser("Su 10:00+")
Traceback (most recent call last):
File "/home/klomp/.local/lib/python3.7/site-packages/humanized_opening_hours/main.py", line 91, in __init__
self._tree = field_parser.parse_field(self.sanitized_field)
File "/home/klomp/.local/lib/python3.7/site-packages/humanized_opening_hours/field_parser.py", line 267, in parse_field
tree = PARSER.parse(field)
File "/home/klomp/.local/lib/python3.7/site-packages/lark/lark.py", line 292, in parse
return self.parser.parse(text)
File "/home/klomp/.local/lib/python3.7/site-packages/lark/parser_frontends.py", line 79, in parse
return self.parser.parse(token_stream, *[sps] if sps is not NotImplemented else [])
File "/home/klomp/.local/lib/python3.7/site-packages/lark/parsers/lalr_parser.py", line 36, in parse
return self.parser.parse(*args)
File "/home/klomp/.local/lib/python3.7/site-packages/lark/parsers/lalr_parser.py", line 81, in parse
for token in stream:
File "/home/klomp/.local/lib/python3.7/site-packages/lark/lexer.py", line 354, in lex
for x in l.lex(stream, self.root_lexer.newline_types, self.root_lexer.ignore_types):
File "/home/klomp/.local/lib/python3.7/site-packages/lark/lexer.py", line 183, in lex
raise UnexpectedCharacters(stream, line_ctr.char_pos, line_ctr.line, line_ctr.column, allowed=allowed, state=self.state)
lark.exceptions.UnexpectedCharacters: No terminal defined for '+' at line 1 col 9
Su 10:00+
^
Expecting: {'COMMA', 'MINUS', '__ANON_3'}
A recent commit (ce5d7ae) removed two "useless" methods.
I'm not sure but it seems that at least one of these methods was not useless:
>>> oh = hoh.OHParser("24/7")
>>> ohr = oh.render()
>>> ohr.plaintext_week_description()
AttributeError Traceback (most recent call last)
<ipython-input-37-912ee1dfa78> in <module>() --> 1 ohr.plaintext_week_description()
~/.local/share/virtualenvs/idunn-8FJC_dD6/lib/python3.6/site-packages/humanized_opening_hours/main.py in plaintext_week_description(self, obj)
718 output = ''
719 for day in obj:
--> 720 d = self.periods_of_day(day)
721 description = d.description if d.description else _("closed")
722 output += _("{name}: {periods}").format(
~/.local/share/virtualenvs/idunn-8FJC_dD6/lib/python3.6/site-packages/humanized_opening_hours/main.py in periods_of_day(self, day)
694 )
695 rendered_periods = self._join_list(rendered_periods)
--> 696 name = self.get_locale_day(day.weekday())
697 return RenderableDay(name=name, description=rendered_periods, dt=d.date)
698
AttributeError: 'OHRenderer' object has no attribute 'get_locale_day'
It looks like the get_locale_day()
function should not have been removed ... ?
When using the example from the OSM documentation, which has times spanning over midnight, I am getting an exception:
>>> import humanized_opening_hours as hoh
>>> hoh.OHParser("Su-Tu 11:00-01:00, We-Th 11:00-03:00, Fr 11:00-06:00, Sa 11:00-07:00")
Traceback (most recent call last):
File "/tmp/osm/lib/python3.7/site-packages/humanized_opening_hours/main.py", line 91, in __init__
self._tree = field_parser.parse_field(self.sanitized_field)
File "/tmp/osm/lib/python3.7/site-packages/humanized_opening_hours/field_parser.py", line 267, in parse_field
tree = PARSER.parse(field)
File "/tmp/osm/lib/python3.7/site-packages/lark/lark.py", line 292, in parse
return self.parser.parse(text)
File "/tmp/osm/lib/python3.7/site-packages/lark/parser_frontends.py", line 79, in parse
return self.parser.parse(token_stream, *[sps] if sps is not NotImplemented else [])
File "/tmp/osm/lib/python3.7/site-packages/lark/parsers/lalr_parser.py", line 36, in parse
return self.parser.parse(*args)
File "/tmp/osm/lib/python3.7/site-packages/lark/parsers/lalr_parser.py", line 92, in parse
reduce(arg)
File "/tmp/osm/lib/python3.7/site-packages/lark/parsers/lalr_parser.py", line 73, in reduce
value = self.callbacks[rule](s)
File "/tmp/osm/lib/python3.7/site-packages/humanized_opening_hours/field_parser.py", line 174, in field_part
"The field contains a period which spans "
humanized_opening_hours.exceptions.SpanOverMidnight: The field contains a period which spans over midnight, which not yet supported.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/osm/lib/python3.7/site-packages/humanized_opening_hours/main.py", line 92, in __init__
except lark.lexer.UnexpectedInput as e:
AttributeError: module 'lark.lexer' has no attribute 'UnexpectedInput'
It seems like spanning over midnight is already detected, but currently not handled.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.