@author: Matt Story <[email protected]>
@license: BSD 3-Clause (see LICENSE)
The Sonnets Xapian Demo was built for the Getting Started with Python and Xapian -- HackNY HackNY Spring Student Hackathon, where I participated as a tech ambassador (although the talk was not given due to room difficulties).
The sonnets were taken by hand from here, and parsed. This content is available freely under the public domain in the USA.
index-sonnets.py
takes an list of files to index on the command-line:
$ python index-sonnets.py sonnets/1 sonnets/2 # etc ...
And indexes them with the author 'William Shakespeare' to the database
./xdb/sonnets.db
. To index all the sonnets:
$ find shakespeare/ -type f | xargs python index-sonnets.py
query-sonnets.py
takes a query string as its first argument, an optional
author query string for its second argument and optionally number of lines in
the sonnet as its third argument (admittedly a terrible interface, but it's a
demo ...). Some things you can try:
$ python query-sonnets.py 'shall'
$ python query-sonnets.py 'shall' '' 15
$ python query-sonnets.py '' '' 16