Giter Club home page Giter Club logo

pyfinder's Introduction

PyFinder

Yet another search engine built in python.

This project contains a really simple search engine capable of digesting any list of objects of any kind, made as a prototype for another project i'm working on, in collaboration with @simo86. Since I liked where it was going, we extracted it from the main project to improve it and make it standalone.

Why?

There are lots and lots of search engines in python. This is another one, made expressly for prototyping and for placeholding. allows simple queries on any type of objects - as long as you search on strings attributes - and works directly on iterable of such objects (coming from any source, like list or even generator functions).

For this reason it's probably not well suited to work in production, but for testing purposes should do its job.

Also, since it does not rely on any kind of stored index - either in file or in a database - can work everywhere your python application can run.

Basic usage

from search import SearchEngine

# Have some objects ready to search!
class Item:
    def __init__(self, v):
        self.value = v

names = ['john doe', ]  # some data
collection = [Item(name) for name in names]

engine = SearchEngine(['value'], limit=10)

result = engine.search('john', collection)
# or directly engine('john', collectio)

Documentation

built documentation can be found on read the docs.

To locally build your documentation go inside the docs folder and run make html to build it with sphinx.

For development purposes

sphinx-autobuild ./source _build_html

can be used to hot reload the documentation that will be served at 127.0.0.1:8000.

Testing

Tests are built with pytest, so to quickly run tests:

pytest

More options can be found in pytest documentation

License

Released under MIT License

pyfinder's People

Contributors

dinghino avatar

Watchers

 avatar

pyfinder's Issues

add filter for stopwords

for blobs like long strings stopwords should be removed from the string (before or after splitting in chunks)
this could be done removing all words shorter than x character, for testing, bug would work better having a dictionary of stopwords to filter out.

stemming

for each segment (partial query or attribute value) find the root-word (i.e. searching -> search).

split at & remove punctuation

remove all punctuation and split there. This should be already in place with the internal splitter function that as of now calls a re.split(r'\W+'), but I haven't actually tested it

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.