Giter Club home page Giter Club logo

querycorrection's Introduction

QueryCorrection

self complemented SpellCorrection based pinyin similairity, edit distance ,基于用户词表,拼音相似度与编辑距离的查询纠错。

关于查询纠错

Image text

项目介绍

对于搜索中的query纠错功能,纠错过程主要分为以下3个过程:
1, Query纠错判断。对于常见错误,例如常见的拼写错误,使用事先挖掘好的错误query字典,当query在此字典中时纠错。如果用户输入的query查询无结果或结果较少于一定阈值时,尝试纠错,可以根据不同领域的策略和容忍度,配置最少结果数阈值。
2,不同策略独立纠错。Query有多种纠错策略,包括拼音纠错和编辑距离纠错,模糊音形近字二次纠错等其他纠错策略等。同音策略是用户输入的错误query和候选纠错query有相同的拼音。编辑距离策略就是错误query和候选query之间编辑距离小于一定阈值,并配合其他条件进行过滤。
3,候选词结果选择。因为每个策略比较独立,不同策略会给出不同的候选词,因此对于候选词的选取,每个策略有所不同。不同策略之间,不同策略内部需要使用不同的评估方式,来选择最优结果。

querycorrection's People

Contributors

liuhuanyong avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.