Giter Club home page Giter Club logo

borgmanities's People

Contributors

abathologist avatar shonfeder avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

borgmanities's Issues

Refactor code

  • Eliminate "Random Selections". Belongs rather in PaperTitleGen.IO.
  • PaperTitleGen.IO is written in a primitive style, reflecting the current state of my knowledge. Substantial gains in clarity can be made once I push to the next level, so
  • Attain next level.

Move all parsing to Haskell

Reasons for this change:

  1. I want to get familiar with parsing in Haskell.
  2. In order to really refine the output, we need to have more advanced parsing of the text. Immediate issues that call for this include:
    • Stripping out all links (requires parsing out relatively complex string patters).
    • Rejecting sequences that contain tweet-speek ("RT", "u r", etc).
    • Rejecting phrases with terms in the black list.
  3. Thus, we'll either need to do more parsing on the python side or the Haskell side. Given (1) and the fact that I don't enjoy working in Python---Point made.

Todo

  • Modify python to return un-filtered unicode text of tweets, one per line.
  • Parse this text in the Haskell.

Build tweet from TitleParts data.

In PapterTitleGen.Gen.hs: Select, manicure, and assemble data from fields of TitleParts input.
The main function for this should be generateTitle (rename if desired).

I'm putting you as principle on this, @KitLiterate, but not to be burdensome! Mainly just to explore the utility of the issue tracker as a project-planning tool.

Change or add what you will.

  • Only take definitions before “with” and other prepositions?
  • Only take definitions up to “having” and coordinating particles (“or” “and”) too ?
  • Remove phrases in parens that appear in definitions.

Plural nouns are not registered as nouns.

On a recent test I got back "manorial based jobs xlyqfwri rt dsw worldwide the fight" as a compliment. But, if we could catch plurals, the complement would have been "manorial based jobs", which would have worked nicely.

  • Come up with idea for fix.
  • Fix.

Reconfigure to access WordsApi via Mashape

WordsApi suddenly quit working, I think. I don't know if this is because I exceeded the daily allowance, or because of an issue on their end. But, in either case, they are terminating their "sandbox" services come February 20th, and will only be serving their api via Mashape (https://www.mashape.com/wordsapi/wordsapi), so I'll have to make this switch eventually. The fact that I only think WordsApi is the problem indicates serious shortcoming in the design of the Web/WordsApi.hs: I have no logging or error reporting here.

  • Add logging to Web/WordsApi.hs
  • Set WordsApi.hs to just use System.Process to interface with curl instead of Network.Curl.
  • Implement the Mashape token and query protocol via curl commands in WordsApi.hs.

Implement black list censorship

  • Compile blacklist.
  • Include crap words amongst the vile ones (like "http")—words that are nouns but suck for our puroses.
  • Implement censor of selected nouns based on blacklist

Add proper logging to PaperTitleGen.IO.hs

For the purposes of refining the algorithm and monitoring its efficiency (namely in terms of how many times it has to restart due to failure to find fitting candidate data), logging seems essential. Also, it's just a good thing to learn to implement.

Tune Algorithm

  • Don’t let single letters count as nouns (‘r’ and ‘s’ count as a nouns because they name the letter, I guess).
  • Eliminate all numbers

Track seedNouns to eliminate redundancy.

Each successful seedNoun should be recorded in a list of seedNouns, and these should not be repeated (at least not until we have extended the range of inputs available to the bot).

  • enter seedNouns of successful title generation into src/resources/seedNouns.txt.
  • check the selection of new seedNouns against seedNouns.txt.

Direct database queries to a local wordnet Prolog-powered database

BotticelliBot has been exceeding its free month quota. This month, it did so in a mere 15 days. That's too pricey for a novelty project. WordNet has a Prolog version of its 3.0 database, and I love Prolog; plus, it would be better if it were running independently any how. Thus...

  • Implement Prolog front-end program for performing queries.
    • Create modular interface for wn database files.
    • JIT loading of database files (i.e., database files won't load unless you actually execute a goal that requires one of their predicates.
    • Implement flexible, general purpose query system for getting word information.
    • Format query results as Json
    • Implement executable PrologScript interface to call program from CLI.
  • Link Haskell to Prolog program.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.