Giter Club home page Giter Club logo

hawk's Introduction

Hawk

Hawk aims to provide transparent and explainable functionality for obtaining properties of words and determining whether triples of words are discriminative.

What constitutes a discriminative triple?

We can affirm said triple is discriminative if the feature allows to discriminate the pivot from the comparison, that is, if it applies to the pivot but not to the comparison. Here's an example:

- Pivot: Paris
- Comparison: Barcelona
- Feature: French

This is a discriminative triple: French applies to Paris because Paris is a French city, but not to Barcelona, since it isn't.It is important to mention that as soon as the feature and the comparison are related, the triple is not discriminative. Therefore, a triple where the feature does not apply to the pivot but to the comparison is still not discriminative.

How can I use it?

The best way to use Hawk is through the API.

Running a Local Server Instance

  1. Download the file from https://github.com/ab-10/Hawk/tree/v0.1.0

  2. Extract the file tar -xf hawk-0.1.0.tar.gz

  3. And run the jetty server java -jar hawk-0.1.0/hawk-0.1.0-jar-with-dependencies.jar hawk-0.1.0/indexes

  4. Now the Hawk API can be accessed on localhost:8080, see API Usage for more information.

API Usage

Graphical front end can be accessed by pointing your browser to the address without /api. E.g. to access properties front end after running the server on your machine point your browser to localhost:8080/properties and for a programmatic API send requests to localhost:8080/properties/api.

Obtaining Properties

Returns a list of properties with their roles in parenthesis, organized by source. For example (shortened):

{"WKP_Graph":["Hawk(definiendum)", "an(O)","unincorporated(B-differentia-quality)","community(B-supertype)","in(B-differentia-quality)"]
"WKT": ["hawk(definiendum)","diurnal(has_diff_qual)","of the family Accipitridae(has_origin_loc)","predatory_bird(has_supertype)"]}

Accessible on HOST_NAME/properties/api and requires 3 parameters:

  1. properties, which properties do you want? Accepted values: p for pivot's properties c for comparison's properties p-c pivot's minus comparison's c-p comparison's minus pivot's intersection the common properties of pivot and comparison.

  2. pivot string value of pivot

  3. comparison string value of comparison

Determining Discriminativity

Returns two element lists, organized by source [DECISION, JUSTIFICATION], where DECISION is "true"/"false" and JUSTIFICATION is a natural language justification for the decision. For example:

{"WKP_Graph":["false","Because hawk and eagle don't contain bird as a property"],"WKT":["false","Because hawk and eagle don't contain bird as a property"],"WN":["true","Because hawk contains bird in property of has_supertype and has_supertype role and eagle doesn't contain bird as one of its properties."]}

Accessible on HOST_NAME/roleBasedVote/api requires 3 parameters: pivot, comparison, feature and returns whether pivot and feature have a common property, that comparison doesn't have (the role of the property as well as the value is taken into account under this comparison type).

How can I contribute?

First of all thanks for showing interest in Hawk! These are the recommended steps for contributing:

  1. Play around with Hawk. Take a look at the usage instructions above and try building something (think hackathon-level project). In case you produce something decent, we'd be happy to list it as an example (thus accepting your first contribution), however the point here is to get you somewhat familiar with the project.
  2. After having the time of your life, completing the step (1.), you now have a dozen feature ideas (or at least a single one), that would make Hawk even more amazing. So what are you waiting for? Fork the project, implement your ideas and I'll see you at the PR!

P.S. if you need any help during any of this, see the support section!

Support

For help shoot an email to Armins(dot)Bagrats at gmail, and I'll (hopefully) respond.

hawk's People

Contributors

ab-10 avatar dependabot[bot] avatar vintrae avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

mandulaj

hawk's Issues

Improve field naming

Currently all document fields in Lucene indexes are named property, irrespective of the nature of the content.

Indexes constructed from definition graphs should have fields organized and named in the following parallel schemes:

  • rawGloss the entire definition, constructed by concatenating the properties of synset's and it's supertype's glosses.
  • chunkedGloss just like raw gloss, but spaces within each property are replaced with underscores, thus a space separates different properties.
  • Naming according to semantic role labels for definition graphs, that are present in the graphs themselves.

The purpose is to facilitate further experimentation and improve property extraction, as improved field naming allows for more selective queries.

Refactor roleBasedVote handler

Currently the roleBasedVote is declared in demo (under RoleBasedVoteHandler) and analysis (under DictionaryClassifiers). There should be no voting logic included in demo, so the RoleBasedVoteHandler should be modified so that it uses functionality from analysis and analysis should be modified so that it can provide explanations for role based vote (similarly as is currently done in demo).

Improve API Error Messages

Currently for both API endpoints (/properties/api and /roleBasedVote/api) a request that does not have all of the required parameters returns {Invalid request}.

There are two problems with this:

  1. The response is not informative enough. The response should specify which parameters were missing.
  2. For property handler, it should be permitted to have a missing comparison if the properties is set to p and missing pivot if properties is set to c.

Document the API

Currently Readme.md describes basic steps for setting up the API, but it would appreciate work to elaborate on the installation process and document the usage format of the API itself.

Include BlindVote in API

Blind Vote: All voting methods in analysis package, that use discriminativeQuery are referred to as blind vote, because they are blind to the definition roles.

The task is to provide them through the API, in a similar way that currently roleBasedVote is provided, however the methods should be called directly rather than duplicating the functionality in the handler (as it is being done for roleBasedVote).

Document the package structure

Describe the responsibilities of each package (i.e. prep, indexation, analysis, demo, examples).
Describing the 3 core packages (prep, indexation, and analysis) separately from the remaining ones might improve clarity.

Split existing JAR

Currently there's one large jar file generated by the Maven build, running which runs a local API instance.
Instead there should be 3 jar files generated as a result of build into out folder:

  • indexation.jar that creates Lucene indexes from out/externalData into out/indexes

  • analysis.jar that accepts triples of form pivot,comparison,feature separated by \n and outputs the same list of triples with appended 0 (not discriminative) or 1 (discriminative). E.g java -jar analysis.jar --model_name < apple,orange,red is expected to output to standard out apple,orange,red,1

  • api.jar current functionality of the jar, i.e. runs a local instance of API.

Create home page

Problem

Currently the API server doesn't have anything at root. It would be nice if the root could link to the other pages of the API.

Suggested Aproach

You should create a new class that extends AbstractHandler in demo package, this class should handle request to root and return the index page, an index page can be a static .html file (which should be placed in resources folder).

Then modify DemoServer to redirect call to root ("\") to the class you created (similarly as it has been done for the other handlers).

Integrate the best performing model and roleBasedEval into the same project

The best performing model (F1 wise), is the one on commit 8533aa6, but the model on roleBasedEval branch offers comparison methods, that have future potential.

I would like to combine the analysis and indexation functionality of both of these models, so that the user can specify what type of analysis they want to do (the methods between the two should be in the same project, but separate methods).

Automate Concept Net Pre-Processing

Requires issue #20 to be complete

Project should include the required scripts for pre-processing Concept Net data.
The result of said pre-processing should be a file easily readable by a java program for further analysis in the format:
<edge name> <term_1> <term_2>
Where the terms inside of triple are space separated and lines are separated using \n (UNIX line endings).

Thanks for contributing.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.