zeeraktalat / multitask-abuse Goto Github PK
View Code? Open in Web Editor NEWMultitask Learning for abuse and hate speech detection
License: Mozilla Public License 2.0
Multitask Learning for abuse and hate speech detection
License: Mozilla Public License 2.0
As discussed, since we don't have much information as to how informative tasks are for each other, we set the weights for all auxiliary tasks equally, more specifically to 1/len(aux_tasks)
.
That will be the easy thing to try first.
What I forgot to mention are possible ways to quantify task relatedness. These could include measuring the PMI of labels or vocabulary. Could be interesting to look into this.
See our CWI'18 code for reference: https://github.com/bjerva/cwi18
In this project, tasks are different languages.
In src/model.py
you'll see that self.inputs
is a dictionary mapping a taskID to an input layer. self.outputs
should really follow that convention tbh, but for some reason it's a list of layers here that are accessed with integers, don't know why we did it that way. Then in the forward()
, you see that the input task ID is provided as an argument.
In src/run.py
, during training, a task ID is drawn at random, then a batch of data for that task. That's not best practice I'd say, better to have a predefined schedule, but in any case you wanna know which task you're training on, so that you can tell that to the model.forward()
.
Make sense?
A bunch of questions have come up today from the train function:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.