I'm a postdoc at Trinity College Dublin in the School of Computer Science and Statistics. My research domain is Natural Language Processing.
About my professional activities, you can browse:
- My professional page.
- My publications on my page, on Google Scholar, on ORCID, on Semantic Scholar, on the ACL Anthology, or on HAL.
- A fair number of repositories, usually experiments linked to one of my papers
About my non-professional activities:
- I contribute to DataScience StackExchange.
- I like TiddlyWiki and I wish I could spend more time doing stuff with it.
- I also have a few tools and some other stuff.
A not very well known but challenging subdomain of NLP.
- https://github.com/erwanm/CLGTextTools: Perl library containing functions to analyze text documents and especially extract text features.
- https://github.com/erwanm/clg-authorship-analytics: set of scripts and libraries to perform author-identification related tasks (Perl).
- https://github.com/erwanm/clg-authorship-experiments: a set of experiments with detailed documentation for
clg-authorship-analytics
(Perl + R).
- https://github.com/alfredomg/ADAPT-MWE17: participation to the VMWE17 Shared Task.
- https://github.com/erwanm/adapt-vmwe18: participation to the VMWE18 Shared Task.
See also related Shiny visualizations at https://brainmend.adaptcentre.ie/
- https://github.com/erwanm/tdc-tools: tools for representing and manipulating data in the Tabular Document-Concept (TDC) format. Used in my other LBD repos (Python).
- https://github.com/erwanm/medline-discoveries: a method for "mining impactful discoveries from the biomedical literature" (Python, R)
- https://github.com/erwanm/lbd-contrast: an experimental approach for LBD.
- https://github.com/erwanm/knowledgediscovery: modified fork to extract and apply LBD methods.
- https://github.com/erwanm/PowerGraph: dependency for the above
- https://github.com/erwanm/kd-data-tools: an ad-hoc concept disambiguation system for KD output (Medline and PMC).
- https://github.com/erwanm/elephant-wrapper: wrapper for the Elephant tokenizer, together with several experiments (LREC18 paper)
- https://github.com/erwanm/erw-ml-utils: ML-related scripts, especially for use with weka
- https://github.com/erwanm/TreeTaggerWrapper: a convenient wrapper to use the venerable POS tagger.
- https://github.com/erwanm/quest: an abandoned fork of Quest (for MT Quality Estimation).
- https://github.com/erwanm/tw-aggregator: a system to automatically aggregate TiddlyWiki content from a collection of public wikis
- https://github.com/erwanm/TW-WhoAmIGame: a simple game meant to be customized with your own questions and answers.
- https://github.com/erwanm/tw-doc: in-house basic documentation generator which adds information extracted from code files to an existing tiddlywiki file.
- https://github.com/erwanm/TiddlyWiki5: fork
- https://github.com/erwanm/Projectify: fork, not started working on it.
- https://github.com/erwanm/TW5-TimeTodo: another fork that I didn't work on.
- https://github.com/erwanm/encfs-util: scripts for linking EncFS (directory encryption) with pass (a command-line password manager) (Bash).
- https://github.com/erwanm/erw-setup: my config and a few scripts (Bash)
- https://github.com/erwanm/erw-tsv-commons: scripts to perform manipulations on TSV files (Perl)
- https://github.com/erwanm/erw-bash-commons: various useful bash functions. includes my "project management" system (Bash)
- https://github.com/erwanm/hugo-chalk: a modified template for Hugo (a framework for building a static website).
- https://github.com/erwanm/indie-coding: various pieces of code (currently about collecting open-license images) (Bash)
- https://github.com/erwanm/Poker-StatsSystem: old attempt at automatic statistics from poker hands, unfinished (Perl).
-
- https://github.com/erwanm/code-snippets: as the name suggests