Giter Club home page Giter Club logo

meetup-topics's Introduction

a Text Mining and Search Project

Overview   |   References   |   Code   |   Presentation   |   Report   |   About us  

☍   Overview

The following project aims to implement a text classification pipeline for the Meetup events description. Within the Meetup platform, every organized event needs to be manually tagged by the organizers to allow the platform's recommendation system to suggest the event to users based on their interests. In this context, a system that suggests to organizers how to label their event based on how it was described by them would be a useful tool. The text classification task is well known in literature and involves a series of operations and tricks, starting from the preprocessing of the texts up to the text representation. Our goal was to find the best combination of preprocessing and text representation to be submitted to the best classifier, based on the classification performance, to maximize some performance evaluation metrics.

☍   References

  • D. M. Blei, A. Y. Ng, and M. I. Jordan, (2003). "Latent dirichlet allocation", The Journal of Machine Learning Research, 3, 993-1022.
  • T. Mikolov, G.s Corrado, K. Chen, J. Dean, (2013). "Efficient Estimation of Word Representations in Vector Space", ICLR 2013, 1-12.
  • Q. Le and T. Mikolov , (2014). "Distributed Representations of Sentences and Documents", Proceedings of the 31st International Conference on Machine Learning, in PMLR, 32(2), 1188-1196.
  • D. Xue and F. Li, (2015). "Research of Text Categorization Model based on Random Forests", IEEE International Conference on Computational Intelligence \& Communication Technology, 173-176.

☍   Code

All the produced code is contained into the src folder, and described in the src README.

☍   Presentation

Slides available here in pdf and pptx formats.

☍   Report

Full report here.

☍   About us

⊜   Dario Bertazioli

  • Current Studies: Data Science Master Student at Università degli Studi di Milano-Bicocca;
  • Past Studies: Bachelor's degree in Physics at Università degli Studi di Milano.

⊜   Fabrizio D'Intinosante

  • Cosa studio: Studente Magistrale di Data Science presso l'Università degli Studi di Milano-Bicocca;
  • Studi precedenti: Laurea triennale in Economia e Statistica per le organizzazioni presso l'Università degli Studi di Torino.

⊜   Massimiliano Perletti

  • Cosa studio: Studente Magistrale di Data Science presso l'Università degli Studi di Milano-Bicocca;
  • Studi precedenti: Laurea triennale in Ingegneria dei materiali e delle nano-tecnologie presso il Politecnico di Milano.

meetup-topics's People

Contributors

faber6911 avatar

Stargazers

Dario Bertazioli avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.