Giter Club home page Giter Club logo

derrick56007 / personal-virtual-assistant-for-news-filtering Goto Github PK

View Code? Open in Web Editor NEW

This project forked from techlauncher-its-personal/personal-virtual-assistant-for-news-filtering

0.0 0.0 0.0 15.39 MB

This project is developed by a group of students in the TechLauncher program by the Australian National University, based on the Stanford Open Virtual Assistant Platform's virtual assistant, Almond.

JavaScript 78.89% Shell 21.11%

personal-virtual-assistant-for-news-filtering's Introduction

Personal Virtual Assistant for News Filtering

This project aims to use machine learning and natural language processing to create a personal assistant that can filter news based on the user's personal preference.

Client's Vision

The ISS (Institutional Shareholder Services) filters through 500k news articles daily to find controversial activities of public companies in the areas of environment, human/labour rights, and corruption. They then provide that information to their clients, large institutional investors around the globe. Currently, the ISS employs analysts equipped with different search and ML (Machine Learning) / NLP (Natural Language Processing) technologies to pre-filter news articles daily. In the future, the ISS would like to provide a text-based personal virtual assistant to their analysts and clients to help them more easily find articles of interest while navigating the news. The assistant would be able to learn about the preferences of each individual user by observing interaction patterns and asking clarifying questions.

Client

Our client is Marcel Neuhausler, from the ISS (Institutional Shareholder Services)

Client Expectations

  • In-depth analysis of the Stanford Open Virtual Assistant project.
  • Standalone deployment of an open source simple news filtering virtual assistant (Almond).
  • Collection of training data for two different news topics.

Project Impact

This project will provide a thorough examination of the possibility of using the Stanford Open Virtual Assistant project's open source virtual assistant, Almond, as a base to develop a text-based personal assistant that can filter news articles. The finished product would also be useful for lowering the workload of the ISS analysts as well as a possible revenue stream if provided directly to their clients as a service.

Milestones, Scheduling, Deliverables

Milestones

  1. Setup a landing page for the project with links to the repository, project documents, and planning board.
  2. Create a presentation slide for information on the Stanford Open Virtual Assistant.
  3. Deployed a standalone version of Almond-Cloud.
  4. Implement a news filtering service under Almond-Devices/Skills.
  5. Implement a training service under Almond-Devices/Skills.
  6. Create a Jupyter Notebook for data preparation and NLP training.
  7. Create a collection of labeled training data for two different news topics.

Scheduling

We will hold three sprints (one every two weeks) in the following schedule:

  1. Sprint 1: Week 3 to Week 4, focused on learning about Almond as well as data preparation.
  2. Sprint 2: Week 5 to Week 6, focused on standalone deployment and service implementation.
  3. Sprint 3: Teaching Break to Week 8, focused on finishing documentation, training, and UI/UX design.

Deliverables

  1. A presentation slide containing an introduction to the Stanford Open Virtual Assistant project.
  2. A standalone deployment of Almond-Cloud, with a news filtering and training service implemented under Almond-Devices/Skills.
  3. A Jupyter Notebook for data preparation and NLP training.
  4. A collection of labeled training data for two different news topics.

Constraints

  1. Our team works remotely. It is almost impossible for us to hold offline meetings or events.
  2. The client is based in America, so their timezone is very different compared to the team members' own time zones (Indonesia, China, Australia). As a result, effective communication time is very limited.
  3. The team has limited experience in data training and labelling as well as the Almond system and as such would require a longer initial learning/setup period.

Risks

  1. There are privacy concerns in our project as we need to collect and analyze user information like their preference of news articles. The team must take great care to get permission to use data.
  2. Remote teams are more likely to be inefficient compared to traditional teams. Besides, this project requires great understanding and applying of the most advanced technologies and knowledge in ML and NLP fields. Our team requires much time to complete this project and it may be hard for us to evaluate time consumption of some difficult technical tasks.
  3. This project may last for two semesters but our team members may change in the second semester. If work is not handovered perfectly between previous and new members, the project process would be greatly influenced.

Resources (Open Source)

Other Resources, Services and Repositories (Our Work)

Website Link

Standalone Almond Deployment

How to test:

  1. Login as an anonymous user (Username: anonymous, Password: testtest)

  2. Go to the My Almond page.

  3. Select Enabled Skills and check if Simple News Filter is active or not.

    a. If it is not active, select Configure New Skill.

    b. In the next page, find and select Simple News Filter.

    c. Return to the My Almond page.

  4. To get the top 5 news for either the sports or tech topic, enter the following command:

\t @org.itspersonal.newsfilter.news_article(topic = enum sports);
OR
\t @org.itspersonal.newsfilter.news_article(topic = enum tech);
  1. To add a new training data for a specific topic, enter the following command:
\t @org.itspersonal.newsfilter.training_news_article(topic = enum sports);
OR
\t @org.itspersonal.newsfilter.training_news_article(topic = enum tech);
  1. When an article is returned, input "yes".
  2. Wait for the system to show the label options. Select "Yes" if the article is relevant or "No" if the article is irrelevant to the topic.

Tooling

Task Tool
Repository Github
Communication Zoom, Slack, Outlook
Documentation Google Drive
UI/UX Design Adobe Illustrator, Adobe Photoshop
Development Environment Visual Studio Code, NotePad++
Deployment Docker
Data Science / ML Environment Jupyter
Planning Trello

Statement of Work

The statement of work is provided as a pdf.

Tutorial Time and Tutor

Tutorial Time: Friday, 8-10 AM

Tutor: Manik Mahajan

Project Members

Member Role
Anggrio Wildanhadi Sutopo Team Leader, Spokesperson, Git Master
Junjie Zou Project Manager, Deputy Spokesperson
Zhihao Ye ML/NLP Developer
Mingjie Shi ML/NLP Developer
Yanan Wu UI/UX Developer

Weekly Meeting Schedule

Day Time (AEDT) Detail
Monday 4-6 PM Team Meeting
Thursday 12-1 PM Client Meeting
Thursday 2-4 PM Team Meeting

personal-virtual-assistant-for-news-filtering's People

Contributors

anggriosutopo avatar shimingjie0804 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.