Giter Club home page Giter Club logo

bi-tech-cp303's Introduction

BI TECH CP303 - Data Mining

  • Instructor: Erin Shellman, [email protected]
  • Teaching Assistant: Ryan Brush, [email protected]
  • Course Location: Puget Sound Plaza in room 407 (map)
  • Course Time: Mondays 6:00 - 9:00 PM
  • Dates: April 6, 2015 through June 15, 2015

Welcome to data mining! This is an applied course meant to teach you practical tools for data mining and knowledge discovery. The course is composed of four units: regression, classification, unsupervised learning and time series analysis. The goal is to provide experience in a breadth of applications and to prepare you for the job of an analyst, data scientist, or any role that calls for data mining. If you already have experience with R or data mining, there are additional readings and challenges in the projects to challenge you and elevate your skills.

Course Grade

Grading is based on classroom participation, completion of homeworks and projects, and attendance. Students are required to attend 80% of the lectures to receive a passing grade.

Assignments

There are a total of four projects, one for each topic area. For each project, you will receive a business problem and a corresponding data set. You're free to use any methods you like, so long as you support your choices. You will write a brief report of your analyses and provide/receive feedback from your classmates.

Critiques

When you turn in your project reports, you will receive the reports of three of your classmates. During the following week, read their reports and provide thoughts and feedback. Please write at least a paragraph discussing parts of the analyses you liked and disliked. While you're reading, try to put yourself into the mind of the business stakeholder and ask if your requests were adequately met. Are you confident in the conclusions drawn? Were the figures and supporting evidence compelling? Remember to maintain a tone of mutual respect and read the section on policies and values for more information.

Due Dates

Assignment Date
Project 1 May 4
Project 1 Critiques May 11
Project 2 May 18
Project 2 Critiques June 1
Project 3 June 8
Project 3 Critiques June 15

Textbook

There is no required textbook for this course. Everything you need to succeed is available in the course repository.

Policies and values

A large component of this course is providing critical feedback on the analyses of your peers. It is imperative that all students are thoughtful when providing written feedback and participating in class. This means using respectful language in discussions and writings, but also being respectful of class time by arriving prepared and engaged.

Everyone is required to do original work for all projects. You're free to openly discuss the projects and your approaches, just like you would in a professional setting, but reports should be your own.

Topics

The lecture notes are available here.

Week Date Topic Dataset
1 April 6 Introduction to Data mining and programming with R Capital Bikeshare: usage_2012.tsv, stations.tsv, daily_weather.tsv
2 April 13 Linear regression Capital BikeShare
3 April 20 Linear regression extensions Capital BikeShare
4 April 27 Logistic regression Twitter user data: bot_or_not.tsv
5 May 4 Trees Twitter user data
6 May 18 Association rules Amazon product recommendations: people_who_bought.txt, product_catalog.tsv
7 June 1 Clustering Amazon
9 June 8 Sharing your work None!
10 June 15 Guest panel None!

Software Installation

We'll be using the statistical programming language R for this course. In addition, I highly recommend that you use RStudio, a powerful interactive development environment (IDE) for R. f you plan to use your own laptop computer in class, please install R and RStudio on your laptop before the first day of class. The computers in the classroom will have everything you need installed.

  1. Download and install R
  2. Download and install RStudio

Office Hours

Office hours will be held on Sundays from 2 to 4pm at Caffe Zingaro in Lower Queen Anne.

bi-tech-cp303's People

Contributors

erinshellman avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.