- Graduate Business, Leavey School of Business
- Course MSIS 2641: Big Data Modeling & Analytics
- Big-Data-MapReduce Course @ Santa Clara University
- Class Meeting dates: 04/03/2018 - 06/08/2018
- Class hours: Class Number 68044: TTh 5:45PM - 7:00PM PST
- Class hours: Class Number 68041: TTh 7:35PM - 8:50PM PST
- Class room: Lucas Hall 310
- Office: 321 T, Lucas Hall
- 1. A Very Brief Introduction to MapReduce by Diana MacLean
- 2. Introduction to MapReduce by Mahmoud Parsian
- 3. Data-Intensive Text Processing with MapReduce by Jimmy Lin and Chris Dyer
- 4. Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, Jeffrey D. Ullman
- Midterm Exam: TBDL, @ 5:45pm to 7:00pm PST
- Midterm Exam: TBDL, @ 7:35pm to 8:50pm PST
- Class hours: TTh 5:45PM - 7:00PM PST
- Final Exam Date: Tuesday, June 12, 2018
- Final Exam Time: 5:45PM - 7:00PM PST
- Class hours: TTh 7:35PM - 8:50PM PST
- Final Exam Date: Thursday, June 14, 2018
- Final Exam Time: 5:45PM - 7:00PM PST
The main focus of this class is to cover the following concepts:
- Concepts of Big Data
- Distributed File Systems
- Distributed Computing
- Distributed and Parallel Algorithms
- MapReduce Paradigm
- MapReduce Algorithms
- Scale-out Architectures (using Hadoop, Spark, PySpark)
- Apache Spark: http://spark.apache.org/
- Use Spark, Py-Spark, Hadoop, and Java to teach MapReduce and distributed computing
- SQL for NoSQL Data, How?
Data Algorithms: Recipes for Scaling up with Hadoop and Spark