Giter Club home page Giter Club logo

dsc-introduction-intro-v2-4's Introduction

Getting Started with Data Science - Introduction

Congratulations on making it this far! Now that you have mastered fundamentals of programming with Python, descriptive statistics, and data visualization, we're going to start digging in to the process of "doing data science".

Lesson Priorities

Work through the lessons in the topic sequentially, unless you are short on time. If you are short on time, be sure to finish the 1st priority items. Keep in mind that 1st priority topics are designated as such because they are essential for your progression through the course. With regards to your growth as a data scientist, 2nd priority topics may be just as important as 1st priority. For example, PEP8 is second priority, but it is critical that throughout the course, you develop your ability to write PEP8 adherent code.

1st Priority

  • The Data Science Process
  • Problems Data Science Can Solve
  • Setting up a Professional Data Science Environment - Introduction
  • Setting up a Professional Data Science Environment - MacOS Installation
  • Setting up a Professional Data Science Environment - Windows Installation
  • Setting up a Professional Data Science Environment - Configuring Git and Anaconda

Be sure to have your DS environment set up correctly within the first days of the program. You may be able to get by without doing so for a little while, but eventually an incorrect setup will amplify into significant headaches. If you are unable to complete the lessons above (aligned to your specific OS), reach out to your instructor for a trouble shoot.

2nd Priority:

  • Data Privacy and Data Ethics
  • PEP8

Althought these are 2nd priority, ethical treatment of data and comfort with PEP8 guidelines are essential skills.

Appendix

Once you have finished reading about data science fundamentals and set up your tools, move on to the Appendix. There you will find a lab and a lesson which walk you through using the terminal to launch a Jupyter Notebook on a local server. You will be able to complete the Jupyter content of Canvas via Illumidesk. However, when you move to projects, you will need to be able to work locally. If you don't get to the appendix material by the end of the day, make a note to review this content before the beginning of a project week.

Data Science Fundamentals

In the first half of this section, we will introduce a lot of new ideas about what we mean by "data science". What is the process? What kinds of problems can data science solve?

We will also go over some key professional concerns of data scientists, including following code best practices and being ethical in our use of data.

Professional Data Science Environment Setup

So far, all of your lessons have been completed in a cloud environment that "just works". You open a lesson and are immediately able to run through your own copy of the code without worrying about where the code came from, how it is stored, whether you have the appropriate software downloaded, etc.

This is very convenient for educational purposes, but is not very representative of a real-world data science environment. So, in the second half of this section, we show you how to get all of the tools set up so that your computer has a professional data science environment!

The tools we cover in this section include:

  • Python
  • Jupyter Notebook
  • Anaconda
  • Git
  • GitHub

You have actually already been using all of these tools "under the hood", but these lessons will walk through what they are all used for and how to install and use them on your computer.

Summary

Remember, it's okay to feel a little uncomfortable. We are going to throw a lot of new concepts at you, and some of them won't fully make sense until much further down the line. Remember that you'll continue to practice these day after day, until they become second nature!

dsc-introduction-intro-v2-4's People

Contributors

hoffm386 avatar j-max avatar

Watchers

spike grobstein avatar James Cloos avatar  avatar Gary Thom avatar  avatar Bernard Mordan avatar Rebekah Rombom avatar  avatar Dr. Chester Ismay avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.