Giter Club home page Giter Club logo

Hello there 👨‍🔬

I'm a student from Czechia who is passionate about math, statistics, and Data Science. This repo holds some of my work that showcases how I go around data analysis and coding in general. If you wish to seek more about me, feel free to visit my website.

Data Analyst Associate certification 🐕‍

  • Tool: RMarkdown
  • Packages: readr, dplyr, forcats, skimr, ggplot2, glue, stringr, tidytext
  • Output: Written analysis

Pet Box Subscription analysis is a descriptive analysis of a pet store, which was done for my Data Analyst Associate certification. This analysis aims to identify pet owners who could purchase stuff every month (food, toys, medical supplies...). The data is read with readr and wrangled with dplyr. As most characteristics are factors, I heavily relied on forcats to simplify my work. Data visualization is done with ggplot2 and skimr. When working with text, I applied glue for string interpolation and stringr for text manipulation. For advanced graphs, I used tidytext's facet functions.

My final submission consisted of a written report for Data Scientists at Data Camp, who reviewed my proposal and reviewed that the analysis meets current industry standards. You can view it in my DataCamp workspace.

Professional Data Analyst certification 💸

  • Tool: RMarkdown
  • Packages: dplyr, tidyr, ggplot2, patchwork, gtsummary
  • Output: Oral presentation with PowerPoint slides

My second Data Analyst certificate was achieved with my analysis on a made-up insurance company. This analysis mainly aims to identify which customers are buying insurance and what their characteristics are. Coding and data interpretation is done in R Markdown. The data is wrangled and transformed using dplyr and tidyr. Data visualization is put together using ggplot2 and patchwork. The final tables are beautified with gtsummary.

The analysis was presented orally to Data Scientist from DataCamp, who reviewed my presentation and verbal communication. My video presentation is not available; however, the PowerPoint presentation can be downloaded from my Github repo.

Data Scientist Associate certification 🧘🏽‍♀️

  • Tool: DataCamp Notebook
  • Packages: readr, dplyr, glue, ggplot2, tidymodels
  • Output: Written submission

To recieve the Data Scientist Associate certification, I created a report that first reads (readr) and wrangles (dplyr, glue) data about a made-up fitness center. After set domain restrictions are validated and applied, data is explored using ggplot2. To predict the number of people in a fitness class, I used various packages from the Tidymodels family.

First model created uses Ridge regression from the glmnet package. Alpha was validated using 10-cross validation. The second model uses Random forest to predict the number of customers. Parameters were tune()'d using 10-cross validation. The final submission can be seen on my DataCamp workspace.

Michal Lauer's Projects

distr6 icon distr6

R6 object-oriented interface for probability distributions.

dt icon dt

R Interface to the jQuery Plug-in DataTables

dtreprex icon dtreprex

Repository that contains a reprex for a DT issue.

msmtprohlizec icon msmtprohlizec

Tento repozitář je pro Shiny aplikaci vyvíjenou pro Ministerstvo školství, mládeže a tělovýchovy České republiky

r-template icon r-template

Private general project template for data analysis

vse-timetable-now icon vse-timetable-now

Chrome browser extension to update InSIS VŠE timetable based on current date

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.