Name: Mauricio Fernando Bautista Lopez
Type: User
Company: Kellogg Company
Bio: I'm a data engineer, I have worked with Scrum and use tools like Azure, AWS I really love work with data and my favorite languages are Python and R
Location: México state, México
Blog: [email protected]
Mauricio Fernando Bautista Lopez's Projects
Primer reto tecnico platzi master, Es una aplicacion en Flask que evalua un input
Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University
This a full pipeline develop to made movie analytics with AWS apache airflow and pyspark
Scripts from computational thinking course from platzi
Repositorio utilizado para el Curso de Hadoop en Platzi
Repo del curso de Python con Python 3
Curso Introductorio de Spark by Platzi 💚
Dagster crash course https://dagster.io/blog/dagster-crash-course-oct-2022
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Roadmap to becoming a data engineer in 2021
Data Engineering Bootcamp 2021
# data_engeniering_python ## Description This is a Data engineering pipeline prototype to extract data from notice sites then, tranform and aggregate different sources and finally load data in a Bd ## Data Sources The project consumes different notices sites at this moment scrape: - https://elpais.com - http://www.eluniversal.com.mx ## Development Parameters needed for configuration are in the file config.yaml this file contains: * **news_sites:** sitename: url: queries: homepage_article_links: article_body: article_title: ### Requirements and Installation directories and file structure: ``` LH4_AMPPS_DASH/ |---extract/ |---common.py |---config.yaml |---main.py |---news_page_objects.py |---transform/ |---main.py |---load/ |---article.py |---base.py |---main.py |---.gitignore |---README.md |---newspaper.db |---pipeline.py ``` It requires Python 3.6 or higher, check your Python version first. The [requirements.txt](requirements.txt) should list and install all the required Python libraries that the pipeline depend on `pip install -r requirements.txt` To start scrapping the sitess, you have to execute [pipeline.py](pipeline.py) file: `python pipeline.py ` This will run the ETL process, and write the output to the specified output location.
The athena adapter plugin for dbt (https://getdbt.com)
Proyecto del curso "djangao crendo 3 webs"
Build a Docker image with GitHub actions
La idea de este repositorio es entender los fundamentos de git
Solucion al reto logico valor futuro platzi master
Example of how to create an ETL process orchestrated with Apache Airflow
python MLP for face recognition This is a personal project to understan how the multilayer perceptron works aplied in computer vision field
This is from fast api course
fundamentos de airflow
Curso de Git y Github con Freddy Vega - Platzi - 2019