Dogukan Ulu's Projects
Produce Kafka messages, consume them and upload into Cassandra, MongoDB.
Create Kafka topic, stream the data to producer and consume on the console using Amazon MSK
An AWS Data Engineering End-to-End Project (Glue, Lambda, Kinesis, Redshift, QuickSight, Athena, EC2, S3)
Get Crypto data from API, stream it to Kafka with Airflow. Write data to MySQL and visualize with Metabase
Writes the CSV file to Postgres, read table and modify it. Write more tables to Postgres with Airflow.
This repo will write a CSV file to the Amazon Kinesis Data Streams
This repo is for generating data from existing dataset to a file or producing dataset rows as message to kafka in a streaming manner.
This repo contains datasets used in trainings.
Docker Apache Airflow
My personal repo
Glue ETL job or EMR Spark that gets from data catalog, modifies and uploads to S3 and Data Catalog
This repository is created for IBM Data Science Professional Certificate Capstone Project
Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra
In this repository, I created ML algorithms for various Kaggle Competitions
Parquet files will be obtained regularly from a public GCS bucket. They will be written to BQ table
Create sample Prefect flows, deploy them as Docker containers and store within GitHub
Upload the remote data into Amazon S3, read the data and upload to Amazon RDS MySQL
Send a dataframe to S3 automatically, trigger Lambda and modify dataframe, upload to RDS
This repo automates the processes when we want to send remote data to AWS services such as Kinesis, S3, etc.
Get the streaming data from the S3 bucket with SQS queue. Load into Snowflake with Snowpipe and modify the data with Snowflake task
Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO
Get data via Twitter API, orchestrate with Airflow and store in S3 bucket