Giter Club home page Giter Club logo

hi! 👋

🤓 facts about me:

  • passionate about technologies that change people's lives
  • coding since 15 years old
  • 11+ years of programming, 6+ with python, 6+ in data.
  • my main interests are in open-source, self-service data platforms, MLOps, DDD, and TDD

💼 I have many years of experience building data platforms and dev tools in modern tech organizations. From small startups (< 50) to 10k+ corporations I know how to operate in different growth stages.

📚 I've studied at the Federal University of São Paulo (UNIFESP). I have a bachelor's degree in Science and Technology and another in Computer Science. I have also recently finished a master of science in intelligent systems, with my research in the last 4 years focused on automated anomaly detection for data quality exploring novel architectures of AutoML and Metrics Repository. My university is one of the most prestigious in Brazil, fully funded by the Brazilian government. In the years I was there it was elected top 5 in all LATAM by Time Higher Education.

If you're passionate about data quality like me, you'll definitely like my publication which can be read here.

tech I've worked with:

languages Python (main language), Shell, SQL
dev/data ops Git, Github Actions, Drone CI/CD, CircleCI
Docker, Kubernetes, Helm
Datadog
data oss DBT, Apache Spark (PySpark), Databricks, Airflow, Airbyte
PostgreSQL, Cassandra, MySQL, MongoDB, DynamoDB, Redis
Kafka, NATS.io
cloud AWS: S3, EMR, ECR, Athena, RDS, Redshift, Glue, Lambda, SNS, SQS, EC2
GCP: Composer, Cloud Storage, BigQuery, DataStore, Cloud Run, Compute Engine, Kubernetes Engine, Artifact Registry
OS Linux, MacOs
🐍 libs I ❤️ aiohttp, flask, fastapi, pydantic, click, typer, sqlalchemy, scrapy, beautifulsoup, desert, marshmallow, pydeequ, awswrangler
Test and Quality: pytest, mypy, flake8, isort, black
Data Science: scikit-learn, keras, tensorflow, prophet, neuralprophet, merlion, jupyter, pandas, numpy, matplotlib, seaborn, streamlit

my open source work 🤘

I'm the creator of the following PyPI packages:

  • biar: batteries-included async requests tool for python
  • thoth: Python tool for profiling-based anomaly monitoring on ETL data pipelines leveraging ML and Apache Spark.

I'm also the co-creator of butterfree a tool for feature engineering and feature store. We created this tool when I was in the first MLOps squad at @quintoandar. It's used for most ML data pipelines there and has 260+ stars on GitHub.

other contributions

I've also made minor contributions for the following awesome open-source libraries:

  • aws-sdk-pandas: easy data integration with AWS services.
  • merlion: a time series forecasting library for python.

my projects

I have a bunch of data engineer test cases which landed me Senior positions in competitive tech companies. So before asking me a take-home assignment, please check these instead:

  • strider-challenge: a simple typer and sqlmodel application developed with DDD and TDD
  • pyspark-pipeline: shows the implementation of a pyspark data aggregation pipeline with automated tests.
  • legiti-challenge: A nice project solution for building and running pipelines for feature store.
  • meli-challenge: a solution for the characters interactions problem using graph and spark.

Here's an archive of old college projects (don't judge me 😅):

  • ntsa: repository for codes, reports, and projects for the Nonlinear Time Series Analysis class from Computer Science Master's Degree Course at Federal University of São Paulo (UNIFESP).
  • neural-networs: repository for the projects of the 2019 Neural Networks class at National Institute for Space Research (INPE)
  • software-testing: Repository for the projects of the 2020 Software Testing class at the Federal University of São Paulo (UNIFESP)

let's connect!

Linkedin Badge

@rafaelleinio on Discord

Rafael Leiniö's Projects

Rafael Leiniö doesn’t have any public repositories yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.