Giter Club home page Giter Club logo

pyspark's Introduction

Hits GitHub stars Maintenance MIT license


HJ github stats

Experience

๐Ÿ”ฅ Product-focused Data Scientist ๐Ÿ”ฅ

  • Ad Tech (current)
  • Gaming
  • Non-Profit

I worked at small to large-scale companies, so I enjoy getting my hands dirty and solving complex data-driven problems. My passion lies in enabling product growth using data and statistical theories.

  • End-to-End designing of company-wide product success measurement through constant hypothesis validation on user behaviors
  • Production-level software & data tooling development. Projects involve data pipelining, object-oriented design/refactoring, integration-testing, and operational maintenance.
  • Experience with petabyte-scale data handling techniques such as Spark
  • Familiarity with Ad-tech domains; real-time bidding, incrementality testing, brand-lift, AB experimenting, etc

Skills

  • Programming - Python, R, SQL (Snowflake, MySQL, Postgre), Shell, HTML, CSS
  • Cloud Service - AWS, Docker, Kubernetes, Jenkins, Spark,
  • Visualization - Tableau, Dash(python), Power BI, Excel
  • Web - Heap, Google Analytics
  • Version / Collaboration - Git, Wiki, JIRA, Confluence

Featured Repo

Main

Repo Description Link
R Projects R Portforlio in .R or .Rmd files by business topics (i.e. ML, DL, Text Mining, Time Series, etc) Link
Python Projects Python Portforlio mostly in Jupyter Notebooks by ML/DL framework (i.e. Pytorch, Tensorflow, Fast.ai, etc) Link
Medium Blog Tech medium blog that talks about data/ml Link

Data Science (Tools&Knowledges)

Repo Description Link
AWS SageMaker in Production End-to-End curated examples that show how to solve business problems using Amazon SageMaker and it's ML/DL algorithm. Mostly in Jupyter Notebook for easy accessibility Link
PySpark PySpark functions and utilities with Real-world Data examples. Can be used to build complete ETL process of data modeling Link
Recommendation System Production-level Implementations of Recommender System in Pytorch. Clone repo and start training by running 'main.py' Link
Natural Language Processing (NLP) Examples Full implementation examples of several Natural Language Processing methods in Python. Ordered in a personal level of complexity Link

Kaggle

Repo Description Link
Bengali.AI Handwritten Grapheme Classification Classify three constituent elements in the image, given the image of a handwritten Bengali grapheme Link

Linkedin Badge Gmail Badge

pyspark's People

Contributors

hyunjoonbok avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

pyspark's Issues

host exception while reading hive table from pyspark..it is showing list of tables but not able to read data from hive table

answer = self.gateway_client. send_command command) -> 1321 return_value = get_return_value 1322 answer, self.gateway_client, self.target_id, self.name) 1323 /ui/jupyterhub_data/anaconda3/lib/python3.8/site-packages/pyspark/sql/utils.py in deco(*a, **kw) 194 #
Hide where the exception came from that shows a non-Pythonic 195 # JVM exception message. --> 196 raise converted from None 197 else: raise 198

IllegalArgumentException: java.net.UnknownHostException: bdacdh-ns

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.