The credit card fraud data is a very popular dataset on kaggle. This dataset contains about 290000 records of data with 31 variables. Among all the variables, we only have three meaningful variables which are the time, amount and the class. The other 28 variables are the top 28 principle components after the PCA on the original dataset. Because of the privacy of the original data, we could not know more information. However, this pre-PCA work just reduce the difficulty to use this dataset for prediction purpose.
nji3 / work-with-kaggle-data Goto Github PK
View Code? Open in Web Editor NEWUsing several Kaggle data to do data analysis and data science projects.