Giter Club home page Giter Club logo

amazon-sagemaker-healthcare-fraud-detection's Introduction

Introduction

IDC study states that 40% of Enteprises in year 2019 will be working to include AI/ML as a part of their transformative strategy. Today, AI/ML is beyond the hype cycle and there are usecases that are providing real business value.

In this workshop, we will work on a healthcare insurance fraud identification usecase. We will apply machine learning to identify anomalous claims that require further investigation. The technique used in the workshop is broadly applicable to multiple problems fraud, abuse and waste.

Launch an Amazon SageMaker Jupyter Notebook

Prerequisites and assumptions

  1. To run this Jupyter Notebook, you need an personal Laptop and an AWS account that provides access to AWS services.

Steps

  1. Sign In to the AWS Console
  2. Click Services, search for Amazon SageMaker and Click Amazon SageMaker in the dropdownFind SageMaker
  3. After you land on Amazon SageMaker console, click on Notebook InstancesSageMaker Console
  4. Click Create NotebookCreate Notebook
  5. Give Notebook a name you can remember and fill out configuration details as suggested in the screenshots below.Create Notebook Instance
  6. Select IAM Role if one already exists in the dropdownSelect Existing Role
  7. Create a new role if one doesn't exist. Create new role
  8. Provide a path to clone public git repo that we will use today for our workshop to download data dictionary and Jupyter IPython Notebook Select Git Repo
  9. Provide the path of Git repo Provide Git url
  10. Click Create Notebook Instance
  11. In the Amazon SageMaker Console-->Notebook Instances, wait for your notebook instance to start. Observe change from Pending to In Service status.Creation pendingNotebook In Service
  12. Remember the name of your notebook instance and Click Open Jupyter for your notebook.Notebook In Service
  13. Validate your data and notebook cloned from Git RepoValidate Git Clone

Finish your Lab in Jupyter Notebook

  1. Click on healthcare-fraud-identification-using-PCA-anomaly-detection.ipynb and start working. From here onwards all the instruction will be in the Jupyter Notebook. Come back after you have completed all the steps in the Jupyter Notebook and finish rest of the steps as suggested below.

Finish

  1. Congratulations!
  2. Please make sure to delete all resources as mentioned in the section below.

Cleanup Resources

  1. Go to Amazon SageMaker console to shutdown your Amazon SageMaker Jupyter Notebook Instance, select your instance from the list.Select Stop from the Actions drop down menu. Stop Notebook Instance
  2. After your notebook instance is completely Stopped, select Delete fron the Actions drop down menu to delete your notebook instance.Delete Notebook Instance
  3. Navigate to Amazon S3 Console. S3 Console
  4. Find Amazon S3 bucket created for training and click to list objects in the bucket.Find Bucket
  5. Navigate to the model-tar.gz and delete it by using Actions menu.Delete Model
  6. Navigate to the training data file healthcare_fraud_identification_feature_store and delete it by using Actions menu.Delete Training Data
  7. After all the objects are deleted in the bucket. Go ahead and delete the bucket using the Actions menu.Delete Bucket

amazon-sagemaker-healthcare-fraud-detection's People

Contributors

awsvik avatar jamesiri avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

amazon-sagemaker-healthcare-fraud-detection's Issues

SageMaker PCA

Gettin an Error while tring to Convert data to binary stream after executing the folowing

matrx_train = X_stndrd_train.as_matrix().astype('float32')
buf_train = io.BytesIO()
smac.write_numpy_to_dense_tensor(buf_train, matrx_train)
buf_train.seek(0)

ERROR
``AttributeError Traceback (most recent call last)
in
1 # Convert data to binary stream.
----> 2 matrx_train = X_stndrd_train.as_matrix().astype('float32')
3 buf_train = io.BytesIO()
4 smac.write_numpy_to_dense_tensor(buf_train, matrx_train)
5 buf_train.seek(0)

~/anaconda3/envs/python3/lib/python3.6/site-packages/pandas/core/generic.py in getattr(self, name)
5272 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5273 return self[name]
-> 5274 return object.getattribute(self, name)
5275
5276 def setattr(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'as_matrix'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.