Giter Club home page Giter Club logo

unsupervised-machine-learning-algorithm-k-means-for-rfm's Introduction

Using unsupervised machine learning algorithm K-means for RFM :shipit:

Context

The Dataset has information about 100k orders from 2016 to 2018 made at multiple marketplaces in Brazil. Its features allow viewing an order from multiple dimensions: from order status, price, payment and freight performance to customer location, product attributes and finally reviews written by customers.

This Dataset was generously provided by Olist, the largest department store in Brazilian marketplaces. Olist connects small businesses from all over Brazil to channels without the hassle and with a single contract. Those merchants are able to sell their products through the Olist Store and ship them directly to the customers using Olist logistics partners.

After a customer purchases the product from Olist Store a seller gets notified to fulfill that order. Once the customer receives the product, or the estimated delivery date is due, the customer gets a satisfaction survey by email where he can give a note for the purchase experience and write down some comments.

The module k_means.py contains the scripts that bring the order and structure to the selected data.

RFM method analyses customer value. The abbreviation stands for the attributes used in segmentation, namely recency, frequency, and monetary value. Frequency determines how often the purchase is made, recency defines the most recent purchase and, finally, monetary value measures spend per customer.

In marketing terms, client segmentation splits business clients into groups that have common attributes based on behavioural, demographic, psychographic or geographic data. Customer segmentation enables companies to target specific groups, allowing effective allocation of resources, appropriate pricing, service, and product customisation, strategizing and innovation.

To validate the number of clusters, Elbow Method is being used. It estimates the optimal value K produced by the cost function. While iterating through increasing K values, average distortion decreases and vice a versa. The “elbow” calculates the point where distortion declines or in other words if the plot looks like an arm, the elbow is where the forearm begins.

The frequency score table shows that most frequent customer made 24 purchases in 2 years period, while 84151 customers made 1 purchase and 10742 made 2 purchases. Total number of customers is 96096.

The Recency Distribution plot shows that maximum number of days the maximum number of days that have passed since the last purchase to the next is 772 and on average 287 days.

Monetary Value graph illustrates that the maximum spend was 60480 BRL. The average spent per customer was 160 BRL.

Performing analysis on the data, resulted in the three major clusters. The High Worth cluster customers indicate that marketing efforts should be focused on their retention, the Medium on retention and spend and the Low on increasing the purchase frequency.

Plot compares Monetary Value and Frequency.

Plot compares Monetary Value and Recency.

Plot compares Frequency and Recency.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.