Article citationsMore >>

Bradford Tuckfield. 2019. Applied Unsupervised Learning with R: Uncover hidden relationships and patterns with k-means clustering, hierarchical clustering, and PCA. Packt Publishing Ltd.

has been cited by the following article:

Article

Incorporating K-means, Hierarchical Clustering and PCA in Customer Segmentation

1University of Missouri, USA


Journal of City and Development. 2021, Vol. 3 No. 1, 12-30
DOI: 10.12691/jcd-3-1-3
Copyright © 2021 Science and Education Publishing

Cite this paper:
Azad Abdulhafedh. Incorporating K-means, Hierarchical Clustering and PCA in Customer Segmentation. Journal of City and Development. 2021; 3(1):12-30. doi: 10.12691/jcd-3-1-3.

Correspondence to: Azad  Abdulhafedh, University of Missouri, USA. Email: asa8cd@mail.missouri.edu

Abstract

This paper addresses the use of clustering algorithms in the customer segmentation to define a marketing strategy of a credit card company. Customer segmentation divides customers into groups based on common characteristics, which is useful for banks, businesses, and companies to improve their products or service opportunities. The analysis explores the applications of the K-means, the Hierarchical clustering, and the Principal Component Analysis (PCA) in identifying the customer segments of a company based on their credit card transaction history. The dataset used in the project summarizes the usage behavior of 8950 active credit card holders in the last 6 months, and our aim is to perform customer segmentation in the most accurate way using clustering techniques. The project uses two approaches for customer segmentation: first, by considering all variables in the clustering algorithms using the Hierarchical clustering and the K-means. Second, by applying the dimensionality reduction through Principal Component Analysis (PCA) to the dataset, then identifying the optimal number of clusters, and repeating the clustering analysis with the updated number of clusters. Results show that the PCA can effectively be employed in the clustering process as a check tool for the K-means and Hierarchical clustering.

Keywords