Clústers amb variables mixtes per a la caracterització de clients

Carbonell Cabutí, Marta

Please use this identifier to cite or link to this item: https://dipositint.ub.edu/dspace/handle/2445/171904

Title:	Clústers amb variables mixtes per a la caracterització de clients
Author:	Carbonell Cabutí, Marta
Director/Tutor:	Puig De Dou, Ignasi Rodero De Lamo, Lourdes
Keywords:	Anàlisi de conglomerats Segmentació de mercat Treballs de fi de grau Cluster analysis Market segmentation Bachelor's theses
Issue Date:	Jun-2020
Abstract:	[cat] L’anàlisi de conglomerats és un mètode multivariant que té com a objectiu principal identificar grups d’objectes amb característiques similars dins d’una base de dades numèriques. Actualment però, aquesta branca de l’estadística està desenvolupant mètodes que permetin l’anàlisi de bases de dades mixtes, per tal de poder utilitzar tant les variables descriptives numèriques com les categòriques dels diversos objectes. Aquests criteris d’agrupació es poden classificar en dos grans grups: els mètodes jeràrquics i els no jeràrquics. En el següent treball es realitza una clusterització de les dades dels clients d’un majorista de ferreteria industrial a fi de poder-los agrupar en varis grups homogenis, mitjançant dos mètodes d’agrupació: el mètode de Ward i el Partition Around Medoids. A fi de poder crear aquests grups és necessari calcular un coeficient de similitud per tal de conèixer les distàncies entre els individus. Així doncs, s’utilitzarà el coeficient de Gower, ja que permet tractar amb dades numèriques i categòriques a la vegada. No obstant, també es realitzarà un anàlisi de sensibilitat d’aquesta mesura per tal de comprovar la seva robustesa. [eng] Traditionally, the cluster analysis has only been used in numerical data bases with the main objective being the identification of object groups with similar characteristics within those databases. This is an on-growing branch of statistics science that tries to incorporate methods that allow us to perform this mix database object categorization. In this project we will be carrying out a cluster analysis of an industrial hardware wholesaler’s client information. By doing so, we can recognize those that follow specific patterns of behaviour, which will allow us to perform a more accurate follow-up and segment the price cut campaigns according to their own interests. For us to accomplish this objective, we have opted for the use of both a hierarchical aggregation method and a non-hierarchical one: the Ward and the Partition Around Medoids methods, respectively. However, for us to be able to create these groups we need to calculate a similarity coefficient for us to know the distances between the various individuals. In this study, we have opted for the use of the Grower coefficient, seeing as it allows us to handle numerical and categorical data at the same time. Hereunder is the detailed explanation of the client characterisation process, starting with the creation of the database and following up with the description of the various groups used. A Grower distance sensibility analysis is also included for the validation of the solidness of this measure.
Note:	Treballs Finals de Grau en Estadística UB-UPC, Facultat d'Economia i Empresa (UB) i Facultat de Matemàtiques i Estadística (UPC), Curs: 2019-2020. Tutors: Ignasi Puig De Dou; Lourdes Rodero De Lamo
URI:	https://hdl.handle.net/2445/171904
Appears in Collections:	Treballs Finals de Grau (TFG) - Estadística UB-UPC

Files in This Item:

File	Description	Size	Format
TFG_MartaCarbonellCabutí.pdf		1.15 MB	Adobe PDF	View/Open

Show full item record

This item is licensed under a Creative Commons License