Introduction to Clustering Algorithms: Gaussian Mixture Models – Hai-Xia Ma Introduction to Clustering Algorithms: Gaussian Mixture Models – Hai-Xia Ma

Introduction to Clustering Algorithms: Gaussian Mixture Models

1 minute read

Published:

Gaussian Mixed Model (GMM) refers to a linear combination of multiple Gaussian distribution functions. Gaussian Mixture Models are probabilistic models that assume all samples are derived from a mixture of an indefinite number of Gaussian distributions with unknown parameters. It belongs to the category of soft clustering methods in which each data point will be a member of each cluster in the dataset, but to varying degrees. This membership is assigned as a probability ranging from 0 to 1 of belonging to a certain cluster. Theoretically, GMM can fit any type of distribution, and is usually used to solve cases where data under the same set contains several different distributions (either the same type of distribution but with different parameters, or different types of distributions, such as normal and Bernoulli distributions).

Gaussian mixture models may be used to identify clusters inside a dataset when the number of clusters included within the dataset is known (or assumed to be known), but the locations and shapes of these clusters are unknown. Finding these clusters is the objective of GMM and since we don’t have any information instead of the number of clusters, the GMM is an unsupervised technique. To do that, we try to fit a variety of Gaussians to our dataset. By this I mean that our goal is to discover as many distinct, but related, Gaussian distributions that may be utilized to characterize our dataset. A important element for the understanding is that these Gaussian shaped clusters must not be circular shaped like for instance in the K-Means technique but can have whatever shapes a multivariate Gaussian distribution might take. That is, a circle can only alter in its diameter but a GMM model can (because of its covariance matrix) represent all ellipsoid forms as well.