Facial reconstruction

Search LJMU Research Online

Browse Repository | Browse E-Theses

A framework approach to initialisation dependent clustering methodologies

Chambers, SJ (2015) A framework approach to initialisation dependent clustering methodologies. Doctoral thesis, Liverpool John Moores University.

[img] Text
158221_S Chambers Thesis Final.pdf - Published Version

Download (3MB)


Clustering algorithms are commonly used for exploratory data analysis and data mining and used correctly are powerful tools for gaining insights into the underlying structure of data. It is known however that some of these algorithms are dependent upon the parameters with which they start, giving differing results as these vary. Often there is an element of randomness in the initialisation process greatly increasing the difficulty of selecting an appropriately initialised solution.

Effective use of these algorithms depends upon the correct choice of appropriate initialisations, however when exploring new data it is often difficult to objectively obtain values appropriate to the problem. The use of initialisation strategies to maximise the performance of the algorithm are therefore important to ensure solutions identified are both consistent with the structure of the data and reproducible.

This thesis introduces a coherent strategy for dealing with initialisation in the form of chosen parameter selection and randomness. A Separation Concordance (SeCo) framework is developed which uses a dual measure approach to evaluating the solutions from resampling of starting conditions. This SeCo framework also allows for the inference of an appropriate number of partitions within the data and introduces a SeCo map for visualising the solution space.

The performance of these visualisations compared and contrasted with the existing methods in use through an exhaustive series of experiments for both algorithms tested, and is shown to be effectivein the selection of a repeatable solution with high concordance to the underlying structure of the data. These results are benchmarked using a range of synthetic and real world data-sets whose composition ranges from trivial to complex.

Item Type: Thesis (Doctoral)
Subjects: Q Science > QA Mathematics
Divisions: Applied Mathematics (merged with Comp Sci 10 Aug 20)
Date Deposited: 04 Nov 2016 14:34
Last Modified: 03 Sep 2021 23:27
DOI or ID number: 10.24377/LJMU.t.00004531
Supervisors: Jarman, I
URI: https://researchonline.ljmu.ac.uk/id/eprint/4531
View Item View Item