Bhih, A, Johnson, P and Randles, M (2015) EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set. International Journal of Engineering Research & Technology (IJERT), 4 (1). pp. 553-557. ISSN 2278-0181
|
Text
V4I1-IJERTV4IS010563.pdf - Published Version Available under License Creative Commons Attribution. Download (296kB) | Preview |
Abstract
Data mining is one of the long known research topics, which is making a comeback especially with the advent of Big Data. ’Clustering’ technique is an important component in data mining. As we enter the Big Data era where many realworld datasets consist of multi-dimensional features, clustering has been gaining momentum in importance within this topic. The traditional clustering algorithms often fail to detect meaningful clusters in high-dimensional data set. Therefore, they become computationally expensive when dealing with data comprised of multiple dimensions. In this paper, we have proposed a modified technique that will perform well with high dimensional data set. In our proposed method we used Principle Component Analysis for dimension reduction before applying standard EM algorithm. The performance of the proposed set of algorithms is evaluated on the basis of silhouette index and time of execution.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Clustering; dimensionality reduction; Particle Component Analysis; Expectation Maximization |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Computer Science & Mathematics Electronics & Electrical Engineering (merged with Engineering 10 Aug 20) |
Publisher: | ESRSA Publications |
Date Deposited: | 12 Oct 2015 09:10 |
Last Modified: | 09 Mar 2022 11:34 |
URI: | https://researchonline.ljmu.ac.uk/id/eprint/2152 |
View Item |