Bhih, A (2018) HIGH PERFORMANCE DECENTRALISED COMMUNITY DETECTION ALGORITHMS FOR BIG DATA FROM SMART COMMUNICATION APPLICATIONS. Doctoral thesis, Liverpool John Moores University.
|
Text
2018Amhmedphd.pdf - Published Version Download (7MB) | Preview |
|
Text
2018Amhmedphdinternal.pdf - Submitted Version Restricted to Repository staff only Download (7MB) |
Abstract
Many systems in the world can be represented as models of complex networks and subsequently be analysed fruitfully. One fundamental property of the real-world networks is that they usually exhibit inhomogeneity in which the network tends to organise according to an underlying modular structure, commonly referred to as community structure or clustering. Analysing such communities in large networks can help people better understand the structural makeup of the networks. For example, it can be used in mobile ad-hoc and sensor networks to improve the energy consumption and communication tasks. Thus, community detection in networks has become an important research area within many application fields such as computer science, physical sciences, mathematics and biology. Driven by the recent emergence of big data, clustering of real-world networks using traditional methods and algorithms is almost impossible to be processed in a single machine. The existing methods are limited by their computational requirements and most of them cannot be directly parallelised. Furthermore, in many cases the data set is very big and does not fit into the main memory of a single machine, therefore needs to be distributed among several machines. The main topic of this thesis is about network community detection within these big data networks. More specifically, in this thesis, a novel approach, namely Decentralized Iterative Community Clustering Approach (DICCA) for clustering large and undirected networks is introduced. An important property of this approach is its ability to cluster the entire network without the global knowledge of the network topology. Moreover, an extension of the DICCA called Parallel Decentralized Iterative Community Clustering approach (PDICCA) is proposed for efficiently processing data distributed across several machines. PDICCA is based on MapReduce computing platform to work efficiently in distributed and parallel fashion. In addition, the real-world networks are usually noisy and imperfect with missing and false edges. These imperfections are often difficult to eliminate and highly affect the quality and accuracy of conventional methods used to find the community structure in the network. However, in real-world networks, node attribute information is also available in addition to topology information. Considering more than one source of information for community detection could produce meaningful clusters and improve the robustness of the network. Therefore, a pre-processing approach that considers attribute information, shared neighbours and connectivity information aspects of the network for community detection is presented in this thesis as part of my research. Finally, a set of real-world mobile phone usage data obtained from Cambridge Laboratories (Device Analyzer) has been analysed as an exploratory step for viability to apply the algorithms developed in this thesis. All the proposed approaches have been evaluated and verified for feasibility using real-world large data set. The evaluation results of these experimentations prove very promising for the type of large data networks considered.
Item Type: | Thesis (Doctoral) |
---|---|
Uncontrolled Keywords: | Community analysis, community detection algorithms; decentralized clustering algorithm; networks; graph; distributed algorithms. |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science T Technology > T Technology (General) |
Divisions: | Electronics & Electrical Engineering (merged with Engineering 10 Aug 20) |
Date Deposited: | 17 Apr 2018 07:50 |
Last Modified: | 21 Nov 2022 15:36 |
DOI or ID number: | 10.24377/LJMU.t.00008399 |
Supervisors: | Johnson, P, Nguyen, TT and Randles, M |
URI: | https://researchonline.ljmu.ac.uk/id/eprint/8399 |
View Item |