Empirical Evaluation of Graph Processing Platforms for Knowledge Discovery from Big Graphs

Cover
University of Texas at Dallas, 2015 - 74 Seiten
This thesis presents the author's research which encompasses an extensive empirical evaluation of three different graph processing platforms representing three different parallel graph processing paradigms: Pegasus; a Mapreduce based graph mining tool, GraphX; a Spark API for graph parallel computation and Urika; a graph processing appliance designed for high performance graph retrieval [1]. Each platform is benchmarked using three popular graph-mining operations: Degree Distribution, Connected Components and Pagerank over real world graphs. Transactional database systems like Urika is found to perform best for non-iterative operations like Degree Distribution whereas Pregel based GraphX performs best for algorithmic operations involving iterative computations like Pagerank and Connected Components. This thesis also explores normative pattern discovery at scale using SPARQL to implement graph-mining algorithms to execute on Urika. Finally, a discussion on ways to optimize the performance of graph-mining operations on each platform is presented.

Bibliografische Informationen