Scalable Complex Query Processing over Large Semantic Web Data Using Cloud
- Resource Type
- Conference
- Authors
- Husain, Mohammad Farhan; McGlothlin, James; Khan, Latifur; Thuraisingham, Bhavani
- Source
- 2011 IEEE 4th International Conference on Cloud Computing Cloud Computing (CLOUD), 2011 IEEE International Conference on. :187-194 Jul, 2011
- Subject
- Computing and Processing
Communication, Networking and Broadcast Technologies
Resource description framework
Cloud computing
Ontologies
US Department of Energy
Heuristic algorithms
Query processing
RDF
Hadoop
Cloud
Semantic Web
- Language
- ISSN
- 2159-6182
2159-6190
Cloud computing solutions continue to grow increasingly popular both in research and in the commercial IT industry. With this popularity comes ever increasing challenges for the cloud computing service providers. Semantic web is another domain of rapid growth in both research and industry. RDF datasets are becoming increasingly large and complex and existing solutions do not scale adequately. In this paper, we will detail a scalable semantic web framework built using cloud computing technologies. We define solutions for generating and executing optimal query plans. We handle not only queries with Basic Graph Patterns (BGP) but also complex queries with optional blocks. We have devised a novel algorithm to handle these complex queries. Our algorithm minimizes binding triple patterns and joins between them by identifying common blocks by algorithms to find sub graph isomorphism and building a query plan utilizing that information. We utilize Hadoop's MapReduce framework to process our query plan. We will show that our framework is extremely scalable and efficiently answers complex queries.