Apache AsterixDB: A Scalable, Open Source BDMS
Share this Session:
  Yingyi Bu   Yingyi Bu
PhD. student
University of California, Irvine


Tuesday, August 18, 2015
03:00 PM - 03:45 PM

Level:  Technical - Intermediate

Apache AsterixDB is a new, full-function BDMS (Big Data Management System) with a feature set that sets it apart from other Big Data platforms in today's open source ecosystem. Its features make it well-suited to applications including web data warehousing, social data storage and analysis, and other use cases related to Big Data. AsterixDB has a flexible NoSQL style data model; a query language that supports a wide range of queries, a scalable runtime; partitioned, LSM-based data storage and indexing (including B+ tree, R tree, and text indexes); support for external as well as native data; a rich set of built-in types, including spatial, temporal, and textual types; support for fuzzy, spatial, and temporal queries; a built-in notion of data feeds for ingestion of data; and transaction support akin to that of a NoSQL store.

Development of AsterixDB began in 2009 and led to a mid-2013 initial open source release. This talk will provide an overview of the resulting system. Time permitting, the talk will cover the system's data model, its query language, and its overall architecture. Also included will be a summary of the current status of the project and a first look at how the system performs when compared to alternative technologies, including a parallel relational DBMS, a popular NoSQL store, and a popular Hadoop-based SQL data analytics platform. The talk will conclude with a mention of several initial trials that the system has undergone and the lessons learned (and future plans laid) based on those early "customer engagements."

Yingyi Bu is a PhD candidate at UC Irvine. During his PhD, he has been working on the Apache AsterixDB project, especially the query processor part and the graph analytics part. He is the recipient of a Google PhD fellowship award and a Yahoo! key scientific challenge award. Prior to his PhD, he full-timely worked in the Microsoft SQL Server engine team.

Close Window