MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark

Each database has its strengths and weaknesses for different data access profiles, and we should endeavor to use the right tool for the right job.

However, adding another infrastructure component greatly increases not only the management effort, but also the development effort to integrate and maintain connections across multiple data repositories, let alone keeping the data synchronized.

In this talk, we’ll discuss MetaQL, a common query layer across database technologies including NoSQL, SQL, Sparql, and Spark.

Using a common query layer lessens the burden on developers, allows using the right database for the right job, and opens up data to additional analysis that would be unavailable previously – providing new and unexpected value.

In this talk, we will discuss:

Database Data Model and Schema
Java/JVM Query Builder, driven by Schema
Query constructs for “Select”, “Graph”, “Path”, and “Aggregation” Queries
NoSQL & SQL Databases
Sparql RDF Databases, including Allegrograph
Apache Spark, including SparkSQL and the new DataFrame API
ETL, Transactions, and Data Synchronization
Seamless queries across databases

Marc Hadfield founded Vital AI to create software systems that understand and harness data. Mr. Hadfield’s career has focused on large-scale data analysis in Financial Services, Life Science, and Enterprise Publishing. He has been principally focused on the interplay of Semantic Data, Machine Learning, Graph Analytics, and Natural Language Processing as a foundation for data-driven applications. Mr. Hadfield, as CTO of Alitora Systems, has worked with the Gladstone Institute and the Gates Foundation to apply these techniques to drug discovery; as CTO of Inform Technologies to apply them to content recommendation for news and video publishers; and as consultant to Bloomberg to apply them to communication regulatory compliance. Using this experience, Mr. Hadfield designed the Vital AI software components and processes to do the heavy lifting of data-driven applications, freeing up developer and data science resources for deeper data analysis.