Monday, June 8, 2015

Running queries against both structured and unstructured data


It is a Microsoft Data tool. It simplifies management of relational and non-relational data with the ability to query both.

With enormous amount of data that is unstructured that grows bigger each day and structured data both archived and live, a single querying technology is most desired. SQL Server has geared up with a new tool, the Polybase.

You want to query non-relational data. Do you modify it and bring it into SQL Server which is relational and then query it? Or do you buy another product to query non-relational data (like data in Hadoop, blobs and files)?

Well Polybase provides the capability to query non-relational data in-situ using the SQL Server using T-SQL. You need not move the data over to SQL Server although SQL Server gives the option to store in SQL Server if you want to do so. Polybase is supported out of the box in SQL Server 2016 CTP2 which means it will be available in SQL Server 2016.

Polybase was not supported out of the box in earlier version. Of course Polybase can process the queries whether it is on the premises or in the cloud.

Here is a rough schematic of what it is about.