Wednesday, October 7, 2015

U-SQL based on SCOPE is designed for Big Data

While SQL covered the RDBMS landscape U-SQL covers a much larger data landscape.

At the same time as the announcement of Azure Data Lake Services, a new language under development at Microsoft, the U-SQL language was also announced. For the Azure Data Lake Service and what it means to business read here.

With the advent of Big Data and the task of mining all kinds of data, RDBMS suddenly found itself at a disadvantage. Structured Query Language (SQL) could only address what is in a relational data store. U-SQL was born to address this challenge posed by Big Data defined by volume, velocity and variety.

What is U-SQL
U-SQL deep dives into Big Data to extract the most relevant information. It is a powerful language (in the words of Microsoft):
  • Process any type of data. From analyzing BotNet attack patterns from security logs to extracting features from images and videos for machine learning, the language needs to enable you to work on any data.
  • Use custom code easily to express your complex, often proprietary business algorithms. The example scenarios above may all require custom processing that is often not easily expressed in standard query languages, ranging from user defined functions, to custom input and output formats.
  • Scale efficiently to any size of data without you focusing on scale-out topologies, plumbing code, or limitations of a specific distributed infrastructure.
Compared to HIVE, a SQL-Based language U-SQL is flexible and does not have the limited capability to address the 'variety' in non-structured data requiring schema generation prior to running queries. U-SQL should prove more easy to use than Hive for complex scenarios.

U-SQL has been designed as declarative SQL based language with native extensibility through user code in C#. This approach:
  • Unifies SQL and C#
  • Unifies structured and Unstructured
  • Unifies declarative and custom code
U-SQL is based on SCOPE which is based on  existing prior languages, ANSI-SQL, T-SQL and HIVE. U-SQL should present a less steeper curve for those who are using SQL already.

U-SQL is an important development that developer need to jump on.

Monday, October 5, 2015

If you know SQL then SCOPE is easy

SCOPE is an acronym for Structured Computations Optimized for Parallel Execution, a declarative language for working with large-scale data. It is still under development at Microsoft. If you know SQL then working with SCOPE will be quite easy as SCOPE builds on SQL.
The execution environment is different from that RDBMS oriented data.
Data is still modeled as rows. Every row has typed columns and eveyr rowset has a well-defined schema. There is a SCOPe compiler that comes up with optimized execution plan and a runtime execution plan.

Look at this QCount query in SCOPE:

SELECT query, COUNT(*) AS count
FROM "search.log" USING LogExtractor
GROUP BY query
HAVING count > 1000
OUTPUT TO "qcount.result";

You probably know most and the rest you are able to guess.

In the above there is a built-in LogExtractor. You can get it step-by-step going line by line; each step output being the input of next step.

SCOPE requires a software platform for storing and analyzing massive amounts of data and Microsoft has one called 'Cosmos'. Here is graphic of SCOPE processing is carried out.

This post is based on the PDF document you will find here and the image is taken from the same PDF.

Sunday, October 4, 2015

Microsoft Azure expands storage options with Azure Data Lake

Recently announced Azure Data Lake addresses the big data  3V challenges; volume, velocity and variety. It is one more storage feature in addition to blobs and SQL Azure database. Azure Data Lake (should have been Azure Data Ocean IMHO) is really omnipotent. Just look at the key capabilities of Azure Data Lake:

Any Data
Native format, distributed data store. No need to pre-define schema information. From unstructured to structured data handling.

Any Size
Kilo bytes to Exa bytes OK. Ready for read/write.

At any scale
Scale to match your needs; high volume data handling of small writes and low latency. Can Aaddress near real-time web analytics scenarios.

HDFS Compatible
Works out-of-the box with Hadoop including services such as HD Insight

Full integration with Azure Active Directory
Supporting identity and access management over all of the data.

Azure Data Lake Store  is therefore a hyper-scale HDFS repositiory designed specifically for big data analytics in the cloud. It is order made for IoT and thorughput-intensive analytics for high volume data.

Read more here.
The graphic is from a  Microsoft Technet site
I checked out the preview portal (, I do not see it. Possible by the end of the year.

Wednesday, September 30, 2015

SQL Server Data Tools for Visual Studio 2013

There are several options for SQL Server Data Tools depending on the version of Visual Studio as described here.

For this post we are using a Toshiba Laptop running Windows 7 (64-bit) Ultimate having a Visual Studio Express 2013 for Web which is one of the requirements in the following list:
  • Visual Studio 2013 Professional,
  • Ultimate,
  • Premium,
  • Express for Web, or
  • Express for Windows Desktop
You can download the tool here:

You will need approximately 1.8GB of free space at for the full install point because it includes all possible components that might be required.

You have an option for the language to choose from:

Choose language and download (English here).

You double click the downloaded program SSDTSetup.exe to begin installation.


Accept the license terms to proceed.
The download and installation begins immediately.

It may take a while to download and install.


After installation you can find Visual Studio 2013 Data Tools as shown.


A number of items will be added to your Control Panel Programs as shown.

You can verify it in Visual Studio 2013 Express for web also as shown.


How do you create SQL Server Projects using the SSDT?
Hints for creating projects:
Launch VS 2013 Web (for example)
Create New Project choosing SQL Server Project Template
Associate a SQL Server Database for the project by directly connecting to SQL Server; choosing to run a script or via a data-tier application

In the next post we look at using SQL Server Data Tools in Visual Studio 2013 Express for Web.

Sunday, September 27, 2015

There are quite a few versions of SQL Server Data Tools. Which one should be used?

It all depends on what version of Visual Studio you have. However, note that, you can install stand alone versions of SSDT.

This table shows the versions suitable for Visual studio 2012; Visual Studio 2013; and Visual Studio 2015.

Note the following:
SSDT can be installed standalone or can be integrated into an existing Visual Studio installation. For the preview release, SSDT is supported with Visual Studio 2015 and Visual Studio 2013. If you already have the Professional or higher edition of Visual Studio 2015 or Visual Studio 2013 and did not opt out of installing SQL Server Data Tools during setup, then your machine has an existing installation of SSDT

Download links for the RTM versions:
SSDT 2015 for Visual Studio 2015:
SSDT 2013 for Visual Studio 2013:
SSDT 2012 for Visual Studio 2012:
Well how are they going to function on Windows 10?
If you do not have
Visual Studio 2012/2013/2015  Professional Edition or above you probably will get just a shell with only SSDT related items and nothing else (like VB, C#, etc.)


Thursday, September 17, 2015

Overview of what you can learn from Database and Analytics sessions at Visual Studio Live!

The field of Data, big and small,l made a great leap forward from mere file based data to relatiional database systems and remained there for a long time and it is still very much used. However, the advent of Internet and social media created new needs not addressed by RDBMS and gave rise to NoSQL databases. On the other hand the enormous amount of data brought in by media made it look for other tools bringing Big Data, now a byword. The sessions treat the enormous devveloping field in this fascinating area.

If you do manage to attend you can get some new information and brush up some old stuff. It should be interesting to track the developments.

These are the session details in the Database and Analytics area.

SQL Server of course is center stage. Topics under SQL Server 2014 covers the following:
  • SQL Server Data Tools (SSDT)
  • Reporting Services for Developers
  • Memory-Optimized Tables and Columnstore Indexes in SQL Server 2014
  • Microsoft Azure SQL Database
TH03 Implementing Data Warehouse Patterns - Attendees Choose
10/01/2015   8:00am - 9:15am
You will learn:
•Techniques for Populating Fact Tables and Dimension tables using T-SQL MERGE statements
•Special considerations for managing Type 2 Slowly Changing dimensions
•Tips for implementing many of the patterns found in the Kimball methodology
 You will also get to understand the following:
Type 2 changing dimensions
Confirmed dimensions
Dimension outriggers
Role-playing relationhsips
Many-tomany bridge table relationships
Factless Fatc tables

TH07: Power BI 2.0: Analytics in the Cloud and Excel
10/01/2015   9:30am - 10:45am
Get to speed quickly on what Power BI has done since version 1 and what's new in Version
Try to read up on my posts in blogs and you might have already learnt a lot!
TH11: Busy Develoepr's Guide to NoSQL
10/01/2015   11:00am - 12:15pm

You will learn:
•About NoSQLs
•What NoSQLs offer that the RDBMS don't
•When and how (and when not) to use them

TH15  Big Data and Hadoop with Azure HDInsight
10/01/2015   1:30pm - 2:45pm
  • Learn what Hadoop is, how it works, what to do to get it running on the Azure cloud and how to integrate it.
  • Learn how to integrate it with SQL Server BI, Excel and third party tools
TH19  Predictive Analytics and Azure Machine Learning
10/01/2015   3:00pm - 4:15pm

You will learn:
•Learn the fundamentals of predictive analytics, including which algorithms work best for specific scenarios
•Learn how to build Azure Machine Learning experiments and models
•Get exposed to the R programming language and see how to integrate R code into Azure ML experiments

W19  SQL Server Reporting Services - Attendees Choose Topics
[This presentation takes some of the more complicated topics and offers them as choices that attendees can pick for the session.]
You will learn:
•About advanced SSRS features "in action", as implemented in actual SSRS client applications
•The implementation of mass-automated report delivery of data driven subscriptions in SSRS (and also see .NET code used in SSIS scripts for customized delivery of reports)
•A general understanding of Microsoft's current and future offerings for SSRS in the Cloud

W15  Transact-SQL for Application Developers - Attendees Choose Topics
09/30/2015 3:00pm - 4:15pm
[Attendees choose from 25 possible topics - plus you'll walk away with all 25 code samples.]
You will learn:
•About the new language enhancements in SQL 2012
•Where sub-queries are necessary
•To gain some appreciation of performance optimization

W11  Real World SQL Server Data Tools
09/30/2015   1:30pm - 2:45pm
You will learn:
•Techniques for importing existing legacy SQL Server schemas with old, dead code. Basically, how do you take the kind of messy, not-well-maintained databases that I see at customer sites and bring it into SSDT without losing your sanity and giving up?
•About real enterprise application databases that often have references across database boundaries, and how to identify and variable-ize these database objects and deal with cross-database and cross-SSDT project references.
•How to handle incremental automated deployment scenarios from the command line or TFS Build and manage 'lookup' data as part of these deployment scenarios.

M01  Workshop: Big Data, Analytics and NoSQL: Everything You Wanted to Learn But Were Afraid to Ask

09/28/2015  9:00am - 6:00pm
"Who has time to learn about Big Data, Analytics and NoSQL? "
Perhaps you can learn a little bit of all the following:
Big Data

Read more here:

Friday, September 11, 2015

SQL Server 2016 CTP 2.3 is ready to download

SQL Server 2016 CTP 2.3 is available since September 2, 2015. Several features have been enhanced. Looks like integration of R Language with SQL Server 2016 is built-in"

The following are the stated enhancements in this preview:
  • Row Level Security support for in-memory OLTP tables
  • A core engine scalability improvement, dynamically partitioning thread safe memory objects by NUMA or CPU, which enables higher scalability
  • In-memory data warehouse (column store) performance optimizations
  • Many-to-many derived hierarchies in Master Data Services
  • Several performance improvements to SQL Server Analysis Services, including DAX query performance
  • Enhancements to Reporting Services, including an updated Report Builder with a modern theme and report rendering for modern browsers

These are additional enhancements besides the above:
Improvements to the SQL Server Integration Service, Query Execution, and DBCC CHECKDB.

Technical Datasheet is  here (download PDF).

Download the evaluation edition here.