Azure Machine Learning Studio: the best place for initial data analysis?

While looking at a (relatively small, 1.7 million records) big data example of New York Yellow Cab taxi trips, I am coming to the conclusion that the best place (if as we do you are using Microsoft tools) for initial analysis, including the all important first step of finding outliers/errors, is Azure Machine Learning Studio (Azure ML, as opposed to Excel, Power BI or bespoke analysis using e.g. Kendo UI).

Why Azure ML for initial analysis?

  1. It loads data quite quickly (e.g. just over a minute to import almost 2 million records from an Azure SQL database). This is currently much quicker than Power BI.
  2. It automatically produces histograms and box plots of numeric fields (see the images below, and above, where the field FareAmount has been selected). We can tell immediately from the box plot that there are several outliers (and in fact probable errors that will need to be either corrected or removed, in that FareAmount should not have negative values!).

BigDataTaxiTripsAzureMLInitialVisualization

Kendo UI is useful for producing powerful filterable/sortable grids such as the one shown below:

BigDataKendoUITaxiTripsGrid

and also useful charts, e.g. a scatter plot of Total (Fare) Amount by Trip Distance:

BigDataKendoUITaxiTripsScatterPlot

However, it takes time and effort to set up such grids and charts, a bespoke webpage or site with the Kendo UI components installed is needed, and using more than a few thousand points in the charts can be very slow.

Start in Azure ML, move to Power BI for reporting

So while the Kendo UI components or Power BI are useful in my view for doing final, interactive reporting in a dashboard to senior management, my preference at the moment is to load the data in Azure ML. I get box plots and histograms without having to do any work, and when I want to look at scatter plots and faceted histograms, I can quickly produce these from Azure ML by running Python or R scripts.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s