# M03 - Visualization, Building and Evaluating Models

13 Oct 2015# 1. Exploratory data analysis

- explore the data with visualization
- understand the relationships in the data
- create multiple views of data
- use conditioning
- understand sources of model errors

# 2. Types of visualization

- relationships in data can be complex
- data exploration requires multiple views
- conditioned plots are ideal

## 2.1. Plots

scatter and line plots

- typically has two dimensions a horizontal axis and a vertical axis
- shows scatter of points

line plots

- important thing is order of data (time)

bar plots

- used for categorical or fact derivative data
- ordered and unordered

histograms

- mainly used for continuous variables (temperature)
- show distribution

box plots

- break the data up into four quartiles
- line on the top of the box shows the length of the upper most quartile
- line on the bottom of the box shows lower most quartile
- line inside the box shows median value
- dots or plus signs shows the outliers

violin plots

q-q plots

- quantile - quantile
- typically used for looking at the residuals of a regression model

# 3. Building models

## 3.1 Modeling process

- metrics - delivering value - the way to measure the performance and results of Machine Learning models.
- feature engineering - selecting a good features for building machine learning models.
- model construction
- testing (different types of models, metrics, parameters)
- evaluation (with testing data sets)

**Terms to know:** *scoring*

Azure ML modules:

# 4. Evaluating models

- select some metrics that we are going to use to evaluate performance models,
- regression (root mean squared error, absolute error, median error)
- classification (AUC, sensitivity, confusion matrix)

- evaluate metrics
- fix errors
- improve model

Azure ML modules:

Posted with : Machine Learning