Advanced visualization concepts
Information Visualization

What We Are Going to Learn

  • Traditional Analytics methods
    • Clustering
    • Regression
    • Classification
    • Dimensionality reduction
    • Recommendation systems

Clustering

http://scikit-learn.org/stable/modules/clustering.html#clustering

When to Use It

  • When you want to find similar items
  • Depends on your distance metric
  • When you have too many items and you want to aggregate

Examples

  • Customer segmentation
  • Grouping experiment outcomes

Regression

http://scikit-learn.org/stable/modules/linear_model.html#passive-aggressive-algorithms

When to Use It

  • Present (identify/compare) tendency
  • Predict values

Regression

http://blockbuilder.org/tmcw/3931800by tmcw

Examples

  • Stock prices
  • Drug response

What car to buy?

User: person buying a car

Task: What's the best car to buy?

Data: all cars on sale

Normal procedure

Ask friends and family

Problem

That's inferring statistics from a sample n = 1

Better approach

Data-based decisions

https://tucarro.com

Jeep Willys

  • Colombia bought many Jeeps after the war
  • They are the a sort of mountain taxi
  • There is a trend to pimp them up
Colombian Jeep Willys. Foto by John Alexis Guerra Gómez

Classification

Classification algorithms
https://scikit-learn.org/stable/modules/ensemble.html#random-forests

When to Use It

  • Present (identify/compare) tendency
  • Present (identify/compare) groups
  • Aggregate
  • Predict values

Examples

  • Photo categorization
  • Sentiment analysis
  • Spam filtering

Dimensionality Reduction

When to Use It

  • Attribute filtering
  • Categorize documents (topic modeling)

Recommendation Systems

When to Use It?

  • Large catalog with user preference history
  • If you like "A" and "B", maybe you will like "C"

Types

  • Collaborative filtering
  • Content- based systems
  • Hybrids

Examples

  • Amazon
  • Facebook
  • Google
  • Yahoo
  • Netflix Prize

How to Use the Algorithms

What We Learned

  • Traditional Analytics methods
    • Clustering
    • Regression
    • Classification
    • Dimensionality reduction
    • Recommendation systems