Information Visualization

- How to recognize, create, and store networks
- Network Visualization Techniques
- Node-Link Representations
- Force Simulations
- Matrix Representations

- Working with Color
- Sequential: one hue
- Divergent: two hues
- Categorical: Multiple hues
- Continuos multiple hues

- A combination of nodes and links
**Nodes**: entities with properties (and an id)**Links**: connections between nodes- Can also have properties
- Can be directed or undirected
- Can have self links, or multiple links between nodes

- Usually contains the values
- Friends attributes (name, age, gender, etc) in a social network.

- But links can also have attributes
- When the friendship was established
- What type of relationship
- How many friends in common do they have

- Can be made from tables, grouping by attributes
- My for creating from tables

**Visual encoding:**- Link connection marks, node point marks

**Tasks:**- Explore topology; locate paths, clusters

**Scalability:**- Node/edge density E < 4N

**Considerations:**- Spatial position no meaning directly encoded
- Proximity semantics?

**Data:**network- Transform into same data/encoding as heatmap

**Derived data:**table from network- One quantitative attribute
- Weighted edge between nodes

- Two categorical attributes: node list x 2

- One quantitative attribute
**Visual encoding:**- Cell shows presence/absence of edge

**Tasks:**- Identify clusters (topology)
- Summarize topology/distribution

**Scalability:**- 1,000 nodes, one million edges

**Adjacency matrix strengths:**- Predictability, scalability, supports reordering
- Some topology tasks trainable

**Node-link diagram strengths:**- Topology understanding, path tracing
- Intuitive, no training needed

**Empirical study:**- Node-link best for small networks
- Matrix best for large networks...
- ...if tasks don’t involve topological structure!

- Data: networks (small number of nodes)
- Tasks: summarize connections; identify highest degree
- Considerations: usually good for origin to destination

- Data: networks
- Tasks: summarize common connections
- Considerations:
- Reduces cluttering
- Requires computing time
- Works with any link based idiom

- Data: networks (few nodes)
- Tasks: summarize common connections.
- Considerations:
- Nodes' order matters.
- Better with highly clustered data

- Data: networks with many edges
- Task: summarize distribution of non network attribs
- Considerations:
- Easier to understand
- Scale well
- Edges on demand work best

- Number of nodes, number of edges
- Connected components: count of separate groups of nodes
- Graph density: percent of possible links that are present

- E.g., run “Average Degree” tab in Gephi
- For pure random networks: $P_k = e ^ { \langle k \rangle } \frac{ \langle k \rangle ^k}{k!}$
- For preferential attachment: $P_k ~\sim~ k ^ {-\gamma}$

- E.g., run “Average Path Length” in Gephi
- The path length between nodes i and j defined as $d_{ij}$
- Average path length $\langle d_{ij} \rangle$
- Network diameter $d_\max = \max _{i,j} d_{ij}$

- Betweenness centrality: number of shortest paths across node
- Degree centrality (node degree), also edge centrality (not in Gephi, use NetworkX)
- Eigenvector centrality $Ax = \lambda x$
- Closeness $d_{cl} = \left [ \sum _{ij} d_{ij} ^ {-1} / n \choose 2 \right ] ^ {-1}$

- PageRank, like eigenvector centrality, can be written as an eigenvalue problem: $$PR(p_i) = \frac{1-d}{N} + \sum _{p_j} \frac{PR(p_j)}{L(p_j)}$$

- Watts and Strogatz measure: $$ C_1 = \left \langle \frac{\sum_{j_1,j_2\in N} a_{j_1j_2}}{k_i(k_i-1)/2} \right \rangle $$
- Newman (and Gephi): $$ C_2 = \frac{3 \times \textrm{triangles}}{\textrm{triples}} $$

- First rule: do not talk about color!
- Color is confusing if treated as monolithic
- Decompose into three channels
- Ordered can show magnitude
- Luminance: how bright
- Saturation: how colorful
- Categorical can show identity
- Hue: what color
- Channels have different properties
- What they convey directly to perceptual system
- How much they can convey: how many discriminable bins can we use?

- Need luminance for edge detection
- Fine-grained detail only visible through luminance contrast
- Legible text requires luminance contrast!
- Intrinsic perceptual ordering

- Redundantly encode.
- Vary luminance.
- Change shape.

- Human perception built on relative comparisons
- Great if color is contiguous
- Suprisingly bad for absolute comparisons

- Noncontiguous small regions of color
- Fewer bins than you want
- Rule of thumb: 6-12 bins, including background and highlights

- Glyphs: composite objects
- Internal structure with multiple marks

- Alternative to color encoding
- Or coding with any single channel

- Problems:
- Perceptually unordered
- Perceptually nonlinear

- Benefits:
- Fine-grained structure visible and nameable

- Alternative:
- Large-scale structure: fewer hues
- Fine structure: multiple hues with monotonically increasing luminance (vs. Viridis R/Python)

Colorful, perceptually uniform, colorblind-safe, monotonically-increasing luminance

- How to recognize, create, and store networks
- Network Visualization Techniques
- Node-Link Representations
- Force Simulations
- Matrix Representations

- Working with Color
- Sequential: one hue
- Divergent: two hues
- Categorical: Multiple hues
- Continuos multiple hues