Define Networks
Networks are defined by two things: nodes and links.
Nodes: a collection of entities which have properties that are somehow related to each other
- E.g., people, forks in rivers, proteins, web pages, organisms, etc.
Edges/Links: connections between nodes
- Links may be directed or undirected
- Links may be binary or weighted
[Slide courtesy of Andy Reagan]
Just Some Examples
- Tournaments
- Organization charts
- Genealogy
- Diagramming (e.g., Visio)
- Biological interactions (genes, proteins)
- Computer networks
- Social networks
- Simulation and modeling
- Integrated circuit design
- River systems
- Many, many more (and some history)
[Slide courtesy of Andy Reagan]
Network values
Nodes
- Usually contains the values
- Friends attributes (name, age, gender, etc) in a social network.
Links
- But links can also have attributes
- When the friendship was stablished
- What type of relationship
- How many friends in common do they have
How to Store Network Data
[Slide courtesy of Andy Reagan]
Node and Link Files
[Slide courtesy of Andy Reagan]
Node and Link Files (cont.)
[Slide courtesy of Andy Reagan]
Adjacency Matrix
[Slide courtesy of Andy Reagan]
Nested: XML/JSON
[Slide courtesy of Andy Reagan]
How to Create Network Data
- Group by common attribute.
- Identify nodes, extract links.
[Slide courtesy of Andy Reagan]
From Flat Data
Say we have tabular data for Les Miserables with columns for "scene", "character", and "line". We want to examine the network of which characters co-occur in scenes. Take all unique characters are nodes and link between all characters in a scene together.
- JS:
d3.nest().key(function(d) { return d.scene; })
.
- Python:
pd.groupby('scene')
.
[Slide courtesy of Andy Reagan]
Identify Nodes, Extract Links
Social network data extract
- Loop through all of the messages.
- Add to a list of all users.
- Add to an edge list that has all "mentions" of another user.
[Slide courtesy of Andy Reagan]