Entry G5.5: Analysis Metrics
I am 100% adding this entry retroactively. And yes, I dated it wrong so that it would show up in the right order in the Post list. That’s the nice thing about digital diaries: you can insert things in after the fact. It wasn’t until I was working on entries G12 and G13 that I realized I hadn’t posted all the analysis metrics I plan to address.
There is no guarantee that I’ll address all metrics under a single category in one post and no guarantee I’ll post them in the order listed. Although, I may update the list after the entries are live to reflect how they end up grouped.
I put together this list from my research into network theory, including:
- Networks by Mark Newman
- The online book Network Science by Albert-László Barabási as well as other books by him and his online material
- Graph Algorithms by Mark Needham and Amy Hodler
- Graph Databases by Ian Robinson, Jim Webber and Emil Eifrem
- Connected by Nicholas Christakis and James Fowler
- Graph People Blog
- Courses/videos/tutorials/blog posts through Neo4j
- Others that I’ve obviously forgotten about by name
I’m not going to go into any detail on any of these topics in this entry. I’ll cover what each metric is and how it applies to data in the relevant entries.
If I’m feeling ambitious, I might come back and put the links to the relevant entries once they’ve been posted, like I’ve done for Global Metrics.
Global Metrics
- Entry G6: Global Counts and Entry 14: Global Counts Comparison
- Node count
- Isolates count and percent
- Relationship count
- Entry G7: Global Density and Diameter and Entry 15: Global Density Comparison
- Number of possible relationships
- Global density
- Diameter
- Entry G8: Components and Entry 16: Components Comparison
- Component count
- Component size
- Component percent
Local Metrics
- Entry 10: Local Metrics: first pass, accompanying notebooks use multiple separate graphs
- Degree Count
- Degree Descriptive Statistics
- Degree Distribution Charts
- Weighted Degree Count
- Weighted Degree Descriptive Statistics
- Weighted Degree Distribution Charts
- Entry 12: Degree Comparison: Unweighted degrees refined, accompanying notebooks use a single multi-graph database
- Degree Count
- Degree Descriptive Statistics
- Degree Distribution Charts
- Entry 13: Weighted Degree Comparison: Weighted degrees refined, accompanying notebook uses a single multi-graph database
- Weighted Degree Count
- Weighted Degree Descriptive Statistics
- Weighted Degree Distribution Charts
Density and Nearest Neighbors
- Number of nearest claim neighbors
- Number of next nearest claim neighbors
- Local density at various step levels (nearest neighbors, next nearest neighbors)
- Number steps to nearest fraud
- Count of fraud at various step levels
- Percent of fraud at various step levels
- Distribution of fraud at various step levels
- Descriptive statistics for fraud at various step levels
Shortest Path
- Shortest path
- Shortest path descriptive statistics
- Shortest path distribution
- Shortest path to fraud
- Distribution of shortest path to fraud
- Descriptive statistics for shortest path to fraud
Triangles
- Triangles / triads
- Distribution of triangles
- Triangles descriptive statistics
Tours / Cycles
- Tours / cycles
- Distribution of tours
- Tours descriptive statistics
Clustering Coefficient
- Clustering coefficient
- Global clustering coefficient
- Local clustering coefficient
- Distribution of clustering coefficient
- Clustering coefficient descriptive statistics
Centralities
- Degree Centrality
- Betweenness Centrality
- Closeness Centrality
- PageRank Centrality
- Eigenvector centrality
K-cores
Not sure what I want to do with K-cores. I’ll explore it more when I get to it.
Other
- Community Detection
- Reciprocity
- Network link analysis