Entry G16: Components Comparison

1 minute read

The notebook that accompanies this entry is a cleaned up, concise version of the three notebooks I created for Entry G8, addressing just the graph components. Just like Entry G14, this is a supplement to, not a replacement for, the older entry (G8) so make sure you read the older one first.

This was the easiest of the redo entries/notebooks. There were two surprises when I put everything together. The first struck me as really strange: the Bimodal and Mixed Models have the exact same components; ids, sizes, everything.

At this point, I can only guess that the KNOWS relationships play such a small part in the shape of the Mixed Model graph that they’re negligible. This would leave the algorithm to find the same components in both the Bimodal and Mixed Models.

I ran the algorithm a couple of times with the same results. While weird, it is nice knowing that the results are reproducible.

ComparisonPermalink

The three graph models have very similar components. All three had exactly 22 components (which was the second surprise; I expected them to have different numbers of components). I’m interested to see if the number of components stays similar when we get to other datasets or whether the number of components stays the same between graph models regardless of whether it’s a unimodal, bimodal, or mixed representation.

I have run the components algorithm on other datasets with a very different number of components, so I do know that it returns a different number than only 22.

What I did expect was a giant component that consisted of the majority of the graph: all three graph models had a giant component comprising over 99% of the nodes. This means that less than 1% of nodes in all three models aren’t connected to the rest of the nodes in some way, shape, or form.

This aligns with the findings of Mark Newman in Networks, Second Edition on page 305, as previously referenced in Entry G8. Of the 27 different graphs from 4 different industries only 3 of them had a giant component consisting of less than 80% of the nodes. I expect the other example graphs I’ll be running will show the same giant component.

Up NextPermalink

Add Villains

ResourcesPermalink

Networks by Mark Newman
Entry G8

Twitter Facebook LinkedIn

Julie Fisher

Entry G16: Components Comparison

ComparisonPermalink

Up NextPermalink

ResourcesPermalink

You May Also Enjoy

Entry SM02: Clean Data

Entry SM01: Using S3 from AWS’s SageMaker

Entry NLP4: Frequencies and Comparison

Entry NLP3: Clean Data and Split into N-grams