Utilities for computing standardization values for distance measures.

author: Harrison Hartle/Tim LaRock (timothylarock at gmail dot com)

Submitted as part of the 2019 NetSI Collabathon.

netrd.utilities.standardize.mean_GNP_distance(n, prob, distance, samples=10, **kwargs)[source]

Mean distance between \(G(n, p)\) graphs.

Compute the mean distance between samples \(G(n, p)\) graphs with parameters using distance function distance, whose parameters are passed with **kwargs.

n (int)

Number of nodes in ER graphs to be generated

prob (float)

Probability of edge in ER graphs to be generated.

samples (int)

Number of samples to average distance over.

distance (function)

Function from netrd.distances.<distance>.dist

**kwargs (dict)

Keyword arguments to pass to the distance function.

mean (float)

The average distance between the sampled ER networks.

std (float)

The standard deviation of the distances.

dist (np.ndarray)

Array storing the actual distances.


Ideally, each sample would involve generating two \(G(n, p)\) graphs, computing the distance between them, then throwing them both away. However, this will be computationally expensive, so for now we are reusing samples. The diagonal of the distance matrix is excluded, i.e., do not compute the distance between a sample graph and itself.


dist_obj = netrd.distance.ResistancePerturbation()
kwargs = {'p':2}
mean, std, dists = netrd.utilities.mean_GNP_distance(100, 0.1, dist_obj.dist, **kwargs)