# Distance¶

Graph distance methods to compare two networks.

## Base class¶

class netrd.distance.BaseDistance[source]

Base class for all distance algorithms.

The basic usage of a distance algorithm is as follows:

>>> dist_obj = DistanceAlgorithm()
>>> distance = dist_obj.dist(G1, G2, <some_params>)
>>> # or alternatively: distance = dist_obj.results['dist']


Here, G1 and G2 are nx.Graph objects (or subclasses such as nx.DiGraph). The results dictionary holds the distance value, as well as any other values that were computed as a side effect.

## Available distances¶

All of the following algorithms inherit from BaseDistance and have the same general usage as above.

 netrd.distance.CommunicabilityJSD Jensen-Shannon divergence between communicability sequences. netrd.distance.DegreeDivergence Compare two degree distributions. netrd.distance.DeltaCon Compare matrices related to Fast Belief Propagation. netrd.distance.DistributionalNBD Distributional Non-backtracking Spectral Distance. netrd.distance.dkSeries Compare graphs based on their $$dk$$-series. netrd.distance.DMeasure Compare two graphs by their network node dispersion. netrd.distance.Frobenius The Frobenius distance between their adjacency matrices. netrd.distance.GraphDiffusion Find the maximally dissimilar diffusion kernels between two graphs. netrd.distance.Hamming Entry-wise disagreement between adjacency matrices. netrd.distance.HammingIpsenMikhailov Combination of Hamming and Ipsen-Mikhailov distances. netrd.distance.IpsenMikhailov Compares the spectrum of the Laplacian matrices. netrd.distance.JaccardDistance Jaccard distance between edge sets. netrd.distance.LaplacianSpectral Flexible distance able to compare the spectrum of the Laplacian in many ways. netrd.distance.NonBacktrackingSpectral Compares the empirical spectral distribution of the non-backtracking matrices. netrd.distance.NetLSD Compares spectral node signature distributions. netrd.distance.NetSimile Compares node signature distributions. netrd.distance.OnionDivergence Compares various types of feature distributions. netrd.distance.PolynomialDissimilarity Compares polynomials relating to the eigenvalues of the adjacency matrices. netrd.distance.PortraitDivergence Compares graph portraits. netrd.distance.QuantumJSD Compares the spectral entropies of the density matrices. netrd.distance.ResistancePerturbation Compares the resistance matrices.

## Reference¶

class netrd.distance.BaseDistance[source]

Base class for all distance algorithms.

The basic usage of a distance algorithm is as follows:

>>> dist_obj = DistanceAlgorithm()
>>> distance = dist_obj.dist(G1, G2, <some_params>)
>>> # or alternatively: distance = dist_obj.results['dist']


Here, G1 and G2 are nx.Graph objects (or subclasses such as nx.DiGraph). The results dictionary holds the distance value, as well as any other values that were computed as a side effect.

dist(G1, G2)[source]

Compute distance between two graphs.

Values computed as side effects of the distance method can be foun in self.results.

Parameters:
G1, G2 (nx.Graph): two graphs.
Returns:
distance (float).
class netrd.distance.Hamming[source]

dist(G1, G2)[source]

The proportion of disagreeing nodes between the flattened adjacency matrices.

If $$u$$ and $$v$$ are boolean vectors, then Hamming distance is:

$\frac{c_{01} + c_{10}}{n}$

where $$c_{ij}$$ is the number of occurrences of where $$u[k] = i$$ and $$v[k] = j$$ for $$k < n$$.

The graphs must have the same number of nodes. A small modification to this code could allow weights can be applied, but only one set of weights that apply to both graphs.

The results dictionary also stores a 2-tuple of the underlying adjacency matrices in the key ‘adjacency_matrices’.

Parameters:
G1, G2 (nx.Graph)

two networkx graphs to be compared.

Returns:
dist (float)

the distance between G1 and G2.

References

class netrd.distance.Frobenius[source]

The Frobenius distance between their adjacency matrices.

dist(G1, G2)[source]

Frobenius distance between two graphs.

If $$a_{ij}$$ and $$b_{ij}$$ are the two adjacency matrices we define

$d(G1, G2) = \sqrt{\sum_{i,j} |a_{ij} - b_{ij}|^2}$

The results dictionary also stores a 2-tuple of the underlying adjacency matrices in the key ‘adjacency_matrices’.

Parameters:
G1, G2 (nx.Graph)

two graphs to compare

Returns:
float

the distance between G1 and G2

Notes

The graphs must have the same number of nodes.

class netrd.distance.PortraitDivergence[source]

Compares graph portraits.

dist(G1, G2, bins=None, binedges=None)[source]

Distance measure based on the two graphs’ “portraits”.

The results dictionary also stores a 2-tuple of the underlying adjacency matrices in the key ‘adjacency_matrices’ and the portrait matrices in ‘portrait_matrices’.

Parameters:
G1, G2 (nx.Graph)

two graphs

bins (int)

width of bins in percentiles

binedges (list)

vector of bin edges (mutually exclusive from bins)

Returns:
dist (float)

the portrait divergence between two graphs.

References

[1] An information-theoretic, all-scales approach to comparing networks James P. Bagrow and Erik M. Bollt, 2018 arXiv:1804.03665

class netrd.distance.JaccardDistance[source]

Jaccard distance between edge sets.

dist(G1, G2)[source]

Compute the Jaccard index between two graphs.

The Jaccard index between two sets

$J(A, B) = \frac{|A \cap B|}{|A \cup B|}$

provides a measure of similarity between sets. Here, we use the edge sets of two graphs. The index, a measure of similarity, is converted to a distance

$d_J(A, B) = 1 - J(A, B)$

for consistency with other graph distances.

Parameters:
G1, G2 (nx.Graph)

two graphs to be compared.

Returns:
dist (float)

the distance between G1 and G2.

class netrd.distance.IpsenMikhailov[source]

Compares the spectrum of the Laplacian matrices.

dist(G1, G2, hwhm=0.08)[source]

Compare the spectrum ot the associated Laplacian matrices.

The results dictionary also stores a 2-tuple of the underlying adjacency matrices in the key ‘adjacency_matrices’.

Parameters:
G1, G2 (nx.Graph)

two networkx graphs to be compared.

hwhm (float)

half with at half maximum of the lorentzian kernel.

Returns:
dist (float)

the distance between G1 and G2.

Notes

Requires undirected networks.

References

class netrd.distance.HammingIpsenMikhailov[source]

Combination of Hamming and Ipsen-Mikhailov distances.

dist(G1, G2, combination_factor=1)[source]

Graph distance combining local and global distances.

The local metric H is the Hamming distance, corresponding to the difference for the edges in both networks.

The global (spectral) metric IM is the Ipsen-Mikailov distance, corresponding to the square-root of the squared difference of the Laplacian spectrum for each network.

The Hamming-Ipsen-Mikhailov (HIM) distance is an Euclidean metric on the space created by the Cartesian product of the metric space associated with H and IM. The trade-off between global and local information is governed by a combination factor: when this is one, local and global information are balanced; when it is zero, it reduces to the (local) Hamming distance; and as it approaches infinity it becomes the (global) Ipsen-Mikhailov distance.

The results dictionary also stores a 2-tuple of the underlying adjacency matrices in the key ‘adjacency_matrices’, the Hamming distance in ‘hamming_dist’, the Ipsen-Mikhailov distance in ‘ipsen_mikhailov_dist’, and the Lorentzian half-width at half-maximum in ‘hwhm’. If the networks being compared are directed, the augmented adjacency matrices are calculated and stored in ‘augmented_adjacency_matrices’.

Parameters:
G1, G2 (nx.Graph)

two networkx graphs to be compared.

combination_factor (float)

positive factor in front of the IM metric.

Returns:
dist (float)

the distance between G1 and G2.

Notes

Requires networks with the same number of nodes. The networks can be directed and weighted (with weights in the range $$[0,1]$$). Both (H and IM) are also saved in the results dictionary.

References

class netrd.distance.ResistancePerturbation[source]

Compares the resistance matrices.

dist(G1, G2, p=2)[source]

The p-norm of the difference between two graph resistance matrices.

The resistance perturbation distance changes if either graph is relabeled (it is not invariant under graph isomorphism), so node labels should be consistent between the two graphs being compared. The distance is not normalized.

The resistance matrix of a graph $$G$$ is calculated as $$R = \text{diag}(L_i) 1^T + 1 \text{diag}(L_i)^T - 2L_i$$, where $$L_i$$ is the Moore-Penrose pseudoinverse of the Laplacian of $$G$$.

The resistance perturbation distance between $$G_1$$ and $$G_2$$ is calculated as the $$p$$-norm of the difference in their resitance matrices,

$d_{r(p)} = | R^{(1)} - R^{(2)} | = ( \sum_{i,j \in V} | R^{(1)}_{i,j} - R^{(2)}_{i,j} |^p )^{1/p},$

where $$R^{(1)}$$ and $$R^{(2)}$$ are the resistance matrices of $$G_1$$ and $$G_2$$ respectively. When $$p = \infty$$, we have

$d_{r(\infty)} = \max_{i,j \in V} |R^{(1)}_{i,j} - R^{(2)}_{i,j}|.$

This method assumes that the input graphs are undirected; if directed graphs are used, it will coerce them to undirected graphs and emit a RuntimeWarning.

The results dictionary also stores a 2-tuple of the underlying resistance matrices in the key ‘resistance_matrices’.

Parameters:
G1, G2 (nx.Graph)

two networkx graphs to be compared.

p (float or str, optional)

$$p$$-norm to take of the difference between the resistance matrices. Specify np.inf to take $$\infty$$-norm.

Returns:
dist (float)

the distance between G1 and G2.

References

class netrd.distance.NetSimile[source]

Compares node signature distributions.

dist(G1, G2)[source]

A scalable approach to network similarity.

A network similarity measure based on node signature distributions.

The results dictionary includes the underlying feature matrices in ‘feature_matrices’ and the underlying signature vectors in ‘signature_vectors’.

Parameters:
G1, G2 (nx.Graph)

two undirected networkx graphs to be compared.

Returns:
dist (float)

the distance between G1 and G2.

References

[1]

Michele Berlingerio, Danai Koutra, Tina Eliassi-Rad, Christos Faloutsos: NetSimile: A Scalable Approach to Size-Independent Network Similarity. CoRR abs/1209.2684 (2012)

class netrd.distance.NetLSD[source]

Compares spectral node signature distributions.

dist(G1, G2, normalization=None, timescales=None)[source]

NetLSD: Hearing the Shape of a Graph.

A network similarity measure based on spectral node signature distributions.

The results dictionary includes the underlying signature vectors in ‘signatures’.

Parameters:
G1, G2 (nx.Graph)

two undirected networkx graphs to be compared.

normalization (str)

type of normalization of the heat kernel vectors. either ‘complete’, ‘empty’ or ‘none’

timescales (np.ndarray)

timescales for the comparison. None yields default.

Returns:
dist (float)

the distance between G1 and G2.

References

[1]

A. Tsitsulin, D. Mottin, P. Karras, A. Bronstein & E. Müller. NetLSD: Hearing the Shape of a Graph. KDD 2018

class netrd.distance.LaplacianSpectral[source]

Flexible distance able to compare the spectrum of the Laplacian in many ways.

dist(G1, G2, normed=True, kernel='normal', hwhm=0.011775, measure='jensen-shannon', k=None, which='LM')[source]

Graph distances using different measure between the Laplacian spectra of the two graphs

The spectra of both Laplacian matrices (normalized or not) is computed. Then, the discrete spectra are convolved with a kernel to produce continuous ones. Finally, these distribution are compared using a metric.

The results dictionary also stores a 2-tuple of the underlying adjacency matrices in the key ‘adjacency_matrices’, the Laplacian matrices in ‘laplacian_matrices’, the eigenvalues of the Laplacians in ‘eigenvalues’. If the networks being compared are directed, the augmented adjacency matrices are calculated and stored in ‘augmented_adjacency_matrices’.

Parameters:
G1, G2 (nx.Graph)

two networkx graphs to be compared.

normed (bool)

If True, uses the normalized laplacian matrix, otherwise the raw laplacian matrix is used.

kernel (str)

kernel to obtain a continuous spectrum. Choices available are ‘normal’, ‘lorentzian’, or None. If None is chosen, the discrete spectrum is used instead, and the measure is simply the euclidean distance between the vector of eigenvalues for each graph.

hwhm (float)

half-width at half-maximum for the kernel. The default value is chosen such that the standard deviation for the normal distribution is $$0.01$$, as in reference [1]. This option is relevant only if kernel is not None.

measure (str)

metric between the two continuous spectra. Choices available are ‘jensen-shannon’ or ‘euclidean’. This option is relevant only if kernel is not None.

k (int)

number of eigenvalues kept for the (discrete) spectrum, also used to create the continuous spectrum. If None, all the eigenvalues are used. k must be smaller (strictly) than the size of both graphs.

which (str)

if k is not None, this option specifies the eigenvalues that are kept. See the choices offered by scipy.sparse.linalg.eigsh. The largest eigenvalues in magnitude are kept by default.

Returns:
dist (float)

the distance between G1 and G2.

Notes

The methods are usually applied to undirected (unweighted) networks. We however relax this assumption using the same method proposed for the Hamming-Ipsen-Mikhailov. See [2].

References

class netrd.distance.PolynomialDissimilarity[source]

Compares polynomials relating to the eigenvalues of the adjacency matrices.

dist(G1, G2, k=5, alpha=1)[source]

Compares the polynomials of the eigenvalue decomposition of two adjacency matrices.

Note that the $$ij$$-th element of where $$A^k$$ corresponds to the number of paths of length $$k$$ between nodes $$i$$ and $$j$$.

The results dictionary also stores a 2-tuple of the underlying adjacency matrices in the key ‘adjacency_matrices’.

Parameters:
G1, G2 (nx.Graph)

two networkx graphs to be compared.

k (float)

maximum degree of the polynomial

alpha (float)

weighting factor

Returns:
dist (float)

Polynomial Dissimilarity between G1, G2

References

[1]

Donnat, Claire, and Susan Holmes. “Tracking network dynamics: A survey of distances and similarity metrics.” arXiv preprint arXiv:1801.07351 (2018).

class netrd.distance.DegreeDivergence[source]

Compare two degree distributions.

dist(G1, G2)[source]

Jenson-Shannon divergence between degree distributions.

Assumes undirected networks.

Parameters:
G1, G2 (nx.Graph)

two networkx graphs to be compared.

Returns:
dist (float)

the distance between G1 and G2.

class netrd.distance.OnionDivergence[source]

Compares various types of feature distributions.

dist(G1, G2, dist='lccm')[source]

Jensen-Shannon divergence between the feature distributions fixed by dist.

Assumes simple graphs.

Parameters:
G1, G2 (nx.Graph)

two networkx graphs to be compared.

dist (str)

type of distribution divergence to output. Choices are ‘cm’, ‘ccm’, ‘lccm_node’ and ‘lccm’. The type stand for the associated random graph ensemble. ‘cm’ compares only the degree distribution. ‘ccm’ compares the networks according to the edges degree-degree distribution. ‘lccm_node’ compares the distribution of nodes according to their onion centrality (degree, coreness, and layer within core). Finally, ‘lccm’ compares the networks according to the edges joint degree, coreness and layer distribution for both endpoints.

Returns:
dist (float)

the distance between G1 and G2.

References

class netrd.distance.DeltaCon[source]

Compare matrices related to Fast Belief Propagation.

dist(G1, G2, exact=True, g=None)[source]

DeltaCon is based on the Matsusita between matrices created from fast belief propagation (FBP) on graphs G1 and G2.

Because the FBP algorithm requires a costly matrix inversion, there is a faster, roughly linear, algorithm that gives approximate results.

Parameters:
G1, G2 (nx.Graph)

two networkx graphs to be compared.

exact (bool)

if True, use the slower but exact algorithm (DeltaCon_0)

g (int)

the number of groups to use in the efficient algorithm. If exact is set to False but g is not set, the efficient algorithm will still behave like the exact algorithm, since each node is put in its own group.

Returns:
dist (float)

the distance between G1 and G2.

References

[1]

Koutra, Danai, Joshua T. Vogelstein, and Christos Faloutsos. 2013. “Deltacon: A Principled Massive-Graph Similarity Function.” In Proceedings of the 2013 SIAM International Conference on Data Mining, 162–70. Society for Industrial and Applied Mathematics. https://doi.org/10.1137/1.9781611972832.18.

class netrd.distance.QuantumJSD[source]

Compares the spectral entropies of the density matrices.

dist(G1, G2, beta=0.1, q=None)[source]

Square root of the quantum $$q$$-Jensen-Shannon divergence between two graphs.

The generalized Jensen-Shannon divergence compares two graphs b √(H0 - 0.5 * (H1 + H2))y the spectral entropies of their quantum-statistical-mechanical density matrices. It can be written as

$\mathcal{J}_q(\mathbf{\rho} || \mathbf{\sigma}) = S_q\left( \frac{\mathbf{\rho} + \mathbf{\sigma}}{2} \right) - \frac{1}{2} [S_q(\mathbf{\rho}) + S_q(\mathbf{\sigma})],$

where $$\mathbf{\rho}$$ and $$\mathbf{\sigma}$$ are density matrices and $$q$$ is the order parameter.

The density matrix

$\mathbf{\rho} = \frac{e^{-\beta\mathbf{L}}}{Z},$

where

$Z = \sum_{i=1}^{N}e^{-\beta\lambda_i(\mathbf{L})}$

and $$\lambda_i(\mathbf{L})$$ is the $$ith eigenvalue of the Laplacian matrix :math:L$$, represents an imaginary diffusion process over the network with time parameter $$\beta > 0$$.

For these density matrices and the mixture matrix, we calculate the Rényi entropy of order $$q$$

$S_q = \frac{1}{1-q} \log_2 \sum_{i=1}^{N}\lambda_i(\mathbf{\rho})^q,$

or, if $$q=1$$, the Von Neumann entropy

$S_1 = - \sum_{i=1}^{N}\lambda_i(\mathbf{\rho})\log_2\lambda_i(\mathbf{\rho}).$

Note that this implementation is not exact because the matrix exponentiation is performed using the Padé approximation and because of imprecision in the calculation of the eigenvalues of the density matrix.

Parameters:
G1, G2 (nx.Graph)

two networkx graphs to be compared

beta (float)

time parameter for diffusion propagator

q (float)

order parameter for Rényi entropy. If None or 1, use the Von Neumann entropy (i.e., Shannon entropy) instead.

Returns:
dist (float)

the distance between G1 and G2.

References

[1]

De Domenico, Manlio, and Jacob Biamonte. 2016. “Spectral Entropies as Information-Theoretic Tools for Complex Network Comparison.” Physical Review X 6 (4). https://doi.org/10.1103/PhysRevX.6.041062.

class netrd.distance.CommunicabilityJSD[source]

Jensen-Shannon divergence between communicability sequences.

dist(G1, G2)[source]

Compares the communicability matrix of two graphs.

This distance is based on the communicability matrix, $$C$$, of a graph consisting of elements $$c_{ij}$$ which are values corresponding to the numbers of shortest paths of length $$k$$ between nodes $$i$$ and $$j$$.

The commmunicability matrix is symmetric, which means the communicability sequence is formed by flattening the upper triangular of $$C$$, which is then normalized to create the communicability sequence, $$P$$.

The communicability sequence entropy distance between two graphs, G1 and G2, is the Jensen-Shannon divergence between these communicability sequence distributions, $$P1$$ and $$P2$$ of the two graphs.

Parameters:
G1, G2 (nx.Graph)

two graphs

Returns:
dist (float)

between zero and one, this is the communicability sequence distance bewtween G1 and G2.

Notes

This function uses the networkx approximation of the communicability of a graph, nx.communicability_exp, which requires G1 and G2 to be simple undirected networks. In addition to the final distance scalar, self.results stores the two vectors $$P1$$ and $$P2$$, their mixed vector, $$P0$$, and their associated entropies.

References

[1]

Estrada, E., & Hatano, N. (2008). Communicability in complex networks. Physical Review E, 77(3), 036111. https://journals.aps.org/pre/abstract/10.1103/PhysRevE.77.036111

[2]

Chen, D., Shi, D. D., Qin, M., Xu, S. M., & Pan, G. J. (2018). Complex network comparison based on communicability sequence entropy. Physical Review E, 98(1), 012319.

class netrd.distance.DistributionalNBD[source]

Distributional Non-backtracking Spectral Distance.

Computes the distance between two graphs using the empirical spectral density of the non-backtracking operator.

See: “Graph Comparison via the Non-backtracking Spectrum” A. Mellor & A. Grusovin arXiv:1812.05457 / 10.1103/PhysRevE.99.052309

VECTOR_DISTANCES = {'chebyshev': <function chebyshev>, 'euclidean': <function euclidean>}
dist(G1, G2, sparse=False, shave=True, keep_evals=True, k=None, vector_distance='euclidean', **kwargs)[source]

Distributional Non-backtracking Spectral Distance.

Parameters:
G1, G2 (nx.Graph)

The two graphs to compare.

sparse (bool)

If sparse, matrices and eigenvalues found using sparse methods. If sparse, parameter ‘k’ should also be specified. Default: False

k (int)

The number of largest eigenvalues to be calculated for the spectral density.

vector_distance (str)

The distance measure used to compare two empirical distributions. Currently available are ‘euclidean’ and ‘chebyshev’, implemented using SciPy. Default: ‘euclidean’

keep_evals (bool)

If True, stores the eigenvalues of the reduced non-backtracking matrix in self.results[‘evals’] Default: False

Returns:
float

The distance between G1 and G2

class netrd.distance.dkSeries[source]

Compare graphs based on their $$dk$$-series.

dist(G1, G2, d=2)[source]

Compute the distance between two graphs by using the Jensen-Shannon divergence between the $$dk$$-series of the graphs.

The $$dk$$-series of a graph is the collection of distributions of size $$d$$ subgraphs, where nodes are labelled by degrees. For simplicity, we currently consider only the $$1k$$-series, i.e., the degree distribution, or the $$2k$$-series, i.e., the distribution of edges between nodes of degree $$(k_i, k_j)$$. The distance between these $$dk$$-series is calculated using the Jensen-Shannon divergence.

Parameters:
G1, G2 (nx.Graph)

two networkx graphs to be compared

d (int)

the size of the subgraph to consider

Returns:
dist (float)

the distance between G1 and G2.

References

[1]

Orsini, Chiara, Marija M. Dankulov, Pol Colomer-de-Simón, Almerima Jamakovic, Priya Mahadevan, Amin Vahdat, Kevin E. Bassler, et al. 2015. “Quantifying Randomness in Real Networks.” Nature Communications 6 (1). https://doi.org/10.1038/ncomms9627.

class netrd.distance.DMeasure[source]

Compare two graphs by their network node dispersion.

dist(G1, G2, w1=0.45, w2=0.45, w3=0.1, niter=50)[source]

The D-Measure is a comparison of structural dissimilarities between graphs.

The key concept is the network node dispersion

$NND(G) = \frac{\mathcal{J}(\mathbf{P}_1,\ldots,\mathbf{P}_N)}{\log(d+1)},$

where $$\mathcal{J}$$ is the Jenson-Shannon divergence between $$N$$ node-distance distributions

$\mathbf{P}_i = \{p_i(j)\},$

and $$p_i(j)$$ is the fraction of nodes at distance $$i$$ from node $$j$$.

The D-measure itself is a weighted sum of three components: the square root of the Jensen-Shannon divergence between the average node-distance probabilities of the two graphs

$\mu_j = \frac{1}{N}\sum_{i=1}^N p_i(j),$

the second term is the absolute value of the differences in the square roots of the network node dispersions of the two graphs, and the third term is the sum of the square roots of the Jensen-Shannon divergences between the probability distributions of the alpha centralities of two graph and of their complements.

Parameters:
G1 (nx.Graph):

the first graph to be compared.

G2 (nx.Graph):

the second graph to be compared.

w1 (float):

weight of the first term in the calculation; with w2 and w3, must sum to 1.0.

w2 (float):

weight of the second term in the calculation; with w1 and w3, must sum to 1.0.

w3 (float):

weight of the third term in the calculation; with w1 d w2, must sum to 1.0.

niter (int):

the alpha centralities are calculated using power iteration, with this many iterations

Returns:
dist (float):

between 0 and 1, the D-measure distance between G1 and G2

Notes

The default values for w1, w2, and w3 are from the original paper.

References

[1]

Schieber, T. A. et al. Quantification of network structural dissimilarities. Nat. Commun. 8, 13928 (2017). https://www.nature.com/articles/ncomms13928

class netrd.distance.NonBacktrackingSpectral[source]

Compares the empirical spectral distribution of the non-backtracking matrices.

The eigenvalues are stored in the results dictionary.

dist(G1, G2, topk='automatic', ignore_negative_evals=True, batch=100, tol=1e-05)[source]

Non-Backtracking Distance between two graphs.

Parameters:
G1, G2 (nx.Graph)

The graphs to compare.

topk (int or ‘automatic’)

The number of eigenvalues to compute. If ‘automatic’ (default), use only the eigenvalues that are larger than the square root of the largest eigenvalue. Note this may yield different number of eigenvalues for each graph.

ignore_negative_evals (bool):

The original publication considers all eigenvalues for inclusion. If True, instead drop eigenvalues with negative complex parts.

batch (int)

If topk is ‘automatic’, this is the number of eigenvalues to compute each time until the condition is met. Default $$100$$.

tol (float)

Numerical tolerance when computing eigenvalues.

Returns:
float

The distance between G1 and G2

class netrd.distance.GraphDiffusion[source]

Find the maximally dissimilar diffusion kernels between two graphs.

dist(G1, G2, thresh=1e-08, resolution=1000)[source]

The graph diffusion distance between two graphs, $$G$$ and $$G'$$, is a distance measure based on the notion of flow within each graph. As such, this measure uses the unnormalized Laplacian matrices of both graphs, $$\mathcal{L}$$ and $$\mathcal{L}'$$, and uses them to construct time-varying Laplacian exponential diffusion kernels, $$e^{-t\mathcal{L}}$$ and $$e^{-t\mathcal{L}'}$$, by effectively simulating a diffusion process for $$t$$ timesteps, creating a column vector of node-level activity at each timestep. The distance $$d_\texttt{GDD}(G, G')$$ is defined as the Frobenius norm between the two diffusion kernels at the timestep $$t^{*}$$ where the two kernels are maximally different. That is, we compute the Frobenius norms and their differences for each timestep, and return the maximum difference.

$D_{GDD}(G,G') = \sqrt{||e^{-t^{*}\mathcal{L}}-e^{-t^{*}\mathcal{L}'}||}$

The results dictionary also stores a 2-tuple of the underlying adjacency matrices in adjacency_matrices, the Laplacian matrices in laplacian_matrices, and the output of the optimization process (peak_diffusion_time and peak_deviation).

Adapted from the authors’ MATLAB code, available at: https://rb.gy/txbfrh

Parameters:
G1, G2 (nx.Graph)

two networkx graphs to be compared.

thresh (float)

minimum value above which the eigenvalues will be considered.

resolution (int)

number of $$t$$ values to span through.

Returns:
dist (float)

the distance between G1 and G2.

References

[1]

Hammond, D. K., Gur, Y., & Johnson, C. R. (2013, December). Graph diffusion distance: A difference measure for weighted graphs based on the graph Laplacian exponential kernel. In Global Conference on Signal and Information Processing, 2013 IEEE (pp 419-422). IEEE. https://doi.org/10.1109/GlobalSIP.2013.6736904