Omnibus Embedding for Multiple Graphs¶
This demo shows you how to simultaneously embed two graphs using omnibus embedding from two graphs sampled from different stochastic block models (SBM). We will also compare the results to that of adjacency spectral embedding, and show why it is useful to embed the graphs simultaneously.
[1]:
import graspologic
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
Simulate two different graphs using stochastic block models (SBM)¶
We sample 2-block SBMs (undirected, no self-loops) with 50 vertices, each block containing 25 vertices (n = [25, 25]), and the following block probabilities:
The only difference between the two are the block probability for the second block. We sample
[2]:
from graspologic.simulations import sbm
m = 50
n = [m, m]
P1 = [[.3, .1],
[.1, .7]]
P2 = [[.3, .1],
[.1, .3]]
np.random.seed(8)
G1 = sbm(n, P1)
G2 = sbm(n, P2)
Visualize the graphs using heatmap¶
We visualize the sampled graphs using heatmap function. Heatmap will plot the adjacency matrix, where the colors represent the weight of the edge. In this case, we have binary graphs so the values will be either 0 or 1.
There is clear block structure to the graphs, and we see that the second, lower right, block of
[3]:
from graspologic.plot import heatmap
heatmap(G1, figsize=(7, 7), title='Visualization of Graph 1')
_ = heatmap(G2, figsize=(7, 7), title='Visualization of Graph 2')
Embed the two graphs using omnibus embedding¶
The purpose of embedding graphs is to obtain a Euclidean representation, sometimes called latent positions, of the adjacency matrices. Again, we assume that the probability matrix of a graph is given by
We use all of the default parameters. Underneath, the select_dimension algorithm will automatically find the optimal embedding dimension for us. In this example, we get the following estimate,
where the first block,
[4]:
from graspologic.embed import OmnibusEmbed
embedder = OmnibusEmbed()
Zhat = embedder.fit_transform([G1, G2])
print(Zhat.shape)
(2, 100, 2)
Visualize the latent positions¶
Since the two graphs have clear block structures, we should see two “clusters” when we visualize the latent positions. The vertices that form the first block should be close together since they have the same block probabilities, while those that form the second block should be further apart since they have different block probabilities.
[5]:
Xhat1 = Zhat[0]
Xhat2 = Zhat[1]
# Plot the points
fig, ax = plt.subplots(figsize=(10, 10))
ax.scatter(Xhat1[:m, 0], Xhat1[:m, 1], marker='o', c='red', label = 'Graph 1, Block 1')
ax.scatter(Xhat1[m:, 0], Xhat1[m:, 1], marker='o', c='blue', label = 'Graph 1, Block 2')
ax.scatter(Xhat2[:m, 0], Xhat2[:m, 1], marker='s', c='red', label = 'Graph 2, Block 1')
ax.scatter(Xhat2[m:, 0], Xhat2[m:, 1], marker='s', c='blue', label= 'Graph 2, Block 2')
ax.legend()
# Plot lines between matched pairs of points
for i in range(2*m):
ax.plot([Xhat1[i, 0], Xhat2[i, 0]], [Xhat1[i, 1], Xhat2[i, 1]], 'black', alpha = 0.15)
_ = ax.set_title('Latent Positions from Omnibus Embedding', fontsize=20)