How do we do node embeddings? ([source](http://snap.stanford.edu/proj/embeddings-www/index.html#materials))
Intuition: Find embedding of nodes so that “similar” nodes in the graph have embeddings that are close together.
1. Define an encoder (i.e., a mapping from nodes to embeddings)
- Shallow embedding (simplest encoding approach): encoder is just an embedding-lookup. Ex: [node2vec](/tag/node2vec), DeepWalk, LINE
2. Define a node similarity function, eg. nodes are similar if:
- they are connected?
- they share neighbours?
- have structural similar roles?
3. Optimize the parameters of the encoder so that similarity in the embedding space (e.g., dot product) approximates similarity in the original network
- Adjacency-based Similarity
- "Multihop" similarity (measure overlap between node neighborhoods)
these two methods are expensive.
-> **Random-walk Embeddings** (Estimate probability of visiting node v on a random walk starting from node u using some random walk strategy, optimize embeddings to encode random walk statistics). Expressivity (incorporates both local and higher-order neighbourhood information) and efficiency (do not need to consider all pairs when training)
Which random walk strategy?
- fixed-length random walks starting from each node: **DeepWalk** (Perozzi et al., 2013)
- "biased random walks" that can trade off between local and global views of the network: **Node2Vec** (Micro-view / marco-view of neighbourhood)
No method wins in all the cases