Graph embedding, representing local and global neighbourhood information by numerical vectors, is a crucial part of the mathematical modeling of a wide range of real-world systems. Among the embedding algorithms, random walk-based algorithms have proven to be very successful. These algorithms collect information by creating numerous random walks with a predefined number of steps. Creating random walks is the most demanding part of the embedding process. The computation demand increases with the size of the network. Moreover, for real-world networks, considering all nodes on the same footing, the abundance of low-degree nodes creates an imbalanced data problem. In this work, a computationally less intensive and node connectivity aware uniform sampling method is proposed. In the proposed method, the number of random walks is created proportionally with the degree of the node. The advantages of the proposed algorithm become more enhanced when the algorithm is applied to large graphs. A comparative study using two networks, namely CORA and CiteSeer, is presented. Compared with the fixed number of walks case, the proposed method requires approximately 50% less computational effort to reach the same accuracy for node classification and link prediction calculations.
Graph representation learning, node embedding, feature learning, random walk
MOHAMMED, SARMAD N. and GÜNDÜÇ, SEMRA
"Degree-based random walk approach for graph embedding,"
Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 30:
5, Article 13.
Available at: https://journals.tubitak.gov.tr/elektrik/vol30/iss5/13