A Hybrid Model of Graph Attention Networks and Random Forests for Link Prediction in Co-Authorship Networks

Ika Arfiani; Herman  Yuliansyah

doi:10.59395/ijadis.v6i2.1382

Authors

Ika Arfiani Department of Informatics, Universitas Ahmad Dahlan, Indonesia
Herman Yuliansyah Department of Informatics, Universitas Ahmad Dahlan, Indonesia

DOI:

https://doi.org/10.59395/ijadis.v6i2.1382

Keywords:

Co-authorship Prediction, Complex Networks, Deep Learning, Ensemble Learning, Link Prediction

Abstract

Co-authorship prediction is important in academic network analysis due to it helps to understand patterns of scientific collaboration and supports collaboration recommendation systems. Topology-based approaches, such as connectivity metrics and node distance, have been widely used to model new relationships in networks. However, these approaches often overlook relevant author attributes, such as reputation and productivity. This study develops a co-authorship prediction model by combining a Graph Attention Network (GAT) and a Random Forest. GAT is used to extract topological features from the co-authorship graph, while Random Forest leverages additional attributes such as h-index and the number of publications to improve prediction accuracy. Experiments were conducted on a co-authorship dataset comprising over 10,000 authors and 50,000 publications. The results show that GAT achieved 85% accuracy, while Random Forest reached 80%. The combination of the two yielded 90% accuracy and a higher F1-score, indicating a better balance between precision and recall. The combined model also proved more accurate in predicting collaborations involving highly productive authors. These findings suggest that a hybrid approach can more comprehensively capture the dynamics of academic collaboration and may serve as a foundation for developing more effective collaboration prediction systems in the future.

Downloads

Download data is not yet available.

References

[1] N. S. Foundation, “Science and Engineering Indicators 2022: The State of U.S. Science and Engineering.” 2022.

[2] H. Yuliansyah, Z. A. Othman, and A. A. Bakar, “Taxonomy of link prediction for social network analysis: a review,” IEEE Access, vol. 8, pp. 183470–183487, 2020, doi: 10.1109/ACCESS.2020.3029122. DOI: https://doi.org/10.1109/ACCESS.2020.3029122

[3] S. A. Koni’ah and H. Yuliansyah, “Classification Algorithm for Link Prediction Based on Generated Features of Local Similarity-Based Method,” SISTEMASI, vol. 11, no. 2, p. 317, May 2022, doi: 10.32520/stmsi.v11i2.1641. DOI: https://doi.org/10.32520/stmsi.v11i2.1641

[4] H. Yuliansyah, Z. A. Othman, and A. A. Bakar, “Extending adamic adar for cold-start problem in link prediction based on network metrics,” Int. J. Adv. Intell. Informatics, vol. 8, no. 3, p. 271, Nov. 2022, doi: 10.26555/ijain.v8i3.882. DOI: https://doi.org/10.26555/ijain.v8i3.882

[5] H. Yuliansyah, Z. A. Othman, and A. A. Bakar, “A new link prediction method to alleviate the cold-start problem based on extending common neighbor and degree centrality,” Phys. A Stat. Mech. its Appl., vol. 614, p. 128546, Feb. 2023, doi: 10.1016/j.physa.2023.128546. DOI: https://doi.org/10.1016/j.physa.2023.128546

[6] H. Yuliansyah and N. H. Putri, “Analisis Jaringan Penulis Bersama pada Program Studi Informatika Universitas Ahmad Dahlan,” Sainteks, vol. 19, no. 1, p. 1, Apr. 2022, doi: 10.30595/sainteks.v19i1.13338. DOI: https://doi.org/10.30595/sainteks.v19i1.13338

[7] I. D. Ulumiyah and H. Yuliansyah, “Analisis Pola Asosiasi Judul Artikel Publikasi Berdasarkan Data Google Scholar Menggunakan Algoritma Apriori,” J. Sarj. Tek. Inform., vol. 10, no. 3, pp. 140–148, 2022, doi: https://doi.org/10.12928/jstie.v10i3.24818.

[8] M. Wibowo, C. Quix, N. S. Hussien, H. Yuliansyah, and F. D. Adhinata, “Similarity Identification of Large-scale Biomedical Documents using Cosine Similarity and Parallel Computing,” Knowl. Eng. Data Sci., vol. 4, no. 2, p. 105, Feb. 2022, doi: 10.17977/um018v4i22021p105-116. DOI: https://doi.org/10.17977/um018v4i22021p105-116

[9] H. Yuliansyah, Z. A. Othman, and A. A. Bakar, “Co-authorship prediction method based on degree of gravity and article keywords similarity,” Phys. A Stat. Mech. its Appl., vol. 665, p. 130511, May 2025, doi: 10.1016/j.physa.2025.130511. DOI: https://doi.org/10.1016/j.physa.2025.130511

[10] P. Veli?kovi?, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph Attention Networks.” 2018. [Online]. Available: https://arxiv.org/abs/1710.10903

[11] X. Wang, J. Zhang, and D. Zhou, “Applications of Attention Mechanisms in Graph Neural Networks,” J. Mach. Learn. Res., vol. 21, pp. 1–15, 2020.

[12] L. Breiman, “Random Forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001. DOI: https://doi.org/10.1023/A:1010933404324

[13] T. N. Kipf and M. Welling, “Semi-Supervised Classification with Graph Convolutional Networks,” CoRR, vol. abs/1609.0, 2016, [Online]. Available: http://arxiv.org/abs/1609.02907

[14] Y. Chen, C. Ding, J. Hu, R. Chen, P. Hui, and X. Fu, “Building and Analyzing a Global Co-Authorship Network Using Google Scholar Data,” in Proceedings of the 26th International Conference on World Wide Web Companion - WWW ’17 Companion, 2017, pp. 1219–1224. doi: 10.1145/3041021.3053056. DOI: https://doi.org/10.1145/3041021.3053056

[15] H. R. Y. Eldon Y. Li, Chien Hsiang Liao, “Co-authorship networks and research impact: A social capital perspective.” Research Policy, 2013.

[16] J. Wang and Y. Wang, “The Role of h-index in Academic Collaboration,” Int. J. Eng. Res. Appl., vol. 9, no. 5, pp. 45–52, 2019.

[17] J. E. Hirsch, “An Index to Quantify an Individual’s Scientific Research Output,” Proc. Natl. Acad. Sci., vol. 102, no. 46, pp. 16569–16572, 2005. DOI: https://doi.org/10.1073/pnas.0507655102

[18] J. Zhou et al., “Graph Neural Networks: A Survey,” IEEE Trans. Neural Networks Learn. Syst., 2020.

[19] T. Raiko and M. Simons, “Fast Gradient-based Learning of Representations in Large Graphs,” J. Mach. Learn. Res., vol. 15, pp. 2395–2422, 2014.

[20] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv Prepr. arXiv1810.04805, 2018.

[21] Q. Li and D. Zhou, “Adaptive Graph Convolutional Neural Networks,” arXiv Prepr. arXiv1802.06375, 2018. DOI: https://doi.org/10.1609/aaai.v32i1.11691

[22] S. Thomas and Q. Le, “Exploring Convolutional Neural Networks for Graph-Based Learning,” IEEE Trans. Neural Networks, 2016.

[23] N. Amenta and E. Shahar, “Preprocessing Methods for Data Quality Enhancement in Computational Biology,” Comput. Biol. Bioinforma., 2009.

[24] M. Newman, Networks: An Introduction. Oxford University Press, 2010. doi: 10.1093/acprof:oso/9780199206650.001.0001. DOI: https://doi.org/10.1093/acprof:oso/9780199206650.001.0001

[25] G. Salha and A. Aljuaid, “A Comprehensive Review on Graph Neural Networks: Models, Techniques, and Applications,” J. Artif. Intell. Data Min., 2021.

[26] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” arXiv Prepr. arXiv1412.6980, 2015.

[27] D. Zhang and L. Zhang, Graph-based Learning with Application to Recommender Systems. Springer, 2020.

[28] J. Zhou et al., “Graph Neural Networks: A Survey,” IEEE Trans. Neural Networks Learn. Syst., 2020.

[29] Y. Guo and H. Zhang, Deep Learning Approaches for Graph Neural Networks and Their Applications. Springer, 2019.

[30] M. Craven and J. Shavlik, “Extracting Tree-Structured Representations from Trained Neural Networks,” 1996.

[31] L. Breiman, “Bagging Predictors,” Mach. Learn., vol. 24, no. 2, pp. 123–140, 1996. DOI: https://doi.org/10.1007/BF00058655

[32] F. Wu and J. Zhu, “Graph Neural Networks: A Comprehensive Review,” J. Comput. Sci. Technol., vol. 35, no. 5, pp. 1025–1055, 2020.

[33] X. Zhang and X. Chen, Learning to Rank: From Data to Decisions. Springer, 2020.

[34] J. S. Katz and B. R. Martin, “What is research collaboration?,” Res. Policy, vol. 26, no. 1, pp. 1–18, Mar. 1997, doi: 10.1016/S0048-7333(96)00917-1. DOI: https://doi.org/10.1016/S0048-7333(96)00917-1