Abstract: Transformer-based architectures have gained popularity across various domains, including graph representation learning. However, selecting an optimal transformer configuration remains ...