Can Hierarchical Transformers Learn Facial Geometry?




Young, Paul
Ebadi, Nima
Das, Arun
Bethany, Mazal
Desai, Kevin
Najafirad, Peyman

Journal Title

Journal ISSN

Volume Title



Human faces are a core part of our identity and expression, and thus, understanding facial geometry is key to capturing this information. Automated systems that seek to make use of this information must have a way of modeling facial features in a way that makes them accessible. Hierarchical, multi-level architectures have the capability of capturing the different resolutions of representation involved. In this work, we propose using a hierarchical transformer architecture as a means of capturing a robust representation of facial geometry. We further demonstrate the versatility of our approach by using this transformer as a backbone to support three facial representation problems: face anti-spoofing, facial expression representation, and deepfake detection. The combination of effective fine-grained details alongside global attention representations makes this architecture an excellent candidate for these facial representation problems. We conduct numerous experiments first showcasing the ability of our approach to address common issues in facial modeling (pose, occlusions, and background variation) and capture facial symmetry, then demonstrating its effectiveness on three supplemental tasks.



face geometry, hierarchical transformers, anti-spoofing, facial expression recognition, deepfakes


Sensors 23 (2): 929 (2023)


Computer Science
Electrical and Computer Engineering
Information Systems and Cyber Security