Information Retrieval and Semantic Inference from Natural Language Privacy Policies




Bokaei Hosseini, Mitra

Journal Title

Journal ISSN

Volume Title



Several state laws, along with app markets, such as Apple's App Store and Google Play, require app developers to provide users with legal privacy notices (privacy policy) containing critical requirements that inform users about what kinds of personal information is collected, how the data is used, and with whom the data is shared. Because privacy policies consist of legal terms often written by a legal team without rigorous insight into the app source code, and because the policy and app code can change independently, privacy policies become misaligned with the actual data practices. In addition to misinforming users, such inconsistencies between policies and data practices can have legal repercussions. The goal of this work is to capture and formalize the semantics of natural language privacy policies into a knowledge base that can actuate (1) transparent software implementation; and (2) shared understanding between policy authors, app developers, and regulators. Constructing an empirically valid knowledge base (i.e., privacy policy ontology) is a challenging task since it should be both scalable and consistent with multi-user interpretations.

This work focuses on formal representation of privacy policy semantics by applying grounded theory, natural language patterns, and neural networks on terminology of privacy policies. Further, the application of formal ontologies in privacy misalignment detection frameworks is discussed.


This item is available only to currently enrolled UTSA students, faculty or staff. To download, navigate to Log In in the top right-hand corner of this screen, then select Log in with my UTSA ID.


Natural Language Processing, Privacy, Requirements Engineering, Software Engineering



Computer Science