A set of semantic data flow diagrams and its security analysis based on ontologies and knowledge graphs
This work addresses the need for automated threat modeling in agile and cloud-based development, though it appears incremental by applying existing semantic methods to a specific domain.
The paper tackles the challenge of automating threat modeling for cloud applications by creating a set of 180 semantic data flow diagrams based on Docker Compose configurations and using ontologies and knowledge graphs to automatically recognize security threat patterns, with results compared to a manual taxonomy to study automation challenges.
For a long time threat modeling was treated as a manual, complicated process. However modern agile development methodologies and cloud computing technologies require adding automatic threat modeling approaches. This work considers two challenges: creating a set of machine-readable data flow diagrams that represent real cloud based applications; and usage domain specific knowledge for automatic analysis of the security aspects of such applications. The set of 180 semantic diagrams (ontologies and knowledge graphs) is created based on cloud configurations (Docker Compose); the set includes a manual taxonomy that allows to define the design and functional aspects of the web based and data processing applications; the set can be used for various research in the threat modeling field. This work also evaluates how ontologies and knowledge graphs can be used to automatically recognize patterns (mapped to security threats) in diagrams. A pattern represents features of a diagram in form of a request to a knowledge base, what enables its recognition in a semantic representation of a diagram. In an experiment four groups of the patterns are created (web applications, data processing, network, and docker specific), and the diagrams are examined by the patterns. Automatic results, received for the web applications and data processing patterns, are compared with the manual taxonomy in order to study challenges of automatic threat modeling.