LGJul 25, 2024Code
AsEP: Benchmarking Deep Learning Methods for Antibody-specific Epitope PredictionChunan Liu, Lilian Denzler, Yihong Chen et al.
Epitope identification is vital for antibody design yet challenging due to the inherent variability in antibodies. While many deep learning methods have been developed for general protein binding site prediction tasks, whether they work for epitope prediction remains an understudied research question. The challenge is also heightened by the lack of a consistent evaluation pipeline with sufficient dataset size and epitope diversity. We introduce a filtered antibody-antigen complex structure dataset, AsEP (Antibody-specific Epitope Prediction). AsEP is the largest of its kind and provides clustered epitope groups, allowing the community to develop and test novel epitope prediction methods and evaluate their generalisability. AsEP comes with an easy-to-use interface in Python and pre-built graph representations of each antibody-antigen complex while also supporting customizable embedding methods. Using this new dataset, we benchmark several representative general protein-binding site prediction methods and find that their performances fall short of expectations for epitope prediction. To address this, we propose a novel method, WALLE, which leverages both unstructured modeling from protein language models and structural modeling from graph neural networks. WALLE demonstrate up to 3-10X performance improvement over the baseline methods. Our empirical findings suggest that epitope prediction benefits from combining sequential features provided by language models with geometrical information from graph representations. This provides a guideline for future epitope prediction method design. In addition, we reformulate the task as bipartite link prediction, allowing convenient model performance attribution and interpretability. We open source our data and code at https://github.com/biochunan/AsEP-dataset.
CVSep 29, 2025
AttentionViG: Cross-Attention-Based Dynamic Neighbor Aggregation in Vision GNNsHakan Emre Gedik, Andrew Martin, Mustafa Munir et al.
Vision Graph Neural Networks (ViGs) have demonstrated promising performance in image recognition tasks against Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). An essential part of the ViG framework is the node-neighbor feature aggregation method. Although various graph convolution methods, such as Max-Relative, EdgeConv, GIN, and GraphSAGE, have been explored, a versatile aggregation method that effectively captures complex node-neighbor relationships without requiring architecture-specific refinements is needed. To address this gap, we propose a cross-attention-based aggregation method in which the query projections come from the node, while the key projections come from its neighbors. Additionally, we introduce a novel architecture called AttentionViG that uses the proposed cross-attention aggregation scheme to conduct non-local message passing. We evaluated the image recognition performance of AttentionViG on the ImageNet-1K benchmark, where it achieved SOTA performance. Additionally, we assessed its transferability to downstream tasks, including object detection and instance segmentation on MS COCO 2017, as well as semantic segmentation on ADE20K. Our results demonstrate that the proposed method not only achieves strong performance, but also maintains efficiency, delivering competitive accuracy with comparable FLOPs to prior vision GNN architectures.
CRApr 8, 2021
CRC: Fully General Model of Confidential Remote ComputingKubilay Ahmet Küçük, Andrew Martin
Digital services have been offered through remote systems for decades. The questions of how these systems can be built in a trustworthy manner and how their security properties can be understood are given fresh impetus by recent hardware developments, allowing a fuller, more general, exploration of the possibilities than has previously been seen in the literature. Drawing on and consolidating the disparate strains of research, technologies and methods employed throughout the adaptation of confidential computing, we present a novel, dedicated Confidential Remote Computing (CRC) model. CRC proposes a compact solution for next-generation applications to be built on strong hardware-based security primitives, control of secure software products' trusted computing base, and a way to make correct use of proofs and evidence reports generated by the attestation mechanisms. The CRC model illustrates the trade-offs between decentralisation, task size and transparency overhead. We conclude the paper with six lessons learned from our approach, and suggest two future research directions.
CROct 20, 2020
Exploring HTTPS Security Inconsistencies: A Cross-Regional PerspectiveEman Salem Alashwali, Pawel Szalachowski, Andrew Martin
If two or more identical HTTPS clients, located at different geographic locations (regions), make an HTTPS request to the same domain (e.g. example.com), on the same day, will they receive the same HTTPS security guarantees in response? Our results give evidence that this is not always the case. We conduct scans for the top 250,000 most visited domains on the Internet, from clients located at five different regions: Australia, Brazil, India, the UK, and the US. Our scans gather data from both application (URLs and HTTP headers) and transport (servers' selected TLS version, ciphersuite, and certificate) layers. Overall, we find that HTTPS inconsistencies at the application layer are higher than those at the transport layer. We also find that HTTPS security inconsistencies are strongly related to URLs and IPs diversity among regions, and to a lesser extent to the presence of redirections. Further manual inspection shows that there are several reasons behind URLs diversity among regions such as downgrading to the plain-HTTP protocol, using different subdomains, different TLDs, or different home page documents. Furthermore, we find that downgrading to plain-HTTP is related to websites' regional blocking. We also provide attack scenarios that show how an attacker can benefit from HTTPS security inconsistencies, and introduce a new attack scenario which we call the "region confusion" attack. Finally, based on our analysis and observations, we provide discussion, which include some recommendations such as the need for testing tools for domain administrators and users that help to mitigate and detect regional domains' inconsistencies, standardising regional domains format with the same-origin policy (of domains) in mind, standardising secure URL redirections, and avoid redirections whenever possible.
CRJun 29, 2019
Towards Forward Secure Internet TrafficEman Salem Alashwali, Pawel Szalachowski, Andrew Martin
Forward Secrecy (FS) is a security property in key-exchange algorithms which guarantees that a compromise in the secrecy of a long-term private-key does not compromise the secrecy of past session keys. With a growing awareness of long-term mass surveillance programs by governments and others, FS has become widely regarded as a highly desirable property. This is particularly true in the TLS protocol, which is used to secure Internet communication. In this paper, we investigate FS in pre-TLS 1.3 protocols, which do not mandate FS, but still widely used today. We conduct an empirical analysis of over 10 million TLS servers from three different datasets using a novel heuristic approach. Using a modern TLS client handshake algorithms, our results show 5.37% of top domains, 7.51% of random domains, and 26.16% of random IPs do not select FS key-exchange algorithms. Surprisingly, 39.20% of the top domains, 24.40% of the random domains, and 14.46% of the random IPs that do not select FS, do support FS. In light of this analysis, we discuss possible paths toward forward secure Internet traffic. As an improvement of the current state, we propose a new client-side mechanism that we call "Best Effort Forward Secrecy" (BEFS), and an extension of it that we call "Best Effort Forward Secrecy and Authenticated Encryption" (BESAFE), which aims to guide (force) misconfigured servers to FS using a best effort approach. Finally, within our analysis, we introduce a novel adversarial model that we call "discriminatory" adversary, which is applicable to the TLS protocol.
CRJun 15, 2019
Does "www." Mean Better Transport Layer Security?Eman Salem Alashwali, Pawel Szalachowski, Andrew Martin
Experience shows that most researchers and developers tend to treat plain-domains (those that are not prefixed with "www" sub-domains, e.g. "example.com") as synonyms for their equivalent www-domains (those that are prefixed with "www" sub-domains, e.g. "www.example.com"). In this paper, we analyse datasets of nearly two million plain-domains against their equivalent www-domains to answer the following question: Do plain-domains and their equivalent www-domains differ in TLS security configurations and certificates? If so, to what extent? Our results provide evidence of an interesting phenomenon: plain-domains and their equivalent www-domains differ in TLS security configurations and certificates in a non-trivial number of cases. Furthermore, www-domains tend to have stronger security configurations than their equivalent plain-domains. Interestingly, this phenomenon is more prevalent in the most-visited domains than in randomly-chosen domains. Further analysis of the top domains dataset shows that 53.35% of the plain-domains that show one or more weakness indicators (e.g. expired certificate) that are not shown in their equivalent www-domains perform HTTPS redirection from HTTPS plain-domains to their equivalent HTTPS www-domains. Additionally, 24.71% of these redirections contains plain-text HTTP intermediate URLs. In these cases, users see the final www-domains with strong TLS configurations and certificates, but in fact, the HTTPS request has passed through plain-domains that have less secure TLS configurations and certificates. Clearly, such a set-up introduces a weak link in the security of the overall interaction.
CLDec 23, 2017
Are words easier to learn from infant- than adult-directed speech? A quantitative corpus-based investigationAdriana Guevara-Rukoz, Alejandrina Cristia, Bogdan Ludusan et al.
We investigate whether infant-directed speech (IDS) could facilitate word form learning when compared to adult-directed speech (ADS). To study this, we examine the distribution of word forms at two levels, acoustic and phonological, using a large database of spontaneous speech in Japanese. At the acoustic level we show that, as has been documented before for phonemes, the realizations of words are more variable and less discriminable in IDS than in ADS. At the phonological level, we find an effect in the opposite direction: the IDS lexicon contains more distinctive words (such as onomatopoeias) than the ADS counterpart. Combining the acoustic and phonological metrics together in a global discriminability score reveals that the bigger separation of lexical categories in the phonological space does not compensate for the opposite effect observed at the acoustic level. As a result, IDS word forms are still globally less discriminable than ADS word forms, even though the effect is numerically small. We discuss the implication of these findings for the view that the functional role of IDS is to improve language learnability.
CRJun 9, 2017
Securing Application with Software Partitioning: A case study using SGXAhmad Atamli-Reineh, Andrew Martin
Application size and complexity are the underlying cause of numerous security vulnerabilities in code. In order to mitigate the risks arising from such vulnerabilities, various techniques have been proposed to isolate the execution of sensitive code from the rest of the application and from other software on the platform (e.g. the operating system). However, even with these partitioning techniques, it is not immediately clear exactly how they can and should be used to partition applications. What overall partitioning scheme should be followed; what granularity of the partitions should be. To some extent, this is dependent on the capabilities and performance of the partitioning technology in use. For this work, we focus on the upcoming Intel Software Guard Extensions (SGX) technology as the state-of-the-art in this field. SGX provides a trusted execution environment, called an enclave, that protects the integrity of the code and the confidentiality of the data inside it from other software, including the operating system. We present a novel framework consisting of four possible schemes under which an application can be partitioned. These schemes range from coarse-grained partitioning, in which the full application is included in a single enclave, through ultra-fine partitioning, in which each application secret is protected in an individual enclave. We explain the specific security benefits provided by each of the partitioning schemes and discuss how the performance of the application would be affected. To compare the different partitioning schemes, we have partitioned OpenSSL using four different schemes. We discuss SGX properties together with the implications of our design choices in this paper.
CRJun 9, 2016
Enabling Secure and Usable Mobile Application: Revealing the Nuts and Bolts of software TPM in todays Mobile DevicesAhmad-Atamli Reineh, Giuseppe Petracca, Janne Uusilehto et al.
The emergence of mobile applications to execute sensitive operations has brought a myriad of security threats to both enterprises and users. In order to benefit from the large potential in smartphones there is a need to manage the risks arising from threats, while maintaining an easy interface for the users. In this paper we investigate the use of Trusted Platform Model (TPM) 2.0 to develop a secure application for smartphones using Windows Phone 8.1. In particular, we suggest a framework based on remote attestation as a proxy to authenticate remote services, where the device is associated to the user and replaces the users credentials. In addition, we use the TPM 2.0 to enable secured information and data storage within the device itself. We present an implementation and performance evaluation of the suggested architecture that uses our novel attestation and authentication scheme and reveal the caveats of using software TPM in todays mobile devices.
CRDec 20, 2015
Developing a Trust Domain Taxonomy for Securely Sharing Information Among OthersNalin Asanka Gamagedara Arachchilage, Cornelius Namiluko, Andrew Martin
In any given collaboration, information needs to flow from one participant to another. While participants may be interested in sharing information with one another, it is often necessary for them to establish the impact of sharing certain kinds of information. This is because certain information could have detrimental effects when it ends up in wrong hands. For this reason, any would-be participant in a collaboration may need to establish the guarantees that the collaboration provides, in terms of protecting sensitive information, before joining the collaboration as well as evaluating the impact of sharing a given piece of information with a given set of entities. The concept of a trust domains aims at managing trust-related issues in information sharing. It is essential for enabling efficient collaborations. Therefore, this research attempts to develop a taxonomy for trust domains with measurable trust characteristics, which provides security-enhanced, distributed containers for the next generation of composite electronic services for supporting collaboration and data exchange within and across multiple organisations. Then the developed taxonomy is applied to possible scenarios (e.g. Health Care Service Scenario and ConfiChair Scenario), in which the concept of trust domains could be useful.
CRNov 14, 2015
A Trust Domains Taxonomy for Securely Sharing Information: A Preliminary InvestigationNalin Asanka Gamagedara Arachchilage, Andrew Martin
Information sharing has become a vital part in our day-to-day life due to the pervasiveness of Internet technology. In any given collaboration, information needs to flow from one participant to another. While participants may be interested in sharing information with one another, it is often necessary for them to establish the impact of sharing certain kinds of information. This is because certain information could have detrimental effects when it ends up in wrong hands. For this reason, any would-be participant in a given collaboration may need to establish the guarantees that the collaboration provides, in terms of protecting sensitive information, before joining the collaboration as well as evaluating the impact of sharing a given piece of information with a given set of entities. In order to address this issue, earlier work introduced a trust domains taxonomy that aims at managing trust-related issues in information sharing. This paper attempts to empirically investigate the proposed taxonomy through a possible scenario (e.g. the ConfiChair system). The study results determined that Role, Policy, Action, Control, Evidence and Asset elements should be incorporated into the taxonomy for securely sharing information among others. Additionally, the study results showed that the ConfiChair, a novel cloud-based conference management system, offers strong privacy and confidentiality guarantees.