Gergely Biczók

CR
4papers
42citations
Novelty55%
AI Score41

4 Papers

4.1CRMay 11
Cybercrime and Prevention: Colonel Blotto in Social Engineering

Gergely Benkő, Katalin Parti, Gergely Biczók

Cybercriminals increasingly target the human factor rather than continuously advancing technological defense mechanisms. Consequently, institutions that allocate substantial resources to strengthening their cybersecurity infrastructure may remain vulnerable if a deceived employee voluntarily transmits sensitive information or financial assets to attackers. Therefore, alongside the implementation of technological defense mechanisms, particular emphasis must be placed on mitigating human vulnerabilities, which can be achieved through preventive awareness programs. However, such training activities can only be effective if they are organization- and context-specific. In this paper, we develop two Colonel Blotto game models to determine the optimal allocation of defensive resources across dominant social engineering attack vectors. We ground the models in Routine Activity Theory (RAT), borrowed from criminology, that describes crime as an event involving a motivated offender, a suitable target, and the absence of a capable guardian. Next, we quantify relevant factors via the VIVA (Value, Inertia, Visibility, Accessibility) framework, and operationalize the models by feeding real-world cybercrime data into them. The first model investigates optimal population-level prevention, focusing on nation-states as defenders; we present and compare use cases of three different countries. The second model focuses on the organization as a decision-maker; here, we analyze five use cases involving organizations of different characteristics. Our results demonstrate that theoretically grounded and data-driven models can provide decision support to policymakers and organizational leaders in allocating their efforts optimally to prevent social engineering attacks and improve their overall cyber resilience.

CRJun 16, 2021
Detecting message modification attacks on the CAN bus with Temporal Convolutional Networks

Irina Chiscop, András Gazdag, Joost Bosman et al.

Multiple attacks have shown that in-vehicle networks have vulnerabilities which can be exploited. Securing the Controller Area Network (CAN) for modern vehicles has become a necessary task for car manufacturers. Some attacks inject potentially large amount of fake messages into the CAN network; however, such attacks are relatively easy to detect. In more sophisticated attacks, the original messages are modified, making the detection a more complex problem. In this paper, we present a novel machine learning based intrusion detection method for CAN networks. We focus on detecting message modification attacks, which do not change the timing patterns of communications. Our proposed temporal convolutional network-based solution can learn the normal behavior of CAN signals and differentiate them from malicious ones. The method is evaluated on multiple CAN-bus message IDs from two public datasets including different types of attacks. Performance results show that our lightweight approach compares favorably to the state-of-the-art unsupervised learning approach, achieving similar or better accuracy for a wide range of scenarios with a significantly lower false positive rate.

CRAug 4, 2020
In Search of Lost Utility: Private Location Data

Szilvia Lestyán, Gergely Ács, Gergely Biczók

The unavailability of training data is a permanent source of much frustration in research, especially when it is due to privacy concerns. This is particularly true for location data since previous techniques all suffer from the inherent sparseness and high dimensionality of location trajectories which render most techniques impractical, resulting in unrealistic traces and unscalable methods. Moreover, time information of location visits is usually dropped, or its resolution is drastically reduced. In this paper we present a novel technique for privately releasing a composite generative model and whole high-dimensional location datasets with detailed time information. To generate high-fidelity synthetic data, we leverage several peculiarities of vehicular mobility such as its language-like characteristics ("you should know a location by the company it keeps") or how humans plan their trips from one point to the other. We model the generator distribution of the dataset by first constructing a variational autoencoder to generate the source and destination locations, and the corresponding timing of trajectories. Next, we compute transition probabilities between locations with a feed forward network, and build a transition graph from the output of this model, which approximates the distribution of all paths between the source and destination (at a given time). Finally, a path is sampled from this distribution with a Markov Chain Monte Carlo method. The generated synthetic dataset is highly realistic, scalable, provides good utility and, nonetheless, provably private. We evaluate our model against two state-of-the-art methods and three real-life datasets demonstrating the benefits of our approach.

LGJul 13, 2020
Quality Inference in Federated Learning with Secure Aggregation

Balázs Pejó, Gergely Biczók

Federated learning algorithms are developed both for efficiency reasons and to ensure the privacy and confidentiality of personal and business data, respectively. Despite no data being shared explicitly, recent studies showed that the mechanism could still leak sensitive information. Hence, secure aggregation is utilized in many real-world scenarios to prevent attribution to specific participants. In this paper, we focus on the quality of individual training datasets and show that such quality information could be inferred and attributed to specific participants even when secure aggregation is applied. Specifically, through a series of image recognition experiments, we infer the relative quality ordering of participants. Moreover, we apply the inferred quality information to detect misbehaviours, to stabilize training performance, and to measure the individual contributions of participants.