Serguei A. Mokhov

CR
7papers
28citations
Novelty25%
AI Score18

7 Papers

CRJul 16, 2012Code
MARFCAT: Transitioning to Binary and Larger Data Sets of SATE IV

Serguei A. Mokhov, Joey Paquet, Mourad Debbabi et al.

We present a second iteration of a machine learning approach to static code analysis and fingerprinting for weaknesses related to security, software engineering, and others using the open-source MARF framework and the MARFCAT application based on it for the NIST's SATE IV static analysis tool exposition workshop's data sets that include additional test cases, including new large synthetic cases. To aid detection of weak or vulnerable code, including source or binary on different platforms the machine learning approach proved to be fast and accurate to for such tasks where other tools are either much slower or have much smaller recall of known vulnerabilities. We use signal and NLP processing techniques in our approach to accomplish the identification and classification tasks. MARFCAT's design from the beginning in 2010 made is independent of the language being analyzed, source code, bytecode, or binary. In this follow up work with explore some preliminary results in this area. We evaluated also additional algorithms that were used to process the data.

IRJan 7, 2021
Application of Knowledge Graphs to Provide Side Information for Improved Recommendation Accuracy

Yuhao Mao, Serguei A. Mokhov, Sudhir P. Mudur

Personalized recommendations are popular in these days of Internet driven activities, specifically shopping. Recommendation methods can be grouped into three major categories, content based filtering, collaborative filtering and machine learning enhanced. Information about products and preferences of different users are primarily used to infer preferences for a specific user. Inadequate information can obviously cause these methods to fail or perform poorly. The more information we provide to these methods, the more likely it is that the methods perform better. Knowledge graphs represent the current trend in recording information in the form of relations between entities, and can provide additional (side) information about products and users. Such information can be used to improve nearest neighbour search, clustering users and products, or train the neural network, when one is used. In this work, we present a new generic recommendation systems framework, that integrates knowledge graphs into the recommendation pipeline. We describe its software design and implementation, and then show through experiments, how such a framework can be specialized for a domain, say movie recommendations, and the improvements in recommendation results possible due to side information obtained from knowledge graphs representation of such information. Our framework supports different knowledge graph representation formats, and facilitates format conversion, merging and information extraction needed for training recommendation methods.

CVAug 1, 2018
Toward Multimodal Interaction in Scalable Visual Digital Evidence Visualization Using Computer Vision Techniques and ISS

Serguei A. Mokhov, Miao Song, Jashanjot Singh et al.

Visualization requirements in Forensic Lucid have to do with different levels of case knowledge abstraction, representation, aggregation, as well as the operational aspects as the final long-term goal of this proposal. It encompasses anything from the finer detailed representation of hierarchical contexts to Forensic Lucid programs, to the documented evidence and its management, its linkage to programs, to evaluation, and to the management of GIPSY software networks. This includes an ability to arbitrarily switch between those views combined with usable multimodal interaction. The purpose is to determine how the findings can be applied to Forensic Lucid and investigation case management. It is also natural to want a convenient and usable evidence visualization, its semantic linkage and the reasoning machinery for it. Thus, we propose a scalable management, visualization, and evaluation of digital evidence using the modified interactive 3D documentary system - Illimitable Space System - (ISS) to represent, semantically link, and provide a usable interface to digital investigators that is navigable via different multimodal interaction techniques using Computer Vision techniques including gestures, as well as eye-gaze and audio.

SEMay 31, 2018
Fast Context-Annotated Classification of Different Types of Web Service Descriptions

Serguei A. Mokhov, Joey Paquet, Arash Khodadadi

In the recent rapid growth of web services, IoT, and cloud computing, many web services and APIs appeared on the web. With the failure of global UDDI registries, different service repositories started to appear, trying to list and categorize various types of web services for client applications' discover and use. In order to increase the effectiveness and speed up the task of finding compatible Web Services in the brokerage when performing service composition or suggesting Web Services to the requests, high-level functionality of the service needs to be determined. Due to the lack of structured support for specifying such functionality, classification of services into a set of abstract categories is necessary. We employ a wide range of Machine Learning and Signal Processing algorithms and techniques in order to find the highest precision achievable in the scope of this article for the fast classification of three type of service descriptions: WSDL, REST, and WADL. In addition, we complement our approach by showing the importance and effect of contextual information on the classification of the service descriptions and show that it improves the accuracy in 5 different categories of services.

CRMar 31, 2016
A Java Data Security Framework (JDSF) and its Case Studies

Serguei A. Mokhov, Lee Wei Huynh, Jian Li et al.

We present the design of something we call Confidentiality, Integrity and Authentication Sub-Frameworks, which are a part of a more general Java Data Security Framework (JDSF) designed to support various aspects related to data security (confidentiality, origin authentication, integrity, and SQL randomization). The JDSF was originally designed in 2007 for use in the two use-cases, MARF and HSQLDB, to allow a plug-in-like implementation of and verification of various security aspects and their generalization. The JDSF project explores secure data storage related issues from the point of view of data security in the two projects. A variety of common security aspects and tasks were considered in order to extract a spectrum of possible parameters these aspects require for the design an extensible frameworked API and its implementation. A particular challenge being tackled is an aggregation of diverse approaches and algorithms into a common set of Java APIs to cover all or at least most common aspects, and, at the same time keeping the framework as simple as possible. As a part of the framework, we provide the mentioned sub-frameworks' APIs to allow for the common algorithm implementations of the confidentiality, integrity, and authentication aspects for MARF's and HSQLDB's database(s). At the same time we perform a detailed overview of the related work and literature on data and database security that we considered as a possible input to design the JDSF.

CRDec 2, 2013
Intensional Cyberforensics

Serguei A. Mokhov

This work focuses on the application of intensional logic to cyberforensic analysis and its benefits and difficulties are compared with the finite-state-automata approach. This work extends the use of the intensional programming paradigm to the modeling and implementation of a cyberforensics investigation process with backtracing of event reconstruction, in which evidence is modeled by multidimensional hierarchical contexts, and proofs or disproofs of claims are undertaken in an eductive manner of evaluation. This approach is a practical, context-aware improvement over the finite state automata (FSA) approach we have seen in previous work. As a base implementation language model, we use in this approach a new dialect of the Lucid programming language, called Forensic Lucid, and we focus on defining hierarchical contexts based on intensional logic for the distributed evaluation of cyberforensic expressions. We also augment the work with credibility factors surrounding digital evidence and witness accounts, which have not been previously modeled. The Forensic Lucid programming language, used for this intensional cyberforensic analysis, formally presented through its syntax and operational semantics. In large part, the language is based on its predecessor and codecessor Lucid dialects, such as GIPL, Indexical Lucid, Lucx, Objective Lucid, and JOOIP bound by the underlying intensional programming paradigm.

CVApr 30, 2012
OCT Segmentation Survey and Summary Reviews and a Novel 3D Segmentation Algorithm and a Proof of Concept Implementation

Serguei A. Mokhov, Yankui Sun

We overview the existing OCT work, especially the practical aspects of it. We create a novel algorithm for 3D OCT segmentation with the goals of speed and/or accuracy while remaining flexible in the design and implementation for future extensions and improvements. The document at this point is a running draft being iteratively "developed" as a progress report as the work and survey advance. It contains the review and summarization of select OCT works, the design and implementation of the OCTMARF experimentation application and some results.