Mujahid Sultan

6papers

21citations

Novelty35%

AI Score36

Ranked #121,503 of 205,806 authors (top 59%)#1,591 in SE (top 46%)

6 Papers

4.7AIMay 17

NeuSymMS: A Hybrid Neuro-Symbolic Memory System for Persistent, Self-Curating LLM Agents

Mujahid Sultan, Sri Thuraisamy, Daya Rajaratnam

We present NeuSymMS, an adaptive memory system that enables large language model (LLM) agents to learn, remember, and reason about users across sessions via a hybrid neuro-symbolic architecture. NeuSymMS couples neural fact extraction from unstructured dialogue with a CLIPS-based expert system that classifies, deduplicates, and reconciles facts under explicit lifecycle rules. The system represents knowledge as subject-relation-value triples stored in relational database management system, supports user/agents/agent-to-agents scoping, and implements a dual-horizon short-term/long-term memory model with access-based promotion and time-based pruning. NeuSymMS maintains continuity of memory while avoiding context-window bloat and cross-entity contamination. We argue that this architecture offers a practical path to trustworthy, auditable memory for production agentic systems and discuss its novelty relative to log retrieval, summarization, and key-value approaches.

LGOct 4, 2022

Sampling Streaming Data with Parallel Vector Quantization -- PVQ

Mujahid Sultan

Accumulation of corporate data in the cloud has attracted more enterprise applications to the cloud creating data gravity. As a consequence, network traffic has become more cloud centric. This increase in cloud centric traffic poses new challenges in designing learning systems for streaming data due to class imbalance. The number of classes plays a vital role in the accuracy of the classifiers built from the data streams. In this paper, we present a vector quantization-based sampling method, which substantially reduces the class imbalance in data streams. We demonstrate its effectiveness by conducting experiments on network traffic and anomaly dataset with commonly used ML model building methods; Multilayered Perceptron on TensorFlow backend, Support Vector Machines, K-Nearest Neighbour, and Random Forests. We built models using parallel processing, batch processing, and randomly selecting samples. We show that the accuracy of classification models improves when the data streams are pre-processed with our method. We used out of the box hyper-parameters of these classifiers and auto sklearn for hyperparameter optimization.

SESep 3, 2020

Linking Stakeholders' Viewpoint Concerns and Microservices-based Architecture

Mujahid Sultan

Widespread adoption of agile project management, independent delivery with microservices, and automated deployment with DevOps has tremendously speedup the systems development. The real game-changer is continuous integration (CI), continuous delivery, and continuous deployment (CD). Organizations can do multiple releases a day, shortening the test, release, and deployment cycles from weeks to minutes. Maturity of container technologies like Docker and container orchestration platforms like Kubernetes has promoted microservices architecture, especially in the cloud-native developments. Various tools are available for setting up CI/CD pipelines. Organizations are moving away from monolith applications and moving towards microservices-based architectures. Organizations can quickly accumulate hundreds of such microservices accessible via application programming interfaces (APIs). The primary purpose of these modern methodologies is agility, speed, and reusability. While DevOps offers speed and time to market, agility and reusability may not be guaranteed unless microservices and API's are linked to enterprise-wide stakeholders' needs. The link between business needs and microservices/APIs is not well captured nor adequately defined. In this publication, we describe a structured method to create a logical link among APIs and microservices-based agile developments with enterprise stakeholders' needs and viewpoint concerns. This method enables capturing and documenting enterprise-wide stakeholders' needs, whether these are business owners, planners (product owners, architects), designers (developers, DevOps engineers), or the partners and subscribers of an enterprise.

DBMar 9, 2020

Probabilistic Partitive Partitioning (PPP)

Mujahid Sultan

Clustering is a NP-hard problem. Thus, no optimal algorithm exists, heuristics are applied to cluster the data. Heuristics can be very resource-intensive, if not applied properly. For substantially large data sets computational efficiencies can be achieved by reducing the input space if a minimal loss of information can be achieved. Clustering algorithms, in general, face two common problems: 1) these converge to different settings with different initial conditions and; 2) the number of clusters has to be arbitrarily decided beforehand. This problem has become critical in the realm of big data. Recently, clustering algorithms have emerged which can speedup computations using parallel processing over the grid but face the aforementioned problems. Goals: Our goals are to find methods to cluster data which: 1) guarantee convergence to the same settings irrespective of the initial conditions; 2) eliminate the need to establish the number of clusters beforehand, and 3) can be applied to cluster large datasets. Methods: We introduce a method that combines probabilistic and combinatorial clustering methods to produce repeatable and compact clusters that are not sensitive to initial conditions. This method harnesses the power of k-means (a combinatorial clustering method) to cluster/partition very large dimensional datasets and uses the Gaussian Mixture Model (a probabilistic clustering method) to validate the k-means partitions. Results: We show that this method produces very compact clusters that are not sensitive to initial conditions. This method can be used to identify the most 'separable' set in a dataset which increases the 'clusterability' of a dataset. This method also eliminates the need to specify the number of clusters in advance.

SESep 24, 2015

Ordering stakeholder viewpoint concerns for holistic and incremental Enterprise Architecture: the W6H framework

Mujahid Sultan, Andriy Miranskyy

Context: Enterprise Architecture (EA) is a discipline which has evolved to structure the business and its alignment with the IT systems. One of the popular enterprise architecture frameworks is Zachman framework (ZF). This framework focuses on describing the enterprise from six viewpoint perspectives of the stakeholders. These six perspectives are based on English language interrogatives 'what', 'where', 'who', 'when', 'why', and 'how' (thus the term W5H Journalists and police investigators use the W5H to describe an event. However, EA is not an event, creation and evolution of EA challenging. Moreover, the ordering of viewpoints is not defined in the existing EA frameworks, making data capturing process difficult. Our goals are to 1) assess if W5H is sufficient to describe modern EA and 2) explore the ordering and precedence among the viewpoint concerns. Method: we achieve our goals by bringing tools from the Linguistics, focusing on a full set of English Language interrogatives to describe viewpoint concerns and the inter-relationships and dependencies among these. Application of these tools is validated using pedagogical EA examples. Results: 1) We show that addition of the seventh interrogative 'which' to the W5H set (we denote this extended set as W6H) yields extra and necessary information enabling creation of holistic EA. 2) We discover that particular ordering of the interrogatives, established by linguists (based on semantic and lexical analysis of English language interrogatives), define starting points and the order in which viewpoints should be arranged for creating complete EA. 3) We prove that adopting W6H enables creation of EA for iterative and agile SDLCs, e.g. Scrum. Conclusions: We believe that our findings complete creation of EA using ZF by practitioners, and provide theoreticians with tools needed to improve other EA frameworks, e.g., TOGAF and DoDAF.

SEAug 8, 2015

Ordering Interrogative Questions for Effective Requirements Engineering: The W6H Pattern

Mujahid Sultan, Andriy Miranskyy

Requirements elicitation and requirements analysis are important practices of Requirements Engineering. Elicitation techniques, such as interviews and questionnaires, rely on formulating interrogative questions and asking these in a proper order to maximize the accuracy of the information being gathered. Information gathered during requirements elicitation then has to be interpreted, analyzed, and validated. Requirements analysis involves analyzing the problem and solutions spaces. In this paper, we describe a method to formulate interrogative questions for effective requirements elicitation based on the lexical and semantic principles of the English language interrogatives, and propose a pattern to organize stakeholder viewpoint concerns for better requirements analysis. This helps requirements engineer thoroughly describe problem and solutions spaces. Most of the previous requirements elicitation studies included six out of the seven English language interrogatives 'what', 'where', 'when', 'who', 'why', and 'how' (denoted by W5H) and did not propose any order in the interrogatives. We show that extending the set of six interrogatives with 'which' (denoted by W6H) improves the generation and formulation of questions for requirements elicitation and facilitates better requirements analysis via arranging stakeholder views. We discuss the interdependencies among interrogatives (for requirements engineer to consider while eliciting the requirements) and suggest an order for the set of W6H interrogatives. The proposed W6H-based reusable pattern also aids requirements engineer in organizing viewpoint concerns of stakeholders, making this pattern an effective tool for requirements analysis.