HCNov 20, 2023
User-Like Bots for Cognitive Automation: A SurveyHabtom Kahsay Gidey, Peter Hillmann, Andreas Karcher et al.
Software bots have attracted increasing interest and popularity in both research and society. Their contributions span automation, digital twins, game characters with conscious-like behavior, and social media. However, there is still a lack of intelligent bots that can adapt to web environments' variability and dynamic nature. Unlike human users, they have difficulty understanding and exploiting the affordances across multiple virtual environments. Despite the hype, bots with human user-like cognition do not currently exist. Chatbots, for instance, lack situational awareness on the digital platforms where they operate, preventing them from enacting meaningful and autonomous intelligent behavior similar to human users. In this survey, we aim to explore the role of cognitive architectures in supporting efforts towards engineering software bots with advanced general intelligence. We discuss how cognitive architectures can contribute to creating intelligent software bots. Furthermore, we highlight key architectural recommendations for the future development of autonomous, user-like cognitive bots.
AIApr 30
A Pattern Language for Resilient Visual AgentsHabtom Kahsay Gidey, Alexander Lenz, Alois Knoll
Integrating multimodal foundation models into enterprise ecosystems presents a fundamental software architecture challenge. Architects must balance competing quality attributes: the high latency and non-determinism of vision language action (VLA) models versus the strict determinism and real-time performance required by enterprise control loops. In this study, we propose an architectural pattern language for visual agents that separates fast, deterministic reflexes from slow, probabilistic supervision. It consists of four architectural design patterns: (1) Hybrid Affordance Integration, (2) Adaptive Visual Anchoring, (3) Visual Hierarchy Synthesis, and (4) Semantic Scene Graph.
AIOct 28, 2025
Affordance Representation and Recognition for Autonomous AgentsHabtom Kahsay Gidey, Niklas Huber, Alexander Lenz et al.
The autonomy of software agents is fundamentally dependent on their ability to construct an actionable internal world model from the structured data that defines their digital environment, such as the Document Object Model (DOM) of web pages and the semantic descriptions of web services. However, constructing this world model from raw structured data presents two critical challenges: the verbosity of raw HTML makes it computationally intractable for direct use by foundation models, while the static nature of hardcoded API integrations prevents agents from adapting to evolving services. This paper introduces a pattern language for world modeling from structured data, presenting two complementary architectural patterns. The DOM Transduction Pattern addresses the challenge of web page complexity by distilling} a verbose, raw DOM into a compact, task-relevant representation or world model optimized for an agent's reasoning core. Concurrently, the Hypermedia Affordances Recognition Pattern enables the agent to dynamically enrich its world model by parsing standardized semantic descriptions to discover and integrate the capabilities of unknown web services at runtime. Together, these patterns provide a robust framework for engineering agents that can efficiently construct and maintain an accurate world model, enabling scalable, adaptive, and interoperable automation across the web and its extended resources.
AIOct 10, 2025
Fundamentals of Building Autonomous LLM AgentsVictor de Lamo Castrillo, Habtom Kahsay Gidey, Alexander Lenz et al.
This paper reviews the architecture and implementation methods of agents powered by large language models (LLMs). Motivated by the limitations of traditional LLMs in real-world tasks, the research aims to explore patterns to develop "agentic" LLMs that can automate complex tasks and bridge the performance gap with human capabilities. Key components include a perception system that converts environmental percepts into meaningful representations; a reasoning system that formulates plans, adapts to feedback, and evaluates actions through different techniques like Chain-of-Thought and Tree-of-Thought; a memory system that retains knowledge through both short-term and long-term mechanisms; and an execution system that translates internal decisions into concrete actions. This paper shows how integrating these systems leads to more capable and generalized software bots that mimic human cognitive processes for autonomous and intelligent behavior.
DCJun 13, 2024
A Document-based Knowledge Discovery with Microservices ArchitectureHabtom Kahsay Gidey, Mario Kesseler, Patrick Stangl et al.
The first step towards digitalization within organizations lies in digitization - the conversion of analog data into digitally stored data. This basic step is the prerequisite for all following activities like the digitalization of processes or the servitization of products or offerings. However, digitization itself often leads to 'data-rich' but 'knowledge-poor' material. Knowledge discovery and knowledge extraction as approaches try to increase the usefulness of digitized data. In this paper, we point out the key challenges in the context of knowledge discovery and present an approach to addressing these using a microservices architecture. Our solution led to a conceptual design focusing on keyword extraction, similarity calculation of documents, database queries in natural language, and programming language independent provision of the extracted information. In addition, the conceptual design provides referential design guidelines for integrating processes and applications for semi-automatic learning, editing, and visualization of ontologies. The concept also uses a microservices architecture to address non-functional requirements, such as scalability and resilience. The evaluation of the specified requirements is performed using a demonstrator that implements the concept. Furthermore, this modern approach is used in the German patent office in an extended version.
AIMay 26, 2023
Towards Cognitive Bots: Architectural Research ChallengesHabtom Kahsay Gidey, Peter Hillmann, Andreas Karcher et al.
Software bots operating in multiple virtual digital platforms must understand the platforms' affordances and behave like human users. Platform affordances or features differ from one application platform to another or through a life cycle, requiring such bots to be adaptable. Moreover, bots in such platforms could cooperate with humans or other software agents for work or to learn specific behavior patterns. However, present-day bots, particularly chatbots, other than language processing and prediction, are far from reaching a human user's behavior level within complex business information systems. They lack the cognitive capabilities to sense and act in such virtual environments, rendering their development a challenge to artificial general intelligence research. In this study, we problematize and investigate assumptions in conceptualizing software bot architecture by directing attention to significant architectural research challenges in developing cognitive bots endowed with complex behavior for operation on information systems. As an outlook, we propose alternate architectural assumptions to consider in future bot design and bot development frameworks.