Martin Fischer

CV
h-index16
17papers
3,202citations
Novelty34%
AI Score42

17 Papers

CLApr 18, 2023
BIM-GPT: a Prompt-Based Virtual Assistant Framework for BIM Information Retrieval

Junwen Zheng, Martin Fischer

Efficient information retrieval (IR) from building information models (BIMs) poses significant challenges due to the necessity for deep BIM knowledge or extensive engineering efforts for automation. We introduce BIM-GPT, a prompt-based virtual assistant (VA) framework integrating BIM and generative pre-trained transformer (GPT) technologies to support NL-based IR. A prompt manager and dynamic template generate prompts for GPT models, enabling interpretation of NL queries, summarization of retrieved information, and answering BIM-related questions. In tests on a BIM IR dataset, our approach achieved 83.5% and 99.5% accuracy rates for classifying NL queries with no data and 2% data incorporated in prompts, respectively. Additionally, we validated the functionality of BIM-GPT through a VA prototype for a hospital building. This research contributes to the development of effective and versatile VAs for BIM IR in the construction industry, significantly enhancing BIM accessibility and reducing engineering efforts and training data requirements for processing NL queries.

CVJun 28, 2023Code
Points for Energy Renovation (PointER): A LiDAR-Derived Point Cloud Dataset of One Million English Buildings Linked to Energy Characteristics

Sebastian Krapf, Kevin Mayer, Martin Fischer

Rapid renovation of Europe's inefficient buildings is required to reduce climate change. However, analyzing and evaluating buildings at scale is challenging because every building is unique. In current practice, the energy performance of buildings is assessed during on-site visits, which are slow, costly, and local. This paper presents a building point cloud dataset that promotes a data-driven, large-scale understanding of the 3D representation of buildings and their energy characteristics. We generate building point clouds by intersecting building footprints with geo-referenced LiDAR data and link them with attributes from UK's energy performance database via the Unique Property Reference Number (UPRN). To achieve a representative sample, we select one million buildings from a range of rural and urban regions across England, of which half a million are linked to energy characteristics. Building point clouds in new regions can be generated with the open-source code published alongside the paper. The dataset enables novel research in building energy modeling and can be easily expanded to other research fields by adding building features via the UPRN or geo-location.

HCJan 8, 2023
Digital Twin: Where do humans fit in?

Ashwin Agrawal, Robert Thiel, Pooja Jain et al.

Digital Twin (DT) technology is far from being comprehensive and mature, resulting in their piecemeal implementation in practice where some functions are automated by DTs, and others are still performed by humans. This piecemeal implementation of DTs often leaves practitioners wondering what roles (or functions) to allocate to DTs in a work system, and how might it impact humans. A lack of knowledge about the roles that humans and DTs play in a work system can result in significant costs, misallocation of resources, unrealistic expectations from DTs, and strategic misalignments. To alleviate this challenge, this paper answers the research question: When humans work with DTs, what types of roles can a DT play, and to what extent can those roles be automated? Specifically, we propose a two-dimensional conceptual framework, Levels of Digital Twin (LoDT). The framework is an integration of the types of roles a DT can play, broadly categorized under (1) Observer, (2) Analyst, (3) Decision Maker, and (4) Action Executor, and the extent of automation for each of these roles, divided into five different levels ranging from completely manual to fully automated. A particular DT can play any number of roles at varying levels. The framework can help practitioners systematically plan DT deployments, clearly communicate goals and deliverables, and lay out a strategic vision. A case study illustrates the usefulness of the framework.

HCOct 11, 2022
A new perspective on Digital Twins: Imparting intelligence and agency to entities

Ashwin Agrawal, Vishal Singh, Martin Fischer

Despite the Digital Twin (DT) concept being in the industry for a long time, it remains ambiguous, unable to differentiate itself from information models, general computing, and simulation technologies. Part of this confusion stems from previous studies overlooking the DT's bidirectional nature, that enables the shift of agency (delegating control) from humans to physical elements, something that was not possible with earlier technologies. Thus, we present DTs in a new light by viewing them as a means of imparting intelligence and agency to entities, emphasizing that DTs are not just expert-centric tools but are active systems that extend the capabilities of the entities being twinned. This new perspective on DTs can help reduce confusion and humanize the concept by starting discussions about how intelligent a DT should be, and its roles and responsibilities, as well as setting a long-term direction for DTs.

CVJun 5, 2022
Estimating building energy efficiency from street view imagery, aerial imagery, and land surface temperature data

Kevin Mayer, Lukas Haas, Tianyuan Huang et al.

Current methods to determine the energy efficiency of buildings require on-site visits of certified energy auditors which makes the process slow, costly, and geographically incomplete. To accelerate the identification of promising retrofit targets on a large scale, we propose to estimate building energy efficiency from widely available and remotely sensed data sources only, namely street view, aerial view, footprint, and satellite-borne land surface temperature (LST) data. After collecting data for almost 40,000 buildings in the United Kingdom, we combine these data sources by training multiple end-to-end deep learning models with the objective to classify buildings as energy efficient (EU rating A-D) or inefficient (EU rating E-G). After evaluating the trained models quantitatively as well as qualitatively, we extend our analysis by studying the predictive power of each data source in an ablation study. We find that the end-to-end deep learning model trained on all four data sources achieves a macro-averaged F1 score of 64.64% and outperforms the k-NN and SVM-based baseline models by 14.13 to 12.02 percentage points, respectively. Thus, this work shows the potential and complementary nature of remotely sensed data in predicting energy efficiency and opens up new opportunities for future work to integrate additional data sources.

GRJan 22
Deep Sketch-Based 3D Modeling: A Survey

Alberto Tono, Jiajun Wu, Gordon Wetzstein et al.

In the past decade, advances in artificial intelligence have revolutionized sketch-based 3D modeling, leading to a new paradigm known as Deep Sketch-Based 3D Modeling (DS-3DM). DS-3DM offers data-driven methods that address the long-standing challenges of sketch abstraction and ambiguity. DS-3DM keeps humans at the center of the creative process by enhancing the flexibility, usability, faithfulness, and adaptability of sketch-based 3D modeling interfaces. This paper contributes a comprehensive survey of the latest DS-3DM within a novel design space: MORPHEUS. Built upon the Input-Model-Output (IMO) framework, MORPHEUS categorizes Models outputting Options of 3D Representations and Parts, derived from Human inputs (varying in quantity and modality), and Evaluated across diverse User-views and Styles. Throughout MORPHEUS we highlight limitations and identify opportunities for interdisciplinary research in Computer Vision, Computer Graphics, and Human-Computer Interaction, revealing a need for controllability and information-rich outputs. These opportunities align design processes more closely with user' intent, responding to the growing importance of user-centered approaches.

CVOct 24, 2022
Vitruvio: 3D Building Meshes via Single Perspective Sketches

Alberto Tono, Heyaojing Huang, Ashwin Agrawal et al.

Today's architectural engineering and construction (AEC) software require a learning curve to generate a three-dimension building representation. This limits the ability to quickly validate the volumetric implications of an initial design idea communicated via a single sketch. Allowing designers to translate a single sketch to a 3D building will enable owners to instantly visualize 3D project information without the cognitive load required. If previous state-of-the-art (SOTA) data-driven methods for single view reconstruction (SVR) showed outstanding results in the reconstruction process from a single image or sketch, they lacked specific applications, analysis, and experiments in the AEC. Therefore, this research addresses this gap, introducing the first deep learning method focused only on buildings that aim to convert a single sketch to a 3D building mesh: Vitruvio. Vitruvio adapts Occupancy Network for SVR tasks on a specific building dataset (Manhattan 1K). This adaptation brings two main improvements. First, it accelerates the inference process by more than 26% (from 0.5s to 0.37s). Second, it increases the reconstruction accuracy (measured by the Chamfer Distance) by 18%. During this adaptation in the AEC domain, we evaluate the effect of the building orientation in the learning procedure since it constitutes an important design factor. While aligning all the buildings to a canonical pose improved the overall quantitative metrics, it did not capture fine-grain details in more complex building shapes (as shown in our qualitative analysis). Finally, Vitruvio outputs a 3D-printable building mesh with arbitrary topology and genus from a single perspective sketch, providing a step forward to allow owners and designers to communicate 3D information via a 2D, effective, intuitive, and universal communication medium: the sketch.

HCJun 29, 2023
LeanAI: A method for AEC practitioners to effectively plan AI implementations

Ashwin Agrawal, Vishal Singh, Martin Fischer

Recent developments in Artificial Intelligence (AI) provide unprecedented automation opportunities in the Architecture, Engineering, and Construction (AEC) industry. However, despite the enthusiasm regarding the use of AI, 85% of current big data projects fail. One of the main reasons for AI project failures in the AEC industry is the disconnect between those who plan or decide to use AI and those who implement it. AEC practitioners often lack a clear understanding of the capabilities and limitations of AI, leading to a failure to distinguish between what AI should solve, what it can solve, and what it will solve, treating these categories as if they are interchangeable. This lack of understanding results in the disconnect between AI planning and implementation because the planning is based on a vision of what AI should solve without considering if it can or will solve it. To address this challenge, this work introduces the LeanAI method. The method has been developed using data from several ongoing longitudinal studies analyzing AI implementations in the AEC industry, which involved 50+ hours of interview data. The LeanAI method delineates what AI should solve, what it can solve, and what it will solve, forcing practitioners to clearly articulate these components early in the planning process itself by involving the relevant stakeholders. By utilizing the method, practitioners can effectively plan AI implementations, thus increasing the likelihood of success and ultimately speeding up the adoption of AI. A case example illustrates the usefulness of the method.

CVAug 28, 2025Code
SYNBUILD-3D: A large, multi-modal, and semantically rich synthetic dataset of 3D building models at Level of Detail 4

Kevin Mayer, Alex Vesel, Xinyi Zhao et al.

3D building models are critical for applications in architecture, energy simulation, and navigation. Yet, generating accurate and semantically rich 3D buildings automatically remains a major challenge due to the lack of large-scale annotated datasets in the public domain. Inspired by the success of synthetic data in computer vision, we introduce SYNBUILD-3D, a large, diverse, and multi-modal dataset of over 6.2 million synthetic 3D residential buildings at Level of Detail (LoD) 4. In the dataset, each building is represented through three distinct modalities: a semantically enriched 3D wireframe graph at LoD 4 (Modality I), the corresponding floor plan images (Modality II), and a LiDAR-like roof point cloud (Modality III). The semantic annotations for each building wireframe are derived from the corresponding floor plan images and include information on rooms, doors, and windows. Through its tri-modal nature, future work can use SYNBUILD-3D to develop novel generative AI algorithms that automate the creation of 3D building models at LoD 4, subject to predefined floor plan layouts and roof geometries, while enforcing semantic-geometric consistency. Dataset and code samples are publicly available at https://github.com/kdmayer/SYNBUILD-3D.

ROFeb 20, 2025
A Mobile Robotic Approach to Autonomous Surface Scanning in Legal Medicine

Sarah Grube, Sarah Latus, Martin Fischer et al.

Purpose: Comprehensive legal medicine documentation includes both an internal but also an external examination of the corpse. Typically, this documentation is conducted manually during conventional autopsy. A systematic digital documentation would be desirable, especially for the external examination of wounds, which is becoming more relevant for legal medicine analysis. For this purpose, RGB surface scanning has been introduced. While a manual full surface scan using a handheld camera is timeconsuming and operator dependent, floor or ceiling mounted robotic systems require substantial space and a dedicated room. Hence, we consider whether a mobile robotic system can be used for external documentation. Methods: We develop a mobile robotic system that enables full-body RGB-D surface scanning. Our work includes a detailed configuration space analysis to identify the environmental parameters that need to be considered to successfully perform a surface scan. We validate our findings through an experimental study in the lab and demonstrate the system's application in a legal medicine environment. Results: Our configuration space analysis shows that a good trade-off between coverage and time is reached with three robot base positions, leading to a coverage of 94.96 %. Experiments validate the effectiveness of the system in accurately capturing body surface geometry with an average surface coverage of 96.90 +- 3.16 % and 92.45 +- 1.43 % for a body phantom and actual corpses, respectively. Conclusion: This work demonstrates the potential of a mobile robotic system to automate RGB-D surface scanning in legal medicine, complementing the use of post-mortem CT scans for inner documentation. Our results indicate that the proposed system can contribute to more efficient and autonomous legal medicine documentation, reducing the need for manual intervention.

SEJan 14, 2022
Digital Twin: From Concept to Practice

Ashwin Agrawal, Martin Fischer, Vishal Singh

Recent technological developments and advances in Artificial Intelligence (AI) have enabled sophisticated capabilities to be a part of Digital Twin (DT), virtually making it possible to introduce automation into all aspects of work processes. Given these possibilities that DT can offer, practitioners are facing increasingly difficult decisions regarding what capabilities to select while deploying a DT in practice. The lack of research in this field has not helped either. It has resulted in the rebranding and reuse of emerging technological capabilities like prediction, simulation, AI, and Machine Learning (ML) as necessary constituents of DT. Inappropriate selection of capabilities in a DT can result in missed opportunities, strategic misalignments, inflated expectations, and risk of it being rejected as just hype by the practitioners. To alleviate this challenge, this paper proposes the digitalization framework, designed and developed by following a Design Science Research (DSR) methodology over a period of 18 months. The framework can help practitioners select an appropriate level of sophistication in a DT by weighing the pros and cons for each level, deciding evaluation criteria for the digital twin system, and assessing the implications of the selected DT on the organizational processes and strategies, and value creation. Three real-life case studies illustrate the application and usefulness of the framework.

CRApr 15, 2021
Measuring the Impact of Blockchain and Smart Contract on Construction Supply Chain Visibility

Hesam Hamledari, Martin Fischer

This work assesses the impact of blockchain and smart contract on the visibility of construction supply chain and in the context of payments (intersection of cash and product flows). It uses comparative empirical experiments (Charrette Test Method) to draw comparisons between the visibility of state-of-practice and blockchain-enabled payment systems in a commercial construction project. Comparisons were drawn across four levels of granularity. The findings are twofold: 1) blockchain improved information completeness and information accuracy respectively by an average 216% and 261% compared with the digital state-of-practice solution. The improvements were significantly more pronounced for inquiries that had higher product, trade, and temporal granularity; 2) blockchain-enabled solution was robust in the face of increased granularity, while the conventional solution experienced 50% and 66.7% decline respectively in completeness and accuracy of information. The paper concludes with a discussion of mechanisms contributing to visibility and technology adoption based on business objectives.

CRDec 3, 2020
The Application of Blockchain-Based Crypto Assets for Integrating the Physical and Financial Supply Chains in the Construction & Engineering Industry

Hesam Hamledari, Martin Fischer

Supply chain integration remains an elusive goal for the construction and engineering industry. The high degree of fragmentation and the reliance on third-party financial institutions has pushed the physical and financial supply chains apart. The paper demonstrates how blockchain-based crypto assets (crypto currencies and crypto tokens) can address this limitation when used for conditioning the flow of funds based on the flow of products. The paper contrasts the integration between cash and product flows in supply chains that rely on fiat currencies and crypto assets for their payment settlement. Two facets of crypto asset-enabled integration, atomicity and granularity, are further introduced. The thesis is validated in the context of construction progress payments. The as-built data captured by unmanned aerial and ground vehicles was passed to an autonomous smart contract-based method that utilizes crypto-currencies and crypto tokens for payment settlement; the resulting payment datasets, written to the Ethereum blockchain, were analyzed in terms of their integration of product and cash flow. The work is concluded with a discussion of findings and their implications for the industry.

CROct 28, 2020
Construction Payment Automation Using Blockchain-Enabled Smart Contracts and Reality Capture Technologies

Hesam Hamledari, Martin Fischer

This paper presents a smart contract-based solution for autonomous administration of construction progress payments. It bridges the gap between payments (cash flow) and the progress assessments at job sites (product flow) enabled by reality capture technologies and building information modeling (BIM). The approach eliminates the reliance on the centralized and heavily intermediated mechanisms of existing payment applications. The construction progress is stored in a distributed manner using content addressable file sharing; it is broadcasted to a smart contract which automates the on-chain payment settlements and the transfer of lien rights. The method was successfully used for processing payments to 7 subcontractors in two commercial construction projects where progress monitoring was performed using a camera-equipped unmanned aerial vehicle (UAV) and an unmanned ground vehicle (UGV) equipped with a laser scanner. The results show promise for the method's potential for increasing the frequency, granularity, and transparency of payments. The paper is concluded with a discussion of implications for project management, introducing a new model of project as a singleton state machine.

CVOct 6, 2019
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

Iro Armeni, Zhi-Yang He, JunYoung Gwak et al.

A comprehensive semantic understanding of a scene is important for many applications - but in what space should diverse semantic information (e.g., objects, scene categories, material types, texture, etc.) be grounded and what should be its structure? Aspiring to have one unified structure that hosts diverse types of semantics, we follow the Scene Graph paradigm in 3D, generating a 3D Scene Graph. Given a 3D mesh and registered panoramic images, we construct a graph that spans the entire building and includes semantics on objects (e.g., class, material, and other attributes), rooms (e.g., scene category, volume, etc.) and cameras (e.g., location, etc.), as well as the relationships among these entities. However, this process is prohibitively labor heavy if done manually. To alleviate this we devise a semi-automatic framework that employs existing detection methods and enhances them using two main constraints: I. framing of query images sampled on panoramas to maximize the performance of 2D detectors, and II. multi-view consistency enforcement across 2D detections that originate in different camera locations.

CVNov 5, 2018
Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

Spyridon Bakas, Mauricio Reyes, Andras Jakab et al.

Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset.

CVAug 17, 2017
Conditional Adversarial Network for Semantic Segmentation of Brain Tumor

Mina Rezaei, Konstantin Harmuth, Willi Gierke et al.

Automated medical image analysis has a significant value in diagnosis and treatment of lesions. Brain tumors segmentation has a special importance and difficulty due to the difference in appearances and shapes of the different tumor regions in magnetic resonance images. Additionally, the data sets are heterogeneous and usually limited in size in comparison with the computer vision problems. The recently proposed adversarial training has shown promising results in generative image modeling. In this paper, we propose a novel end-to-end trainable architecture for brain tumor semantic segmentation through conditional adversarial training. We exploit conditional Generative Adversarial Network (cGAN) and train a semantic segmentation Convolution Neural Network (CNN) along with an adversarial network that discriminates segmentation maps coming from the ground truth or from the segmentation network for BraTS 2017 segmentation task[15, 4, 2, 3]. We also propose an end-to-end trainable CNN for survival day prediction based on deep learning techniques for BraTS 2017 prediction task [15, 4, 2, 3]. The experimental results demonstrate the superior ability of the proposed approach for both tasks. The proposed model achieves on validation data a DICE score, Sensitivity and Specificity respectively 0.68, 0.99 and 0.98 for the whole tumor, regarding online judgment system.