Enabling Inter-organizational Analytics in Business Networks Through Meta Machine Learning
This work addresses the challenge of data sharing and analytics in business networks, making it more effective by overcoming key obstacles, though it appears incremental as it builds on existing methods for distributed data.
The paper tackles the problem of enabling analytics across inter-organizational business networks where data is distributed and sensitive, proposing a meta machine learning method that preserves confidentiality and limits data transfer while outperforming isolated analyses and approaching the performance of a hypothetical full-data-sharing scenario.
Successful analytics solutions that provide valuable insights often hinge on the connection of various data sources. While it is often feasible to generate larger data pools within organizations, the application of analytics within (inter-organizational) business networks is still severely constrained. As data is distributed across several legal units, potentially even across countries, the fear of disclosing sensitive information as well as the sheer volume of the data that would need to be exchanged are key inhibitors for the creation of effective system-wide solutions -- all while still reaching superior prediction performance. In this work, we propose a meta machine learning method that deals with these obstacles to enable comprehensive analyses within a business network. We follow a design science research approach and evaluate our method with respect to feasibility and performance in an industrial use case. First, we show that it is feasible to perform network-wide analyses that preserve data confidentiality as well as limit data transfer volume. Second, we demonstrate that our method outperforms a conventional isolated analysis and even gets close to a (hypothetical) scenario where all data could be shared within the network. Thus, we provide a fundamental contribution for making business networks more effective, as we remove a key obstacle to tap the huge potential of learning from data that is scattered throughout the network.