SEAug 31, 2018
The use of Charts, Pivot Tables, and Array Formulas in two Popular Spreadsheet CorporaBas Jansen, Felienne Hermans
The use of spreadsheets in industry is widespread. Companies base decisions on information coming from spreadsheets. Unfortunately, spreadsheets are error-prone and this increases the risk that companies base their decisions on inaccurate information, which can lead to incorrect decisions and loss of money. In general, spreadsheet research is aimed to reduce the error-proneness of spreadsheets. Most research is concentrated on the use of formulas. However, there are other constructions in spreadsheets, like charts, pivot tables, and array formulas, that are also used to present decision support information to the user. There is almost no research about how these constructions are used. To improve spreadsheet quality it is important to understand how spreadsheets are used and to obtain a complete understanding, the use of charts, pivot tables, and array formulas should be included in research. In this paper, we analyze two popular spreadsheet corpora: Enron and EUSES on the use of the aforementioned constructions.
SEMar 13, 2015
Enron versus EUSES: A Comparison of Two Spreadsheet CorporaBas Jansen
Spreadsheets are widely used within companies and often form the basis for business decisions. Numerous cases are known where incorrect information in spreadsheets has lead to incorrect decisions. Such cases underline the relevance of research on the professional use of spreadsheets. Recently a new dataset became available for research, containing over 15.000 business spreadsheets that were extracted from the Enron E-mail Archive. With this dataset, we 1) aim to obtain a thorough understanding of the characteristics of spreadsheets used within companies, and 2) compare the characteristics of the Enron spreadsheets with the EUSES corpus which is the existing state of the art set of spreadsheets that is frequently used in spreadsheet studies. Our analysis shows that 1) the majority of spreadsheets are not large in terms of worksheets and formulas, do not have a high degree of coupling, and their formulas are relatively simple; 2) the spreadsheets from the EUSES corpus are, with respect to the measured characteristics, quite similar to the Enron spreadsheets.