MSMar 27, 2015
ALGORITHM 937: MINRES-QLP for Singular Symmetric and Hermitian Linear Equations and Least-Squares ProblemsSou-Cheng T. Choi, Michael A. Saunders
We describe algorithm MINRES-QLP and its FORTRAN 90 implementation for solving symmetric or Hermitian linear systems or least-squares problems. If the system is singular, MINRES-QLP computes the unique minimum-length solution (also known as the pseudoinverse solution), which generally eludes MINRES. In all cases, it overcomes a potential instability in the original MINRES algorithm. A positive-definite preconditioner may be supplied. Our FORTRAN 90 implementation illustrates a design pattern that allows users to make problem data known to the solver but hidden and secure from other program units. In particular, we circumvent the need for reverse communication. While we focus here on a FORTRAN 90 implementation, we also provide and maintain MATLAB versions of MINRES and MINRES-QLP.
NANov 17, 2016
Local Adaption for Approximation and Minimization of Univariate FunctionsSou-Cheng T. Choi, Yuhan Ding, Fred J. Hickernell et al.
Most commonly used \emph{adaptive} algorithms for univariate real-valued function approximation and global minimization lack theoretical guarantees. Our new locally adaptive algorithms are guaranteed to provide answers that satisfy a user-specified absolute error tolerance for a cone, $\mathcal{C}$, of non-spiky input functions in the Sobolev space $W^{2,\infty}[a,b]$. Our algorithms automatically determine where to sample the function---sampling more densely where the second derivative is larger. The computational cost of our algorithm for approximating a univariate function $f$ on a bounded interval with $L^{\infty}$-error no greater than $\varepsilon$ is $\mathcal{O}\Bigl(\sqrt{{\left\|f"\right\|}_{\frac12}/\varepsilon}\Bigr)$ as $\varepsilon \to 0$. This is the same order as that of the best function approximation algorithm for functions in $\mathcal{C}$. The computational cost of our global minimization algorithm is of the same order and the cost can be substantially less if $f$ significantly exceeds its minimum over much of the domain. Our Guaranteed Automatic Integration Library (GAIL) contains these new algorithms. We provide numerical experiments to illustrate their superior performance.
CVFeb 5, 2018
Real-time Prediction of Intermediate-Horizon Automotive Collision RiskBlake Wulfe, Sunil Chintakindi, Sou-Cheng T. Choi et al.
Advanced collision avoidance and driver hand-off systems can benefit from the ability to accurately predict, in real time, the probability a vehicle will be involved in a collision within an intermediate horizon of 10 to 20 seconds. The rarity of collisions in real-world data poses a significant challenge to developing this capability because, as we demonstrate empirically, intermediate-horizon risk prediction depends heavily on high-dimensional driver behavioral features. As a result, a large amount of data is required to fit an effective predictive model. In this paper, we assess whether simulated data can help alleviate this issue. Focusing on highway driving, we present a three-step approach for generating data and fitting a predictive model capable of real-time prediction. First, high-risk automotive scenes are generated using importance sampling on a learned Bayesian network scene model. Second, collision risk is estimated through Monte Carlo simulation. Third, a neural network domain adaptation model is trained on real and simulated data to address discrepancies between the two domains. Experiments indicate that simulated data can mitigate issues resulting from collision rarity, thereby improving risk prediction in real-world data.
SEFeb 6, 2016
Report on the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3)Daniel S. Katz, Sou-Cheng T. Choi, Kyle E. Niemeyer et al.
This report records and discusses the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3). The report includes a description of the keynote presentation of the workshop, which served as an overview of sustainable scientific software. It also summarizes a set of lightning talks in which speakers highlighted to-the-point lessons and challenges pertaining to sustaining scientific software. The final and main contribution of the report is a summary of the discussions, future steps, and future organization for a set of self-organized working groups on topics including developing pathways to funding scientific software; constructing useful common metrics for crediting software stakeholders; identifying principles for sustainable software engineering design; reaching out to research software organizations around the world; and building communities for software sustainability. For each group, we include a point of contact and a landing page that can be used by those who want to join that group's future activities. The main challenge left by the workshop is to see if the groups will execute these activities that they have scheduled, and how the WSSSPE community can encourage this to happen.
LGOct 3, 2015
Machine Learning for Machine Data from a CATI NetworkSou-Cheng T. Choi
This is a machine learning application paper involving big data. We present high-accuracy prediction methods of rare events in semi-structured machine log files, which are produced at high velocity and high volume by NORC's computer-assisted telephone interviewing (CATI) network for conducting surveys. We judiciously apply natural language processing (NLP) techniques and data-mining strategies to train effective learning and prediction models for classifying uncommon error messages in the log---without access to source code, updated documentation or dictionaries. In particular, our simple but effective approach of features preallocation for learning from imbalanced data coupled with naive Bayes classifiers can be conceivably generalized to supervised or semi-supervised learning and prediction methods for other critical events such as cyberattack detection.
SEJul 7, 2015
Report on the Second Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2)Daniel S. Katz, Sou-Cheng T. Choi, Nancy Wilkins-Diehr et al.
This technical report records and discusses the Second Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2). The report includes a description of the alternative, experimental submission and review process, two workshop keynote presentations, a series of lightning talks, a discussion on sustainability, and five discussions from the topic areas of exploring sustainability; software development experiences; credit & incentives; reproducibility & reuse & sharing; and code testing & code review. For each topic, the report includes a list of tangible actions that were proposed and that would lead to potential change. The workshop recognized that reliance on scientific software is pervasive in all areas of world-leading research today. The workshop participants then proceeded to explore different perspectives on the concept of sustainability. Key enablers and barriers of sustainable scientific software were identified from their experiences. In addition, recommendations with new requirements such as software credit files and software prize frameworks were outlined for improving practices in sustainable software engineering. There was also broad consensus that formal training in software development or engineering was rare among the practitioners. Significant strides need to be made in building a sense of community via training in software and technical practices, on increasing their size and scope, and on better integrating them directly into graduate education programs. Finally, journals can define and publish policies to improve reproducibility, whereas reviewers can insist that authors provide sufficient information and access to data and software to allow them reproduce the results in the paper. Hence a list of criteria is compiled for journals to provide to reviewers so as to make it easier to review software submitted for publication as a "Software Paper."
SEApr 29, 2014
Summary of the First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE1)Daniel S. Katz, Sou-Cheng T. Choi, Hilmar Lapp et al.
Challenges related to development, deployment, and maintenance of reusable software for science are becoming a growing concern. Many scientists' research increasingly depends on the quality and availability of software upon which their works are built. To highlight some of these issues and share experiences, the First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE1) was held in November 2013 in conjunction with the SC13 Conference. The workshop featured keynote presentations and a large number (54) of solicited extended abstracts that were grouped into three themes and presented via panels. A set of collaborative notes of the presentations and discussion was taken during the workshop. Unique perspectives were captured about issues such as comprehensive documentation, development and deployment practices, software licenses and career paths for developers. Attribution systems that account for evidence of software contribution and impact were also discussed. These include mechanisms such as Digital Object Identifiers, publication of "software papers", and the use of online systems, for example source code repositories like GitHub. This paper summarizes the issues and shared experiences that were discussed, including cross-cutting issues and use cases. It joins a nascent literature seeking to understand what drives software work in science, and how it is impacted by the reward systems of science. These incentives can determine the extent to which developers are motivated to build software for the long-term, for the use of others, and whether to work collaboratively or separately. It also explores community building, leadership, and dynamics in relation to successful scientific software.