Using ML filters to help automated vulnerability repairs: when it helps and when it doesn't
This is an incremental improvement for automated program repair, potentially speeding up patch generation by integrating ML filters with testing oracles.
The paper tackles the problem of unreliable ML models in automated vulnerability repair by proposing an ML filter before traditional testing, identifying theoretical bounds on precision and recall to determine when this approach is effective.
[Context:] The acceptance of candidate patches in automated program repair has been typically based on testing oracles. Testing requires typically a costly process of building the application while ML models can be used to quickly classify patches, thus allowing more candidate patches to be generated in a positive feedback loop. [Problem:] If the model predictions are unreliable (as in vulnerability detection) they can hardly replace the more reliable oracles based on testing. [New Idea:] We propose to use an ML model as a preliminary filter of candidate patches which is put in front of a traditional filter based on testing. [Preliminary Results:] We identify some theoretical bounds on the precision and recall of the ML algorithm that makes such operation meaningful in practice. With these bounds and the results published in the literature, we calculate how fast some of state-of-the art vulnerability detectors must be to be more effective over a traditional AVR pipeline such as APR4Vuln based just on testing.