MA: Building a Framework for Vulnerability-Fixing Commits

Abstract:

Security advisories are employed for the purpose of disclosing vulnerabilities in software systems. However, the advisory does not always reference the commit that fixes the vulnerability, making it challenging to ascertain the cause. Linking an advisory to its associated vulnerability-fixing commit (VFC) provides crucial information that is useful for risk assessment. Various ranking-based approaches have been successfully employed to support the search for the VFC by ranking all commit candidates for an advisory according to their likelihood of fixing the issue. The results were enhanced by incorporating language models as additional features for the prediction. A comparison of these approaches is a challenging task due to the lack of training pipelines, the diversity of used training data, and other factors. This thesis presents two key contributions: the vfc-dataset project, which unifies 11 existing datasets and augments their entries with additional data required for the training of language models; and the vfc-franking-framework, which supports the development of ranking approaches by providing template implementations as well as benchmarks to evaluate novel and existing approaches. The framework’s capabilities are demonstrated by (re)implementing and evaluating two approaches on two large-scale benchmarks.