A few months ago, Jonathan Spring wrote an article called “Probably Don’t Rely on EPSS Yet.” From the title, you can probably guess what the subject is, and in this post, I’ll (belatedly) analyze his critiques of the Exploit Prediction Scoring System (EPSS).
If you’ve read my previous post on the EPSS, you’ll know that I described it as a “generally good tool for evaluating CVE exploitability,” and I stand by that relatively conservative endorsement. This also constitutes the bottom line of my response to Jonathan’s article: the EPSS is a good (and free) tool for doing what it claims to do.
Although I initially viewed some of Jonathan’s critiques as straw men, I’ll give him the benefit of the doubt based on his stated purpose of evaluating the “pros and cons” of the model. I’ll tackle the salient topics one at a time, responding to his exact words:
Risk management
EPSS cannot replace a vulnerability analysis or risk management process and should not be used by itself.
Agreed, but FIRST never said you should rely on EPSS alone. EPSS only gives you half of the equation for risk (likelihood). The other half (impact) cannot be provided through any broad standard like EPSS (or CVSS, for that matter) and requires a business impact analysis. The Deploy Securely Risk Assessment Model (DSRAM) combines both aspects of risk into a single output.
With that said, if I had to pick any single, free metric for quickly prioritizing CVEs, I would pick the EPSS. Since resources are always limited and vulnerabilities are numerous, having a machine-readable method for predicting exploitations is a key tool for any organization. And while you might exclude impact analysis as a result, it’s better to be half-right than it is to be flying completely blind.
Applicability or lack thereof
EPSS uses machine learning to predict exploitation probabilities for each CVE ID (CVE IDs provide identifiers for vulnerabilities in IT products or protocols). This reliance on pre-existence of a CVE ID is one reason why EPSS is not useful to software suppliers, CSIRTs, and many bug bounty programs.
While I agree that it would be awesome if the EPSS applied to more than just CVEs and have advocated for the addition of a feature that allows for scoring of non-CVEs, the model doesn’t do this and never has claimed to.
From personal experience at multiple software suppliers, the majority of vulnerabilities identified as part of the development lifecycle are CVEs. Understanding whether or not they are exploitable through analysis of the interaction between first- and third-party code is a major undertaking and can require huge amounts of time. While I fully admit EPSS is not perfect, even a directionally accurate estimate of whether or not a given CVE is going to be exploited is hugely useful when dealing with hundreds or thousands of them.
Finally, if I were a bug bounty hunter, I would absolutely uses the EPSS to target my efforts more effectively. For example, one could run a Shodan scan of a target network to identify all internet-facing CVEs, lookup their EPSS scores, and then attempt to exploit them in descending order of priority. Many bounty programs exclude the mere report of a CVE from their rules of engagement and require proof of exploitability. EPSS is great way for ethical hackers to identify the most likely candidates. Unethical hackers unfortunately can do the same.
Opacity
EPSS calls itself an “open, data-driven effort”—but it is only open in the sense that anyone can come and ask questions during the meetings to the handful of people who actually have access to the code and data for producing the scores.
This is a valid critique. I certainly think it makes sense for FIRST to eventually release the model’s code so that others can evaluate it.
With that said, it is a volunteer effort, so every marginal step to be more transparent could would take away from other initiatives. I think the federal government should consider giving FIRST funding to formalize this effort.
I also think that releasing anonymized data (not including the victim or anything about it) regarding real-world exploitations would be hugely helpful to the security community at large. While the EPSS data partners might be hesitant to freely give away their intellectual property, I would view such a move as an excellent marketing tool.
Very exploitable vulnerabilities
Furthermore, zero-day vulnerabilities may get a CVE ID upon publication and disclosure, but a zero day is almost always published because it is widely known to be exploited. The EPSS FAQ clarifies that vulnerabilities widely known to be exploited are out of scope for EPSS…It is useful as long as the organization is mature enough that it can distinguish and has capacity to address vulnerabilities that are “just below the obvious” threats of widely exploited vulnerabilities and the EPSS data provenance matches the organization (see below).
This point confused me substantially, as I do not know what Jonathan is referring to here. I think his statement that “vulnerabilities widely known to be exploited are out of scope for EPSS” is mistaken. The EPSS FAQ (published after his initial version, which he subsequently updated) does a good job explaining why even vulnerabilities known to be exploited have a score of less than 100%.
Environment details
No organization should assume that its environment matches the data used to train EPSS.
Agreed again, but I am not aware of anyone doing this. EPSS is an “all other things equal” estimate of the likelihood of exploitation of a given CVE. It’s basically impossible for such a model to incorporate data from a specific environment when making predictions that need to applicable to every conceivable system.
With that said, EPSS is still far better than existing systems like CVSS, which don’t even attempt to include information about real-world exploitation in the score (assuming you pull the scores from the National Vulnerability Database). Although I admit that you can customize the reading with the temporal and environmental groups, this is not possible to do in an automated or machine-readable manner (to my knowledge).
KEV comparison
For example, there are several CVE IDs in CISA’s known exploited vulnerabilities [KEV] list with low EPSS scores, and there are plenty of CVE IDs with high EPSS scores not in that list. People seem to think that this discrepancy means one or the other is wrong. Actually, it probably does not inform rightness or wrongness about either. The discrepancy might be telling us that attackers use different methods to attack the organizations in CISA’s constituency than they use to attack AlienVault’s and Fortinet’s constituency.
While this isn’t Jonathan’s critique per se, it is important to understand another reason there might be something on CISA’s KEV list that would have a low EPSS score, mainly that the former explicitly disclaims trying to measure the only thing the latter claims to measure. According to CISA itself, “a vulnerability's exploitability is not considered as criteria for inclusion in the KEV catalog. Rather, the main criteria for KEV catalog inclusion, is whether the vulnerability has been exploited or is under active exploitation.”
Thus, the KEV provides a binary yes/no answer to the question “has a vulnerability ever been exploited.” It explicitly excludes any information about whether or not the vulnerability is likely to be exploited in the future. Thus, a vulnerability that was exploited one time, but has a very simple compensating control (e.g. firewall rule) that basically everyone applies going forward, would correctly appear on the KEV list but have a low EPSS score.
(Edit, 12 May 2023) Finally, the EPSS leverages continuously-updated exploitation data from intrusion detection systems in the wild, which represents many more data points than just the KEV list (which in any case, is included in the EPSS model). As of version 3, this included 12,243 unique vulnerabilities being exploited. Upon release of EPSS v3, the CIA KEV had 866 CVEs.
Difficulty of remediation
You also might want to know how expensive it will be to remediate the CVE ID.
Agreed again, but this is outside of the scope of any reasonable critique of the EPSS (or any vulnerability rating system, for that matter). Determining the cost of a control is of course part of a sound risk management calculus, but this is not something any vulnerability rating system can deliver.
Conclusion
My biggest question to critics of the EPSS is “well, how do you prioritize known vulnerabilities?”
Those who are fans of CVSS will point out that it scales well, to which I certainly agree. But if you scale something that is deeply flawed (see this post for details), then you are just making the problem worse. Others make the more honest argument that they use it because everyone else does too. But this isn’t a good reason by itself to use something.
For those who aren’t adherents of CVSS, most of the answers I get involve something to the effect of needing to take the individual context of each vulnerability into account when making decisions. For a single vulnerability, this is a fair approach, but there is no way to scale it to the level necessary for real-world operations.
Assuming you spend one man-minute (an implausibly short amount of time) on each of 130,000 known vulnerabilities in your network (a reasonable number for a medium-sized organization), then you are dedicating more than a single person’s entire working year (2000 hours/year x 60 minutes per hour = 120,000 minutes) just on evaluating them. Furthermore, non-quantitative approaches such as this suffer from a range of biases and are difficult to communicate to others.
As another option, the organization that published Jonathan’s article - the Software Engineering Institute at Carnegie Mellon University - has what might be considered a competing framework in the form of the “Stakeholder-Specific Vulnerability Categorization (SSVC),” which Jonathan mentions. I’ll also consider an analysis of the SSVC for a future post, but must admit that I was initially deterred by the fact that the model is described in a 68-page white paper. Using a series of decision trees to drive action, it would also appear to me to also be difficult to scale, but I’ll reserve judgement for now.
And finally, while there are some commercial vulnerability prioritization tools out there (which I have not used), they obviously cost money and probably have some of the same issues (opacity, etc) that Jonathan identifies with EPSS. Not everyone can afford these, either.
Thus, in the end, most of us are left with the imperfect but usable EPSS.