How StackAware analyzes a vendor's security and AI governance just using its website

Just how they talk can tell you A LOT.

Walter Haydock

Aug 26, 2024

Here is how I start my risk analysis of vendors for StackAware clients, using just the vendor’s website:

1. Check for vulnerability disclosure program (VDP)

Search on Google site:vendor_domain vulnerability (if there are a lot of results, you can add “disclosure”)
No program? That’s a issue right away because VDPs are free and show you want to collaborate with ethical hackers.
Sometimes these are hosted externally (e.g. on Bugcrowd, HackerOne, etc.) so look outside the vendor domain as well.

2. Check for security.txt file

Head to vendor_domain/.well-known/security.txt
Nothing there? Another adverse finding. Security.txt is free and is a standardized way to communicate your security policy and contacts.

3. Check terms of service

Many of these are boilerplate and contain red flags such as:

Threatening legal action or criminal prosecution to anyone scanning their web site for vulnerabilities (including ethical hackers). I refer to these terms as “irresponsible disclosure programs” because they are the exact opposite of a standard VDP. If you have one of these, the only people who will inform you of vulnerabilities are criminals seeking to extort you.
Forbidding linking to the site without permission. Although I’m not a lawyer, this:
- Just sounds ridiculous
- Doesn’t seem enforceable
- Discourages search engine optimization (SEO)

4. Check privacy policy

These things are a gold mine of information.

Do they claim the right to do basically anything with your data?
Is information hosted in a jurisdiction with poor protections?
Do they make bizarre claims unlikely to be true?

Dig deep here.

Refer a friend

5. Look for consumer health data privacy policy

Legislation like Washington State’s My Health My Data Act (MHMDA) effectively requires a separate policy specifically covering “consumer health data” (CHD). CHD is a far more expansive category of data than protected health information (PHI) as defined by the Health Insurance Portability and Accountability Act (HIPAA).

And the MHMDA applies to more than just HIPAA covered entities and business associates.

Considering the:

Broad definition of CHD;
Right of private action the MHMDA offers;
Fact it applies to anyone who “conducts business” in Washington State (which might include using a cloud provider hosted there);

Every potentially impacted company should have a health data policy in place.

6. Look for security or trust center

Having one of these is a green flag for sure, but I go deeper and look for things like:

Claiming compliance with standards in a way that doesn't make sense (e.g. non-DoD contractor claiming CMMC Level 5 certification)
Extremely boilerplate statements like “we protect your data with TLS encryption in transit.”
Using obvious buzzwords like “military grade” (I was in the military, and if we are talking about cybersecurity I’m wasn’t impressed).

7. Search for misclassified or exposed sensitive documents

If I had a dollar for every company that has a document labeled CONFIDENTIAL or INTERNAL USE ONLY posted publicly, I would be a moderately wealthier man.

This tells me the company is either:

actually exposing sensitive information without authentication.
not taking its data classification and labeling program seriously.

Either way, it’s a red flag.

I look for these with the Google query site:vendor_domain confidential

If that returns a lot of results, I narrow it down by adding filetype:pdf and then subsequently filetype:ppt and filetype:pptx to the query. Experience has shown that a disproportionate percentage of sensitive documents are in either PDF or PowerPoint format.

8. Query internet-facing storage

Next I’ll head to Grayhat Warfare and query the company’s name to see if they have anything juicy in an publicly-exposed:

AWS
Azure
GCP
Digital Ocean

cloud storage containers.

9. Examine AI processing

Finally, I will:

Google site:vendor_domain ai
This will show examples of product features leveraging AI (but also a lot of SEO-optimized garbage blog posts talking about AI in general).
If there is talk about AI features but nothing about security, retention, or training policies, that leads to me a second-level review.
Especially interesting to look at are videos talking about AI. I have reviewed vendors where executives imply they are:
- Indefinitely retaining customer data.
- Using sentiment analysis on customers without notice.
- Training on customer data, but not mentioning it in terms and conditions.

10. Check robots.txt file

Some companies may want to prevent AI vendors from training on their content. In any case they should have a robots.txt file that provides machine-readable instructions to bots and agents. Here is what to include if you want to block scraping.

How are you evaluating your vendors?

Security ratings tools can give you a general idea of a company’s cyber defenses.

But some quick recon of a vendor’s web site can tell you a whole lot more about how they approach security.

If you have any sort of leverage over them (e.g. are in a sales process) you can also use StackAware’s vendor data protection and AI processing guides to dig deeper.

But if you really want a full and continuously-updated picture, StackAware will do a comprehensive risk assessment of your entire AI supply chain:

Book a call

Deploy Securely

Discussion about this post