The “Uncommon download” warning reported in that post presumably originates from Google’s Download Protection feature in Chrome.
I don’t know Google’s internal systems, but I previously led the team building the AI behind the Application Reputation feature at Microsoft. This was likely the inspiration for Google’s similar feature (see the third-party test results before and after Google’s release).
Unfortunately, large companies can’t generally talk much about security features like these. So, I’ll share some context that may help untangle some of the concerns raised in this post.
The issue reported
The post, written by a game developer, includes screenshots showing Google is warning the developer that his website has an issue. It looks scary, including an apparently-overstated claim that “Google has detected harmful content.”
When digging down into the underlying issue, however, what Google found is much less ominous: the programs downloaded on the site are classified as “uncommon downloads” and users downloading them may receive a warning.
One obvious question this raises is: why would Chrome warn users just because a download is uncommon? I’ll get back to this.
The poster’s concern isn’t that people downloading the program may have to click through warnings, which is annoying but manageable; rather, he’s concerned that the site’s search rankings could be affected.
And he thinks the cost to resolve the issue will be over $1200 to distribute software he’s giving away for free, though he’s not even sure if that will solve the problem! This is the basis for the claim that Google is using its monopoly to stifle the distribution of free software.
Google doesn’t provide clear instructions on how to remove the alerts or details about whether they affect his site’s search rankings, and so this must seem very frustrating. Hence the post. Let’s untangle it.
Why would Google even warn users about uncommon downloads?
These warnings originate in an attempt to prevent people from getting tricked into downloading malicious programs.
As an example, a website claiming to have a funny video on it could show an error message and tell users to download a program to see the video. Before features like Download Protection, and Application Reputation in Windows, this sort of approach used to be the most common way people’s computers were infected.
Fortunately, these features work and that problem is close to solved. More on that in a bit.
Doesn’t anti-virus already solve this problem?
Anti-virus programs don’t work well. The effectiveness of leading anti-virus products is closer to a coin flip than a security guard deterring most intruders.
A senior manager at anti-virus firm Symantec once claimed anti-virus products only catch 45% of malware. Another news story claims researchers estimate less than 70% of threats are detected. Reports like these from reputable sources are rare because they’re awful for business, but they’re in the right ballpark.
This is a very different story than you’ll find with third-party tests in the anti-virus industry. AV-Test, the leading testing service, finds that the average anti-virus product stops 99.6% to 100% of all infections. That makes for great marketing but these numbers are incredibly misleading.
Why does every product look so good in these tests? The tests use horrible methodologies. The anti-virus companies share samples of malicious files they find with one another, and with the testing firms. These are then used to test the products—after they’ve all had a chance to detect them, and well after the bad guys have moved on.
On top of this, the testers always use the same computers to test, making it easy for companies to reverse-engineer the tests. Some respected companies get caught overtly cheating, though there’s enough sketchiness in the whole affair that it’s really not necessary to cheat.
Why do warnings on uncommon downloads work?
Bad guys can work with nearly any attack window they’re given. If it takes you a month to detect their malicious programs, they’ll use them for a month. If you give them an hour, they’ll make a bunch of programs and rotate in a new one each hour.
Finding bad stuff and blocking it is hard. Detecting everything instantly isn’t feasible.
What about identifying good programs? If we could reliably identify all the good programs out there, we could just block everything else. Unfortunately, finding all of the good programs is also infeasible.
However, it’s possible to identify the vast majority of good programs out there with high accuracy. This means the vast majority of times someone downloads an application, it’s known to be good and can run without any warnings. People learn to click through warnings if they happen all the time, anyway.
What about the small portion of downloads that remain? Well, virtually all of the malicious programs are in there, along with some good ones. The solution here is to use a strong warning that isn’t easy to click-through.
Motivated people can still run the program, but it catches their attention and encourages them to think for a second. If this happens rarely enough, it works. This approach led to a significant improvement in the virus landscape on Windows.
How does a new program ever get past the warnings?
People can click through the warnings and run the applications. If many people are doing this, it’s a strong signal that the program is legitimate. Of course, bad guys can also run their own programs, so artificial intelligence algorithms are needed.
Fortunately, these systems typically figure it out quickly. Most users actually don’t see a single one of these warnings in an entire year. The algorithms are good, even if imperfect.
Even a single download, if accompanied with other strong indications the file is legitimate, can be enough to remove the warnings. Or without corroboration, it can take more time and require more users to download the program.
If the developer only creates one program, this will almost always go away on its own over time. The biggest problem is when someone produces many infrequently-downloaded programs—or many versions of the same one. Each new program will have to go through this process of gaining reputation.
There is a straightforward way for developers of multiple small programs to get past these warnings, though it does cost some money. It doesn’t require the $1,219 claimed in the post that started this discussion, more like $69 per year.
Buying a certificate enables a developer to prove it’s the same person writing each new application. The system then learns the developer is legitimate, rather than that each application is legitimate. It still takes some time, but it can mean each new program avoids the warnings.
But isn’t this beside the point? Monopolies! Freedom!
What I’ve discussed so far is mostly just explanations of why people get messages like this and some insight into how developers can avoid the warnings. This hasn’t addressed the charge of a monopoly stifling the distribution of free software.
What Google is doing is not abusing monopoly power and suppressing freedom.
As an Internet user from well before Google existed, I recall when the only way to find content online was to browse category listings of sites and services. Being listed in Google’s search engine is an enormous free benefit Google provides.
Google, more than any other company in the world, enables people to find information (and programs!) on nearly any topic. Niche sites like the one in question would have no hope of reaching any audience online without services like the free browser and search engine Google provides. Even with the warnings.
Google could make the warning messages clearer, provide more context on how code-signing helps, and perhaps even improve their algorithms to make it less common for legitimate developers to experience these warnings.
But that said, let’s not lose sight of how Google helps people find niche content that wouldn’t be possible to find otherwise. Far from stifling the distribution of free content, Google is enabling it on an unprecedented scale.
Note: The author of the blog post referenced at the beginning of the article uses the handle byuu. I’m using “him” and “his” to make my writing more readable, but this is standing in for an unknown gender.