Read the whole event log.
If you were behind Cloudflare and it was proxying sensitive data (the contents of HTTP POSTs, &c), they've potentially been spraying it into caches all across the Internet; it was so bad that Tavis found it by accident just looking through Google search results.
The crazy thing here is that the Project Zero people were joking last night about a disclosure that was going to keep everyone at work late today. And, this morning, Google announced the SHA-1 collision, which everyone (including the insiders who leaked that the SHA-1 collision was coming) thought was the big announcement.
Nope. A SHA-1 collision, it turns out, is the minor security news of the day.
This is approximately as bad as it ever gets. A significant number of companies probably need to compose customer notifications; it's, at this point, very difficult to rule out unauthorized disclosure of anything that traversed Cloudflare.
Considering the amount and sensitivity of the data they handle, I'm not sure a t-shirt is an appropriate top-tier reward.
There's an argument for changing secrets (user passwords, API keys, etc.) for potentially affected sites, plus of course investigating logs for any anomalous activity. It would be nice if there were a guide for affected users, maybe a supplemental blog post.
(and yet again: thank you Google for Project Zero!)
Step 2) leak cleartext from said MITM'd connections to the entire Internet
I recently noted that in some ways Cloudflare are probably the only entity to have ever managed to cause more damage to popular cryptography since the 2008 Debian OpenSSL bug (thanks to their "flexible" ""SSL"" """feature"""), but now I'm certain of it.
"Trust us" doesn't fly any more, this simply isn't good enough. Sorry, you lost my vote. Not even once
edit: why the revulsion? This bug would have been caught with valgrind, and by the sounds of it, using nothing more complex than feeding their httpd a random sampling of live inputs for an hour or two
It's also a bit sad that travis has to contact cloudflare by twitter. Seriousy?
Edit: https://twitter.com/taviso/status/832744397800214528 is the tweet in question
Great, that makes me feel so much better! I'm sorry, don't try to put a cherry on the top when you've just leaked PII and encrypted communications.
Additionally, most vendors in the industry aren't deployed in front of quite as much traffic as CloudFlare is. It's a miracle that ProjectZero managed to find the issue.
I guess this confirms a few things.
- The complete query strings are logged,
- They don't appear to be too concerned with who accesses the logs internally or have a process that limits the access, and
- They're willing to send those logs out to a random person.
The examples we're finding are so bad, I cancelled some
weekend plans to go into the office on Sunday to help
build some tools to cleanup. I've informed cloudflare
what I'm working on. I'm finding private messages from
major dating sites, full messages from a well-known
chat service, online password manager data, frames from
adult video sites, hotel bookings. We're talking full
https requests, client IP addresses, full responses,
cookies, passwords, keys, data, everything.
Cloudflare pointed out their bug bounty program, but I
noticed it has a top-tier reward of a t-shirt.
Cloudflare did finally send me a draft. It contains an
excellent postmortem, but severely downplays the risk
to customers.
Where would you even start to address this? Everything you've been serving is potentially compromised, API keys, sessions, personal information, user passwords, the works.
You've got no idea what has been leaked. Should you reset all your user passwords, cycle all or your keys, notify all your customers that there data may have been stolen?
My second thought after relief was the realization that even as a consumer I'm affected by this, my password manager has > 100 entries what percentage of them are using CloudFlare? Should I change all my passwords?
What an epic mess. This is the problem with centralization, the system is broken.
I'm assuming I need to change my passwords on a significant number of sites. So far none of them have alerted me to a potential breach. Would love to have a head start.
This is precisely why. The only thing that surprises me about this, is that it was an accidental disclosure rather than a breach. Other than that, this was completely to be expected.
EDIT: Also, this can't be repeated enough: EVERYBODY IS AFFECTED. Change your passwords, everywhere, right now. Don't wait for vendors to notify you.
Anything could have irrevocably leaked, and you have no way of knowing for sure, so assume the worst.
But was the leaked data similarly limited to only the sites with the features enabled? Or could it have come from any request - even an entirely unrelated site?
> The examples we're finding are so bad, I cancelled some weekend plans to go into the office on Sunday to help build some tools to cleanup. I've informed cloudflare what I'm working on. I'm finding private messages from major dating sites, full messages from a well-known chat service, online password manager data, frames from adult video sites, hotel bookings. We're talking full https requests, client IP addresses, full responses, cookies, passwords, keys, data, everything.
This is huge.
I mean, seriously, this is REALLY HUGE.
The modern web requires a paranoid attitude.
Can someone explain in simpler terms what happened here and how it a) affects sites using Cloudflare and b) Users accessing sites with Cloudflare?
Add the following to your hosts file to bypass Cloudflare and access HN directly:
50.22.90.248 news.ycombinator.com
However, I really want to say I am absolutely impressed with both Project Zero AND Cloudflare on so many fronts, from clarity of communication, to collaboration, and rapid response. So many other organizations would have absolutely tanked when presented with this problem. Huge kudos for CF guys understanding the severity and aligning resources to make the fixes.
In terms of P0 and Tavis though, holy crap. Where the heck would we be without these guys? Truly inspiring !
If you wanted to pay to DDoS a site, search for "booter" and you'll get a list of sites that will take another site off the internet for money with a flood of traffic.
quezstresser.com webstresser.co topbooter.co instabooter.com booter.xyz critical-boot.com top10booters.com betabooter.com databooter.com
etc. etc. - from the first 30 results I could find 2 booter sites that weren't hosted by Cloudflare.
But hey, pay Cloudflare and your site too can be safe from DDoS attacks...
1) From the metrics I recalled when I interviewed there, and assuming the given probability is correct, that means a potential of 100k-200k paged with private data leaked every day.
2) What's the probably that a page is served to a cache engine? Not a clue. Let's assume 1/1000.
3) That puts a bound around a hundred leaked pages saved per day into caches.
4) Do the cache only provide the latest version of a page? I think most do but not all. Let's ignore that aspect.
5) What's the probably that a page contains private user information like auth tokens? Maybe 1/10?
6) So, that's 10 pages saved per day into the internet search caches.
7) That's on par with their announcement: "With the help of Google, Yahoo, Bing and others, we found 770 unique URIs that had been cached and which contained leaked memory. Those 770 unique URIs covered 161 unique domains." Well, not that we know for how long this was running.
8) Now, I don't want to downplay the issue, but leaking an dozen tokens per day is not that much of a disaster. Sure it's bad, but it's not remotely close to the leak of the millennia and it's certainly not internet scale leak.
9) For the record, CloudFlare serves over one BILLION human beings. Given the tone and the drama I expected way more data from this leak. This is a huge disappointment.
Happy Ending: You were probably not affected.
/* generated code */
if ( ++p == pe )
goto _test_eof;
With the help of Google, Yahoo, Bing and others, we found 770 unique URIs that had been cached and which contained leaked memory. Those 770 unique URIs covered 161 unique domains. The examples in the report shows Uber, okcupid , etc. It would be good to know the full list, to know what password might have been compromised.
https://blog.cloudflare.com/incident-report-on-memory-leak-c...
"@taviso their post-mortem indicates this would've been exploitable only 4 days prior to your initial contact. Is that info invalid?" - https://twitter.com/pmoust/status/834916647873961984
"@pmoust Yes, they worded it confusingly. It was exploitable for months, we have the cached data." - https://twitter.com/taviso/status/834918182640996353
Every time I see a dev trying to parse HTML with a custom solution or regex or anything other than a proven OSS library designed to parse HTML I recoil reflexively. Sure, maybe you don't need a parser to see if that strong tag is properly closed but the alternative is ...
Plaintext?
Will Cloudflare be explicitly notifying customers about whether data from their site could have been leaked by this bug?
OKCupid
Uber
people claiming 1Password, can't find
Lyft
Yelp
Pingdom
Digital Ocean
Montecito Bank and Trust
Sorry I hate to just be a coach commentator. Obviously hindsight is 20/20. Still I think there's a lesson here.
How comforting!
Unless I'm mistaken, CloudFlare's services necessarily require they act as a MITM. Would it be possible or practical change the DDoS protection service such that it uses an agent on the customer's end (the CF customer) that relays relevant data to CF, instead of having CF MITM all data?
As it is now, we have:
End user <-> CF MITM to inspect packet data <-> CF Customer site
where CF uses the data discovered through MITM (and other metadata such as IP) to determine if the end user is a bad actor.What if we, instead, had something like:
End user <-> CF TCP proxy <-> CF Customer site
^ |
| v
CF decision agent <-- CF metadata ingest
The CF captive portal would not work with this but they could still shut down regular ol boring TCP DDoSes.
CloudFlare has multiple SSL configurations:
> Flexible SSL: There is an encrypted connection between your website visitors and Cloudflare, but not from Cloudflare to your server.
> Full SSL: Encrypts the connection between your website visitors and Cloudflare, and from Cloudflare to your server
(I'll add Full SSL mode still involves CloudFlare terminating SSL (decrypting) before re-encrypting to communicate to your server)
If I am running in Full SSL mode, is (or was) my data vulnerable to being leaked?
The full list is available for download here (23mb) https://github.com/pirate/sites-using-cloudflare/raw/master/...
I will be updating it as I find more domains.
Hopefully people will learn something from today.
Because you're correct: if CF's info sec team is "very very good at their jobs", how did this incident happen?
Guessing a lot of credit card details are ripe for picking in the data they leaked.
Run WHOIS on them, it's almost 100% behind Cloudflare: https://www.google.com/#q=ddos+booter
I would be less concerned about the fact that Cloudflare is spraying private data all over the internet if people weren't being coerced into it by a racket.
We won't have a decentralized web anymore if this keeps going. The entire internet will sit behind a few big CDNs and spray private data through bugs and FISA court wire taps. God help us all if this happens.
Some possible queries: "CF-Int-Brand-ID", nginx-cache "Certisign Certificadora Digital",
Once you find one, you can look through the results for unusual strings/headers which you can use to find more results.
Many results have clearly been removed from Google's cache, but.. many also have not.
Found my bank's site on it. :(
So, I clicked on that - and I get a 500 error from NGINX.
My guess is that a lot of services are going to be overwhelmed by the sheer volume of password reset requests, thus preventing users from resetting their passwords.
1. Recognition on our Hall of Fame.
2. A limited edition CloudFlare bug hunter t-shirt. CloudFlare employees don't even have this shirt. It's only for you all. Wear it with pride: you're part of an exclusive group.
3. 12 months of CloudFlare's Pro or 1 month of Business service on us.
4. Monetary compensation is not currently offered under this program.
Guessing they're gonna reconsider #4 at this point.
We're definitely doomed to repeat the same mistakes over and over.
What is the optimal solution???
Cloudflare is MitM by design. Chrome and others must not tolerate it. This vulnerability is just another reason to do it asap.
[1] https://www.rust-lang.org/en-US/ [2] Self declared rust fanboy
I will refrain from any criticism of Cloudflare and what I think about this because they're going through hell as it is. But everyone else is fair game. The higher a level of service you centralize, the more you stand to lose.
X-Uber-token:
X-Uber-latitude:
... "
That's a crapton of keys.
An experienced Ragel programmer would know that when you start setting the EOF pointer you are enabling code paths that never executed before. Like, potentially buggy ones. Eek!
Also, mono-cultures have always been a very bad idea, not just in agriculture.
This is what CloudBleed looks like, in the wild: https://gfycat.com/ElatedJoyousDanishswedishfarmdog
A random HTTP request's data and other data injected into an HTTP response from Cloudflare.
Sick.
> could you tell us why a lot of people had to re-authenticate their Google accounts on their devices all of the sudden? It may not have been related, but Google definitely did something that had us all re-authenticate.
I too had to reauthenticate and was very worried because it was first time I had to do this, I thought something bad happened with my account and it was very suspicious.
[0]: https://scotthelme.co.uk/tls-conundrum-and-leaving-cloudflar...
Hosters like Hetzner, OVH have for a year now offered DDOS protection (I'm guessing it's heuristic rate limiting, but they won't tell details b/c that would make it trivial to workaround it, so they say). Could someone characterize their offering and tell me if it's any good?
To those spinning a story against C programming here: it is entirely possible (trivial, even) to isolate address spaces between requests, and has been for like 25 years (CGI programming) and more. When you absolutely must use a long running, single-address space service container, OpenBSD's httpd shows how to do it right (goes to great lengths to randomize/re-initialize memory etc.). I agree, though, that using straight C isn't a good choice for the latter.
When I was evaluating CF for a small personal app, I really thought hard about using a public reverse proxy and decided that it wasn't worth it for the scale I was dealing with. No one can predict these security issues, but I sure am glad I didn't go with them!
"We were working to disclose the bug as quickly as possible, but wanted to clean up search engine caches before it became public because we felt we had a duty of care to ensure that this private information was removed from public view. We were comfortable that we had time as Google Project Zero initially gave us a 90 day disclosure window (as can still be seen in their incident tracker), however after a couple of days, they informed us that they felt that 7 days was more appropriate. Google Project Zero ended up disclosing this information after only 6 days."
Had this proxy been written in nearly any other language it wouldn't have had this vulnerability, like so many similar vulnerabilities.
Using ML or Rust or Java or whatever doesn't magically make all vulnerabilities disappear but it sure makes those that are intrinsic to C disappear. And that's not just a few.
There is just no excuse.
> About a year ago we decided that the Ragel-based parser had become too complex to maintain and we started to write a new parser, named cf-html, to replace it. This streaming parser works correctly with HTML5 and is much, much faster and easier to maintain.
I'd assume that at this point, customers would like to have a little more than a vague promise.
And now off to resetting a lots of password and checking where OTPs are possible.
In the wider world the word "leak" doesn't mean memory access patterns, it means deliberate sabotage.
The headline in "The Verge" is "Password and dating site messages leaked by internet giant Cloudflare". That's technically correct too, but also gives completely the wrong message.
Simpler, proactive messaging from Cloudfront might have helped here.
I looked on the lastpass blog (s/www/blog/), nothing about this. Is it just too early?
I'm trying to figure out how bad this is; and a part from the exchanges I'm using which other sensitive sites are concerned.
Not sure what to make of it - the low number of domains affected.
====================================
In our review of these third party caches, we discovered data that had been exposed from approximately 150 of Cloudflare's customers across our Free, Pro, Business, and Enterprise plans. We have reached out to these customers directly to provide them with a copy of the data that was exposed, help them understand its impact, and help them mitigate that impact.
Fortunately, your domain is not one of the domains where we have discovered exposed data in any third party caches. The bug has been patched so it is no longer leaking data. However, we continue to work with these caches to review their records and help them purge any exposed data we find. If we discover any data leaked about your domains during this search, we will reach out to you directly and provide you full details of what we have found.
I'm not lazy, it's just overwhelming trying to figure out what's actually going on with all these comments...
At least change the cookie name so the token stops working. For example, in ASP.NET - change the "forms-auth" name in the web.config file. etc etc.
and some chap did it anyways. yay, i guess.
> and even plaintext API requests from a popular password manager that were sent over https (!!).
lpass ls | egrep -o '[a-z]+\.[a-z]+' | sort > mydomains.sorted
sort sorted_unique_cf.txt > cf_really_sorted
comm -12 mydomains.sorted cf_really_sorted
It's not perfect (since it will only look at the lastpass item description, not the actual URL, and will only match foo.tld type domains), but it still found a number of domains for me
This will put the final lid on cloudflare anyhow. Sticking with AWS.
The first principle was security: The principle that every syntactically incorrect program should be rejected by the compiler and that every syntactically correct program should give a result or an error message that was predictable and comprehensible in terms of the source language program itself. Thus no core dumps should ever be necessary. It was logically impossible for any source language program to cause the computer to run wild, either at compile time or at run time. A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to - they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law.
-- Turing Award lecture 1981
Comments: