CloudFlair: Bypassing Cloudflare using Internet-wide scan data

Cloudflare is a service that acts as a middleman between a website and its end users, protecting it from various attacks. Unfortunately, those websites are often poorly configured, allowing an attacker to entirely bypass Cloudflare and run DDoS attacks or exploit web-based vulnerabilities that would otherwise be blocked. This post demonstrates the weakness and introduces CloudFlair, an automated detection tool.

 

A cloud hiding the sun from plain sight. Photo from Pixabay.

Cloudflare allows websites to protect against all sorts of attacks. It can also act as a Web Application Firewall (WAF) to block the exploitation of web-based vulnerabilities such as XSS and SQL injections. It gained even more traction recently by announcing unmetered mitigation of DDoS attacks: Cloudflare is essentially stating they will protect their customers against DDoS attacks of any scale without charging any extra, no matter what pricing plan they are on (including the free one).

In the past few weeks, I found that multiple websites using Cloudflare were misconfigured, and allowed an attacker to bypass any Cloudflare protection in place easily. Several of the companies behind these websites had more than 1 million users and were among the top companies of their market segment.

Some of the vulnerable companies had a public bug bounty program

 

To be clear, this article is not talking about a vulnerability in the Cloudflare service itself, but rather about a configuration mistake commonly made by website owners protecting their website with Cloudflare.

Note: A few minutes before publishing this article, someone brought to my attention that a similar piece had been written a few months ago. I am publishing it nonetheless because I believe it can help raise awareness on the topic (and I spent too much time writing it anyway).

Background: protecting a website with Cloudflare

Here’s what a typical request flow looks like for a website which is not protected by Cloudflare.

  1. The user contacts the DNS server of the website’s hosting provider, and asks for the IP of example.com
  2. The DNS server responds with the IP of the web server hosting example.com (e.g., 93.184.216.34)
  3. The user makes an HTTP request to that web server
  4. The web server responds with the web page
Typical request flow for a website not protected by Cloudflare

Since anyone can directly access the web server hosting example.com, an attacker can potentially harm the website by running a DDoS attack against it. If the infrastructure behind example.com is not large enough to absorb or block the traffic, the site can be completely knocked out.

When a company (or individual) decides to use Cloudflare to protect its website, it

  1. goes to its domain registrar, and sets the DNS servers to Cloudflare’s (e.g., kim.ns.cloudflare.com) ;
  2. sets up its Cloudflare account to work with the domain name (e.g., mycompany.com).

Now, when a user accesses mycompany.com, the following happens.

  1. The user contacts the DNS server kim.ns.cloudflare.com, and asks for the IP of mycompany.com
  2. The DNS server responds with the IP of an intermediary Cloudflare server (e.g., 104.16.109.208)
  3. The user makes an HTTP request to this server
  4. Cloudflare checks the legitimacy of the request (presence of malicious-looking content, source IP address, in addition to other factors), and decides whether to let the request pass through or block it
  5. If Cloudflare chooses to allow the request to pass through, it forwards it to the real web server responsible for mycompany.com (e.g., 188.226.197.73). This server is commonly called the origin server.
Request flow for a website using Cloudflare

In theory, an attacker can not access the origin server directly, and in particular, does not know its IP address. However, this protection relies on the origin server being only accessible through Cloudflare.

For this protection to work, an attacker must not be able to access the origin server directly. Otherwise, it can just contact the origin server without passing through Cloudflare, and bypass any protection.

Exposed origin servers

The proper way an origin server behind Cloudflare should behave is only to accept traffic coming from Cloudflare’s IP ranges. However, many origin servers are gladly taking incoming traffic from any source. I believe this is partly due to the lack of emphasis on this issue in Cloudflare’s documentation. The closest thing I found in the docs lies in a page entitled Recommended First Steps for all Cloudflare users:

Step 1: Whitelist Cloudflare’s IP addresses

Once you’ve changed your name servers to Cloudflare, web traffic will be routed through Cloudflare’s network. Hooray! This means that your webserver will see a lot of traffic proxied through Cloudflare, and in order to allow all this traffic to access it, you will need to make sure that Cloudflare IPs are whitelisted and not rate-limited in any way on your server (you can ask about this at your host). We have a page with all the Cloudflare IPs.

(emphasis mine)

As you can read, this only tells system administrators to whitelist Cloudflare’s IP addresses, without explicitly instructing to block the incoming traffic coming from other sources. The post from Cloudflare’s blog DDoS Prevention: Protecting The Origin makes it even worse in my opinion, by implying that keeping the IP address of the origin server “secret” is enough.

Cloudflare doesn’t stop clever attackers who know your IP address from sending traffic to it directly. Just because your origin server’s IP address is no longer advertised over DNS, it’s still connected to the internet. If your IP address is not kept secret, attackers can bypass the Cloudflare network and attack your servers directly.

(emphasis mine)

To sum up – a publicly accessible origin server is safe… as long as nobody finds its IP addresses.

Internet-wide scan data with Censys

Unfortunately, in 2018, relying on someone not finding your IP address is a bit optimistic to say the least. Projects like  Shodan or Censys continuously scan the Internet and make their data accessible to anyone for free.

As a random example, it takes about one second to retrieve a list of all the HTTP servers on the Internet who return a page with a title containing Nicolas Cage or Rick Astley.

An example of a Censys search (click to enlarge)

Censys is very well-suited to find exposed origin servers efficiently. The following sections describe a method to detect exposed origin servers of a specific domain, using Censys.

Finding exposed origin servers of a domain with Censys

An efficient way to find publicly accessible origin servers is to use their SSL certificate. Using Censys Certificate search feature, we can search for valid SSL certificates for a specific domain name. Censys collects those certificates from multiple sources (direct probe on port 443, and logs of the Certificate Transparency project).

For instance, the query parsed.names: reddit.com and tags.raw: trusted can be used to find valid certificates issued to reddit.com or one of its subdomains. Here’s a direct link to the associated Censys search.

Finding SSL certificates issued to reddit.com and its subdomains (click to enlarge)

As you can see, Censys found 7 individual certificates. If you click on one, you will have the option to find all IPv4 hosts that were found to present this certificate when probed on port 443.

Finding IPv4 hosts using a specific SSL certificate (click to enlarge)

In this view, we can see all IPv4 hosts using the SSL certificate whose SHA256 fingerprint is 36f7[…]815a0a.

This simple method to search for IPv4 hosts using SSL certificates issued to a specific domain can be used to find exposed origin servers.

  • Search for SSL certificates issued to mytarget.tld
  • Find all IPv4 hosts using one of these certificates
  • Check if these hosts seem to be origin servers of mytarget.tld

Checking if a host is a (likely) origin

Once we have a list of potential origin servers, the next question is: how do we assess if a host is an origin server of a specific Cloudflare protected domain? There is no silver bullet here since only a company’s sysadmins will be able to tell for sure, but some basic heuristics can help.

First, the IP of the candidate host should not fall into Cloudflare IP ranges, otherwise, we have just found the Cloudflare server which acts as a middleman between the end-users and the origin server. Then, the HTML response of the candidate host should be similar to the response we get when accessing the website using its standard domain name (e.g., mytarget.tld). I say similar because exact equality is too strict here, since it is common that parts of the same webpage change every time – CSRF tokens, session identifiers, and so on.

Once we find a set of hosts matching these criteria, we can be quite confident we have found a set of exposed origin servers. Of course, these hosts could be very similar to mytarget.tld but actually be development or staging instances. We cannot know with 100 % confidence, but what can help is to browse to the candidate host’s IP directly, and see if it behaves like the production website accessible via mytarget.tld: can you register an account on it? Can you login to an account you created from mytarget.tld, and vice versa? If the answer is yes, you should be pretty confident you found an origin server which shouldn’t be publicly exposed.

Keep in mind that accessing an origin server by entering directly its IP in your browser’s address bar will sometimes not work, as the web server running on it might be expecting an HTTP Host header. When this is the case, you can add a static mapping to your hosts file or query the origin server with a tool like curl or Postman which allows you to set specific Host headers.

Automating the process with CloudFlair

This process can be cumbersome when done manually; to help automate it, I wrote a tool called CloudFlair. It uses the Censys API to search for SSL certificates and associated IPv4 hosts. Once it has retrieved a list of potential origin servers using the method previously described, it will call each one of them and compute the similarity of the response with the response sent by the original domain. It uses a structural similarity function designed on purpose for comparing web pages (described here), since standard string similarity functions such as the Levenshtein distance are too slow to work with strings of the size of a typical web page.

Here’s a sample output.

Once you get a list of likely origin servers, you might still have to make a few manual checks to confirm the result.

Remediation

Update: After Cloudflare’s CTO pointed it out on Twitter, the best mitigation in the future will probably be to use Cloudflare Warp. This feature is currently still in beta, but it’s probably going to be a real game-changer regarding this issue. I did not look into in detail, but essentially it’s allowing you to open a tunnel to a Cloudflare from within a private network (not publicly routable), making sure that only Cloudflare can access your origin server (since it’s not exposed on the Internet).

If your website is vulnerable to the weakness discussed in this post, there is most likely no easy way to fix it. Once the IP of an origin server has leaked, it’s game over. The data used by Censys is versioned, and anyone can download a snapshot of this data at any point in time. What’s more, restricting the incoming traffic on a server is not enough to protect it against DDoS attacks. Dropping traffic at the software level with iptables does not prevent an attacker from sending a large number of packets to the server to consume all the available bandwidth and make it inaccessible to legitimate users.

Below are two steps I would recommend taking.

  • The first step should prevent the IP address of your origin server from appearing in future Censys scans, and ensure that application-level security features of Cloudflare cannot be bypassed (such as WAF or HTTP endpoint rate limiting).
  • The second step should (and I believe is the only way to) prevent an attacker from running a DDoS attack against you.

Step 1: Firewall incoming traffic or enable Authenticated Origin Pulls

Configure your origin server only to accept incoming traffic from Cloudflare’s IP ranges. Refer to:

Note: Before applying those steps, make sure you understand that it will prevent you from accessing your origin servers via SSH, RDP, FTP, or any non-HTTP based protocol. Depending on your needs, this might be fine. However, if you still need to access your origin server using one of these protocols, you will need to leave the corresponding ports open.

Another way you can restrict incoming traffic to Cloudflare’s servers is by enabling Authenticated Origin Pulls on your account, and configure your web server accordingly. This feature will make Cloudflare authenticate with a TLS client certificate when talking to your origin server.

Step 2: Change the IP address of your origin server

As stated above, once the IP address of your origin server has leaked, it’s game over. The best solution would be to change it if it is possible, and if DDoS attacks are a credible threat to you. Depending on your hosting provider, this might be easy or painfully hard to achieve.

Acknowledgments

I want to address a big thank you to the people below, for their multiple suggestions and for proofreading this post!

 

Thank you for reading! Feel free to leave a comment below or to tweet @christophetd for discussions and remarks.

Liked this post? Show it by pushing the heart button below! You can also follow me on Twitter.

23 thoughts on “CloudFlair: Bypassing Cloudflare using Internet-wide scan data

  1. Pingback:

  2. Yes but the title is so misleading. Bypassing cloud flare is completely irrelevant from a website that is configured poorly.

    • The goal here is precisely to raise awareness about origins being misconfigured, which is a real issue.

  3. Pingback:

  4. Is it possible to prevent an origin from being indexed this way by using a non-standard http or https port?

    • That’s a good question, but I think it’s a bad idea.
      1) A port scanner could fingerprint this non-standard port and see it is handling SSL connections
      2) Why to do this anyway instead of firewalling off the whole thing?

      • It may be easier to setup compared to a firewall and Censys doesn’t scan those ports, if I understand the article correctly. You can stay hidden longer.

  5. Pingback:

  6. Why do you exclude parsed.names: cloudflaressl.com?

    • Because those are SSL certificates issued to the CloudFlare intermediary server between the client and the origin.

  7. Good post, thanks! You also can have the passive DNS approach and check for some previously resolved ips in the Cloudflare range.

  8. If you host your origin in AWS *or* other cloud providers you can still protect your origin even if the IP was leaked. No strict requirement to change it.

    By configuring an AWS security group (aka firewall) rule AWS will drop the traffic before it ever reaches your server. You don’t get charged for the traffic, and AWS has its own DDoS that would be free in that case. So no strict requirement to try to keep your origin IP secret.

  9. Pingback:

  10. Even blacklisting all non Cloudflare traffic doesn‘t help if your Origin infrastructure can‘t handle the load. Also lot of Cloud providers and Shared hosters don‘t even offer blacklisting.

  11. Cloudflare “strongly recommends” installing mod_cloudflare via command line or the cPanel plugin. I’m not comfortable doing this, for fear that I might mess something up. Is there any way to get help/support for this?

  12. good post , can i translate this article to my native language (chinese) and forward it to my blog? i will add original author and origin post link to the eginning of the article .. thanks !!

    • Sure, no problem!

  13. Pingback:

  14. As usual you have the balance of Security vs. Usability vs. Abuse by “Authority”.
    So far as I see it, CloudFlare only managed to produce a Server in the Middle, blocking Usability & providing a Censorship tool.
    Defective and ver. 0.x software should not be released onto the web at large. Remove it, fix it, then we can negotiate.
    Since there is no metering, they cannot provide you with reports on the benefits to your website.
    Do listen to the complaints of the end users, they will be your only source of trusted feedback.

  15. Pingback:

  16. i spotted some confusing statement (confusing for me).

    In the second paragraph of your remediation, you mentioned that >>> “Dropping traffic at the software level with *iptables* does not prevent an attacker from sending a large number of packets to the server to consume all the available bandwidth and make it inaccessible to legitimate users.”

    But on the Step 1 solution you have provided, IPtable seems to be part of the solution enlisted.

    >>> “Configure your origin server only to accept incoming traffic from Cloudflare’s IP ranges. Refer to:
    https://danielmiessler.com/blog/whitelisting-cloudflare-iptables/

    I was thiking since you disqualified it in the first place, you would not see it as a solution and therefore provide a better alternative.
    i currently suffered massive DDOS attack and now whitelisting cloudflare using VHost on NGinx.

    So i would like to know, can i rely on IPtables or VHost?
    Which is better for CF whitelist?

  17. Pingback:

  18. You can disable cloudflare in your browser by disabling the certificate, it doesn’t block the assholes but it blocks all sites that use it on your browser end. In affect why bother visiting a page if it doesn’t exist.

    There is also “cloud firewall” addon for Firefox this should let you block cf easily!

    CF in a nutshell,

    https://en.wikipedia.org/wiki/Cloudflare

Leave a Reply

Your email address will not be published. Required fields are marked *