What Cloudflare is offering might be thought of as a secure subset of the internet. Two entities within the secure subset can communicate in safety. Communication with the “outer” unsafe internet is available, but you take your chances.
What is their goal? Get everyone who matters into the secure subset! At one end that means major providers, but at the other end it means businesses and individuals who live by their internet connections and can afford to pay a bit more for a safe internet… That’s how it looks to me, at any rate.
Well, not really.
Remember that Cloudflare is not, and probably will not ever be, its own internet such that traffic never or even rarely leaves it. And that’s simply because no company can ever be the Last Mile connectivity solution for everyone except a select number of large businesses that can afford it. And even then, as a business you’re not going to limit yourself to working only with companies that can also afford to be on the same private internet, so you have to be connected to the public internet. And, being connected to the public internet means you are subject to things like DDOS attacks. You can read how Cloudflare connects to last mile networks here: https://noise.getoto.net/2021/09/16/unboxing-the-last-mile-i… , as well as how they’re working to help issues be identified quickly.
What Cloudflare is providing, at its core, is a hosting service. They have servers set up around the world to host your website and applications (which included APIs). Their architecture is by its nature distributed, but at the end of the day they have servers, running a bunch of custom security and performance-oriented software, connected via both public internet and their own private backbone, on which they run their customer’s applications and host their websites. It’s not a “secure subset of the internet.” The security part helps stop DDOS attacks from impacting you as a customer, and the performance part helps your website or application’s performance among widely distributed users.
It is my opinion, as uninformed as it may be, that for there to be a really secure internet, each end point must have some sort of financial accountably.
I’m not sure about the “financial” part, but yes, knowing endpoints is a good piece of establishing secure connections. Being able to examine and remove bad traffic before it reaches vulnerable endpoints is another good piece. These are both provided in Cloudflare’s Zero Trust solutions: “Cloudflare Access” & “Cloudflare Gateway.” But, they are not for the general internet, they are for businesses that want something better than a firewalled internal network. Similar to what ZScaler offers, btw. These “Zero Trust” solutions are great, but implementing them on a world-wide scale is akin to the Third Reich’s Kennkarte usage (everyone had to apply for, obtain, and show “their papers” on demand to officers. A business can do that for their internal employees/users, but we expect more privacy from the government.
What stands out is the need for resilience — systems that must continue to operate even when the world goes bad: tsunamis, earthquakes, terrorist attacks, cloud outages, etc. These requirements must be met for autonomous vehicles to really take over. Similarly for any degree of IoT that is truly resilient — it won’t work if you can’t turn your lights on or open your door because there is an outage somewhere 1,000 miles away.
It is clear that Amazon with 77-odd data centers or Google with 21 will not be able to satisfy these needs.
A bunch to unpack here.
First, it’s true that there are systems which have become so important that they must be resilient to man-made and natural disasters. However, neither autonomous driving nor home control are good examples. There are no production autonomous driving systems that rely on connection to a network to function - they all are able to process inputs and determine what to do via an on-board system. Some may get additional information from a network (traffic is an example of widespread adoption), but that information is not necessary. There are no 5G edge networks capable of the latency requirements needed to, say, process an image and detect that it’s a pedestrian you don’t want to hit. Similarly, I don’t know of any internet-enabled home control systems that do not have local control overrides. If Alexa can’t turn your lights on, you can always walk over to the physical switch.
I also disagree with the characterization that Amazon, for instance, does not have a resilient architecture in place. You can read https://docs.aws.amazon.com/whitepapers/latest/aws-overview/… or https://aws.amazon.com/about-aws/global-infrastructure/ to find out that Amazon currently has 25 fully isolated “Regions,” with each region having multiple “Availability Zones,” which are themselves isolated, and you can elect to have your application run on multiple Availability Zones within a Region to achieve complete redundancy. In addition, Amazon provides regional API endpoints, which are designed to operate securely for at least 24 hours even if isolated from the rest of the internet. Amazon’s infrastructure goes far beyond anything that Cloudflare has today in terms of worldwide resiliency.
Thirdly, no matter who is hosting your application, if resilience is important then you need to design your application for that - you can’t just have it hosted on a great network and expect everything to always be fine. If you’re using Cloudflare’s distributed POPs (Points of Presence) for instance then you have to figure out where copies/backups of your core application’s data lives. Just think about having your customers log on to your website. You maintain some kind of database that stores each user’s information, including username and password (and potentially secure token ID). That database will usually live on a central server, which could go down. So, you need at least two servers with the database, and during normal times you have to keep them in close synchronization as new users are added, old users removed, or users just changing their passwords. And if you’re using something like Cloudflare’s edge network for performance, then you’re probably keeping read-only subsets of your user database on each POP (Point of Presence). Then if the POP goes down, perhaps another POP that’s further away takes its place, but it will have to go back to the central server to get the DB record for that user it hasn’t seen before. As a large company you’re not going to store your entire user database on a relatively small POP, and even if you did you’d still have the multiple DB synchronization requirements. With Cloudflare’s edge network, you’re getting faster performance most of time, not resiliency. The recent widely reported failures of edge networks taking out huge numbers of websites and applications prove this.
Now, there are certainly some kinds of applications that perhaps don’t require authentication and yet need to service users with performance and reliability. The caching nature of a POP can be helpful there, but there’s always an “origin server” which has the main storage of the data involved (and it needs to be backed up/synchronized). You can’t run an Edge Network like a disk farm of RAID 5 devices because that kind of constant inter-POP communication would kill performance.
So, what is the point of Cloudflare’s “backbone?” Here’s Cloudflare’s page on it: https://noise.getoto.net/2021/09/16/cloudflare-backbone-a-fa…
Essentially, it provides additional routing choices for data being requested at one of Cloudflare’s data centers needing to be provided through another of Cloudflare’s data centers. The “additional” part is important, because, as Cloudflare itself admits, sometimes the public internet is simply faster due to shorter fiber distances. But, capacity overloads or failures can mean that Cloudflare’s own backbone is there as either a backup or a faster performance option. But, it’s important to recognize that this backbone is only between Cloudflare data centers. It doesn’t run all the way to TMF’s headquarters in Virginia, nor to your home for that matter.