Fastly Deep Technical Dive

Over at SoftwareStackInvesting, Poffringa has a new detailed post on Fastly’s technology (… ).

It’s highly technical, but if I may, here are some perhaps easier to digest take-aways:

• Fastly was founded by Artur Bergman in 2011, who was frustrated with the existing CDN solutions available to him.

Fastly’s Chief Product Architect recently said “Fastly has a long history of looking at problems from first principles and being unafraid to undertake difficult projects if we know they will benefit our customers.”

• Fastly took software-driven approach to the networking of their CDN, based on Arista’s (ANET) SDN switches. Using switches instead of routers was big cost savings for Fastly, and by running their own software on top, they can optimize content delivery in ways fixed hardware routers can’t.

• Fastly’s overall design is to have fewer, but larger POPs than their competitors. Not just larger, but much more efficient, with SSDs and custom file system software for performance. Some super-popular content, such as “Like” or “Share” button icons, are actually served out of memory, not disk.

• Almost all aspects of Fastly’s CDN are customer-programmable, performant, and cost-efficient with features like instant cache purging, limited deployment regions, and instant activation/deactivation.

• Compute@Edge moves computations out of central servers and closer to end-users in Fastly’s distributed network. This is a serverless environment, which essentially means customers don’t have to manually manage server process instances and only pay for what they use. Fastly’s processes start up orders of magnitude faster than anyone else, including Amazon’s Lamda, the first popular serverless computing architecture. Like Fastly’s re-thinking of CDNs, they went back to first principles for their serverless architecture and it’s unlike anyone else’s - almost all of which have a container that is started up. Those containers are heavy, in that they have all kinds of functionality, not necessarily designed for serverless. Fastly designed their processes specifically around the small kernal that starts up quickly and leaves it to the applications to bring in only what they need to run.

• Compute@Edge won’t be rolled out to everyone until 2021, so no revenue from it until then. Smorgasbord here: With select customers already building solutions in Beta, once it goes live revenue could start pretty high at day one.

• As an example of Compute@Edge, Shopify will be offering its customers the ability to do custom product discounts, beyond standard things like “buy 1, get 1 free.” Shopify’s customers will be able create their own discount rules and run them within the Shopify environment in a fast, compact, secure manner on top of Fastly. Shopify reported a speed of 1,000 requests per second.

• Poffringa warns us potential investors that the market for this kind of distributed serverless compute environment is unproven. Engineers might love it, but that doesn’t mean customers will flock to it. And if they do, Poffringa expects that the big cloud vendors will eventually modify their own existing serverless offerings in an attempt to match Fastly’s speed and security. However, Fastly likely has a big head start.

My own (Smorgasbord’s) take is that given those vendors already have a big heavy-weight architecture, they’re going to have to make some tough decisions in trying to match Fastly that their existing customer won’t like. For instance, AWS’s Lamda gives their customers a full web container with all sorts of functionality. That’s one reason it takes so long to startup. If Amazon pulls functionality, existing customer workloads won’t work without significant re-writing on the customer size, which they won’t like. If AWS chooses a whole new approach that matches Fastly, then customers will be confused as to which Amazon serverless model to choose.

• Competition includes Cloudflare’s Worker product, which is already available. Worker has a 3000-5000 micro-second startup time, compared to Fastly’s 35 micro-second startup time (and compared to Amazon’s 200,000 mico-second startup time). Worker has a smaller footprint than AWS (3MB versus 35MB), but Compute@Edge’s memory footprint is much much smaller: only several KB (1MB = 1000KB).

Definitely worth reading the whole article, even if you have to take it in chunks and skip over some technical passages.


I found the article most helpful in explaining Fastly’s moat and differences with legacy CDN providers. It can be hard to read (I wish the editors would stop publishing in huge chunky paragraphs) but don’t let that stop you. Definitely worth the effort.

1 Like

I would like to compliment Smorgasbord1 for linking us to the “Fastly Deep Technical Dive” article on the Software Stack Investing website. He warns that it is heavy reading. I would like to provide a decoder ring. I have some arcronym definitions and explanations of the things Smorgasbord1 and the article refer to. If you keep this at hand, the material reads easier.

Millisecond, microsecond: Thousandth, millionth of a second.

CDN: Content delivery network: A cluster of networking hardware optimized to find requested material on the internet and route it quickly to the requester. These things are highly optimized for both BANDWIDTH and LATENCY.

Bandwidth: How many bits of data can be delivered in a second. Like more water per second through a fat fire hose compared to a garden hose.

Latency: How quickly the first return happens after you made the request. That comprises the time it takes for the CDN to decode your request, look in its local (or remote) address book to find where on the internet the content you want it, code a request to the place that has it, and route the flow down to your PC. Optimal is high bandwidth (fat pipe) and low latency (quick turnaround).

POP: Point of Presence: A physical location where a company has networking equipment they control. That equipment is optimized to route requests for content to where the content provider stores it, and to optimize how the returning content returns. When they park their equipment all around the world, and link it together they can control everything within that network, and how it connects back out to the internet. A giant optimized POP connected to a fast network next door to Disney will get your princess video or website faster.

Caching content: From the root word cache. Storing content right there in your POP device, so you don’t have to go far to get to it. Some content is requested so frequently that it makes sense to store local copies, and return that when requested. The requester doesn’t have to wait for a request to go out across the internet and the reply to come back. Examples might range from a “like” button on a web page to a popular web page or chunk of video.

IOPS: Input-output operations per second: A measure of how many data storage and retrieval transactions can be done on a data storage device. Usually refers to a disk drive or solid state drive (SSD).

HD, HDD: Hard Drive, Hard Disk Drive: Spinning circular metallic disks with magnetic surfaces. Data is recorded on them with magnetism like on a tape recorder. Data is organized in circular “stripes” from the inner stripe to the outer stripe. Like a 2-dimensional onion. The recording and retrieving happens with an electro-magnetic “head” that reads and writes the data dots on the stripe on the disk. To write, it generates electrical current to create the magnetic field that magnetizes the dots. To read it uses the head to measure the magnetic polarity of the dot and converts it to a small electrical current.

There is a small microprocessor inside the drive enclosure that controls where on the disk the data is stored. When it stores or retrieves it does this: It moves the head out to the correct stripe, and it waits until the starting spot on the disk rotates around to where the head is, then activates the head to read or write.

It takes a matter of milliseconds to store or retrieve data. The speed is limited by the rotational speed of the disk and the speed limits on the head mechanism moving in and out across the radius of the spinning disk. The unit “knows” where on the disk the data is, waits for the spinning disk to come around to where the magnetic “head” can read it from the surface. Once the head reads it, then the electronics take over and it goes back to the memory of the server that requested the data.
The hard drives can support a few hundred input-output operations in a second (IOPS).

SSD: Solid state drive: A solid state drive is an electronic version of a hard drive, with special memory that operates in MICROSECONDS instead of milliseconds, but still retains data if you power it down. There is no waiting for a head to get to the correct stripe or the disk to come around to the start point of the data spots. It all happens with the speed of semiconductors. It is a thousand times faster than the electro-mechanical hard drives. More expensive too. The SSD’s used by Fastly can operate a few hundred times faster than the HDD’s. You can see why they are preferred and much more expensive.

Fastly studied the tradeoff between a ton of POP’s around the world versus fewer POP’s that contain more expensive, but better performing hardware. They chose to limit the number of POP’s, and build each with high performing compute devices and large numbers of high performing solid state drives to cache as much routing information and content as they could. Caching means it is already there, not at the end of a long request out to the internal or their POP network. They challenged the assumption the competition made that many small POP’s were the way to go. Their system is faster. Also they have address computation and interconnection routing knowledge to the other POP’s that figure out the route in 35 microseconds, 85 to 5700 times faster than competitors.

I don’t know how long that advantage will last, but it is pretty good right now.