Introducing Confluent (CFLT)

Hello all,

First of all, I am eternally grateful to everyone on this board, as others have stated many times before, I share this sentiment.

The company I wanted to bring up for discussion is Confluent.

Investor Presentation:
Public MF article discussing pre-IPO (where I pulled some info):…

Confluent operates as a software company with a SaaS model. The company is seven years old and they IPO’d a little more than two months ago at $45 (trading there now). Their market cap is ~$11.5 Billion. There is a lot to like from a Saul standard, which I will get into later.

The company was founded by three former Linkedin employees (all three still with the company, two on the board) who were tasked with rebuilding Linkedin’s data infrastructure from the ground up to be cloud native and scalable. They didn’t see an off the shelf solution, so decided to start building it themselves. This platform is called Apache Kafka.

Confluent offers a service that allows businesses to “stream” data in real time. They operate a data platform and their tag line is “Data In Motion.” Confluent’s platform, from what I understand, is a paid version of Apache Kafka, the open-source project adopted by more than 80% of Fortune 500 companies they created. The difference being Confluent provides enterprise level support. Among other things? Not sure

From the 10Q: Confluent’s platform, "allows customers to connect their applications, systems, and data layers and can be deployed either as a self-managed software offering, Confluent Platform, or as a full-managed cloud-native software-as-as-service (“Saas”) offering, Confluent Cloud.

Someone else can probably help explain that a little better for non techies like myself. Hard for me to get my head around that. As Saul says, we may not understand the tech like someone “in the field” might, but we CAN look at the impressive numbers

Confluent reported yesterday, Aug 5th.
-Total Revenue of $88mm, up 64% YoY
-Confluent Cloud revenue of $20mm, up 200% YoY
-RPO of $327mm, up 72% YoY
-617 customers w/ $100,000 or greater in ARR, up 51% YoY
-Q3 Revenue Guidance of $89-$91mm, and Fiscal 2021 revenue guide of $347-$351mm

By the Numbers (pulled from Bloomberg):
Revenue (millions)
Year Mar Jun Sep Dec Total Growth (QoQ)
FY2019 29.2 34.0 38.5 48.1 149.8
FY2020 50.9 53.9 61.5 70.3 236.6 74.5% 58.2% 59.7% 46.2%
FY2021 77.0 88.3 51.3% 64.0%

Confluent is starting to re-accelerate after the Q4’20 slowdown. I don’t know why they slowed a bit in 2020, but I assume it was COVID related.

Gross margin has hovered around 69% over the last year. Subscription based revenue, which represents ~90% of revenue, was 77%.

Customer Base:
In land and expand mode: already have captured 136 of the Fortune 500 companies.
Customers that you would be familiar with include: Citi, Expedia, Lowe’s, Bosch, BMW, Expedia, Humana, Key Bank, Walmart, Dominos, Netflix, Ticketmaster, Morgan Stanley, Goldman Sachs, Lufthansa and more. Do you see the point there? They are proving to be successful across virtually ANY industry! The network effect is driving further expansion.

Named Google’s 2020 Cloud Partner of the Year for Smart Analytics for the third straight year! (…)

Total Customers: As of Q2’21 Confluent had 2,830 customers, up from 1,390 in Q2’20, an increase of 104%!

Customers w/ >$100k in ARR: 617 in latest quarter, up 51% YoY

Customers w/ >$1MM in ARR: 70 in latest quarter, up 112% YoY! Talk about land and expand!!


There is a lot to like here. This is a founder-led hyper-growth SaaS company with a huge TAM. While I don’t completely understand the technology, I do like the impressive revenue growth, gross margin %, and customer growth, particularly in the $1MM+ category. It appears that Confluent’s moat is the fact that they are the main support for the Apache Kafka software at the enterprise level. The technology is open source and ubiquitous, and they are bringing over paid subscribers in droves. I am going to buy a ~2% starter position next week to keep it on the radar.


Apache Kafka is opensource. Confluent packages this and provides support and bells and whistles around Kafka. Same as Redhat does for Linux.

The world is moving more and more realtime and Kafka / Confluent is at the center driving this change. Tweets, Messages, Stocks, Bank transactions, Video, Audio everything is going streaming.

Competition is other streaming solutions like AWS Kinesis, Nifi etc. but Kafka is king. As an indictor there are 17k indeed jobs looking for Kafka skills.

By the way, Aug 9th is when lockup expires for first 25%, so if you want to scoop up the shares, next week would be a good time.


Muji peaked my interest recently about Confluent when talking about the companies IPO on 7invest podcast located here:

Some parts I found interesting:

  • And I think from the Confluent S-1 and from the Apache Kafka site, 80% of Fortune 100 companies use it. 70% of Fortune 500 companies use it internally.

  • And to me, as a hyper growth investor, with a very condensed portfolio, I’m most interested in that cloud aspect of their business, because that’s where the company can scale more. And so I want to see, you know, extensive growth on the cloud platform.

  • And it can playback data from that archive, or can playback the real time stream, as it always is done. And with machine learning, you know, you can immediately see the benefit there is that you can train on the historical data that’s stored in the underlying object store, and then turn around that machine learning model, and be doing analytics over the real time streaming from there.

  • Apache Kafka is the winner confluent will be a winner, it’s just I, you know, I need to see them focus on that cloud service exactly as MongoDB did with Atlas.

  • I would like to see them go into that managed service versus managed infrastructure where it’s no longer the customer. It’s a little more turnkey, from the customer’s perspective, I don’t need to be fiddling with configuration options, I just need to say I need this cluster, it needs to have this capacity in this cloud. And then you just hook all the agents up into it. And so I think that’s the ultimate path for confluent.


IMO, CFLT has a lot of similarities like other open-source names and I feel the open source model makes them very difficult to improve operating efficiencies.

E.g. in this first quarter after IPO, CFLT’s top level growth is impressive and a good re-acceleration, but all other metrics are pointing to the wrong directions YoY:

  • NON-GAAP Operation margin is lower
  • Adjusted EPS is much lower
  • Free cash flow is lower

Still no interest to open a position, for now.



The world is moving more and more realtime and Kafka / Confluent is at the center driving this change. Tweets, Messages, Stocks, Bank transactions, Video, Audio everything is going streaming.

I have not used Kafka but my company has something similar. If I understand correctly, Kafka has nothing to do with the real time features mentioned above. Kafka is about real time data ingestion and processing. The data here is logging data generated by user’s activities. For example, when user likes a post on Instagram or makes a purchase on Amazon, there will be logging data sent to web server, which is later taken care of by the real time data processing services. One common use case of such service is to build real time dashboard to monitor product performance.

BTW, for folks who are curious about whether Confluent is a competitor with Snowflake. They’re not. From Confluent’s website, there’s a technical tutorial about how to integrate Confluent with Snowflake -…. Customers can choose Confluent to handle the data ingestion and piping data to Snowflake’s data warehouse.

This field definitely has a lot of potential. I’d research on the competitors and take a look at the S1 of Confluent. Thanks for sharing the company!



For anyone interested in learning about Confluent and Apache Kafka, I highly recommend subscribing to muji’s service at - he has written EXTENSIVELY about the company and the technology. It’s behind a pay wall that you can get over for $60 or so, worth it imo.

(no position in CFLT)


One of the Confluent engineers got me interested in this company before the IPO so I’ve been keeping my eye on it. There’s a lot to like. Here’s a couple of extra resources I’ve run across:

MF Article by Eric Cuka discussing competitors and partners:…

Transcript of the recent conference call:…


I’ve been looking into CFLT for some time and just started a small position on Friday after looking through the earnings release.

I would explain the tech in this way. Snowflake is a traditional data warehouse in that it employs batch processing. That is you take in a bunch of data and you write it to disk, and then at some later time you run operations on that data. The use case here is that you don’t need insights from that data right away. For example, you’re a financial firm that does millions of transactions per week, and you want to know how the transaction size, volume, etc. compares week to week. It’s a huge operation. It’s best to store the data to disk and schedule a batch process for the end of that week.

Confluent is a real-time data platform in that it employs stream processing. That is you can perform operations on the data as you receive it. There is no writing to disk. The upside here is speed and that you get insights in real time. For example, a credit card company doing fraud detection. You don’t want to store a record of all those transactions to disk, do a batch process over the weekend, and a week later tell the customer “hey, this transaction is kind of suspicious.” You want to do that in real time.

This is, of course, an oversimplification, and there are downsides that are a bit on the technical side, but I hope this is helpful for the non-techies out there.

Here’s a good video put out by Confluent that goes into greater detail:


HHHypergrowth / Muji is great at breaking down a company’s tech product offerings in lay man’s terms. He touched on Confluent in this podcast episode from June;

No position in CLFT for me yet. I agree with Zoro (ZoroSGInvesting) that some key metrics are trending in the wrong direction. Will monitor it for another quarter or two.

  • Ra

Bert Hochfeld analyzes Confluent in this SA article:…



MF had a good introductory podcast on Confluent,…


Two things are certain

  1. World is going more event driven and real time
  2. Kafka is king in this world (beat Spark streaming, Storm, Nifi etc.)

Confluent provides Connectors, Support, Security etc.
The key question is will Confluent’s Kafka managed service win over other Kafka managed services ?

My view is Confluent will win but this answer is not that clear yet.


I read through this entire thread and a few of the links and I was curious to understand how big is this market? It sounds like a niche area. I did a search and I can’t find any info regarding their TAM.

1 Like

$50B according to their S1:…

I don’t think this is a niche product at all. Apache Kafka has been adopted by more than 80% of Fortune 500 companies. So it it a matter of selling these companies that already use Kafka on the value proposition of Confluent.


As of IPO TAM of $50 billion was estimated by Confluent and its underwriters

So there are big cloud competitors with their own home grow solutions for AWS, Azure and G-cloud.

Then amongst Kafka - there are other large competitors such as Cloudera (merged entity that was public but taken private because of poor market performance).

So it’s a crowded field and where Confluent can add value to a company is on managed services where I think there will always be a pressure to compress the margins. IMHO, not enough for a special sauce for me to pursue.

(1st post - so I hope this is not out of bounds)


Confluent (CFLT) and “data in motion”

This is my first post on TMF and on this board. I have been a lurker for 10 months or so, and benefited hugely by the philosophy and the content of this board.

I have been a software engineer for 22 years. I retired last year and now am a fulltime student doing ‘masters in clinical counseling’. Last few years of my career, I was heavily involved in moving existing applications to the cloud and writing cloud native applications.

I see Confluent mentioned on board a few times, it has great numbers, but I think members don’t understand what Confluent does, what ‘data in motion’ is, how important it is for companies, and how it is different from other database companies like Oracle and MongoDB.

Here is my attempt to explain what they do, and how important they are to companies who are in the cloud.

Let’s take an example of a retailer like ‘Pier 1 Imports’. Their web site used to be a one giant ‘monolith’ application. Everything that an e-commerce website needs like UI, Orders, Shopping-cart, Customers, Products, Inventory and all the business logic would encompass in one application.
What’s the problem with this architecture?
Scalability: If for Black Friday I want to increase resources for UI and product by five times to handle the traffic, I have to increase resources for the entire application five times. That means your entire cloud bill goes five times up.
Development time: Since a huge team of developers is required to manage this application, there is a lot of time wasted in communication and coordination. Developers will be stepping on each other’s toes.
Time to market: Rolling out new features or a bug-fix is hard because every time one feature is ready, some other features might be under construction and so on.
Technological restrictions: if some feature or some part of the application can be written better using some specific language, it’s tough luck! Since it’s one application, everything needs to be written in one language .

Cloud-native architecture or Microservices architecture solves all these issues. It is done by splitting every feature in its own service. So in case of ‘Pier 1 imports’, UI, Order, Shopping-cart, Products etc will have their own service and their own database. This solves all the above issues since an independent service can be written in its own technology/language, have its own small developer team that can release on its own schedule and it could be scaled independently.

But, do you see a new set of problems with this architecture?
With this new architecture, there are a lots of moving parts! When it was one application, it was easier for Order to talk to the Packaging, Inventory and Analytics modules after each order was placed. Now, service to service communication is more complex because they have separate existence.
There are two main ways services communicates to one another in new cloud-native architecture:

API: this is when two services directly talk to each other. This is perfect for real time but the problem is if one service is down it would create havoc in the system. For example if Analytics service is down, and Order wants to let Analytics know that it sold two tables then order will get stuck because analytics is down and which could mean order won’t go through, and future orders might not go through either since Order is still waiting on Analytics to respond.
Publisher/Subscriber Messaging Queue Mechanism: This is a ‘almost-real-time’ model but not the real time. In this mechanism, Publisher is an entity who has important information to release, like Order-service when order is placed or Inventory-service when it wants to publish an event when a product inventory goes below critical mark. Subscriber is an entity which is interested in a particular event like Inventory service would be interested in every order that is placed. If any Subscriber goes down, upon restarting, it would know what it read last and start from there. This broadcasting, subscribing, mailbox, messages, and queueing is done beautifully, easily, reliably and with extensive scalability by an open source project named Kafka.

Lets see this in example: If on the Pier-1 website, a customer adds 2 tables to the shopping cart, shopping-cart would broadcast that message. Inventory service has subscribed to the message, so upon getting the message, it would put 2 tables on hold in inventory. When the customer hits the buy button, Order service will broadcast the message, Inventory service will move those 2 tables from ‘hold’ to ‘sold’ and might publish its own message for inventory depletion for the tables. Packaging service and Transportation service have also subscribed for the particular table sell-message, so that they can run their algorithms. There would be scores of messages created and spread around for each order. This queueing mechanism becomes the spinal-cord of the system and its importance, necessity and usage keeps growing by the day. I have used alternative queueing mechanisms like MSMQ (by microsoft), RabbitMq, JMS etc but no one comes close to Kafka.

Confluent is founded by original creators of Kafka. Confluent made Kafka a managed cloud service, added a bunch of bells and whistles to make development easier and manageable.

I hope this was helpful.

Nitin (Long CFLT 15%)



Thanks for explaining the technics. I’m a software engineer as well. I have a different understanding from you about the usage of Kafka.

My company uses Scribe, which is very similar to Kafka IIUC. My company does not use Scribe for general MQ purpose. Scribe is used primarily to power our real time data analysis and data ingestion. Async computing and The communication between microservices is done by something else.

I read this article -…, which I think makes sense. It says that RabbitMQ is more capable than Kafka in some cases, including communication between microservices. One benefit RabbitMQ has is that it supports priority in the MQ, which is very important in some cases. Based on my conversations with my friends in some top tech companies, they also appear to use RabbitMQ for async tasks.

Although I agree that Kafka is an essential infra (I’m also long CFLT), my understanding is that, Kafka is not the best to be used as a MQ. The strength of Kafka is related to real time analytical data, NOT related to handling async tasks or communication between micro-services.

I learned all these information from my software engineer fellows and from online resources while I don’t have first-hand experience in Kafka, so I may be wrong. If you have first-hand experience in Kafka, I’ll appreciate you sharing what you achieved with Kafka in your real world projects.



OK, let’s skip the attempts at explaining what Confluent does. Today it really just comes down to this:

Confluent made Kafka a managed cloud service, added a bunch of bells and whistles to make development easier and manageable.

Right. Kafka is an Apache open-source project (meaning free to deploy yourself). However, Kafka requires a bunch of tech people to run and manage since it’s complicated. What Confluent does is host (run and manage) Kafka for you. For now as investors, we can just assume that Kafka is great and that a lot of companies, particularly big companies, want to use it. The question then is whether they’ll turn to Confluent or not.

Probably the first thing to look at is what’s the competition. Since Kafka is open-source, Amazon has created a hosting service for it. The call it the “Managed Streaming Service for Kafka,” known as Amazon MSK. This was a serious problem for Confluent, so they changed the terms of the open-source licensing to prohibit others from offering certain parts as a service - thereby monopolizing that capability for themselves. Similar to what MongoDB had to do facing the same threat from Amazon.

Now, Confluent will always have an advantage of cloud provider independence that Amazon by its very nature does not. Not every company will care, or care enough. So, now it comes down to capability and pricing differences. Confluent’s big addition is their Replicator product that helps manage Kafka implementations spanning multiple data centers. There is a free MirrorMaker product available, but again, it’s harder to setup and use.

For today, I think Confluent should be analyzed on its past business performance and future expectations of that performance. Using Saul’s methods will tell you whether to invest in Confluent or not - see the first post in this thread for that. The slowdown in Q4’20 is concerning to me, as there’s no apparently explanation (Covid would not seem to be a factor given this company’s software business).

That all said, technology changes. Apache Kafka came out of a decade old project at LinkedIn, and the developers created their own company (Confluent) to profit from it. A new project at Yahoo is called Pulsar, and Apache has adopted that as another open-source standard. Comparing Pulsar to Kafka requires some heavy technical understanding, but suffice it to say that Pulsar is newer and has some performance advantages, especially for large users needing multiple tenancy, replication across geos, and large data storage. And, of course, the developers of Pulsar created their own company, Streamlio, but rather than go public they agreed to be acquired by Splunk, which is integrating Pulsar tech into their Splunk products and not appearing to push Pulsar hosting as it own product.

Which is probably why one of the founders (and original Yahoo developer) left to form yet another company, StreamNative, which has raised

At this point it seems unlikely that Splunk will push Streamlio to the same extent that Confluent is (instead they appear to be integrating advanced messaging into their existing Splunk products), so Confluent may have dodged that competitive bullet.

That said, just a week ago another company, StreamNative, got almost $24M in funding towards its Pulsar-based service (… ). But, it’s way too early and small to matter to us investors. But, it does suggest to me that Confluent will have to come up with something new/additional to survive beyond 5 years or so.


It does suggest to me that Confluent will have to come up with something new/additional to survive beyond 5 years or so.

I disagree, the point I was trying to make is “data in motion” is a field that Confluent is working in just like Crowdstrike works in Security. You don’t expect Crowdstrike to come up with something else, right?

Databases companies: Oracle, MongoDB, Redis etc
Data analytics/warehouse companies: Snowflake, Teradata, databricks, Oracle
Data in motion companies: Confluent

“Data in motion” is a newer field with huge potential. It has become a necessity for most applications which are being designed for the cloud.

Apache Kafka came out of a decade old project at LinkedIn, and the developers created their own company (Confluent) to profit from it.

Agreed, but in addition to just take profit, they did this

  • Rearchitected it to be cloud native.
  • Rest proxy ( http api support to publishing/subscribing messages)
  • Schema registry (this is to manage different versions of messages)
  • KSQL ( language to deal with messages matching what databases do)
  • various connectors (to make publishing/subscribing messages from different systems)

And here are numbers to prove their success (August 5, 2021):

  • Total revenue was $88.3 million, growing 64% YOY
  • cloud (fully managed) revenue 22% of Q2’21 revenue | 200% y/y growth
  • in-prem revenue (self managed) revenue 67% of Q2’21 revenue | 46% y/y growth

Annual revenue for last 3 years:

FY18  65m
FY19  150m
FY20  237m

Total customers:  2,830
New customers Q2: 290,  104% YOY 
100K+ customers:  617,   51% YOY
1m+ customers:    70,   112% YOY

Here is one of the analyst’s comments: Those changes in NRR (went from 117% to 130%) in one quarter is pretty remarkable. I’ve actually never seen something like that before.

The slowdown in Q4’20 is concerning to me
From their last call, here is CFO “I can tell you that we had paused hiring last year prudently because of COVID.” and “We had some churn because of COVID et cetera”. So maybe COVID did affect them.