Snowflake and Mongo DB

I still have some MDB shares, which went up last week, perhaps in anticipation of good earnings in a couple weeks.

I’ve been a big fan of the product in the past, especially Atlas (which has been achieving something like a 60% revenue growth rate) but I think that people setting up new applications will more and more turn to cloud-first solutions like Snowflake.

Yes, Atlas runs on the cloud, but if you have a data analytics use case (and who doesn’t?), then the pre-processing needed for Mongo (and other NoSQL DBs) is a major downside. Snowflake not only supports un/lightly structured data, it also supports SQL with extensions for both relational and NoSQL access.

I haven’t dived in deeply yet, so am wondering if anyone here has some experiences or opinions to share. Right now I’m considering selling my remaining position in MDB to redeploy elsewhere. I have a tiny, tiny SNOW position, but not adding to it at current levels.

19 Likes

I still have some MDB shares, which went up last week, perhaps in anticipation of good earnings in a couple weeks.

I’ve been a big fan of the product in the past, especially Atlas (which has been achieving something like a 60% revenue growth rate) but I think that people setting up new applications will more and more turn to cloud-first solutions like Snowflake.

Yes, Atlas runs on the cloud, but if you have a data analytics use case (and who doesn’t?), then the pre-processing needed for Mongo (and other NoSQL DBs) is a major downside. Snowflake not only supports unstructured or lightly structured data, it also supports SQL with extensions for both relational and NoSQL access.

I haven’t dived in deeply yet, so am wondering if anyone here has some experiences or opinions to share. Right now I’m considering selling my remaining position in MDB to redeploy elsewhere. I have a tiny, tiny SNOW position, but not adding to it at current levels.

Smorgasbord1:

In reading a recent Seeking alpha article on MDB, I found some of the comments under the article to be helpful in answering your questions and also as additional support for my MDB investment position (6%).
FWIW, my MDB investment has alays been predicated in the enormous TAM, their leadership position in NoSQL, the stickiness of their product, the quality of their CEO/team, and the continuing growth and adoption among enterprises as an alternative to and replacement for SQL/Oracle, etc… I have been in MDP for 2 years at an avg cost basis of $71. With the stock now at $264, I am still bullish for considerable share growth from here.

Here are the quotes from the Seeking Article you may find interesting:
-“The author is missing that MongoDB used to be an on prem database, and that Atlas, their SaaS cloud version is growing at 60% y/y and will soon be their primary revenue driver. And following their DB-Engines rankings & usage, they are just starting to hit mainstream adoption, meaning growth will accelerate. IMO, MongoDB is the clear leader in next gen NoSQL databases. The emerging architecture that’s replacing old Oracle stacks is MongoDB + Postgres/Redis for front end transactions and something like Snowflake/Redshift in the back end for warehousing & analytics. And once customers build their software on this foundation, its as sticky as it gets since switching costs involve redeveloping. Unless another technology disrupts it, MongoDB will have a long, long runway of growth ahead of it.

-“MongoDB is winning more and more deals as a General Purpose Database at huge companies and competing head to head with Oracle, IBM, and Microsoft as legacy players and the cloud native databases from AWS and GCP. Growth might be lumpy from big deals, but the runway is super long for MongoDB.”

https://seekingalpha.com/article/4390763-mongodb-reality-get…

Hope this is helpful.

Fool on!
Rockleppard
Long MDB

33 Likes

Thanks for the reply, but neither the article nor the comments are particularly helpful to me.

For instance this from the comment you highlighted:
“The emerging architecture that’s replacing old Oracle stacks is MongoDB + … Unless another technology disrupts it, MongoDB will have a long, long runway of growth ahead of it.”

That Mongo is replacing other databases shows that Mongo is itself replaceable. And my question is precisely whether Snowflake is that disruptive technology.

Yes, I know today the emphasis is on integrations with Mongo at the front-end for data input and then periodic transfer of data to Snowflake (Snowflake had a blog post on this over 4 years ago), but I can’t help but feel that Snowflake doesn’t actually need something as heavy-weight as Mongo for a front-end.

And while Mongo is a great NoSQL database, the data prep processing times in order to perform analytics (Map Reduce, anyone?) are often not tolerable in today’s faster paced world. Snowflake has a real advantage there. And Snowflake supports traditional relational databases as well as it supports unstructured, NoSQL-style data. So things like ACID don’t need special handling like they do in Mongo. And while MongoDb is NoSQL (only), it wasn’t really built for the very large datasets that Hadoop was, but for which Snowflake is, in my limited understanding, clearly better.

Hadoop vs Snowflake: https://community.snowflake.com/s/article/Hadoop-Vs-Snowflak…

Snowflake’s ability to handle document-oriented data (Mongo’s specialty) is described in gory detail here: https://www.snowflake.com/wp-content/uploads/2015/06/Snowfla… It uses what they call a “schema-on-read” technology which differs in that the schema is defined in the SQL statement itself. So you get the benefits of storing JSON (simply tags) data with the benefits of using SQL statements for access.

Here’s Snowflake’s summary: Snowflake’s architecture makes it possible to query semi-structured data and structured data together using SQL. You can join, window, compare and calculate structured and semi-structured data in a single query. This makes it possible to eliminate extra systems and steps while realizing superior performance, simplifying data pipelines and reducing the time from when data is generated to when it can be accessed and analyzed.

In layman’s terms, Snowflake gives a single place (Data Lake) to store all your data for quick access.

Here’s 2.5 minute video to watch: https://www.snowflake.com/workloads/data-lake/?wvideo=eudbyp…

So, my question is: given that MongoDB’s future is its Atlas cloud product, why would developers not choose the cloud-first, more data structure agnostic, and better speed performing Snowflake database instead, and what does that mean for MongoDB’s future?

17 Likes

https://www.linkedin.com/pulse/snowflake-cloud-scale-data-wa….

Everything I am finding suggests using Snowflake as an OLTP engine would be a bad idea.

“Snowflake stores data in contiguous units of storage called micro-partitions. Micro partitions are immutable which means they follow “write once” and “read many” approach. The rows in a micro partition are stored in a columnar fashion. The smaller size of the micro partitions and columnar structure helps granular pruning of large amount of data stored in a large number of partitions. But what does it mean for data insertion? Since it uses MVCC and is immutable, any insert or update actually copies the entire partition. This feature helps in time travel and even undropping a table. However due to this feature a single row DML is not going to scale efficiently. If you have a 16MB partition with 100,000 rows, it will take the same time to update a single row as it will take to update all 100,000 rows in a partition.”

6 Likes

That Mongo is replacing other databases shows that Mongo is itself replaceable.

If A is more capable than B, then A can replace B, but B can’t replace A unless the requirements for use are limited to what B can provide.

My perception of these two products is that they have quite different purposes. It has been true for years that companies would often have one database which supported the primary transactional work of the company and another database which was used for analysis. Sometimes this might mean both were of the same technology, but the analysis one would be very differently structured. Often, two different technologies were used.

Here, Mongo seems to me the database one would use to support the companies business transactions and Snowflake is the one would use for analysis, at least in those cases where the analysis can not be adequately done on Mongo.

BTW Progress DataDirect makes connectors for both Mongo and Snowflake so that they can be queried with SQL. https://www.progress.com/datadirect-connectors

6 Likes

ritchy writes:
Everything I am finding suggests using Snowflake as an OLTP engine would be a bad idea.

tamhas writes:
Here, Mongo seems to me the database one would use to support the companies business transactions and Snowflake is the one would use for analysis, at least in those cases where the analysis can not be adequately done on Mongo.

If you’re actually doing transactions (OLTP stands for OnLine Transactional Processing, btw), then would you really be using a NoSQL database? Aren’t relational databases typically used when read/write/delete/insert/modify transactions (the bread and butter of OLTP) are needed?

I could see using MongoDB for the front-end gathering of data from lots of IoT devices. Batch those up and send them over to Snowflake for long term storage and data analysis. You don’t change what an IoT device reports, so there isn’t any modifications or logic needed (like a bank account where you want to be sure someone doesn’t withdraw more than they have). With IoT devices you just collect the data and that data doesn’t get modified.

But, even here, is MongoDB overkill/too expensive to be the front-end gatherer of data?

Looking at how MongoDB itself advertises (https://www.mongodb.com/use-cases )these are the Use Case categories they tout:
* Single View (real-time views of all your most important data)
* IoT (analyze and act on data from the physical world)
• Mobile (mobile app development made fast & easy)
* Personalization (relevant content presented to all your users)
* Catalog (product catalogs, asset management, and more)
* Real-Time Analytics (analytics at the speed of your data)
* Content Management (store, edit, and present all types content)
* Mainframe Offloading (move workloads off the mainframe)
• Gaming (video games that are global, reliable, and scalable)
• Payments (modernizing your payments)

I used a * to indicate Use Cases that I think overlap with Snowflake.

BTW, thanks, I appreciate the discussion, and hope others are finding it useful.

7 Likes

If you’re actually doing transactions (OLTP stands for OnLine Transactional Processing, btw), then would you really be using a NoSQL database? Aren’t relational databases typically used when read/write/delete/insert/modify transactions (the bread and butter of OLTP) are needed?

If I am doing it, then probably yes since I am a dinosaur who spent the bulk of his professional life on relational databases … or before relational databases. But, my understanding is that these days it depends in part on the kind of task one is managing. If it involves a lot of unstructured or loosely structured data, then a document database might be just the thing.

As for your use cases, I think a number of the ones you have starred are not really overlaps because one would use Mongo or Snowflake to address a different part of the requirement. E.g., IoT, one might use Mongo to handle the incoming data stream and organize it, but then use Snowflake to analyze it.

1 Like

I know this is slightly off-topic. I’m very interested in SNOW but came across an interesting feature that Azure Synapse has called Azure Synapse Link. They are billing it as a way to NOT need ETL and to get real-time analytics thru Azure’s SNOW competitor. For now, it works with Azure’s CosmosDB (their MongoDB competitor). Microsoft is promising it will have connectors to other platforms. Basically, when inserting into CosmosDB, Cosmos will real-time replicate the transactions to a column-store copy (more efficient for analytical workloads), which is then used by Synapse Link to directly connect to the data for real time analytics.

I’m not sure if SNOW has the same capability. If not, this could be a challenge for them going forward as the idea of not needing to transform/move the data with ETL processes could save companies a ton of time and money.

Having said that, I like what you’re thinking, if SNOW could be used for OLTP, then that could also remove the need for expensive ETL. I have seen some documentation where SNOW mentions that but not sure its really an option.

…one would use Mongo or Snowflake to address a different part of the requirement. E.g., IoT, one might use Mongo to handle the incoming data stream and organize it, but then use Snowflake to analyze it.

Right, that’s part of my question. Would one really choose Mongo as the front-end database? Is the cost/benefit ratio for data ingestion worth it for Mongo over other choices? What about other types of data for which there’s a known schema and so one can choose a relational database?

I’m not so much trying to justify SNOW as I am wondering how SNOW impacts MDB.

1 Like