MDB Goes Mobile

MDB Goes Mobile

Dang Saul, great job on all the ZS notes. Since I wrote up Okta, I was about 70% through writing up a product review & technical deep dive post on ZS, but then got swamped by work and family needs. That ultimately cuts into my investment homework time… then Saul beat me to the punch by posting all his collective notes. He covers the basics of the ZIA and ZPA products well, so I may table that post and write up some takeaways after I have a chance to fully absorb those extensive notes (especially the conf call takeaways).

For now, it’s time for another technical deep dive… let’s switch to MongoDB. A database engine - a subject near and dear to my heart! If you read my ElasticON writeup (post #48019), you know software development is my job, and I enjoy it immensely. I personally work more with Elasticsearch than MongoDB, but am starting to work with it more as I have some use cases that it solves. For the sake of this post, I’ll call the company MDB and the product MongoDB.

MDB just had an acquisition, their third. Their prior one, mLab in Oct 2018, cost $68M, and allowed them to convert customers from an Atlas competitor (managed cloud-hosted MongoDB instances) into Atlas customers. While that last acquisition was a customer and team acquire, they just bought Realm for $39M to acquire product lines that help them jump start their new mobile initiatives. In this prior thread (https://discussion.fool.com/mdb-acquisition-34190957.aspx?sort=w…), Shikotus did a great post breaking down reasons why in non-technical terms, and SteppenWulf then filled in with some technical points about what it means, competition wise.

I want to riff a bit on MDB’s moves into mobile.

MDB OVERVIEW

What has allowed companies like MDB and Elastic (and non-database dev tools like Twilio) to thrive is how embedded they become as tools a software company uses to solve problems. As a developer has a need for data storage in their application, they pick and choose amongst the available SQL and NoSQL databases and select one that serves their use case best – and it better be one they can learn to use fast and get productive with as quickly as possible. Once that database gets embedded into use within their application and it goes into production, the software company heavily relies on the stability of it. From there, it needs to be able to scale – because as a software company gets more and more successful, their usage of (and dependence to) those databases goes up immensely. And as those companies have success with that new tool, they use it again in other projects or products. Once the tool is embedded in the software architecture, it is usually not going to be removed unless it starts being unreliable, or is not scaling up enough to keep up with the company’s needs.

MongoDB is a document database, which allow you to store and query a collection of data objects. Being a NoSQL store, it excels at being flexible (data properties can differ from object to object) and can easily scale. MongoDB couples easily with modern web & mobile application code, as each piece of data sent or queried is basically a ready-to-use object, whereas in SQL, you would use a data translation layer in between (such as an ORM) to help translate SQL rows to a usable object, and vice versa. In NoSQL, instead of relying on a “common query language” like SQL, a developer accesses the data via simple API methods for querying, inserting, updating and deleting. Document databases have a huge number of use-cases, as it can be used anywhere people have a collection of objects… such as movies in a streaming service, properties on a rental site, books in a library, user reviews or comments, store inventory, etc.

MOBILE DATABASES

The primary methodology used to access SQL and NoSQL databases is the client-server model, where the web or mobile app is a client that makes requests to a database that is hosted on a remote server (either on-prem on cloud-hosted).

But this doesn’t solve every need. Yes, the world is increasingly more connected and cloud-driven, but there are still bandwidth concerns and connectivity issues. Applications are, of course, going to be MUCH faster and MUCH more performant if a subset (or all) of the data an application needs resides directly on the device. Users could perform tasks while off-line, that could be synced back to the cloud when they next come back online.

A different type of access method being used on mobile apps is having a synchronized copy of the database as a mobile database, in order to store the data locally on the phone or device itself, instead of being hosted elsewhere. A mobile database can basically be any of the SQL or NoSQL types, but in order to usable and maintainable, it has to be light-weight so it doesn’t overpower the device, as devices tend to have less compute and memory, and have battery considerations.

In addition, the database must be geared for synchronizing data between a centralized master server and the local database on each mobile device. In a mobile database, synchronization ultimately needs to support updates in BOTH directions:

  • Master to Mobile = syncing updates from the master database to the embedded database, updating the local data on the device with any inserts or changes they should get a copy of.
  • Mobile to Master = syncing updates from the local mobile database back to the master database.

SQLite is the long-time standard for on-device databases. It’s an open-source, light-weight relational “database in a file” that has been around forever, having been long used in embedded devices. It has the same SQL interface as server-hosted databases. It’s slow, but is easy enough to use for on-device database needs. One huge downside for modern architectures, however – it doesn’t have any out-of-the-box synchronization capabilities. Custom add-on solutions have cropped up, like AMPLI-SYNC (sync from SQLite to/from any major SQL database) or LiteSync (sync between SQLite dbs). And proprietary competitors have long been around, like InterBase, and a bunch of long-in-the-tooth bloated crap from Oracle, Microsoft and IBM that no one is using in their modern architectures. Despite all this, SQLite reigns supreme.

So the SQL side is covered, though mostly from a long-used light-weight database that has been piece-mealed to work with mobile. But what about NoSQL mobile solutions? As SteppenWulf mentioned, the only major player that has focused on mobile thus far has been Couchbase, who is one of the few document-database competitors to MongoDB that isn’t a cloud provider. Beyond them, a few NoSQL competitors have started to spring up, like the open-source databases Realm and UnQLite.

Speaking of cloud providers, they want you to AVOID using an on-device database, and instead have everything talking to a hosted solution… because having a local database on the device is counter to their existence of providing cloud compute and storage resources. So they are not (yet) much of a factor on this front, though AWS AppSync aims to provide some of the sync features.

Recently, MDB has been moving to fill in the gaps that Couchbase is covering, and now has their own mobile product. But first, let’s talk about the many steps MDB has taken towards the Cloud and Mobile…

CLOUD ATLAS

In 2016, MDB got the bright idea that customers don’t want to do the boring stuff in setting up their own database servers – they just want the features and they want it immediately. No more wasting time with patching, maintaining security and backups, and monitoring and auditing. The amount of time and IT resources this saves companies has to be massive.

So users can now go directly to the source, and have the database provider be the one supporting their database instances. Who knows the database engine better than the ones who wrote it? Atlas supports all the major cloud providers, and, like with web hosting, a customer can get their database hosted on a shared cluster or a private dedicated one, depending on their budget and needs.

We all know about the licensing changes that subsequently occurred. Needless to say, MDB woke up to the fact that AWS and Google were the real competition (to Atlas, and, as we’ll see, to Stitch), not open-source alternatives like Couchbase and Cassandra.

ALL EYES ON THE CLOUD

MongoDB v4 was released less than a year ago in July 2018, during their MongoDB World 2018 conference. This was their first major release under the new licensing. It shows they are making major moves towards Cloud as the future platform.

Beside the much ballyhooed multi-document ACID transaction capabilities (…as well as the new SQL connectors, the new Charts viz tool, and the new pipeline builder…), they made HUGE progress in making MongoDB more applicable for cloud hosted and mobile solutions.

v4 included:

  • MongoDB Stitch serverless platform released.
  • MongoDB Mobile released in beta for iOS/Android devices.

First lets talk Stitch, as it’s the important lynch pin that makes a Mobile version even viable. MongoDB Stitch is a serverless-platform for interacting with the database. At its core, it allows direct querying from front-end code (web or mobile app), skipping the requirement of needing a back-end API, plus provides the ability to run code inline, within the database.

Database developers typically control access to the database from a server-side API or service. The traditional software stack is a web or mobile front-end application that talks to an API. That API in turn controls all access to the back-end database engines (which could be Mongo, a SQL DB, Elasticsearch, Redis, etc) and file systems… as well any corresponding security, logging, monitoring & audit concerns around that.

Serverless platforms allow the front-end application (say, a Javascript web app, or an iOS mobile app) direct integration with the database, so no intermediary API is needed. This has the potential to greatly simplify the amount of infrastructure a SaaS or software company would normally need, as it could completely remove a layer of the architecture stack.

Your app still talks to a database server hosted SOMEWHERE (whether on-prem, or, most likely, the self-managed or managed cloud). But now the app does not need an intermediary API or backend service to access that database. And if you don’t need an intermediary server, why even host the database? If your database is also hosted in the cloud, it makes it so the company doesn’t need ANY servers – it can be a mobile or web app, hosted in the cloud, that just talks directly to the database in the cloud. Truly serverless.

End result of using serverless platforms like Stitch are a much simpler stack. Benefit 1: The front-end application can talk directly to the database. Compared to the typical stack above, you may be able to do Benefit 2: Completely eliminate the need for an API layer, especially by using the other features of Stitch (functions and triggers). After you lose needing a server to host the API, why not go all the way and Benefit 3: No longer self-manage and self-host your database instance. Use Stitch to talk directly to the MongoDB Atlas service.

STITCHING TOGETHER A PLATFORM

But beyond simplifying the development stack, Stitch also greatly extends the programmability of the MongoDB database, by allowing you to code scripts that can run in any copy of your database - even on Atlas.

  • Stitch Functions = allows you to embed custom code within the database, which include calling external APIs like Twilio and Slack.

  • Stitch Triggers = allows you to have data changes trigger events. Such as a running a function to trigger a Twilio text or email notice when a new customer record is added.

These combine into a potent new interface into the database, and one that greatly simplifies how the data is accessed - which in turn can all combine to completely eliminate the need for API layer. You no longer need to maintain any of your own infrastructure under this new paradigm. The database itself can be triggering back-end processes, like notifications and emails.

Serverless platforms like this provide a huge amount of developer lock-in. There aren’t standards across platforms, so you don’t go switching from one to another without a lot of pain. Do you think Stitch allows you to talk to AWS DocumentDB? Nope. It completely perpetuates lock-in to MDB - either to cloud-hosted Atlas (managed instance) or a self-hosted MongoDB v4 (self-managed instance).

However, cloud providers aren’t standing still. They have AWS Amplify and Google Firebase platforms, that are similar serverless platforms that tie to their own respective cloud services. (And again, they aren’t interested in providing an embedded on-device database - they want you to consume cloud services.)

These serverless ecosystems are the platforms that the next generation of SaaS tooling is going to be built upon. MDB had some catching up to the big cloud providers, but with Switch and Atlas, MDB is skating to where the puck is going to be. This sets them up to directly compete against AWS and Google and Microsoft serverless platforms. How can MDB differentiate? What they do best – helping you solve all your data needs, while being flexible and scalable. Part of that is solving niches that cloud providers can’t or won’t solve.

STITCH IS THE THREAD BINDING CLOUD TO MOBILE

MongoDB self-managed and MongoDB Atlas are great HOSTED database options, ones that must run on a server somewhere. Mobile is altogether different. A mobile database is one that lives on the device itself. It allows for a local copy of a dataset which, as we discussed above, needs synchronization with a master database.

MongoDB v4 included MongoDB Mobile, which aimed at filling this niche and finally catching up to Couchbase’s mobile offerings. It went beta when first released last summer, then went GA back in November. It allows mobile devices to install and use a local instance of MongoDB - the same database a company may already use on the server side. It works through MongoDB Stitch, which can serve as the interface for storing data either to a hosted database (Atlas or self-hosted) or to the on-device database. As of November GA release, it also currently has a beta feature called Mobile Sync, which allows for synchronizing changes between a back-end MongoDB and the MongoDB Mobile database.

I found an interesting tidbit on MDB Mobile features page: “MongoDB Mobile uses SQLite as a simple key-value store behind the scenes due to its stability and prevalence on devices.” https://docs.mongodb.com/stitch/mongodb/mobile/mobile-featur…

MDB has clearly been betting big on their new Stitch and mobile capabilities over the past year, but the mobile side still seems pretty rudimentary (mobile db just went GA, and sync still in beta). They need to inject some maturity in their mobile products, so… this new acquisition really comes as no surprise. You’ll never guess what Realm does!

THERE CAN BE ONLY ONE

Realm is a leader in NoSQL mobile databases, with 2 main products: a platform for syncing data, and a mobile database that they bill as an open-source NoSQL version of SQLite. It is a NoSQL document store that is very similar to MongoDB, not to mention also being open-source and free. They label their focus as “off-line first”, which means they are ideal for having the local on-device database be the main database for the mobile app, and necessary updates are sent from and to the master database once Internet connectivity is next re-established.

In addition, they have a managed cloud-hosted version called Realm Cloud that has been available since Jan 2018. Realm marketing says you can now use Realm as a “RESTless” middleware layer, aka a serverless platform. So this company is a mini MongoDB for mobile – they have their own MongoDB Mobile & Stitch and can manage and host it for you like Atlas.

Realm has 2B+ installed on-device databases, from 100k active developers across 350 customers. Published customers include Amazon, Google, Netflix, Starbucks, Ebay. Not too shabby.

I think MDB was so impressed by their sync features that they bought themselves bolt-on capabilities to integrate into their own platform. And yes, it’s for the team too, as they are the dev team best suited for integrating Realm’s db & sync features directly into MongoDB Mobile and Stitch platforms.

The PR makes it sound like Realm will continue to exist, but I doubt their platform will continue to be a thing… The clear path now for MDB is use Realm Database on the device (getting rid of SQLite), which much more closely matches MongoDB format. Then, utilize their sync platform to drive how it syncs up with the master MongoDB. MDB is sure to integrate it directly into MongoDB Mobile & the Stitch platform in order to sync to MongoDB self-hosted and Atlas databases. Final step would then migrate any Realm customers to MongoDB Mobile.

Bottom line - this was an excellent acquire. Stitch is the cementing the cloud-friendly future for MDB and was THE big news in the v4 release. Mobile is a major part of it, and is something that cloud providers can’t provide. They took out a successful competitor in mobile and are using that to overlay their existing product lines.

Mobile databases are starting to enable a new type of mobile app. Apps can be developed that are able to download a snapshot of the user’s data to the device, so the user can always carry it around and access it, regardless of being on wifi/LTE or off-line. This would make things FAST as well, as you aren’t having to download large datasets off the web. Any changes the user makes while off-line can be synced to the ‘home base’ database as needed. Clearly some major app providers are taking advantage of this, after looking at Realm’s customer list.

PR for the acquisition said they will be releasing more details at MongoDB World 2019, coming up soon (mid June in NYC). Hopefully we will start seeing signs of where and how Realm is getting integrated into the MongoDB ecosystem that v4 got started on (Stitch, Canvas, etc).

It’s an exciting time for app development, as cloud platforms have really enabled nimbleness and flexibility while enabling massive scale. I see MDB as making the right moves at the right time. They have been busy stitching together a cloud- and mobile-focused platform, and then enhanced it greatly with this acquisition. They join with cloud providers in having a serverless platform for their customers to better tie application development into their services. That means an ecosystem… and that means lock-in. And when you are talking development lock-in, you generally mean a deep integration that isn’t going to be replaced easily.

-muji
long MDB (11%)

148 Likes

Thank you muji.
bought more this morning

Muji,
Although I am long MDB (12.2%) and I think I understand the basic benefits of NoSQL vs SQL databases, I am having a heck of a time trying to figure out why mobile is important for MDB. Please be patient with me, I’m a chemical engineer not a computer scientist. The question I am trying to answer is what does one use a mobile database for? Is this a salesman accessing a customer database? A drilling rig accessing and inputing geological data for a region? An epidemiologist inputing and extracting data on the spread of disease?

Why is a mobile database useful? Thanks.

Best,

bulwnkl

5 Likes

I can think of one.

My job is to protect a fiber run. As such I get locate tickets, call before you dig. We hold the exact location of our fiber as a company confidential and home land security for official use only status of information.

As such we have to use our own maps to locate the fiber and we must do it one ticket at a time.

If we had a mobile data base that could populate Apple Maps, (Apple actually anonymizes data so that they cannot build patterns from our location searches) and we could over lay our fiber locations with our ticketing system information.

By rolling this into a local database on the Iphone the technician (Me) could do his entire job from a phone, and could continue working seamlessly when out of cell service.

These things would improve efficiency and accuracy.

It would also reduce or eliminate the need for a laptop and Ipad.

Cheers
Qazulight

9 Likes

I end up in courtrooms w no internet connection and my cell phone blocked. And even when I have good reception the data I want still takes too long too populate in the heat of a trial.

I solve this by having Dropbox local on my MacBook and all my files local on my iPad. It just does not work either way when any latency can cost a client terrible harm to their life.

Dropbox syncs back incredibly well. My iPad software is limited to just that device. That is becoming a market limiter for the company. I believe they are now working on enabling true networking for this product that is awesome but works only local on one iPad and you otherwise sync…like what a dinosaur…through iTunes! And then you have to manually download what was uploaded to a different iPad.

Mobile enables Dropbox like locality and near instant sync when connected for use through Mongo. I am sure sales people on the field and who knows who else are using the product for similar latency issues and sync issues.

What this enables is for this functionality to become native to Mongo.

One may argue that 5g may solve all these problems. I argue back that 5g will make these issues even more pressing and worse.

Thus, at least for me, the need for mobile database that can be local and then synced back to the cloud.

Tinker

5 Likes

Heck, as I am using this iPad program called Trial Pad, a unique piece of software available on on other platform, the one thing that stops me from exclusively using this for trial is the fact that any new documents that come in on my computer need to be manually uploaded to iPad and then properly marked and filed. Easy if you have a high paid paralegal. But I operate w out such as I have never found the need.

I doubt if the software is built on Mongo or even an NoSQL. It operates very well under an SQL framework. But this Dropbox like capability is incredibly important for any use that requires 100% uptime and latency is a game breaker. It is not just one document that comes in a few seconds after you ask for it, but document after document after document. It adds up and becomes untenable, even if you have a great connection.

Obviously large corporations like Coke and Home Depot have the need like I do for a local database to operate on the device and then sync back to the larger database. And that is what this acquisition better enables for MongoDB users, thus making Mongo more likely to be chosen as the database of choice for a project.

Anyways, it has been a big limitation on the best piece of legal software I have ever used. And it only works on an iPad. The frustration is that I only have used it when I get ready to prepare for a trial instead of using it as a regular part of running a case in real time since uploading each new document is just such a pain, and there is no synchronization to a larger database like Dropbox enables.

What this is worth in terms of database? I do not know. I do know it would greatly enhance the value of this software. Software that has disrupted the ability to present at trial. I am sure the details would bore you, but believe it or not such a grand piece of software needs this Remus like capability to make it truly wonderful.

Tinker

2 Likes

bulwnkl -

Hello again… we met 2 years ago at FoolFest, where you told me all about VEEV. And for that, I thank you. (125% gain in 1.5yr when I sold.)

Why is a mobile database useful? Thanks.

Great question. A company making a choice on using a mobile database boils down to this simple question…

Does your application benefit from having a dataset local to the user?

Benefits of a dataset local (on-device) to user:

  • Responsiveness
  • High Level of Privacy
  • On-device Search
  • Minimize Bandwidth
  • Offline Access

Focus areas that can benefit from an on-device local database:

  • Store data from on-device equipment (camera, gps, sensors)
  • Assuring access to assets while offline (mentioned repeatedly upthread)
  • On-device ML (Machine Learning)
  • Edge data collector/monitor (IoT, Bluetooth)
  • Batch updates to the user, instead of real-time querying
  • Local copy of data cache
  • Local copy of user-specific data (user’s “feed”, messages, history, transactions)

Some application ideas that crisscross those benefits/areas:

  • Healthcare apps - Info is private, can keep entire historical record on-device; store data from remote/bluetooth monitoring devices, push to remote systems as needed.
  • Email/Chat/Messaging apps - locally store user messages, pull from remote systems as needed.
  • Social media apps - locally store latest user’s feed, pull from remote systems as needed.
  • Single-player Gaming apps - locally store all game and user data, pull updates/maps/levels from remote systems as needed/connected.
  • Financial/Bank/CC apps - locally store user’s account transactions, pull from remote systems as it updates.
  • Remote Field apps (delivery, service visits, construction) - keep all assets on device, avoid spotty coverage.

Why would you NOT want on-device, and instead must rely on hosted servers (cloud or on-prem)?

Focus areas that require a centralized dataset:

  • Cloud compute at scale
  • Interconnect multiple parties
  • Collective ML
  • Large pool of data
  • Real-time datasets

Some application ideas are forced into a centralized data mode:

  • Multiplayer Gaming apps - track user actions & locations across map in real-time.
  • Ride Share apps - connecting driver to requests in real-time.
  • Stock market apps - user queries & market feeds in real-time.
  • Social Media apps - create user feed from user’s pool of friends.
  • Group Chat - requires real-time interconnection.
  • Web Conferencing - real-time video stream.

If your app or service requires “always connected”, like needing real-time feeds (Stock Market), or interconnecting users (Ride Sharing) or asset streaming (Video-on-demand), that is a pretty good reason to not worry about a local instance of the database. However, the benefits of on-device data can co-exist with remote lookups within the same app, as there are always opportunities to have a local cache of a user’s dataset store on-device.

Lyft needs you connected to pair ride requests with drivers, however your ride history can be stored on-device for responsiveness of listing/searching it, or storing your favorite locations locally so they come up quickly when specifying ride destination.

And regardless, the reverse is always true. If you use a local on-device instance of a database, it ALWAYS has be fed by a remote master database. A financial app might store a user’s transaction history on-device for listing/searching, but still needs to pull live data in on a regular basis to update the ledger from the master database. Any local changes can get synced the reverse direction.

-muji
long MDB

16 Likes

Thank you for your thoughts, muji. I have a bit of a technical question for you. [Non-tech folks, you may want to skip to the last paragraph here where I’ll take it back to investment implications].

Do you think the whole “serverless” thing, aside from being a misnomer, is a bit of a hype? I don’t code anymore, so my understanding may be a bit outdated, but I feel like we’ve been up the road of “let’s reduce 3-tier architecture to 2-tier architecture” before, only to go back again and again. We started out with 2-tier client-server architecture to begin with, of course; then we wanted the flexibility to decouple storage from business logic (i.e. we did not like that very same developer lock-in that you discuss with Stitch), and so the 3-tier architecture got introduced. And then every once in a while we have some people that start whining about too many layers, and try to take the middle tier out again. We saw this with JSPs for example - you had tags in the JSP spec that let you talk straight to the database, but again, it quickly became a “bad practice” to use those. I feel like this “serverless” thing might be nothing more than the latest re-branding of the same “let’s go back to 2 tiers” pattern again. Yet I am still not convinced it’s a good idea. Perhaps the most compelling reason for sticking with 3 tiers is security - which has NEVER been a more burning issue than it is today. With 2 tiers (aka serverless), your database is essentially directly exposed to your front end, eliminating a whole layer of defense. The very capability you describe to embed triggers and functions in the database could be a real nightmare from a security perspective. Whereas with 3 tiers, a hacker compromising a browser or a phone still had to figure out a way to hack through the server API layer before they could get unfettered access to the database, with serverless database technology like Stitch, once you successfully hack the client, the database is basically yours to exploit. That sounds like a real dangerous idea to me.

From your perspective as a technologist, do you feel that something materially changed this time around, and the security and vendor lock-in concerns we’ve always had with 2-tier architecture no longer apply? Or are we just seeing another hype cycle and cool heads will eventually prevail again?

From an investment perspective, this is much more about Stitch than Realm (which is still a great acquisition). I just wonder whether Stitch is really all it’s cracked up to be. Certainly serverless seems all the rage now, but I wonder if we are one massive security breach away from that no longer being the case.

Appreciate your thoughts!

2 Likes

shikotus,
As an old DB person too, I also had the same thoughts as you on data security as well as enforcement for role based access control (RBAC). But also concerns on data constraints and data integrity. Many applications require constraints and referential integrity to produce accurate and reliable reports. For example we would not want an invoice without the amount due or customer name field. Or a student class enrollment list with students who have no information record. I think the No SQL movement is be driven by the big data world and ML and AI. Here you have massive amounts of data and inadequate (scalable) resources to correlate and report. the ML/AI apps are much more forgiving and do not need perfect accurate and constrained data. SQL DBs are not going away. NO SQL is just a way to handle certain data for certain applications.

Here is posting you may have already watched SQL vs No SQL.
https://www.youtube.com/watch?v=ZS_kXvOeQ5Y

-zane
long MongoDB

1 Like