Oh crap. Settle in … it’s time for another technical deep dive.
October happens to be Cybersecurity Awareness Month, and as the list of companies I am interested in within this space is expanding, I thought I’d bring you more than you wanted to know about cybersecurity. It gets deep enough on the technical side to explain it to the non-techies out there, then I bring it back around to what our hypergrowth companies are doing in this space and what I see in them.
But even just skimming the surface of network security, it’s a lot to cover. So I have spent a lot of my (not so) free time over the past few months to compile these thoughts about cybersecurity, and explaining the multitude of terms and acronyms. I believe it helps to understand the technical history (the way things were and where they are going now), to better understand what is driving the success of our hypergrowth stories in this space.
1 - Intro
2 - Network Basics
3 - Attack!
4 - Why SECaaS?
5 - Flavors of Security
6 - A New Dawn
PART 3 (coming soon)
7 - Hypergrowth
8 - Our Companies
Cold autumn drizzle
Silent ghosts in my network
Never to exit
- muji, TMF Poet Society, October 2019
We live in a connected world. Every company in today’s world MUST have the technical skills to setup, secure, and monitor their day-to-day business operations and company secrets (aka proprietary data: payroll, contracts, intellectual property, payroll, supply chain, customer lists). This is why I like to say [repeatedly] that “EVERY company is a tech company” under this connected global economy.
When it comes to network layout, the norm for companies is to have a complex arrangement. A company typically has digital assets & connected hardware (file storage, data systems, POS systems, printers, IoT sensors, “smart” equipment, cameras) within physical locations (offices, factories, mobile fleet, store fronts) containing a workforce that utilizes computing devices (workstations, laptops, tablets, phones) to track business operations (sales, marketing, finances, HR, payroll, operations, accounting, IT, R&D). And today, the definition of “workforce” is greatly expanding, as companies connect not only their employees across all their enterprise locations, but also remote workers, contractors, vendors, and other partners. And it isn’t just physical locations that need securing any more – the workforce is increasingly going mobile.
And then there is a company’s operational infrastructure. A company could be maintaining their own email, HR, payroll, accounting and identity mgmt systems, or, increasingly, they could be using outside SaaS providers. A company may have on-site or remote data-centers that they maintain, or they may use one or more IaaS (Infrastructure-as-a-Service) cloud providers, or some hybrid mixture of the two. And if a company is itself a SaaS company, infrastructure is all the more important as it is also customer facing (hosting web servers and APIs).
And on top of all that networking complexity above, a company needs to SECURE that nest of systems, users, and the processes between them. For that, SaaS services have emerged, providing cybersecurity services to protect your company’s assets, systems and/or employees.
Cybersecurity = Protection of internet-connected systems from cyberattacks.
Security-as-a-Service (SECaaS) = A SaaS company providing some type of cybersecurity service to enterprises.
Because of all the complexity in securing all of the internal pieces above, SECaaS are forming services that solve some portion of that complexity. When a customer adopts their platform, this means that they become embedded into the daily operations of companies, and typically at a cost that is less than the internal resources (systems and staff) required to “do it yourself”.
It is a bit of an epiphany when you finally understand the power behind SaaS enterprise services and their inherent “stickiness”. However, I feel cybersecurity-related services have the HIGHEST LEVEL of stickiness due to the nature of security; the universality of how all enterprises need these solutions (and always will!) is driving much of the hypergrowth here. Yes, one doesn’t necessarily have to follow the technological ins-and-outs to do well (Saul and others here have done very well indeed, even when lacking in-depth tech knowledge and just following the numbers). But I believe a good understanding is critical to knowing where your holdings are in the market, how much is left in their growth, and to spot when those companies are expanding their market opportunities.
Several of our popular hyper-growth stocks are in the SECaaS space – OKTA, ZS, CRWD, ESTC – so it’s time to talk tech a bit to explain how these companies fit into the puzzle of security coverage. As you’ll see, they all complement each other, and there is room in your portfolio for all of them. (Well, at least in mine.) Other SaaS companies followed here that are not directly related to security are benefiting as well – IT services like managed databases (MongoDB Atlas & Stitch, Elastic Cloud), infrastructure monitoring (Elastic Stack, Datadog), endpoint communications (Twilio) and incident response (PagerDuty) are also thriving, in part due to this never-ending need to manage and watch your systems.
Conventional network security has always been built on the assumption that that your internal network is a trusted zone, requiring a perimeter be built and maintained to keep the untrusted clients out of it. It’s called castle-and-moat security (if it has a name… up until lately, it’s just been called “network security” because it was the ONLY option). It was typically comprised of multiple per-purpose hardware devices (“appliances”) obtained and set up and maintained by your IT dept.
– Network Layers –
First, let’s talk about what a conventional company’s network layout is typically comprised of, as it helps us understand where attacks are hitting. Mary Meeker had a good slide in her 2019 report about the layers of infrastructure. https://www.bondcap.com/report/itr19/#view/152 [For the Luddites out there, I’m going to use a really basic analogy alongside these terms – your house being a trusted network (castle), the outside walls of your house the periphery (moat), and the outside world as untrusted (public internet).]
Core = A company’s data center(s), comprised of on-premise or cloud infrastructure. Typically where the centralized database storage, file repository, and the compute tasks live (aggregation, search, analytics, monitoring). [These are the rooms of your house, each with its own purpose (each system a room). That makes enterprise SaaS services you utilize be a drop-in container you rent that plops a storage pod in your driveway (instant-room) – in that you have to go outside to access it.]
Edge = Edge of the company’s network that control and manage the network entry into the trusted network. Edge is the gateway used to corral communications from endpoints and allow them access to core. What constitutes the edge varies by industry and purpose of the network - telecommunications network edge could be a cell tower, while a company’s network edge may be a specific office’s router and firewall. [These are the doors into your house.]
Endpoint = Hardware devices that connect into a company’s network – all the individual computers, laptops, phones, tablets, printers, IoT sensors, cameras, smart meters, POS terminals, etc. These are the devices the workforce is using, or are remote devices collecting data on their own. [These are the people that want to get into the house to do something. How do you tell friends and neighbors from thieves? SECURITY.]
– Network Devices –
There are many common basic network devices, which could be either hardware appliances or software-based. [And I will extend the Luddite analogies from above.]
Network router = Device that moves packets between 2 points in an optimum way. [If your network is a house, think of it as the passageways (hallways and doors) between rooms.]
Network gateway = Device that joins two networks together, serving as a boundary to both. A gateway is a router, but a router isn’t necessarily a gateway. [Think of it as the doorway between the house and outside.]
Proxy server = A gateway device that acts as an intermediary, to prevent direct access from untrusted networks to trusted. For instance, if a company hosts its own web server, a person’s browser is querying a proxy that takes the request and forwards it on the trusted network to the appropriate server, and takes the response and gives it back to the browser on the untrusted network (public internet). [Instead of letting a request from outside into the house, a proxy stands at the door, takes the request, goes and gets the answer from the appropriate room, and takes it back to the door to give the response back to the requester. This helps insulate the core rooms from the outside ne’er-do-wells.]
Firewall = A gateway device acting as a barrier that utilizes pre-set security rules to control & monitor incoming and outgoing network traffic based on pre-determined security rules, typically between a trusted internal network and untrusted external network (public internet). Rules are set as yay/nay (allow or disallow). [This is the locking screen door, letting air in and out (valid requests) but not the mosquitoes (invalid requests).]
DMZ (De-Militarized Zone) or Screened Subnet = A specialized subnet (separated subdivision of a network) used to isolate external traffic to a different network space than the internal trusted one. It connects to both the trusted network (for communication with core systems like databases) as well as the untrusted network (internet users), such that only the DMZ is visible to the outside world, keeping the trusted network safely isolated. Typically holds web, mail, & FTP servers that must be accessible from the internet. [Think of DMZ like a vestibule or foyer, with a locked door (firewall) to go inside but also a door (firewall) accessible from the outside.]
Edge devices = A network gateway that controls access to the trusted network, controlling requests and data flows between endpoints and core. This acts as a proxy and firewall to the trusted network. [This is the butler that answers the door and knows who to let in, who to proxy requests for, and who to stop by locking the door.]
So edge devices are gateways that deal with network interconnections, and endpoints are typically remote devices that are connecting to the edge device as a client [knocking on the door and asking the butler to enter]. Cloud computing and IoT have started making the role of edge devices more important, increasing the need for more intelligence (compute) at the network edge. Edge devices can now be polling stand-alone endpoint devices (like IoT sensors) outside the trusted network and acting on them.
Edge computing = Refers to how edge devices that are gathering remote data from endpoints could be doing compute or analysis BEFORE passing the data to core, as opposed to the core doing it. There are pros and cons to this depending on the use case, but typically it leads to faster response and lower latency (edge is closer to endpoints than core) and less network traffic (edge doesn’t have to pass everything to core, it can strip down or anonymize final data). Downside would be that edge devices typically don’t have a complete picture that the core would, so is only capable of handling & analyzing its own data subset. [Your butler is sending or receiving messages to the outside to know when a task is done and something needs to be triggered, eg knowing a parcel was delivered outside and triggering someone to go get it.]
– Ever-Growing Perimeter –
Network security is comprised of products designed to monitor and secure network traffic moving in and out of your perimeter, to stop threats before they materialize. There are many many companies that provide hardware and software for this. It is not an industry to just blaze into as an investor, as much of it is commoditized, and, given the preponderance of a lot of existing infrastructure and hardware, is not an industry that will be disrupted overnight.
Enterprises have to be ever-vigilant to maintain their perimeter, with a lot of monitoring and implementing of new security approaches. Changes are always made with fingers crossed – yet inevitably, the majority of security actions are reactionary with a perimeter, as the majority of work is in patching holes that are discovered & containing the damage of any breaches. Cybersecurity is generally a game of “whack-a-mole”. Gartner estimated in 2017 that enterprise infosec spend is 90% prevention and 10% detection.
It’s easy to hit the limits of this perimeter strategy – companies hope to grow, and often need to expand their trusted networks beyond a simple structure (for example, interconnect multiple locations, or add acquired companies into their network, or allow access to a remote or mobile workforce). This leads to a lot of complexity in maintaining security, as the perimeter becomes much larger than a single location and what one set of devices can handle.
Enterprise networks are designed to be “outside-in” – users are moving from the outside (untrusted network) to inside (within the trusted). You need to build the perimeter to keep other things out except for those users you permit. There are several ways to allow outside users and other networks to connect to your trusted network.
VPN (Virtual Private Network) = Extends a private network across a public network (the internet), so that remote users can securely access a trusted network from outside the perimeter. Creates an encrypted tunnel between the end user and the trusted network. This is commonly used to allow remote employees access to an trusted network, in order to access enterprise applications.
WAN (Wide Area Network) = Separate networks joined together as one, regardless of distance. Useful to interconnect various physical locations together (offices, factories). Can use a VPN or set up dedicated connections with a telco/ISP. [This is like an underground tunnel between your house and another house, so you can go from house to house w/o going outside.]
Hub-and-spoke topology = WAN layout having one main hub (primary office) and the rest integrated as spokes (all the other locations) off the hub. All traffic is routed through the hub as spokes intercommunicate. Simplest to set up but has a single point of failure.
Mesh topology = WAN layout having all the networks interconnect with each other directly. More redundant than hub-and-spoke, but harder to build & maintain. Could be utilized over a hub-and-spoke (“Partial Mesh”) to have most important locations meshed and secondary ones spoked.
– Endless Cloud –
All of these topologies still try to maintain a perimeter around a trusted network. When you have multiple locations, security gets complicated fast, as your perimeter has to extend around the entirety of it. But then how do you then maintain that perimeter out further, when your network then expands from on-premise data centers to managed infrastructure in the cloud (IaaS), or when your company starts utilizing enterprise SaaS services on a regular basis? It is difficult to maintain a tight grip on your data when you aren’t in charge of the servers it resides on or the network paths it takes. The explosion of cloud infrastructure and SaaS services (driving today’s and tomorrow’s hypergrowth) hinder a company in maintaining a meaningful perimeter.
Companies are starting to leverage cloud infrastructure due to cost, ease of use, and that it’s scalable and can grow with them and their needs. Costs are minimal compared to long-term infrastructure costs of IT staff and in buying, housing, securing and maintaining servers and networking gear. Companies can be in any number of phases of cloud-integration.
On-premise = Company that has a enterprise network across one or more locations, and maintains their own servers for application and data hosting, hosted on-premise or in a data center.
Cloud-native = Company whose entire business operations are maintained on cloud infrastructure or from using cloud SaaS services.
Cloud-hybrid = Common combination of the two. Many companies are have been on-premise so long - the only choice until IaaS services took off – that they are slow to migrate. Companies have a lot of existing infrastructure, so are likely adopting cloud initiatives to start testing the waters.
Cloud-first = Hybrid company that has come to a tipping point, where they will choose SaaS and IaaS solutions over building it themselves or maintaining internal infrastructure. They are not interested in buying any more infrastructure beyond that which they already have in place. Companies can go cloud-first at any time, as they are discovering the cost for SaaS services for infrastructure and for biz-op services are less than buying on-premise software plus the costs in maintaining IT staff and system hardware to maintain that software internally.
Now that systems are be being moved out to the cloud IaaS platform, and services are being used from SaaS enterprise services, all those network connections passing data back and forth must be protected. Think about all the SaaS/IaaS services now at play in a given company’s toolkit: Microsoft 365 (Office, Excel, Outlook), Google Docs, Box, Dropbox, Slack, Zoom, Workday, Paycom, ServiceNow, Salesforce, Marketo, Zuora, Shopify, Square, Atlassian, Github, AWS, Azure, Google Cloud. As data transmits to and from those SaaS services throughout the work day, the possibly exists for malicious attachments or activity. You need to protect all traffic between internal trusted network and SaaS services to assure data isn’t leaking out and that malicious activity is not coming in. A company has to be sure there are no gaps in their network protection.
When using IaaS infrastructure, the cloud providers are not particularly interested in being the single line of defense, and as such are none too eager in taking the blame. Capital One was just breached by a former AWS employee, and AWS said “it wasn’t us, it was a mis-configured WAF”. The complexity of maintaining security doesn’t go away when you adopt IaaS for infrastructure. When the perimeter expands out to the cloud servers, maintaining, securing and monitoring that perimeter just became a lot more complex.
Incoming network connections (to your apps) must be protected! Outgoing network connections (your employees conducting business) must be protected! Intercommunication to and from SaaS services must be protected! ALL TRAFFIC must be protected!
When you see the adjective “smart” applied to things, as in smart home, smart clothes, smart toys, smart phone – substitute the term “hackable.” They always come together. Kevin Kelly, December 2018 https://twitter.com/kevin2kelly/status/1073103112007634944
Companies may try to buy their way out of harms way, yet can never expect to remain breach-free. There is an ever-increasing level of sophistication and coordination in attacks, so staff have to be up-to-date in their security knowledge in order to know what to expect – they have to address the known knowns, as Rumsfeld put it, by buying a lot of security hardware and staff, and by patching and monitoring all those systems continuously. Yet a company cannot help but have gaps in knowledge & expertise within their staff and equipment (the known unknowns), plus new attack vectors are emerging daily from unknown angles (the unknown unknowns). Per IndustryWeek, 2018 saw a massive increase in cyberattacks, so 75% of companies are increasing cybersecurity spend.
It is not a matter of IF you get attacked, but HOW, WHEN, WHERE and WHO. There are many ways for a malicious actor to attack a network, and many possible targets of their attack. Using encrypted traffic and VPNs goes a long way towards mitigating many of these, but it is a CONSTANT battle to keep up-to-date with attack vectors and patching of your computer systems and network hardware. And even then, vulnerabilities can still crop up. For instance, Heartbleed was a major vulnerability discovered in a very heavily used SSL library, that affected nearly every server-side software. http://heartbleed.com IoT devices in your network are another huge risk. It’s helpful to have cameras, sensors, and devices hooked up to the network, but it’s another potential breach point with its own set of exploits.
One enormous setback with the conventional network security model is that “every company is its own island”, meaning every company only sees its own network logs and breach attempts. There are plenty of ways for IT staff to distribute information about attacks and try to keep up to date on security concerns and best practices (newsgroups, blogs, industry groups) - but companies are NOT going to share their security logs and cannot compare notes with others in real-time. It makes for a lopsided battle, where attacks can be coordinated, but the response never is.
– What you can lose? –
Personally Identifiable Information (PII) = Term for any data that could potentially identify a specific individual (name, addr, SSN, DOB) or reveal private info a user may have.
Personal Health Information (PHI) = PII pertaining to medical data, such as your charts, diagnoses, or genetic makeup.
Payment Card Info (PCI) = PII pertaining to financial payment, typically credit card payment details (CC number, expry date, security code).
Data Breach = Incident in which sensitive or confidential data was illegally accessed and downloaded. Typically involves theft of data with PII, PHI, PCI, or company secrets (like financial data or intellectual property).
Incident Response (IR) = An organized approach to addressing and managing the aftermath of a security breach or cyberattack, in order to handle the situation in a way that contains the attack and limits damage.
– Who is attacking? –
There are a wide variety of possible agents and intents behind a hack attempt:
- Nation states or national governments (cyberwarfare)
- Terrorists (cyberterrorism)
- Industrial spies
- Organized crime groups
- Hackers or Hacktivists
- Business competitors
- Disgruntled insiders
Black Hat hacker = An unethical hacker (malicious actor) that wants violate systems to steal or to cause harm.
White Hat hacker = An ethical hacker attempting to discover exploits and patch vulnerabilities. Typically has advance authorization to do penetration testing.
Gray Hat hacker = A mix of the two that lives in the middle. Generally means someone who is breaking laws (hacking w/o authorization or notice to the company) but the intent is not malicious. Some companies are offering bounties for any details on how to breach their systems, so it has become lucrative work.
– How you lose it? –
Attack vector = The path a malicious actor takes into your system, in order to plant malware, steal data, or burrow deeper into your network systems.
Exploits = Known vulnerabilities in software or hardware systems that become easy entries into your computer if left unpatched. Companies are at high risk here and must remain current on their patching of computer and network systems. IoT devices in particular are problematic in that they are more rarely patched (or, worst of all, are not patchable, so exploits can remain exposed forever!). It is also common to forget to change the default administrative password.
Zero Day Exploit = An exploit that is newly discovered and has not yet been patched. Big trouble, plus copycats may appear when it starts getting media attention. Even more trouble, however, are the exploits that have NOT been discovered!
Shadow IT = Software exposed to internet that IT doesn’t know about, or the use of unauthorized cloud apps. Hard to patch things when IT is unaware of it being there.
Social engineering = Use of deception to fool employees into divulging information or system access that they should not be (aka the human element). Beyond network security, the workforce itself is a security factor, where you need to provide education on being security conscious and aware of the threats (employee could click on malware, or could accidentally provide PII or credentials on the phone to a deceitful caller). And attacks aren’t solely from the outside of your network - employees can have malicious intent. Companies have to remain diligent in their security efforts, and can never let their guard down.
Advanced Persistent Threat (APT) = A prolonged and targeted attack that gains access to your trusted network, and potentially remains undetected for a long period of time. A ghost in the machine (as mentioned in my haiku), that is covering its own tracks. The focus of APT attack is more about monitoring your network and stealing data than it is causing damage (which is likely to draw notice).
Distributed attack = A coordinated attack from multiple nodes across one or more compromised networks. This allows malicious actors to flood servers with requests.
Distributed Denial of Service (DDoS) attack = A coordinated, distributed attack against your web services or servers, in order to disrupt normal traffic – most likely to overwhelm a service by flooding it with bogus requests.
Botnet = A collection of connected devices that have been compromised, in order to be under a hacker’s control for DDoS or other distributed attacks. IoT devices with poor security have been making this style of attack easier, and more potent. This allows hackers to greatly multiply their attacks as a sort of malicious super-computer. The Mirai Botnet, set up by a few college kids, is a well-known one.
Brute Force Attack = An attack that attempts to force its way into an account by guessing as many possible combinations of credentials as possible. Think this is hard? A white hat just earned a $30k bounty from Facebook for his incredibly easy method of brute force attacking a 6-digit passcode (a common way to do 2FA to a mobile phone) in the 10 minute time limit that Instagram allows to reset your password. He proved he could hack into ANY Instagram account for ~$150 in cloud resources. (See article at bottom - it’s a good read, and frightening when you realize how easy it can be for a determined hacker to get into your accounts.)
SQL Injection = Web server attack against database query services that allow for running additional embedded commands against the database. Could possibly allow for viewing or modifying the database, like injecting new account credentials or showing existing ones. Typically caused by bad software development practices.
Cross Site Scripting (XSS) = Web server attack that allows a hacker to submit or embed a custom script that other users of the web site may be exposed to. Any website with a forum or comments section has to worry about this.
Phishing = A form of social engineering involving widely broadcast emails disguised as legitimate messages (ie a Paypal email that asks you to log in to your account) that attempt to lure you onto a fake website in order to capture your user credentials. [Setting up MFA/2FA on your accounts helps alleviate this threat! Use Authy from Twilio to track 2FA tokens.]
Spear Phishing = A highly targeted phishing attack against a specific group or individual, instead of being widely broadcast out. “Whaling” is a spear phishing attack against a high-value target, like a CEO or politician.
Business Compromised Email or Man-in-the-Email attack = Attack gaining access to a corporate email account, to pose as a higher up in order to entice or threaten employees into performing an action - typically to commit fraud by getting staff to pay bogus invoices or wire money.
Malware (malicious software) = Hidden software planted on your system to capture keystrokes, gather sensitive data or gain access. Common types include viruses (manipulates files), worms (self-replicating), trojan horses (masquerades as legitimate), spyware, ransomware and fileless malware.
Spyware = Malware that allows a user to spy on the user, such as a keyboard logger (captures what you type) or camera or mic capture.
Ransomware = Malware that encrypts your files, in order to extort you into paying a ransom to regain access. You can ask several US cities how they feel about that (see NPR article). By May, there were 22 known attacks on US public-sector so far in 2019 (see CNN article).
Fileless malware = Malware that resides entirely in memory (RAM), never writing to disk as a file, in order to evade detection.
Drive-By-Download = Malware downloaded from a compromised website, where a user inadvertently installs it onto their own system.
Malvertising = Online ads that lead to malware installation.
Cryptojacking = Malware to take over your system for its compute power, in order to build a network of systems to mine cryptocurrency on your dime (your hardware and power bill). In mid-2018, 4 of the top 10 malwares were cryptojacking scripts, including #1 and #2.
Polymorphic malware = A type of malware that constantly changes its identifiable features in order to evade detection. Frequently changes its signature (like having random file names) to evade detection via pattern-matching.
Wifi spoofing or “evil twin” = Creating a fake wifi network (e.g. “Starbucks-Guest-Wifi”) to fool users into connecting to it, in order to eavesdrop on their network traffic. A company demoed it on 60 Minutes back in 2016 (see article).
Man-in-the-middle attack = Eavesdropping on network traffic in order to sit between two sides of a valid request, acting as the destination while capturing the steps of entry. Websites typically use HTTPS protocol now in order to help thwart this, which makes network traffic encrypted.
Replay attack = Eavesdropping on network traffic to capture the steps of entry into systems, in order to replay them to gain entry. [Enabling MFA/2FA on your accounts is a good deterrent to this, as you then need another factor that the hacker cannot capture in order to authenticate. Also helped when websites finally all went HTTPS, so credentials are not being sent plaintext.]
Account hijacking = When an attacker uses stolen account credentials (say, by phishing or keyboard logging malware or replay attack) to conduct malicious or unauthorized activity.
Session hijacking = Compromising your account by using an existing login token taken in a man-in-the-middle attack.