Question on UPST

Hi All,

I am a shareholder of UPST. I am just trying to get a better understanding of how UPST might fair in a recession or a depression.

As we have heard Saul say many, many times over the past few years, certain stocks, like CRWD, NET, DDOG and SNOW and part of the infrastructure of so many businesses. Thus, in a recession or a depression, companies just can not stop using these businesses. That seems to be such a huge sticking point for many of these businesses.

When it comes to UPST, how much thought needs to go into something like this: In 2008, 2009, and 2010, as the country was in a huge financial crisis, millions of people lost their homes and their jobs and could not pay their bills. Of course, during those years, the default rate on say “personal loans” (which is UPST’s main product as of right now but certainly may be just one of many products in the future) was certainly much higher than in more normal years.

Is there a possible scenario that can play out, that when the next big recession or depression eventually hits, the banks that use UPST’s products to fund more loans can be blindsided and take on big losses in those lean years and lose some of their faith in UPST’s products?

Or, maybe this is a better way of saying it : UPST was founded in 2012, right toward the end of the Great Recession. When they put all the data into all their different AI algo’s for understanding people’s credit risk, how much are they factoring in the really, really dark times for the economy? Are they using data from just 2012 and forward, or all they using data for 50, 75, or 100 years ago that also take into account these difficult periods?




I think of them as a usage-based service so it would be the performance of each vertical, weighted by size and averaged together, that would make a difference. E.g. the number of people looking for small loans, the number of people looking for auto loans, future verticals which they plan to enter as fast as they can. The data is probably out there to get a sense of a potential impact. I haven’t gone this deep myself.

1 Like

“Are they using data from just 2012 and forward, or are [sic] they using data for 50, 75, or 100 years ago that also take into account these difficult periods?”

I think the answers are mostly from 2014 and beyond and no.

First of all, the majority of the data that they (and the other loan based AI companies) use is data relevant to the specific persons. So they really aren’t using data from “50, 75, or 100 years ago”. It just doesn’t apply as well when trying to make AI based decisions for another person.

In the 10K (as well as the S1) they share a little bit about the data they use and you can draw some conclusions from that.

They started in 2014 with only 3rd party data feeding into their basic algorithms and expanded from there. Today they still use 3rd party data but have significantly augmented that with what they call Training Data Points (the secret sauce for training their AI algorithms).

The 3rd party data includes “standard credit attributes”, education, employment, and other factors including “macroeconomic signals”. This data will have some applicant specific historical data such as changes to the credit attributes over time, changes in education and employment, salary, prior defaults, etc. This is data that all lenders and algorithms have access to.

The Training Data Points are repayment events from their own data collected on the applicants/users. As of Dec 2020 they had ~10.5M data points providing over 17 billion cells of data (double from 1.5 years ago). That is over 1600 cells of data per repayment event. They are tracking many many attributes that probably include data such as timing of repayments relative to due date, method of payment, if payment was made from computer or mobile device, etc. One of the Chinese companies in the same space even used data points such as how quickly you typed on your phone as part of the training data.

As the training data has grown, so have the number of modeling techniques UPST implemented. The increase of modeling techniques is linked with the increase of their internal training data points because this is where they can create their moat. They are looking for relationships between behaviors and possibility of default for each specific borrower which they then can apply to the next applicant, separate from higher level macroeconomic events. The more data points they have, the finer they can train the algorithms for better results. And if they are the only ones with access to this data, they stand above the rest.

Success in the AI spaces require mainly 2 things - very good algorithms and lots and lots of data. The more data you have the more training you can do and the more 3rd or 4th level variables you can analyze for meaningful correlation.

Based on this, I believe that while the 3rd party data will contain some historical data on the specific applicants and economic conditions, the majority of the analysis is done on more recent data collected from their internal databases at the specific applicant/user level. The end result they are trying to achieve is to issue loans with the lowest default rates in ALL economic situations. If they are right in what they are doing, then the impact of the next recession will be lower for their approved loans than the industry. But only time will tell.



  • no position in UPST at this time, but getting close