Research paper on Upstart's underwriting mod

Hi all, this is my first post. I found this board in May this year and since then, I have been learning a lot from it. The posts here really helped broaden my vision and given me a framework for investing, so I would like to thank you all for contributing your time and energy, specifically our host Saul, who is so kind to share his extremely successful investing philosophy with us. I am writing my first post to discuss a recent academic paper I came across, which does an in-depth analysis of Upstartā€™s underwriting model. It has some interesting insights about how Upstartā€™s model works, which I think some of you may find interesting.

The paper is 61 pages long, but I will try to summarize the important results. I highlighted the key parts in bold, so you may only skim through them if you want to see the main points. For those of you who wants to take a deeper look, the link to the paper is:

The authors of the paper partnered with Upstart to provide an independent analysis of Upstartā€™s underwriting model. The analysis is conducted on a dataset provided by Upstart for the 900,000 loans it originated from 2014 to 2021Q1. It includes borrower characteristics at the time of origination and monthly loan performance for those loans. Authors also partnered with Consumer Financial Protection Bureau (CFPB) to develop a benchmark model that reflects the traditional underwriting practices of a bank using credit scores. Using these, authors compare the performance of Upstartā€™s underwriting model with traditional banks. I am generally wary of taking academic research at its face value since there usually is so much assumption or simplification that goes into these researches, making the results too superficial to apply to real life. However, this one stands out as all the dataset and the models come directly from the original source. Going through the paper, I didnā€™t see anything fundamentally wrong with how they do their analysis, so I do trust these findings to be meaningful. Now, let me get to the actual results.

First, authors examine whether credit score is a good predictor of deaults rates for Upstartā€™s loan applicants. Among Upstartā€™s loan sample, they find no relationship between probability of default and credit scores below 660. Meaning, among all the borrowers Upstart lends to, there is little difference of default risk between a customer with a credit score of 600 and 660 etc. This suggests that Upstart picks borrowers with a credit score of 600, while maintaining level of delinquency rates similar to borrowers with a credit score of 660. Now that Upstartā€™s partner banks are starting to drop minimum score requirements, Upstart can possibly serve customers with even lower credit scores (deep subprime) without a major increase in delinquincy rates.

Authors also describe Upstartā€™s marketing channels and average contract characteristics. As we are all aware, there is some concentration risk in terms of marketing channels. For over 45% of funded loans, the referrer is Credit Larma. Less than 25% is reached directly through Upstartā€™s own website. There are some referrals from LendingTree, Google, Facebook etc, but these are relatively minor. Note that these values are for loans offered before end of Q1 2021, so some of this is probably not up to date. For 44% of customers, the purpose of the loan is credit card refinancing, for 34% it is debt consolidation. The reasons like large purchase, home improvement, bills/rent etc. are minor. This shows Upstartā€™s objective is true to its main userbase, which is helping people with credit card debt.

The average contract has an APR of 22% with a 4-year maturity. Average annual income of the borrowers is around $67,000. The average credit score for borrowers is 653 at origination. Upstartā€™s main customer base is near-prime/prime. Less than 25% of borrowers have credit scores below 623. 75% of Upstart borrowers have a credit score of 683 or less. These numbers might be different now since analysis was conducted before Q1 of this year. With partnering banks dropping minimum credit score requirements, the average credit score and annual income is probably now lower and the average contract APR is higher.

The authors also show a histogram of Upstartā€™s predicted default probabilities for its borrower pool categorized by credit scores. Even for customers with high credit scores, there is some variation to Upstartā€™s predicted default risk. Meaning, there are super prime borrowers whom Upstart assigns a high probability of default. These are relatively small in size, but I think this contributes to the (vocal) minority saying Upstarts offers high rates to high score applicants.

Authors find that Upstartā€™s model is good at predicting default risk, which is not surprising. Upstartā€™s own measure of risk has a monotone relationship with default rates, even in cases credit score doesnā€™t. This means Upstart is much better at finding out whether a customers is more or less likely to default than the credit score. Best part is, this performance is not limited to low credit score customers. If you looked only at a sample of borrowers with credit scores over 700, Upstartā€™s predictions can still find who among those super-prime consumers are more likely to default. This means Upstart offers value even to those banks who want to ultimately keep credit score as a hard constraint.

They also build a model to analyze what piece of alternative information is driving Upstartā€™s predictive power. They donā€™t use all the data (only 38 variables), so there may be limitations to their analysis. They find that the top 15 predictive variables include level of eduation, type of job, loan purpose and some other variables obtained by credit score. This underlines that credit score does have some predictive power. However, as Upstart also points out, it alone is not sufficient and therefore should not be used as a hard constraint. Depending on the level of eduation, Upstartā€™s predicted default probability can change as much as 4.2%. This is significant compared to Upstartā€™s average predicted default probability of 22%. Device type or technology can move the Upstartā€™s estimated defauly probability by about 4.7%, employment type and loan purpose can both move it by about 2.8%. This means that just these four pieces of data can potentially change Upstartā€™s predicted risk by nearly 10% or more. Again, it should be highlighted that this analysis doesnā€™t use these variables in the same way Upstart does and the results are not presented very openly (due to intellectual property concerns), so the numbers may not be entirely accurate.

Regarding profitability of loans, they find that low-credit score borrowers are more profitable. Since low credit score borrowers are usually charged higher APR and Upstart is good at idenfying those with low default risks, this makes sense. Authors also find that Upstartā€™s low-credit score loans originated in 2020 generated 1 percentage point higher returns than 2019 and 2 percentage points higher returns than 2018, meaning either (i) macro-economic conditions are getting more favorable, or (ii) Upstartā€™s continuous improvements to its model is paying off.

One thing worth mentioning is their last analysis, which looks at mortgage loans. They discuss a company called Quicken Loans (now changed its name to Rocket Mortgage), a fintech mortgage lender who seems to be somewhat comparable to Upstart in the mortgage loan space. The authors find that mortgages originated by Rocket Mortgage generate about 12 basis point higher returns compared to three large banks (Bank of America, Chase, and Wells Fargo), highlighting that mortgage loans can also benefit from use of alternative data. I am not sure how Rocket Mortgageā€™s offerings differs from Upstart, but it is worth researching into it to understand whether Upstartā€™s transition to mortgage loan space is feasible and whether Upstart can offer something Rocket Mortgage doesnā€™t. In any case, I think the analysis of mortgage space in this paper is not coincidental, but rather, this research was conducted as part of Upstartā€™s preliminary analysis to evaluate their entry to the mortgage space.

I hope some of you find this useful. There is a lot more small details in the paper, so Iā€™d recommend you to give it a quick look if you are interested.




Wow! Thanks so much for sharing this publication and your valuable and impressive insights!

Regarding your mortgage comments, I want to add:

Iā€™ve looked at NY Fed data (see slide 6 and 7 atā€¦) and it shows mortgage lending to lower FICO borrowers has fallen over 80% from pre-GFC levels.

An 80% drop!!

The Great Financial Crisis has understandably made banks extremely wary of home lending to subprime scorers. Just look at the orange and blue bars on slide 6 from 2007 - huge fall off the cliff. There are so many low FICO borrowers numbering in the millions left out of the market for the past 14 years!

Granted, some of them should never be offered a mortgage in the first place, but for the true ā€˜hidden primesā€™ who wouldnā€™t actually default, if UPST can replicate its similar success from personal loans, they can literally re-expand the mortgage origination market MASSIVELY.

So, if UPST can perform in this segment- and UPST retains a first mover status into AI/ML driven mortgage lending - then this is something that UPST investors plus banks will salivate over, once everyone recognizes the implications, as mortgages generate substantial income: according to the IMB (ā€¦), banks reported a net gain of $3,361 on each loan they originated in the first quarter of 2021.
If banks could increase their home lending volume by 10, 20, or even 30 percent with the use of UPSTā€¦it would be incredible.