The 'computer vision' myth behind Amazon's "Just Walk Out" stores

The combination of sophisticated tools and technologies … driven entirely by computer vision used to keep track of purchases in Amazon’s “Just Walk Out” stores turns out to be a complete farce, which may be one reason why Amazon is Just Walking Out of its not-so-sophisticated attempt to pull a fast one on a concept it hasn’t really delivered on.

The reality is that nearly 1000 workers in India were tasked with real time checking of at least 7 out of 10 purchases in those stores as they happened

Here’s how Amazon’s site describes it:
"Q: How does Just Walk Out technology work?

Just Walk Out technology uses a combination of sophisticated tools and technologies to determine who took what from the store. When a consumer takes something off the shelf, it’s added to their virtual cart. When the consumer puts the item back on the shelf, it comes out of their virtual cart. After they leave the store, they’re charged for the items they left the store with."

"E-commerce giant Amazon’s “Just Walk Out” technology allowed customers to bypass traditional checkouts at its stores, and the company relied on 1,000 Indian human workers to do the job manually, according to a report in Business Insider.

As many as 1,000 workers in India were tasked with reviewing what customers picked up, and ultimately walked out with from Amazon’s “Just Walk Out”-enabled stores.

“The company said that the technology was driven entirely by computer vision, However, a significant portion of “Just Walk Out” sales required manual review by the team in India. In 2022, the report said that 700 out of every 1,000 “Just Walk Out” transactions were verified by these workers.”

Here’s Amazon’s version:
**“**The misconception that Just Walk Out technology relies on human reviewers watching shoppers live from India is misleading and inaccurate," an Amazon spokesperson said via an e-mail statement to USA TODAY. "As with many AI systems, the underlying machine learning model is continuously improved by generating synthetic data and annotating actual video data.

Smart technology: Why Amazon is ditching Just Walk Out checkouts at grocery stores

"Our associates validate a small portion of shopping visits by reviewing recorded video clips to ensure that our systems are performing at our high bar for accuracy, which is made possible because we continuously improve both our algorithms and use human input to correct them.”

3 Likes

So basically the Amazon model was the Theranos version of retail. Hype and talk, and mysterious things going on behind the curtain.

Sounds very USian, these days.

2 Likes

What if the only part of this that is disappointing is that they were reviewing 70% of something?

  • How much more would it require to review 100%?
  • Maybe this is just an offshoring story hidden behind a thin veneer of technology.
1 Like

Retail work is generally low pay, and often low in other ways, but it provides a lot of jobs. If it were actually possible to off-shore a significant portion of those jobs it would primarily hit a very vulnerable group with little or nothing to fall back on.

2 Likes

Yeah, I wasn’t being positive one way or another about that idea; I think ideally Amazon would have wanted to see the percentage of somethings being reviewed reducing as whatever their technological efforts improved…

  • If the “tech” had improved to the point where only 10% of things were being reviewed
  • And, what about the ‘five-finger discounts’ that were never being reviewed?

I’m reminded how early car-rental websites basically offered all these dropdowns and choices on the web but then really just sent an email to people who then keyed the reservation into the backend systems… but “this” doesn’t seem to have improved in enough ways to make it viable.

1 Like

Which would have been perfectly understandable - all they had to do was say this was a pilot project/ proof of concept / etc. in which the ‘computer vision’ systems were being manually supervised and trained to progressively replace the backend team

But of course they thought they could get away with the sophisticated B.S.

I understand the need for helping machines learn over time, perhaps weaning off the need for human eyes in the process.

I found the Just Walk Out shopping experience to be an improvement over other stores. Initially it felt weird, like I was stealing. But the charge to my amazon account came so quickly that it alleviated that issue.
The purchasing experience became is so fast which is great in airports.
I seriously doubt anyone had time to review the majority of my purchases before or during my check out. but they could have been back checking to make sure everything in my purchase was accounted for.

4 Likes

It seems like they could tell from inventory how many of each item was taken, then correlate with what the output from the computer vision system. With videos they could probably figure out who took something that wasn’t charged, and charge them after the fact.

Perhaps this was a sociology experiment to see how many people would try to deceive the system, and perhaps what they found was disappointing. I think the vast majority of people are honest, but it could have been clever thieves that found a systematic way to steal from these stores.

Enjoy,
Brian

2 Likes

That’s very much my assessment, i.e. screening after-the-fact to see if the system is working, but taking the loss as “a learning experience” to train the system to work better. In my darker moments I see the possibility of identifying the person who did the shoplifting, running them through facial recognition software so the next time they came into the store there would be extra resources trained on them. I doubt they would report someone to the police because they stole a $1.29 box of frozen peas.

1 Like

Okay, I’ve read both articles (now) and let’s just say I find less to be skeptical about here, than you may.

If we just question the statement about 700 out of 1000 somethings being reviewed in 2022, this whole thing goes away entirely, imo. Now, why would I do that? Because I think there’s a question as to who/where the scope of the transactions was observed to be 70%. Maybe 70% of all the transactions were reviewed in 2022, and maybe 70% of all transactions forwarded to the India-located review center were reviewed. Maybe all the one item sales where the customer was in the store for less than “n” moments of time were not checked and not passed to the India team because those were presumed to be truthful cases of a customer grabbing one item with ‘purpose’ and going.

What’s the journalistic quality of “The Business Standard”, anyway? I don’t think I have seen that “masthead” before.

The USA Today story, on the other hand is built upon the first story; without saying the source(s) it references the rumor/fact and then goes on from there, but then pivots to talking about Amazon’s newer effort.

The replacement for “Just Walk Out” seems to be about keeping customers more comfortable (showing them their running ‘ticket’) and about enticing customers with a more interactive advertising/suggestion based interaction. So, is that proof that the prior solution doesn’t work, or that it doesn’t generate enough sales for the cost to implement, outside of certain places like airports, where they will continue to use it.

Or simply that it’s too hard, fraught with mistakes, which makes it suitable only for “airports” (& similar, if there are such) where the typical purchase is a one or two item stop. And where the margin is enormously enormous.

Maybe someday the computing power or capability will be good enough to revert back to traditional retail (grocery margins notoriously thin to start with), but that day is obviously not today.