Deep Seek, Apple, Nvidia

While Nvidia is synonymous with AI hardware, Apple Silicon is making a strong case for its role in the next phase of AI development. Apple’s M2 Ultra, with its unified memory and UltraFusion technology, offers unmatched cost efficiency for running models like DeepSeek R1. At just $26 per GB of memory compared to $312 for Nvidia’s H100, Apple’s chips may reshape the economics of AI deployment.

Rumors of the upcoming M4 Ultra only add fuel to the fire. With 256GB of unified memory and bandwidth nearing 1.2TB/s, Apple is quietly positioning itself as a formidable player in AI hardware.

Someone very close to me works in a major tech company, and yesterday she relayed to me a lot of online chatter about how Deep Seek runs very well on an M2 Mac, causing some upset in the non-Apple side of the industry. There seemed to be a lot of confusion yesterday. Apparently, it was developed on older Nvidia chips, which called into question whether AI has passed a “good enough” plateau in hardware technology, and all those questions cast doubt on the big spending in AI and helped drop stocks of companies heavily involved in AI, especially Nvidia (which is rebounding today).

Apple didn’t crash, though I personally think Apple’s benefit is less about Apple Silicon and more the idea that maybe Apple was right not to spend way too much on AI. That, and I think AAPL’s decline due to AI (of the Apple kind) and the Chinese market is already baked in. I remain skeptical about the current state of AI, both the Artificial and the Apple kind, despite – or perhaps because of – my own early forays into ML.

-awlabrador

3 Likes

However, the massive black box approach makes it difficult — or impossible — to determine why a particular dataset is prone to hallucinations, and it’s easy to have an insufficient model, owing to a too-small dataset. As the recent rollback of Apple Intelligence news summaries proves, simply taking a hyper-focused approach to model training isn’t a magic bullet.

These DeepSeek stories remind me of conversations I’ve been having with one of my sons, who is a CS major in college right now.

I told him that I thought one aspect of computer programming could be thought of as a gradual evolution through increasing abstraction, e.g. abstraction from hardware, then from electronic architectures, then from programming language, and so on, meaning:

  • In “ancient times”, programming mechanical computers (e.g. Babbage’s mechanical computer) included making direct changes to physical hardware.
  • Early electronic computing involved, at least in part, rewiring electronic hardware… [Edit: See The Imitation Game.]
  • But electronic programmable computing then involved PEEKing and POKEing at the CPU and at memory locations, like flipping switches on the Altair…
  • But eventually that evolved into programming machine language or Assembly language code. But that kind of coding was tied to specific CPU architectures…
  • Which led to programming languages (e.g. COBOL, FORTRAN, BASIC, Pascal, C/C++, Java, Python, etc. – and this being an Apple board, Swift), which could be defined independently of CPU architecture and even OS, as long as native compilers, interpreters, or runtime environments were available on the various platforms.
  • Now, we have AI or machine learning (ML), which I’ve been told is the preferred nomenclature and is what I go with. And AI/ML can be programmed independently of specific language.

I’ve generalized and simplified a ton of things, so don’t bother jumping on my back and nitpicking the details. I’ll ignore you if you do.

My point is that each stage of abstraction means that those deeply involved in a given stage don’t need to worry at all about the preceding stage, but we still need people adept at those more fundamental, less abstracted stages.

For example, a programmer/developer who uses a given language doesn’t necessarily have to worry about CPU architectural details, and much less about actual electronics and layout, but we still need people to design and fabricate CPUs, hardware, etc. If you’re not familiar with ML, it’s just linear algebra and a little calculus at it’s most basic level (emphasis on basic; again, don’t bother nitpicking), albeit with potentially huge matrices and data sets, so you could “easily” (if tediously) write your own neural networks and train them, in whatever programming language and on whatever computer you choose. But with ML libraries, the dirty details of neural networks and training are further abstracted and hidden, blissfully, from the developer, freeing the programmer to do a little bit of programming and focus on preparing the training data sets.

(I was at a science conference recently, seeking out fellow scientists who are using ML, all of whom turned out to be far younger than I. One of them and I shared the revelation that “my ML code is just a page long!”)

I’ve been telling my son that some day he may be teaching future CS students not how to write programs or algorithms but rather how to train machines. Some of those future CS students may know nothing about any specific programming language.

Which brings me back to this DeepSeek story. If it’s true that DeepSeek is as good as or better than other machines running on larger and more advanced hardware, then, IMO, it can only be because it was trained better – better training data sets, better target data, or some such thing. And it’s not necessarily larger data sets any more than it is newer GPUs. Maybe it’s something like human language research, or maybe it’s more directed than just feeding DeepSeek newspaper stories from NYT archives, like sending DeepSeek to some AI university rather than giving it an old encyclopedia.

It also means that at least some of the gobs of money being spent on AI hardware – and Nvidia chips – can and probably should be diverted to AI/ML research at the software, data, and training end.

My two cents.

-awlabrador

Edit: This also reminds me of another conversation or two I had with a former roommate of my other son. That roommate was also a CS major. He noted that, as NN’s got larger, they started manifesting new behaviors that weren’t apparent at smaller scales, which is basically (I assumed) how we got to LLMs like ChatGPT and its relatives. One thing that wasn’t explicitly stated in our conversations, but which I unconsciously assumed, is that the manifestations weren’t gradual but were relatively sudden. If true, that again potentially indicates a plateau (or plateaus) in capability vs. hardware. Someone with more background and exposure in CS here is welcome at this point to educate me/us further.

2 Likes

Very interesting… I remember my early 6502 Apple ][+ days, the Red Book, etc, so I dabbled a bit in coding, but soon realized that I was so far behind them young whippersnappers, who had moved on to Pascal, C, there was no way, me as just a hobbyist I’d ever catch up, so I fiddled with Applsoft Basic for a while, found a bit of fun with it, as well as chasing errors, but my full time job put me on the road a lot, so it fell to the side, dialed a bit in Unix as our Electronic switches were Unix based, so chasing troubles got me into he Unix world a bit, but just enough to sort out what the developer was trying to do, running diagnostics… It is interesting to see these big changes, them youngsters have to be having a lot of fun sorting it out… Way over my head for sure… A long way from our SF Computer Faire days…

Good to see kids chasing CS degrees, I once thought of trying for it, but working full time, night school classes was as far as I got, and then the road beckoned, transfers all over the area, as well as a family to raise, nudge along… Time flies!

Good to see kids chasing CS degrees,…

To preface, this is all just my opinion:

I admit to being ambivalent about CS degrees, at least from four year colleges. I do think there are actual computer science topics that require those CS degrees.

But if you want just to program and either freelance or work for a specific company that, hopefully, knows better, you can either learn programming on your own or go to community college and learn some programming languages. You can be a strong Python or Java programmer without a 4 year degree, IMO.

But, if you want to work in some particular non-CS-specific field, I think you learn that field (e.g. some science or engineering field, or finance, whatever…) and then pick up programming on the side as needed.

I myself write programs for my job all the time, and I never took a CS class in college. Well, I took a computational physics (or was it physical computation?) class once in grad school, but I had to drop out because the TAs never could get the FORTRAN compiler working on their parallel computing machine, and I didn’t know C at the time, so I couldn’t do the homework. But I’ve picked up a bunch of languages, and I use computers, write simulations, do data acquisition and analysis, etc. all the time for my job, with physics degrees instead of CS.

What I don’t want to see is students taking CS as a major, with the dream and maybe intent of writing the next great million dollar computer game, and then working in obscurity doing server support to pay the bills (esp. college loans) and never getting out. IMO, you can do server support without a bachelor’s, if that’s what you want to do. Or even write games without a bachelor’s.

(Yes, getting hired without a bachelor’s can be hard, depending on the company.)

Despite my expressed attitude, though, my son is a CS (and math) major, and we talk about CS topics, as they’re some of the things we have in common, anyway. I’ve told him about the problems I think are interesting in CS, as a science, and maybe someday he’ll take one of them up.

Also, I’ve raised my son to be independent and not worry excessively over what I think, anyway, even though he also takes me seriously and understands me. He follows his own interests.

What’s really amusing, though – and really the reason for this whole post – is that, while still a teenager, he told his friends about my attitude about CS, and word spread without my knowing. I might have tried to stop him, had I known. And then word spread to my adult friends, including at least one who’s got a CS degree.

And we’re still friends. :grinning: I like to think that says a lot about me, my friends, and the friendships we have.

-awlabrador

2 Likes

Success works! With or without the paperwork, degrees, etc. I’d liked to have had a better start, but zero educational experience in the family at the time, poor counseling, and I got distracted by too much free time, motorcycle, etc, to where I bailed on college, eventually a trade school, DeVry in Chicago, but turned that, in the end into entering the telecom world of WeCo. Spent 4 decades there, many hours of various training classes, then the field work, putting it all together, pretty satisfying, interesting times, playing with the latest Bell Labs products… So at now 83, retired since '02, I’m way past going after a CS degree, but like to see it as a case for those with the interests… But it takes a special talent to come up with actual innovative products, just as it was on the job, only a few of the top techs had that extra spark that let them diagnose, find problems, and I found it didn’t work to listen to others, their perspectives sidetracked my own instincts, delayed, rather than hastened the solutions…

Yep, we all make our own way through this tech world, I don’t know where it’s heading now with AI, ML, but it’s interesting to try to keep up!

1 Like