DeepSeek - Why It Matters

mlakas1
Jan 29
3 min read

brain ai image generated — Clearly, I am an artist.

(Note, I wanted to finish this tomorrow (January 30th), but I heard a rumor o3-mini is gonna drop. This stuff moves fast.)

Last Thursday, a Chinese company named DeepSeek released its latest AI model, DeepSeek R1. This is a reasoning model similar to ChatGPT, a class of AI that thinks through its answers using a process called Chain of Thought (CoT).

DeepSeek seems to have come out of nowhere. It’s a startup founded by engineer Liang Wenfeng, who has organized his team around knowledge advancement, innovation, and creativity. In doing so, they appear to have caught the reigning AI titans flat-footed.

DeepSeek R1 is on par with the very best AI models available. Allegedly, it cost only $5.6 million in training resources. It is also important to note this was done on chips that were handicapped by the Biden Administration’s embargo. Compare that to ChatGPT-4. According to CEO Sam Altman, ChatGPT-4 cost more than $100 million to train.

At first glance, this all sounds too good to be true. Maybe it is? But let’s look at what we know.

Putting R1 Through Its Paces….Does It Work?

I had a chance to test R1 before it caught on (and thus became “busy”). For factual information queries—such as a request for a summary of the Stargate Project—the model produced a reasonable answer. However, I noticed that its presentation sometimes lagged behind ChatGPT’s. In several instances, paragraphs were squashed together without spaces, or the formatting was inconsistent (weird fonts, etc.). This didn’t happen often, but it did happen.

Next, I tried something more challenging: I wanted to do some code work around PDF preprocessing. I wrote a detailed prompt about what I wanted and the language it should use, then let it rip. That’s where the fun started.

DeepSeek R1 appeared to converse with itself, weighing different approaches and potential downsides, reminding itself what I wanted, and even speculating on what I might be doing incorrectly. I was watching it reason in real time. It was wild. In the end, it gave me three well-explained options.

To ChatGPT’s credit, it also provided an excellent answer—though not quite as thorough—and I could modify code in the browser using the “canvas” feature. Pretty neat.

Bottom line: R1 is the real deal. It provides excellent answers. Color me impressed.

Efficiency Claims and Open Source…Sort Of

DeepSeek claims its model is 95% more efficient than other models and has priced it accordingly. But is it really that efficient? Unfortunately, there is no way to verify training costs independently, so we don’t know for sure. We can, however, infer a few things.

DeepSeek calls R1 “open source,” but it’s not truly open source in the conventional sense, because the actual code and dataset are not available. You cannot reproduce the exact model from scratch. It’s more accurate to call it “open weight,” meaning the model weights (its “brain”) can be downloaded and run locally. And people have indeed done so. The ability to run a reasoning model on high-end consumer hardware is extremely impressive—and that’s what makes this such a big deal.

Some have dismissed R1 as merely a copy of existing models, and there are now accusations of product theft flying around. Did DeepSeek leverage other models to build their system? Probably. But that’s hardly surprising—modern AI systems are often built on the discoveries of others. After all, the ‘T’ in ChatGPT stands for “Transformer,” a technique originally published in a Google research paper.

True to that tradition, DeepSeek has released a paper detailing the methods used to develop R1.

Final Thoughts

At the end of the day, R1 is very, very good. It avoids some costly training issues, runs efficiently, and opens up new possibilities in AI. What’s next? Everyone and their mother will incorporate and improve upon DeepSeek’s techniques. Models will get better—and that’s a win for AI science.

DeepSeek - Why It Matters

Putting R1 Through Its Paces….Does It Work?

Efficiency Claims and Open Source…Sort Of

Final Thoughts

Recent Posts

Comments