Skip to content
  • Facebook
  • X
  • Linkedin
  • WhatsApp
  • YouTube
  • Associate Journalism
  • About Us
  • Privacy Policy
  • 033-46046046
  • editor@artifex.news
Artifex.News

Artifex.News

Stay Connected. Stay Informed.

  • Breaking News
  • World
  • Nation
  • Sports
  • Business
  • Science
  • Entertainment
  • Lifestyle
  • Toggle search form
  • Petition Seeks To Stop ‘Demolition’ Of Supreme Court Building For New One
    Petition Seeks To Stop ‘Demolition’ Of Supreme Court Building For New One Nation
  • UP Cop Suspended For Pressuring Woman To Withdraw Daughter’s Kidnapping Case
    UP Cop Suspended For Pressuring Woman To Withdraw Daughter’s Kidnapping Case Nation
  • India, Canada agree to resume FTA talks: Goyal
    India, Canada agree to resume FTA talks: Goyal Business
  • SRH Take Massive Decision Ahead Of IPL 2024 Final Against KKR In Chennai
    SRH Take Massive Decision Ahead Of IPL 2024 Final Against KKR In Chennai Sports
  • Feminising hormone therapy can alter proteins in transwomen’s blood
    Feminising hormone therapy can alter proteins in transwomen’s blood Science
  • Access Denied World
  • Paris Olympics: PV Sindhu Starts With A Dominant Win Over Fathimath Abdul Razzaq
    Paris Olympics: PV Sindhu Starts With A Dominant Win Over Fathimath Abdul Razzaq Sports
  • FTX Founder Sam Bankman-Fried To Testify At His US Crypto Trial
    FTX Founder Sam Bankman-Fried To Testify At His US Crypto Trial World
Has China achieved AI breakthrough with DeepSeek?

Has China achieved AI breakthrough with DeepSeek?

Posted on January 28, 2025 By admin


For over two years, San Francisco-based OpenAI has dominated artificial intelligence (AI) with its generative pre-trained language models. The startup’s chatbot penned poems, wrote long-format stories, found bugs in code, and helped search the Internet (albeit with a cut off date). Its ability to generate coherent sentences flawlessly baffled users around the world. 

Far away, across the Pacific Ocean, in Beijing, China made its first attempt to counter America’s dominance in AI. In March 2023, Baidu received the government’s approval to launch its AI chatbot, Ernie bot. Ernie was touted as the China’s answer to ChatGPT after the bot received over 30 million user sign-ups within a day of its launch. 

But the initial euphoria around Ernie gradually ebbed as the bot fumbled and dodged questions about China’s President Xi Jinping, the Tiananmen Square crackdown and the human rights violation against the Uyghur Muslims. In response to questions on these topics, the bot replied: “Let’s talk about something else.” 

Late to the AI party

As the hype around Ernie met the reality of Chinese censorship, several experts pointed out that difficulty of building large language models (LLMs) in the communist country. Google’s former CEO and chairman, Eric Schmidt, in talk at the Harvard Kennedy School of Government, in October 2023, said: “They [China] were late to the party. They didn’t get to this [LLM] AI space early enough.” Mr. Schmidt further pointed out that lack of training data on language and China’s unfamiliarity with open-source ideas may make the Chinese fall behind in global AI race.  

As these Chinese tech giants trailed, the U.S. tech giants marched forward with their advances in LLMs. Microsoft-backed OpenAI cultivated a new crop of reasoning chatbots with its ‘O’ series that were better than ChatGPT. These AI models were the first to introduce inference-time scaling, which refers to how an AI model handles increasing amounts of data when it is giving answers.

AI trader turned AI builder

While the Chinese tech giants languished, a Zhejiang-based hedge fund, High-Flyer, that used AI for trading, set up its own AI lab, DeepSeek, in April 2024. Within a year, the AI spin off developed the DeepSeek-v2 model that performed well on several benchmarks and was able to provide the service at a significantly lower cost than other Chinese LLMs. 

When DeepSeek-v3 was launched in December, it stunned AI companies. The Mixture-of-Expert (MoE) model was pre-trained on 14.8 trillion tokens with 671 billion total parameters of which 37 billion are activated for each token. 

A MoE model uses different “experts” or sub-models that specialise in different aspects of language or tasks. And each expert is activated when its relevant to a particular task. This makes the model more efficient, saves resources and speeds up processing.

Training despite American sanctions

According to the technical paper released on December 26, DeepSeek-v3 was trained for 2.78 million GPU hours using Nvidia’s H800 GPUs. When compared to Meta’s Llama 3.1 training, which used Nvidia’s H100 chips, DeepSeek-v3 took 30.8 million GPU hours lesser.

After seeing early success in DeepSeek-v3, High-flyer built its most advanced reasoning models – – DeepSeek-R1-Zero and DeepSeek-R1 – – that has potentially disrupted the AI industry by becoming one of the most cost-efficient models in the market. 

When compared to OpenAI’s o1, DeepSeek’s R1 slashes costs by a staggering 93% per API call. This is a huge advantage for businesses and developers looking to integrate AI without breaking the bank. 

The savings don’t stop there. Unlike older models, R1 can run on high-end local computers — so, no need for costly cloud services or dealing with pesky rate limits. This gives users the freedom to run AI tasks faster and cheaper without relying on third-party infrastructure.  

Plus, R1 is designed to be memory efficient as it requires only a portion of RAM to operate, which is low for an AI of its calibre. Separately, by batching, the processing of multiple tasks at once, and leveraging the cloud, this model further lowers costs and speeds up performance, making it even more accessible for a wide range of users.

A close contest

While it may not be quite as advanced as OpenAI’s o3, it still offers comparable quality to the o1. According to benchmark data on both models on LiveBench, when it comes to overall performance, o1 edges out R1 with a global average score of 75.67 compared to the Chinese model’s 71.38. OpenAI’s o1 continues to perform well on reasoning tasks with a nearly nine-point lead against its competitor, making it a go-to choice for complex problem-solving, critical thinking and language-related tasks. 

When it comes to coding, mathematics and data analysis, the competition is quite tighter. Specifically, in data analysis, R1 proves to be a better choice for analysing large datasets. 

One important area where R1 fails miserably, which is reminiscent of the Ernie Bot, is on topics that are censored in China. For instance, to any question on the Chinese President Xi Jinping, the Tiananmen Square protest, and the Uyghur Muslims, the bot tells its users: “Let’s talk about something else.”

Unlike Ernie, this time around, despite the reality of Chinese censorship, DeepSeek’s R1 has soared in popularity globally. It has already surpassed major competitors like ChatGPT, Gemini, and Claude to become the number one downloaded app in the U.S. (In India, DeepSeek is at the third spot under productivity, followed by Gmail and ChatGPT apps.) This meteoric rise in popularity highlights just how quickly the AI community is embracing R1’s promise of affordability and performance.

Smaller models rise

While OpenAI’s o4 continues to be the state-of-art AI model out there, it is only be a matter of time before other models could take the lead in building super intelligence.

DeepSeek shows that, through its distillation process, it can effectively transfers the reasoning patterns of larger models into smaller models. This means, instead of training smaller models from scratch using reinforcement learning (RL), which can be computationally expensive, the knowledge and reasoning abilities acquired by a larger model can be transferred to smaller models, resulting in better performance. 

In its technical paper, DeepSeek compares the performance of distilled models with models trained using large scale RL. The results indicate that the distilled ones outperformed smaller models that were trained with large scale RL without distillation. Specifically, a 32 billion parameter base model trained with large scale RL achieved performance on par with QwQ-32B-Preview, while the distilled version, DeepSeek-R1-Distill-Qwen-32B, performed significantly better across all benchmarks. (Qwen is part of an LLM family on Alibaba Cloud.)

This, in essence, would mean that inference could shift to the edge, changing the landscape of AI infrastructure companies as more efficient models could reduce reliance on centralised data centres. 

The future of AI race

While distillation is a powerful method for enabling smaller models to achieve high performance, it has limits. For instance, as distilled models will be tied to the “teacher“ model, the limitations in the larger models will also be transferred to the smaller ones. Also, distilled models may not be able to replicate the full range of capabilities or nuances of the larger model. This can affect the distilled model’s performance in complex or multi-faceted tasks.

Distillation is an effective tool for transferring existing knowledge, but it may not be the path to major paradigm shifts in AI on its own. That means, the need for GPUs may increase companies build only increase as more powerful intelligent models.

DeepSeek’s R1 and OpenAI’ o1 are the first reasoning models that are actually working. And R1 is the first successful demo of using RL for reasoning. From here, more compute power will be needed for training, running experiments, and exploring advanced methods for creating agents. There are many ways to leverage compute to improve performance, and right now, American companies are in a better position to do this, thanks to their larger scale and access to more powerful chips.

Published – January 28, 2025 03:31 pm IST



Source link

World Tags:DeepSeek news, DeepSeek Nvidia, DeepSeek open source, DeepSeek R1, DeepSeek tech stocks, DeepSeek vs ChatGPT, Nvidia news

Post navigation

Previous Post: In Relief For Atishi, BJP’s Defamation Case Dismissed By Court
Next Post: Turned Down Chole Puri, Then Virat Kohli Ate This Desi Dish With Delhi Teammates Ahead Of Ranji Trophy Return

Related Posts

  • Gaza’s doctors struggle to save hospital attack survivors as West Asia rage grows
    Gaza’s doctors struggle to save hospital attack survivors as West Asia rage grows World
  • Harris works to energize Black male voters and denounces Trump support of ‘stop and frisk’
    Harris works to energize Black male voters and denounces Trump support of ‘stop and frisk’ World
  • Access Denied World
  • Cease-fire talks with Israel and Hamas are expected to resume Sunday in Qatar
    Cease-fire talks with Israel and Hamas are expected to resume Sunday in Qatar World
  • Access Denied World
  • World reported twice as many cholera cases in 2022 as in 2021: WHO | Data
    World reported twice as many cholera cases in 2022 as in 2021: WHO | Data World

More Related Articles

Elon Musk’s X down for tens of thousands of users globally, Downdetector shows Elon Musk’s X down for tens of thousands of users globally, Downdetector shows World
Access Denied World
Missing Journalist Austin Tice: U.S. hostage envoy Roger Carstens in Beirut to seek information Missing Journalist Austin Tice: U.S. hostage envoy Roger Carstens in Beirut to seek information World
Iran warns of ceasefire violation as U.S. plans to escort Hormuz ships Iran warns of ceasefire violation as U.S. plans to escort Hormuz ships World
Hamas’ Large-Scale Terror Attack Prompts Blame Around Israeli Intelligence Hamas’ Large-Scale Terror Attack Prompts Blame Around Israeli Intelligence World
Cambodia extradites alleged scam kingpin Chen Zhi to China Cambodia extradites alleged scam kingpin Chen Zhi to China World
SiteLock

Archives

  • May 2026
  • April 2026
  • March 2026
  • February 2026
  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022

Categories

  • Business
  • Nation
  • Science
  • Sports
  • World

Recent Posts

  • Marijuana worth ₹4 crore, concealed in trolley bag seized at Delhi airport; two held
  • Portugal aims to stop relying on EU funds; seeks to become net EU contributor: PM Luis Montenegro
  • UDF’s near three-fourths majority marked a historic victory for the Congress-led coalition
  • At least 1,000 more rooms to be made available to devotees at Kukke Subrahmanya in two years
  • India’s genetic mosaic: how understanding our genes can help improve our health

Recent Comments

  1. WilliamTOP on UP Teacher Who Asked Students To Slap Muslim Classmate
  2. DavidAnymn on UP Teacher Who Asked Students To Slap Muslim Classmate
  3. Jesusetexy on UP Teacher Who Asked Students To Slap Muslim Classmate
  4. JeffryFok on UP Teacher Who Asked Students To Slap Muslim Classmate
  5. StanleyPeapy on UP Teacher Who Asked Students To Slap Muslim Classmate
  • Google Wins Trademark Lawsuit Over YouTube Shorts, Court Rules No Confusion
    Google Wins Trademark Lawsuit Over YouTube Shorts, Court Rules No Confusion World
  • Access Denied Sports
  • Russian attack damages port infrastructure in Ukraine’s Odesa region, governor says
    Russian attack damages port infrastructure in Ukraine’s Odesa region, governor says World
  • Rohit Sharma Isn’t Staying With MI Teammates During Home IPL 2024 Games. Here’s Why
    Rohit Sharma Isn’t Staying With MI Teammates During Home IPL 2024 Games. Here’s Why Sports
  • Access Denied World
  • Global trade getting weaponised through tariffs: Nirmala Sitharaman
    Global trade getting weaponised through tariffs: Nirmala Sitharaman Business
  • “Was A Big Shock”: India’s U-19 World Cup Winning Captain Opens Up On USA Rejection
    “Was A Big Shock”: India’s U-19 World Cup Winning Captain Opens Up On USA Rejection Sports
  • Nicolas Jackson Double Fires Chelsea To Victory At Woeful West Ham
    Nicolas Jackson Double Fires Chelsea To Victory At Woeful West Ham Sports

Editor-in-Chief:
Mohammad Ariff,
MSW, MAJMC, BSW, DTL, CTS, CNM, CCR, CAL, RSL, ASOC.
editor@artifex.news

Associate Editors:
1. Zenellis R. Tuba,
zenelis@artifex.news
2. Haris Daniyel
daniyel@artifex.news

Photograher:
Rohan Das
rohan@artifex.news

Artifex.News offers Online Paid Internships to college students from India and Abroad. Interns will get a PRESS CARD and other online offers.
Send your CV (Subjectline: Paid Internship) to internship@artifex.news

Links:
Associate Journalism
About Us
Privacy Policy

News Links:
Breaking News
World
Nation
Sports
Business
Entertainment
Lifestyle

Registered Office:
72/A, Elliot Road, Kolkata - 700016
Tel: 033-22277777, 033-22172217
Email: office@artifex.news

Editorial Office / News Desk:
No. 13, Mezzanine Floor, Esplanade Metro Rail Station,
12 J. L. Nehru Road, Kolkata - 700069.
(Entry from Gate No. 5)
Tel: 033-46011099, 033-46046046
Email: editor@artifex.news

Copyright © 2023 Artifex.News Newsportal designed by Artifex Infotech.