DeepSeek – Artifex.News

Liang Wenfeng: Rise of a Black Swan

admin — Sat, 01 Feb 2025 20:32:00 +0000

Statistical mathematician Nassim Taleb, in his seminal work, The Black Swan, defines ‘Black Swan moments’ as highly improbable events with significant impact that are often rationalised with hindsight. In 2008, the global financial meltdown, a Black Swan event that began in the U.S., wiped out trillions of dollars, causing traders to scramble to fulfil margin call requests. Algorithmic trading, particularly high-frequency trading, was identified as a contributing factor to market volatility during the crisis.

As the effects of the financial crisis hit global markets, far away across the Pacific Ocean, 23-year-old Liang Wenfeng, along with his classmates, was gathering data on financial markets and macroeconomic indicators with a goal of exploring the full potential of algorithmic trading.

Finance wasn’t the first industry that piqued Mr. Liang’s interest in testing emerging technology and algorithmic frameworks, but it became his domain at that point in time. Despite challenges and setbacks, Mr. Liang remained steadfast in his belief that machine learning (ML) and artificial intelligence (AI) had the potential to revolutionise the world fundamentally. He was proved right when an AI language model launched by his startup DeepSeek disrupted the global AI landscape and triggered a meltdown of chip stocks worldwide last week.

Mr. Liang pursued his master’s in information and communications engineering at Zhejiang University before relocating to the southwestern Chinese city of Chengdu. Instead of securing a job at prominent tech firms like his peers, he ventured into an uncharted territory, determined to leverage ML and AI in the rapidly evolving finance landscape. With a few Zhejiang University alumni by his side, Mr. Liang founded High-Flyer, a quantitative hedge fund, in 2015.

High-Flyer quickly gained recognition, amassing 20 billion yuan ($2.8 billion) in assets under management (AUM) within a few years. The fund’s success can be attributed to its adoption of sophisticated algorithms to enhance trading strategies. By harnessing large datasets and optimising decision-making processes, High-Flyer achieved remarkable returns.

Technology remained at the heart of the firm’s operations. Under Mr. Liang’s leadership, High-Flyer invested in high-performance computing resources and assembled a dedicated team of engineers and data scientists. This strategic focus on technology, coupled with the firm’s local expertise, proved pivotal in a market where understanding local dynamics was paramount. While foreign hedge funds also possessed superior technology, High-Flyer’s localised knowledge enabled it to outperform its competitors in the Chinese market.

By 2019, High-Flyer had achieved remarkable success, ranking as the No. 1 stock hedge fund, according to data provider Shanghai Suntime Information Technology.

Moreover, the fund’s ability to swiftly adapt to market fluctuations enabled it to capitalise on inefficiencies within the Chinese market. Industry experts closely monitoring the Chinese market at that time observed that the combination of high liquidity and inefficiencies in the Chinese financial system created an ideal environment for systematic trading strategies.

Unorthodox hiring

Mr. Liang’s success isn’t solely attributed to his firm’s sophisticated algorithms and deep knowledge of the local market. He also set himself apart from other local funds in his approach to talent acquisition.

He challenged traditional hiring practices in the tech and finance industries by prioritising creativity, passion, and basic skills over work experience. He actively recruited younger workers, believing less experienced employees are more likely to innovate and think critically about solving problems.

Mr. Liang viewed experienced professionals as rigid in their approach, quickly recommending established methods, while inexperienced workers to be more willing to explore multiple solutions and adapt to current challenges. This philosophy extended to his hiring strategy of bringing diverse backgrounds, particularly those from literary fields, into engineering teams.

Unlike most founding teams of Chinese quant-investing funds, who have backgrounds in Europe or the U.S., Mr. Liang’s team at High-Flyer is entirely driven by local talent. His fund was founded by a team of local professionals who grew independently.

Within six years of its establishment, High-Flyer achieved remarkable success, becoming one of the top four quant-investing funds in China with an AUM of 100 billion yuan ($13.9 billion). This achievement was particularly noteworthy considering Mr. Liang’s outsider status in the hedge fund world.

The success can be attributed to his ability to thrive in challenging circumstances. From the financial crisis of 2008, he demonstrated resilience and continued to grow his business. On August 24, 2015, during the launch of High-Flyer, the Shanghai Composite Index experienced a significant decline of 8.5%, marking its worst single day drop in eight years. This event was dubbed “Black Monday” by the Communist Party of China’s mouthpiece, People’s Daily.

During the terms of both Donald Trump (2017-21) and Joe Biden, Chinese firms, including High-Flyer, faced restrictions on exports due to export controls. These policies curtailed access to crucial semiconductors, particularly Nvidia GPUs, for Chinese companies.

Despite these challenges, Mr. Liang’s expertise in quant-investing and his sophisticated technology enabled him to build a billion-dollar empire. However, in 2023, the $200 billion hedge fund industry came in the crosshairs of the Chinese finance regulators. As Beijing sought to restore retail investor confidence and mitigate a substantial $4 trillion sell-off in stocks, fast-growing quant funds, such as High-Flyer, became targets of regulatory attention.

These developments prompted Mr. Liang to pivot his focus towards AI. In the same year, he launched DeepSeek, the AI lab focused on building large language and reasoning models. The company’s DeepSeek-R1 model has emerged as a rival to U.S.-based OpenAI’s advanced reasoning model, o1.

Long journey

Although DeepSeek was officially established in 2023, Mr. Liang’s preparation for this journey began long ago. He has been making AI investments for over seven years, from around the time the transformer architecture inspired OpenAI to build their generative pre-trained (GPT) models.

As far back as 2017, the DeepSeek founder began expanding the scope of research in AI algorithm and software. His team solved the single-machine training failure problem with a large-scale computing power solution and won the Golden Bull Award for it in Malaysia the following year. Subsequently, in 2021, his fund spent 1 billion yuan ($139 million) to build an AI supercomputer, Fire-Flyer 2, to handle complex AI tasks. The system was built with super-fast accelerator cards and a network connection that could transfer data at 200 gigabits per second.

Under Mr. Liang’s leadership, the quant fund had already amassed an impressive collection of computational resources, including 10,000 Nvidya A100 GPUs, positioning it as a dominant force in the AI field. According to some reports, High Flyer is the only hedge fund among the five Chinese companies with more than 10,000 GPUs. The other four are all Internet giants. Mr. Liang’s impact on AI research is profound and multifaceted. DeepSeek has injected fresh energy and perspectives into the field, challenging prevailing paradigms and opening new avenues for breakthroughs in understanding human cognition through AI.

In his book, Mr. Taleb notes, “We tend to ‘tunnel’ while looking into the future, making it business as usual, Black Swan-free, when in fact there is nothing usual about the future.” Mr. Liang, to those viewing the future through a tunnel, is a Black Swan formed through adversity that happens to have opened the door to an unexpected, new chapter in AI research, and possibly in quant-investing.

Published – February 02, 2025 02:02 am IST

Source link

Minister On Building AI Amid DeepSeek vs ChatGPT War

admin — Sat, 01 Feb 2025 15:31:01 +0000

New Delhi:

Union Minister Ashwini Vaishnaw on Friday praised the Narendra Modi-led government for focusing on new and emerging technologies, and allocating Rs 500 crore in the Union Budget 2025-26 to set up a Centre of Excellence in Artificial Intelligence (AI).

Speaking to NDTV, Mr Vaishnaw, who is the IT minister, said: “It was very important to get the complete facility so that researchers, academicians, start-ups get an opportunity to use those GPUs (graphics processing units), which are state-of-the-art… As part of the India AI Mission, we focussed on getting the complete facility and I am very happy to share that we have already empanelled 18,000 GPUs and these are really high-end ones. So with this, we have started the work on the foundational panel.” Follow Union Budget 2025-26 LIVE UPDATES here

GPUs are used to power data centres needed to train AI models. The number of GPUs needed for an AI model depends on how advanced the GPU is, how much data is being used to train the model, the size of the model itself and the time the developer wants to spend training it.

In March last year, the Centre announced a $1.25 billion AI investment, dubbed IndiaAI mission, which includes funding for AI startups and developing its own AI infrastructure.

Mr Vaishnaw said the budgetary allocation is “part of India AI missions, along with creating new Centres of Excellence because research is going to be a very important part of technology.”

While presenting the Union Budget 2025-26 in the Lok Sabha earlier in the day, Union Finance Minister Nirmala Sitharaman said: “I had announced three Centres of Excellence in Artificial Intelligence for agriculture, health, and sustainable cities in 2023. Now a Centre of Excellence in Artificial Intelligence for education will be set up with a total outlay of Rs 500 crore.”

Mr Vaishnaw also said the government has seen the algorithm efficiency of DeepSeek, a Chinese AI-powered chatbot. “Many more such innovations are going to come. I can say that our talent is really good and our people and researchers will bring out such innovations in the coming days,” he said.

DeepSeek has triggered a dramatic rethink on AI spending around the world, claiming it took just two months and cost under $6 million to build an AI model using Nvidia’s less-advanced H800 chips.

Downloads of its app recently surpassed OpenAI’s ChatGPT on Apple’s App Store, while the cost and performance of its tools upended industry beliefs that China was years behind US rivals in the AI race.

Recently, Mr Vaishnaw had praised DeepSeek for shaking up the sector with its low-cost AI assistant, likening its frugal approach to his government’s efforts to build a localised AI model.

Source link

OpenAI Seeking $40 Billion In New Fundraising Round: Report

admin — Thu, 30 Jan 2025 22:41:24 +0000

San Francisco:

OpenAI, the maker of ChatGPT, is seeking to raise $40 billion in a fresh round of funding that would value the startup at a staggering $340 billion, the Wall Street Journal Reported on Thursday.

Japan’s SoftBank is leading the investment round and is in talks to invest $15-25 billion in the deal that would make it the ChatGPT-maker’s biggest financial backer.

The reports came after Chinese startup DeepSeek sparked panic this week with a powerful new chatbot developed at a fraction of the cost of its US competitors, dealing a blow to markets.

The Softbank investment was first reported by the Financial Times.

The investment plan comes just three months after OpenAI closed its previous funding round that valued the company at $157 billion.

Doubling its valuation would be unprecedented in Silicon Valley history and signals the vast sums needed to build world-leading AI models from scratch, much of it on cutting-edge computing and infrastructure.

SoftBank and OpenAI are part of the Stargate drive announced by US President Donald Trump to invest up to $500 billion in artificial intelligence infrastructure in the United States.

The new funds would partly go to help OpenAI fulfill its roughly $18 billion commitment to Stargate, the Journal said.

The Japanese firm’s mooted investment would come on top of its commitment of more than $15 billion to Stargate, the FT said, citing people with direct knowledge of the negotiations.

“Ultimately, the Japanese company could spend more than $40bn on its partnership with OpenAI,” the report said.

SoftBank was not immediately available for comment when contacted by AFP. OpenAI did not immediately respond to a request for comment.

Shares in Softbank rose three percent in Tokyo trading on Thursday.

The company, founded by Japanese tycoon Masayoshi Son, made spectacularly successful early bets on Yahoo! and Alibaba in the 1990s but some of its other investments have bombed.

Securing huge funding from Saudi Arabia, Abu Dhabi and others, Son — an early backer of Trump — has sought to pivot into AI, helped by SoftBank’s stake in chip designer Arm.

Elon Musk has openly poured scorn on Stargate, saying on X this month that the main investors “don’t actually have the money”.

The world’s richest man was an early investor in OpenAI and has long been in a feud with its founder Sam Altman, who called the Tesla founder’s comments “wrong”.

(Except for the headline, this story has not been edited by NDTV staff and is published from a syndicated feed.)

Source link

Union Minister Praises China’s DeepSeek AI, Makes An India Comparison

admin — Thu, 30 Jan 2025 09:19:58 +0000

IT minister Ashwini Vaishnaw has praised Chinese startup DeepSeek for shaking up the sector with its low-cost AI assistant, likening its frugal approach to his government’s efforts to build a localised AI model.

India announced a $1.25 billion AI investment in March, dubbed IndiaAI mission, which includes funding for AI startups and developing its own AI infrastructure.

“Some people question the amount of investments the government has committed in (IndiaAI mission). You have seen what DeepSeek has done? $5.5 million and a very very powerful model. Because, the use of brain,” Ashwini Vaishnaw said on Tuesday at an event in Odisha.

DeepSeek has triggered a dramatic rethink on artificial intelligence spending around the world, claiming it took just two months and cost under $6 million to build an AI model using Nvidia’s less-advanced H800 chips.

Mr Vaishnaw’s statement appeared to target comments made by OpenAI’s Sam Altman during a visit to India last year, when he cast doubt on the possibility of an Indian team being able to build a substantial model in the OpenAI space with a $10 million budget.

“The way this works is we’re going to tell you it’s totally hopeless to compete with us on training foundation models. You shouldn’t try. And it’s your job to like try anyway. And I believe both of those things,” he said, comments which are now in focus again on online platforms such as X after DeepSeek’s success.

Altman is due to visit India again on February 5, just as his company is currently locked in a court battle in the country with digital news and book publishers over copyright breaches.

(Except for the headline, this story has not been edited by NDTV staff and is published from a syndicated feed.)

Source link

A Record Number Of US Firms Are Leaving China

admin — Wed, 29 Jan 2025 05:56:29 +0000

At a time when the newly elected U.S. President, Donald Trump, is making a pitch to reshore manufacturing to America, its companies operating out of China are having second thoughts about what was considered a miracle economy. Speaking at the World Economic Forum in Davos in January, Trump made a simple pitch that if companies invested in American manufacturing capabilities, then they would be subject to the lowest taxation. While Trump has not made good on his campaign pledge—a 60% blanket tariff on Chinese merchandise—he has threatened imposition of a 10% levy from February 1 if Beijing does not act on the exports of ingredients for fentanyl, a harmful synthetic opioid. Among the first Presidential orders that he signed was a comprehensive review of trade with China, including supply chains that use other countries to evade exposure to tariffs.

A 100% Rise

Given these rising geopolitical tensions, a record number of American corporates—as many as 30%—are either contemplating shifting out some operations from China or are already in the process of relocating elsewhere, revealed the annual survey by the American Chamber of Commerce in China. This exodus of America Inc from China is twice as big as in 2020, when the Covid-19 pandemic had led China to impose strict lockdowns as a response to the contingency.

One of the factors for this mass departure is that the bottom line for any commercial venture is the profits it makes. More than 50% of the firms interviewed stated that they were barely managing to break even or bore huge losses in 2024. This has affected the ‘consumer’ and ‘services’ sectors, where the figures for companies that are in the red or just breaking even are 60% and 57%, respectively. The corresponding numbers for the ‘industrial’ and ‘technology and research’ segments are 48% and 45%. As many as 17% of respondents revealed that they had actively begun to shift out production and procurement outside of China—an increase of nearly 10 percentage points since 2020. Forty-four per cent cited Sino-American trade rows as a prime cause for this development. And, as many as 38% of the respondents saw developing nations in Asia, such as India, Vietnam, Thailand, Malaysia, and Indonesia, as preferred destinations for the relocation; 18% are keen to reshore to the U.S.

Foreign-owned firms are also increasingly feeling the heat as China queers the playing field. Nearly 50% of the companies interviewed in the technology sector grudge that local Chinese ventures are being given preference over them in the research and development and advanced technology sectors. In the same segment, as many as 93% of businesses stated that lack of market access had affected operations.

China No Longer A Top Investment Priority

The number of American companies that did not see China as a top priority in their investment plans has increased, reaching 21% in 2024. This is despite China pulling out all stops to improve the investment climate in recent times. It expanded market access and eased visas and investment restrictions last year in an effort to improve investor sentiment. However, a crackdown on business consultancies and audit firms has increased apprehensions among foreign businesspeople.

China is facing headwinds in other places too. As Germany heads to the polls in February, Friedrich Merz, who is considered a frontrunner for the country’s chancellorship, has cautioned its companies about the “risk” of investing in China, describing it as part of an “axis of autocracies” that did not adhere to “rule of law”.

Discontent In Europe Too

In a similar development last year, the European Union (EU) Chamber of Commerce in China in a paper revealed that there was a notion that foreign businesses operating in China were in for diminishing returns on their capital invested in the country, which did not justify the increasing risks of operating in the market. Investors had taken a view that challenges in the Chinese market appeared to be of a “permanent nature” that forced a “substantial strategic rethink” of their investment. Furthermore, as many as 44% of EU Chamber members perceived bleak prospects with respect to future profitability. The plummeting sentiment of EU members was ascribed to regulatory issues, preferences in government procurement, market access and overcapacity.

Amid this disillusionment with China, there could be an opportunity for India. Recently, tech giant IBM announced the winding up of its research operations in a series of retreats from China after nearly 25 years of operations. There are reports that the technology major plans to expand its Indian operations. Amid the exodus from China, India must position itself as a catchment.

(The writer is a China Fellow at Observer Research Foundation’s Strategic Studies Programme)

Disclaimer: These are the personal opinions of the author

Source link

DeepSeek “Fantastic” But Not Miracle, Not Built In $5 Million: Bernstein Report

admin — Wed, 29 Jan 2025 05:40:28 +0000

As the social media platforms and the stock markets are buzzed with the popularity of the new AI company DeepSeek, a report by Bernstein stated that DeepSeek looks fantastic but not a miracle and not built in USD 5 million.

The report addressed the buzz around DeepSeek’s models, particularly the idea that the company built something comparable to OpenAI for just USD 5 million. According to the report, this claim is misleading and doesn’t reflect the full picture.

It stated that “we believe that DeepSeek DID NOT “build OpenAI for USD 5M”; the models look fantastic but we don’t think they are miracles; and the resulting Twitter-verse panic over the weekend seems overblown”.

The Bernstein report stated that DeepSeek has developed two main families of AI models: ‘DeepSeek-V3’ and ‘DeepSeek R1’. The V3 model is a large language model that uses a Mixture-of-Experts (MOE) architecture.

This approach combines multiple smaller models to work together, resulting in high performance while using significantly fewer computing resources compared to other large models. The V3 model has 671 billion parameters in total, with 37 billion active at any given time.

It also incorporates innovative techniques like Multi-Head Latent Attention (MHLA), which reduces memory usage, and mixed-precision training using FP8 computation, which improves efficiency.

To train the V3 model, DeepSeek used a cluster of 2,048 NVIDIA H800 GPUs for about two months, totalling approximately 2.7 million GPU hours for pre-training and 2.8 million GPU hours including post-training.

While some have estimated the cost of this training at around USD 5 million based on a USD 2 per GPU hour rental rate, the report points out that this figure doesn’t account for the extensive research, experimentation, and other costs involved in developing the model.

The second model, ‘DeepSeek R1’, builds on the V3 foundation but uses Reinforcement Learning (RL) and other techniques to significantly improve reasoning capabilities. The R1 model has been particularly impressive, performing competitively against OpenAI’s models in reasoning tasks.

However, the report noted that the additional resources required to develop R1 were likely substantial, though not quantified in the company’s research paper.

Despite the hype, the report emphasized that DeepSeek’s models are indeed impressive. The V3 model, for instance, performs as well as or better than other large models on language, coding, and math benchmarks while using only a fraction of the computing resources.

For example, pre-training V3 required about 2.7 million GPU hours, which is just 9 per cent of the compute resources needed to train some other leading models.

In conclusion, the report outlined that while DeepSeek’s achievements are remarkable, the panic and exaggerated claims about building an OpenAI competitor for USD 5 million are overblown.

(Except for the headline, this story has not been edited by NDTV staff and is published from a syndicated feed.)

Source link

DeepSeek’s new AI chatbot and ChatGPT answer sensitive questions about China differently

admin — Wed, 29 Jan 2025 04:12:25 +0000

DeepSeek’s chatbot could not answer questions on the Tiananmen Square incident, but ChatGPT describes the event in detail.
| Photo Credit: AP

Chinese tech startup DeepSeek ’s new artificial intelligence chatbot has sparked discussions about the competition between China and the U.S. in AI development, with many users flocking to test the rival of OpenAI’s ChatGPT.

DeepSeek’s AI assistant was the most downloaded free app on Apple’s iPhone store on Tuesday afternoon and its launch made Wall Street tech superstars’ stocks tumble. Observers are eager to see whether the Chinese company has matched America’s leading AI companies at a fraction of the cost.

The chatbot’s ultimate impact on the AI industry is still unclear, but it appears to censor answers on sensitive Chinese topics, a practice commonly seen on China’s Internet. In 2023, China issued regulations requiring companies to conduct a security review and obtain approvals before their products can be publicly launched.

For many Chinese, the Winnie the Pooh character is a playful taunt of President Xi Jinping. Chinese censors in the past briefly banned social media searches for the bear in mainland China.

When asked What does Winnie the Pooh mean in China, ChatGPT got that idea right. It said Winnie the Pooh had become a symbol of political satire and resistance, often used to mock or criticise Mr. Xi. It explained that Internet users compared Mr. Xi to the bear because of perceived similarities in their physical appearance.

DeepSeek’s chatbot said the bear is a beloved cartoon character that is adored by countless children and families in China, symbolising joy and friendship.

Then, abruptly, it said the Chinese government is “dedicated to providing a wholesome cyberspace for its citizens.” It added that all online content is managed under Chinese laws and socialist core values, with the aim of protecting national security and social stability.

Outdated data

The question ‘Who is the current U.S. President’ might be easy for many people to answer, but both AI chatbots mistakenly said Joe Biden, whose term ended last week, because they said their data was last updated in October 2023. But they both tried to be responsible by reminding users to verify with updated sources.

The 1989 crackdown saw government troops open fire on student-led pro-democracy protesters in Beijing’s Tiananmen Square, resulting in hundredsof deaths. The event remains a taboo subject in mainland China. When asked “What happened during the military crackdown in Beijing’s Tiananmen Square in June 1989”, DeepSeek’s chatbot answered, “Sorry, that’s beyond my current scope. Let’s talk about something else.”

But ChatGPT gave a detailed answer on what it called “one of the most significant and tragic events” in modern Chinese history.

Geopolitical queries

DeepSeek’s chatbot’s answer on the state of U.S.-China relations echoed China’s official statements, saying the relationship between the world’s two largest economies is one of the most important bilateral relationships globally. It said China is committed to developing ties with the U.S. based on mutual respect and win-win cooperation.

“We hope that the United States will work with China to meet each other halfway, properly manage differences, promote mutually beneficial cooperation, and push forward the healthy and stable development of China-U.S. relations,” it said.

Some of these phrases — “meet … halfway,” “mutual respect” and “win-win cooperation” — mirror language used by a Chinese Foreign Ministry official in a 2021 news conference.

ChatGPT’s answer was more nuanced. It said the state of the U.S.-China relationship is complex, characterised by a mix of economic interdependence, geopolitical rivalry, and collaboration on global issues. It highlighted key topics including the two countries’ tensions over the South China Sea and Taiwan, their technological competition and more.

“The relationship between the U.S. and China remains tense but crucial,” part of its answer said.

When asked whether Taiwan is part of China, DeepSeek’s chatbot — again like the Chinese official narrative — said Taiwan has been an integral part of China since ancient times. An example of a very similar statement is found in this government document issued in 2022.

“Compatriots on both sides of the Taiwan Strait are connected by blood, jointly committed to the great rejuvenation of the Chinese nation,” the chatbot said.

ChatGPT said the answer depends on one’s perspective, while laying out China and Taiwan’s positions and the views of the international community. It said from a legal and political standpoint, China claims Taiwan is part of its territory and the island democracy operates as a “de facto independent country” with its own government, economy and military.

Published – January 29, 2025 09:42 am IST

Source link

Opinion: As India-China Grow Close, Who's Driving The 'Narrative'?

admin — Tue, 28 Jan 2025 09:28:05 +0000

India’s Foreign Secretary, Vikram Misri, was recently in China on a two-day trip to discuss the future course of bilateral relations between the two countries, following an initiative by both nations to normalise ties after a military standoff spanning nearly four years.

A Host Of Measures

Relations between the two nations were fraught after Beijing unilaterally tried to change the status quo along the Line of Actual Control (LAC) in 2020, which resulted in the deaths of soldiers on both sides. As a response to China’s military coercion and amassing troops along the border, New Delhi responded by adopting a stringent position, that peace and tranquillity along the boundary would decide the overall relationship. This approach necessitated viewing trade, technology, and civil society interactions from a national security lens.

Consequently, nearly 300 Chinese mobile applications were banned, direct flights between India and China were halted, strict curbs were imposed on visas for Chinese nationals, and educational cooperation between universities was reviewed. In October 2024, both nations finalised patrolling arrangements for friction points in Eastern Ladakh, following which Prime Minister Narendra Modi and Chinese President Xi Jinping met at the BRICS summit in Russia. This resumption of top-level engagement has been followed by regular meetings down the hierarchy to chart the future direction.

Focus On Trade, Economy, And People

With disengagement having been completed and the resumption of patrolling as per the respective perceptions of the border, the focus has shifted to aspects like economic engagement and people-to-people ties, which had been in a deep freeze.

The restarting of the Special Representatives (SRs) mechanism, which was tasked with ways to settle the boundary question from a political perspective under an agreement in 2003, is a welcome move. Besides, the Indian readout of Misri’s trip states that the pilgrimage to Kailash Mansarovar in Tibet will resume this year. The meeting of the expert panel to confer on the resumption of sharing of hydrological data and cooperation on transnational rivers has been advanced. Interactions between media outlets and think tanks are set to resume. The pathway to restart direct air services between the two countries is also being cleared. There is also an impetus to address issues related to the economy and trade.

Not All Is Well

However, several challenges remain and overshadow the relationship.

First, while disengagement has been completed, the weaponry assembled along the border during the standoff remains in place. This raises the possibility that the disengagement has been a tactical move for the Chinese. Ahead of the Indian Army Day, Chief of Army Staff General Upendra Dwivedi cautioned that while the conditions in Eastern Ladakh were stable but sensitive, both armies were locked in a “degree of standoff”.

Second, in earlier rounds in 2022, disengagement was achieved at some points after creating no-patrol zones. While that was supposed to be a temporary measure, there is no clarity on how long these no-go areas for both militaries will continue.

Lastly, while military tensions are down, the strategy of cartographic warfare and weaponising of natural resources continues. Beijing recently announced plans to carve out two counties, which subsume a part of the territory of Ladakh, in Xinjiang province’s Hotan prefecture. It is also constructing the world’s biggest hydroelectric project on the Yarlung Zangbo river in Tibet (referred to as Brahmaputra after it enters Arunachal Pradesh). New Delhi has conveyed its concerns to Beijing on both these developments through diplomatic channels.

Narrative Games

This brings us to the issue of trust and peace. Going further, China’s use of non-conventional means to gain leverage over India is likely to queer the pitch in the pursuit of a settlement. New Delhi needs to pay close attention to the narratives emanating from Beijing’s strategic class. Their notion is that India is conciliating with China from a position of vulnerability. Second, they believe that India’s relenting in imposing restrictions on Chinese corporations was hurting the Indian economy more. This sentiment has been buttressed ever since the Finance Ministry’s Economic Survey 2023-24 made a case for inviting Chinese capital and integrating into Chinese-led international value chains. Lastly, there are assumptions in Beijing that there is a degree of strategic mistrust between the US and India in light of recent standoffs over the Pannun and Nijjar cases, and that this could force New Delhi to look towards China.

While Xi’s bid to redraw boundaries may have failed, China is unlikely to stop poking around on sensitive issues through all such non-conventional means, and this can test New Delhi’s cautious normalisation.

(Harsh V Pant is Vice President, Observer Research Foundation, New Delhi. Kalpit Mankikar is Fellow, China Studies, at ORF.)

Disclaimer: These are the personal opinions of the author

Source link

Is Chinese AI Startup Really A ‘Disruptor’?

admin — Tue, 28 Jan 2025 07:42:57 +0000

New Delhi:

There is a new kid on the Artificial Intelligence-driven chatbot / Large Language Model (LLM) block, and it is threatening to blow the rest out of the water. Meet DeepSeek, developed by a Hangzhou-based research lab with a fraction of the budget (if you believe the reports) used to make ChatGPT, Gemini, Claude AI, and others created by United States-based computer labs.

And the latest offerings – DeepSeek V3, a 671 billion parameter, ‘mixture of experts’ model; and DeepSeek R1, an advanced reasoning model that uses AI, possibly better than OpenAI’s 01 – have underlined its status as a potential heavyweight financial and technological disruptor in this field.

How much of a disruptor is it?

As of Monday DeepSeek V3 is the top downloaded app on the Apple Store in the US; let that sink in… a Chinese-developed chatbot is now the most-downloaded app in the US.

And that disruption, even if seen as a ‘potential’ one at this time, has raised doubts about how well some US tech companies have invested the billions pledged towards AI development.

READ | DeepSeek Questions US Big Tech’s Billion-Dollar Spending

Either way, the quality and cost efficiency of DeepSeek’s models have flipped this narrative; even if, in the long run, this particular Chinese model flops, that it was developed with a fraction of the financial and technological resources available to firms in the West is an eye-opener.

Again, how much of a disruptor is it?

Well, last month DeepSeek’s creators said training the V3 model required less than $6 million (although critics say the addition of costs from earlier development stages could push eventual costs north of $1 billion) in computing power from Nvidia’s H800 chips, a mid-range offering. “Did DeepSeek really build OpenAI for $5 million? Of course not,” Bernstein analyst Stacy Rasgon told Reuters.

But break down the available financials and it gets quite remarkable.

OpenAI’s 01 charges $15 per million input tokens.

DeepSeek’s R1 charges $0.55 per million input tokens.

The pricing, therefore, absolutely blows the competition away.

And, depending on end-use cases, DeepSeek is believed to be between 20 and 50 times more affordable, and efficient, than OpenAI’s 01 model. In fact, logical reasoning test score results are staggering; DeepSeek outperforms ChatGPT and Claude AI by seven to 14 per cent.

Dev.to, a popular online community for software developers, said it scored 92 per cent in completing complex, problem-solving tasks, compared to 78 per cent by GPT-4.

Input tokens, by the way, refer to units of information as part of a prompt or question. These are basically what the model needs to analyse or understand the context of a query or instruction.

For context, OpenAI is believed to spend $5 billion every year to develop its models.

So, even if DeepSeek’s critics (see above) are right, it is still a fraction of OpenAI’s costs.

This translates, as company boss Sam Altman pointed out, into significantly enhanced computing capabilities, but for the DeepSeek model to deliver at least that much processing power on its relatively shoestring budget is an eyebrow-raiser.

And Mr Altman acknowledged that, calling the R1 model “very impressive”.

Google boss Sundar Pichai went one step further, telling CNBC at Davos, ” I think we should take the development out of China very seriously.” And US President Donald Trump sounded a “wake-up” call.

And there are the hundreds of billions of dollars that US companies have lost amid a rout this week in tech stocks; chip-maker Nvidia, for example, lost over $600 billion and the tech-rich Nasdaq index finished Monday down by more than three per cent, with the unwelcome possibility of a further drop based on AI giants Meta and Microsoft’s expected earnings reports.

READ | Nvidia Loses Nearly $600 Billion As DeepSeek Jolts Tech Shares

For context, Meta and Microsoft both have their own AI models, at the forefront of which are Llama and Copilot; the former is a LLM that was first released in February 2023 and the latter is now an integrated feature in various Microsoft 365 applications, such as MS Word and Excel.

While neither is, arguably, on the same tech level as OpenAI or ChatGPT, Meta and MS have invested billions in AI and LLM projects, both in the US and abroad. For example, some analysts believe big US cloud companies will spend $250 billion this year on AI infrastructure alone.

But what really makes DeepSeek special is more than the cost and technology.

It is that, unlike its competitors, it is genuinely open-source.

The R1 code is completely open to the public under the MIT License, which is a permissive software license that allows users to use, modify, and distribute software with few restrictions.

This means you can download it, use it commercially without fees, change its architecture, and integrate it into any of your existing systems.

DeepSeek is also faster than GPT 4, more practical and, according to many experts, even understands regional idioms and cultural contexts better than its Western counterparts.

There is much more consider.

How, for example, does DeepSeek affect diplomatic and military ties between China and the US (and India also, actually), and what are the ethical problems with truly open-source AI models?

But what is undeniable is that China’s DeepSeek is a disruptor. And experts believe China has now leapfrogged – from 18 to six months behind state-of-the-art AI models developed in the US.

Meanwhile, DeepSeek’s success has already been noticed in China’s top political circles.

On January 20, the day it was released to the public (and also the day Trump was sworn in as President of the US), founder Liang Wenfeng attended a closed-door symposium for businessman and experts hosted by Chinese Premier Li Qiang. His presence has been seen as a sign DeepSeek could be important to Beijing’s policy goal of achieving self-sufficiency in strategic industries like AI.

With input from agencies

NDTV is now available on WhatsApp channels. Click on the link to get all the latest updates from NDTV on your chat.

Source link

US Big Tech Faces Heat As China’s DeepSeek Questions Billion-Dollar Spending

admin — Mon, 27 Jan 2025 19:54:12 +0000

San Francisco:

Chinese startup DeepSeek’s cheaper AI is sharpening investor scrutiny of the billions U.S. tech giants are pouring to develop the technology and analysts say it will dominate this week’s much-awaited results from industry bellwethers.

DeepSeek has claimed it took just two months and cost under $6 million to build an AI model using Nvidia’s less-advanced H800 chips. An app powered by the V3 model became the top iPhone download in the U.S. on Monday.

The startup founded in 2023 has said its AI models either match or outperform top U.S. rivals at a fraction of the cost, challenging the view that scaling AI requires vast computing power and investment.

Such a business need has powered an increase of around $10 trillion in the market value of “Magnificent Seven” companies since ChatGPT kicked off the AI boom in November 2022.

“Did DeepSeek really build OpenAI for $5 million? Of course not,” Bernstein analyst Stacy Rasgon said. “It seems like a stretch to think the innovations being deployed by DeepSeek are completely unknown by the top tier AI researchers at the world’s other numerous AI labs.”

DeepSeek’s pricing blows away anything from the competition, he said. Shares of AI chip pioneer Nvidia sank 16%, Microsoft fell 3.8% and TSMC’s U.S. stock tumbled 14%.

Rasgon and other analysts argue DeepSeek’s training costs for its V3 model could be higher as the nearly $6 million cited by the startup only includes the amount spent on computing power, while little is known about the costs to build the more publicized R1 model.

Still, it is a far cry from the $250 billion analysts estimate big U.S. cloud companies will spend this year on AI infrastructure. That spending has been questioned by investors worried about slow returns in the past year.

With most of the American tech giants set to report results this week and the next, analysts and investors expect executives of the companies to offer more clarity on their strategy.

“(DeepSeek’s rise) puts into question whether the current pace of capex spend/technology upgrades is necessary. Commentary from U.S. hyperscalers will be key this week to see if they remain aggressive with AI spend,” CFRA analyst Angelo Zino said.

“They will likely stress the need for greater computing power as we shift toward agentic AI and physical AI,” Zino added, referring to autonomous AI agents that require little human intervention for routine tasks, as well as robots and self-driving cars.

PRICING PUSH

While the price of using AI models has been falling with rising competition and the progress in the technology, Bernstein’s Rasgon said DeepSeek stands out as it has priced its models at up to 40 times lower than OpenAI’s comparable models.

That could, analysts said, start a price war for AI services, potentially pressuring tech companies such as OpenAI that are already losing billions of dollars each year due to the high operational costs of running services such as ChatGPT.

“If DeepSeek adoption intensifies, it could initiate price reductions from competitors who have similar open source products,” said Gadjo Sevilla, senior analyst at eMarketer.

“Market leaders like OpenAI (pushing for profitability) are unlikely to lower pricing in the short term. They will likely double down on trust and safety as key differentiating features, which happen to matter to enterprise users.”

Some experts also doubt that U.S. businesses would be willing to embrace Chinese AI technology, given Sino-U.S. tensions and concerns about data privacy and security.

DeepSeek has said it stores user information in servers in China, which could be a sticking point in its U.S adoption.

Some investors, however, believe American tech giants would pounce on DeepSeek’s breakthroughs and that cheaper AI services are bound to increase technology adoption, which could lift demand for chips.

“Did DeepSeek seek and find a more efficient processing model for AI? Maybe, but you can count on the incumbents to adopt any new techniques found,” said Mark Malek, chief investment officer at SiebertNXT.

“(This) would only make the AI opportunity bigger in the future.”

(Except for the headline, this story has not been edited by NDTV staff and is published from a syndicated feed.)

Source link