Shangtang releases Nissin 5.0: benchmarking against GPT-4 Turbo
On April 24th, trading was urgently suspended after the stock price of Shangtang Technology surged by more than 30%. On April 23rd, Shangtang held a technical exchange day event and launched the 600 billion parameter large model Nissin 5.0. The official article stated that this is “China’s first GPT-4 Turbo level large model”. The Nissin 5.0 large model has stronger knowledge, mathematics, reasoning, and coding abilities, and its comprehensive performance comprehensively benchmarks against the GPT-4 Turbo, achieving or surpassing the GPT-4 Turbo in mainstream objective evaluation.
In practical use, Shang Tang stated that the model has significantly improved its natural language abilities in creative writing, reasoning, and summarization, as well as its ability to generate images and images. Shang Tang mentioned that its multimodal large model’s ability to perceive images and texts has reached a global leading level.
Comment: This update mainly focuses on enhancing knowledge, mathematics, reasoning, and coding abilities. The improvement in the capabilities of the Nissin 5.0 model is partly due to the use of a hybrid expert architecture (MoE), which activates a small number of parameters to complete inference, and the context window during inference reaches about 200K. Secondly, the model is trained on over 10TB tokens and covers hundreds of billions of levels of logical synthetic dimension chain data. In addition, it also benefits from the joint optimization of SenseCore computing power facilities and algorithm design in the Shangtang AI large device. Recently, overseas companies such as Aerospace and Meta have released their latest models, with some benchmark scores exceeding GPT-4, which is quite reminiscent of overtaking OpenAI. Finally, there is also the Shangtang Rixin 5.0 in China. At the same time, competition may enter a new stage, and GPT-5 may be launched this summer, with OpenAI signaling that GPT-5 performance far surpasses GPT-4.
Nvidia makes another move to acquire an Israeli company
On April 24th local time, Nvidia announced that it had signed a final acquisition agreement with Israeli company Run: ai, which Nvidia believes can help customers more effectively utilize their AI computing resources. Run: AI was founded in 2018 and is a workload management and orchestration software provider based on the open-source container orchestration platform Kubernetes. The company’s products can improve the resource utilization of GPU clusters. There are reports that Nvidia’s acquisition of Run: ai is worth $700 million. Nvidia recently acquired another Israeli startup, Deci, which was founded in 2019 and provides efficient generative AI and computer vision models. Its solution is to adjust the size of AI models to make them run cheaper on AI chips.
Comment: Nvidia is highly favored by Israeli startups. In 2019, Nvidia also defeated potential investors such as Intel and Microsoft to acquire Israeli network technology supplier Mellanox for a total price of approximately $7 billion, which is an important acquisition for Nvidia’s high-speed network layout. From these two new acquisitions, it can be seen that Nvidia is laying out efficient utilization of GPU cluster resources and helping customers reduce the cost of using AI. At present, the high computational cost of AI large models is evident, and training a large model may cost tens of millions of dollars. Behind this is the difficulty in improving the energy efficiency of AI chips and reducing power consumption. Beyond the chip manufacturing process, Nvidia is seeking more solutions to promote the popularization of AI.
Apple acquires Paris based artificial intelligence startup Datakalab
According to French media reports, Apple has acquired Paris startup Datakalab, which focuses on artificial intelligence compression algorithms and computer vision technology. This acquisition was completed in December last year, and both companies reported this transaction to the European Commission this month.
Datakalab is an AI startup headquartered in Paris, France, specializing in artificial intelligence compression and computer vision technology. Datakalab describes itself as an expert in low-power, runtime efficiency, and deep learning algorithms, where the system can run on the device side. This company collaborated with the French government in May 2020 to deploy AI tools to the public transportation system in Paris to check whether passengers are wearing masks.
Comment: The market believes that this acquisition is part of Apple’s broader AI strategy, aimed at introducing more complex AI technologies into its devices, such as the upcoming iOS 18 model and the future Apple Vision Pro. During the earnings conference call in February this year, Apple CEO Cook revealed that there are some exciting things about Apple (in terms of AI) that they will discuss later this year. Apple is not moving fast in the field of big models. Whether to develop it independently, acquire and improve AI capabilities, or directly install products from external big model companies on terminal devices, the outside world is still waiting for Apple to make a decision.
OpenAI CEO invests in energy startup Exowatt
According to foreign media reports, energy startup Exowatt has recently received a seed round investment of $20 million (approximately RMB 145 million) from investors such as OpenAI CEO Sam Altman and well-known Silicon Valley venture capital firm Andreessen Horowitz.
Exowatt was established in 2023 with the aim of using solar energy to meet the demand for clean energy in large data centers. It is reported that unlike traditional solar panels that directly convert sunlight into electricity, Exowatt uses a unique method that involves storing heat instead of electricity, storing solar energy in a thermal cell. The company has developed a three in one modular energy system designed to supply power to the data center, combining collectors, thermal batteries, and heat engines that can provide power and heat that can be dispatched.
Comment: This is not Ultraman’s first investment in an energy company. Prior to this, he had invested in Helion Energy and Oklo, which are energy companies specializing in controllable nuclear fusion and nuclear fission power generation. The end of AI lies in energy. A study shows that ChatGPT can consume 500000 kilowatt hours of electricity per day, which is more than 17000 times the average daily electricity consumption of American households. At the previous Davos World Economic Forum, Altman stated that breakthroughs in energy are needed for artificial intelligence in the future, as the electricity consumed by artificial intelligence will far exceed people’s expectations. Investing in energy companies means that OpenAI may hope that changes in the energy sector will provide support for AI computing power.
AI search startup Perplexity AI receives a new round of $63 million in financing
On April 24th, AI search engine startup Perplexity AI announced on social media that it had received $62.7 million in financing, with a valuation of $1.04 billion. This time, Daniel Gross led the pitch, with Stan Druckenmiller, NVIDIA, Jeff Bezos, Tobi Lutke, Garry Tan, Andrej Karpathy, and others following.
Perplexity AI provides a generative AI search engine service that supports users to search for any information through natural language. Perplexity AI can be viewed as a search engine version of ChatGPT.
According to data, Perplexity AI was founded in August 2022 and received $3.1 million in seed round financing in September of the same year. In December, it released the Q&A engine “Ask”, and its business grew rapidly. After four months, its monthly active users exceeded 2 million, becoming the dark horse in the field of generative AI search engines. This is also one of the important reasons why it can obtain investments from Microsoft, Google, GitHub, and other companies.
Comment: Perplexity AI states that Q&A models like ChatGPT have completely overturned the interaction mode of applications, simplified a lot of manual operations, and made people’s ways of obtaining information more convenient and extensive. At the same time, this also provides an opportunity for Perplexity AI to challenge traditional search engines through intelligent technology innovation. The large amount of financing for startups overseas is still ongoing, indicating that AI search engines are still an important direction.
Aishi Technology completed A2 round of financing exceeding 100 million yuan, led by Ant Group
On April 23rd, the official announcement from Light Source Capital stated that Aishi Technology has completed the A2 round of financing exceeding 100 million yuan, led by Ant Group, with Light Source Capital serving as the exclusive financial advisor. At this point, Aishi Technology has accumulated financing of over 200 million RMB within a year, becoming the largest start-up company in the field of video big model financing in China. It is reported that this round of financing will be used to further iterate the self-developed video generation model, upgrade the team, and accelerate the industry application of AI video generation technology.
Comment: Aishi Technology was founded in April 2023. Wang Changhu, the founder and CEO, was once the head of ByteDance visual technology. Aishi Technology officially released its cultural and educational video product PixVerse in January 2024. The official website stated that the monthly traffic has exceeded one million, and the cumulative video generation has exceeded ten million. It is widely used by creators in content production such as film and television, advertising, and anime. Founder Wang Changhu expressed the hope that in the future, AI native videos can be integrated into the production and consumption chain of the content industry, bringing sustained vitality to the AIGC field. After the launch of Sora, the domestic video generation track has also become hot, and Shengshu Technology has produced the Vidu big model. Aishi Technology’s next actions are highly anticipated.
Video big model Vidu released
At the 2024 Zhongguancun Forum Annual Meeting and Future Artificial Intelligence Pioneer Forum, Tsinghua University and Shengshu Technology jointly released China’s first long duration, high consistency, and high dynamic video model, Vidu. This model adopts the team’s original architecture U-ViT, which combines Diffusion and Transformer, and supports one click generation of high-definition video content up to 16 seconds long with a resolution of up to 1080P. Vidu is able to directly generate high-quality videos of up to 16 seconds based on the provided text descriptions.
Comment: According to Zhu Jun, a professor at Tsinghua University and chief scientist of Shengshu Technology, Vidu adopts a “one-step” generation method. Like Sora, the text to video conversion is direct and continuous, and the underlying algorithm is completely end-to-end generated based on a single model, without involving intermediate frame insertion and other multi-step processing. After the release of Sora in February this year, the team, based on their understanding of the U-ViT architecture and long-term accumulated experience, further broke through key technologies in long video representation and processing within two months and launched the model. From the released videos, it appears that Vidu’s visuals are more realistic, but the videos that have already been released are still a few seconds long, and there are still some areas where the motion transitions are not smooth. There should be significant room for improvement in the future.
“AIGC’s First Stock” Goes Out to Ask about the First Day of Listing Breakout
On April 24th, under the name of “AIGC’s first stock”, Go Out and Ask (02438. HK) officially landed on the Hong Kong Stock Exchange, and its listing broke through. Its issue price was HKD 3.8 per share, with an opening price drop of 21.58% to HKD 2.98 per share. As of the closing of the day, Go Out and Ask’s stock price was HKD 3.68 per share, a decrease of 3.16%, with a market value of HKD 5.489 billion.
According to the prospectus submitted upon going out to inquire, the company’s revenue from 2021 to 2023 was 398 million yuan, 500 million yuan, and 507 million yuan, respectively. From 2021 to 2023, the cumulative total comprehensive loss attributable to the equity shareholders of the company exceeded 2 billion yuan. After deducting changes in the book value of redeemable preferred and common shares, share based compensation, and listing expenses, the company’s adjusted net loss for 2021 was 73 million yuan. The adjusted net profits for 2022 and 2023 were 109 million yuan and 18 million yuan, respectively.
Comment: Founded in 2012, Go Out and Ask is an AI company founded by former Google scientist Li Zhifei. Its core business is generative AI and voice interaction technology, and it has a self-developed large model called “Sequence Monkey”. It provides AIGC (AI generated content) solutions, AI enterprise solutions, smart devices and accessories, and other services for content creators, enterprises, and consumers. According to the data provided by going out and asking, the company currently has over 10 million AIGC solution users worldwide, of which approximately 840000 are paying users. The failure on the first day of listing may indicate that the capital market still has concerns about the commercialization ability of related AIGC products.
Tencent’s 30% code is generated by AI code assistant
Tencent Cloud recently announced that 30% of Tencent’s code is written by Tencent Cloud’s AI code assistant. Tencent’s R&D personnel account for over 74%, and Tencent has launched the AI code assistant based on the hybrid model. Half of Tencent’s employees use the AI code assistant every day, with a code generation rate of over 30%. This assistant or “AI programmer” can intelligently complete code information, efficiently complete coding work, fix error codes, interpret existing code, and engage in artificial intelligence technology conversations. Tencent Cloud AI Code Assistant is also preparing to enter more industries such as finance.
Comment: More and more “AI programmers” are entering technology giants to work. Interestingly, recently Baidu revealed that 27% of its daily new code is generated by the intelligent code assistant Comate, and the proportion is also close to 30%. This may mean that at present, the proportion of work that AI can replace human programmers is about 30%, and more may not be realistic. Not long ago, the AI programming product Devin from Cognition Labs was questioned for its deceptive programming abilities, and many of the problems were fixed by Devin’s self directed and self performed actions. Using AI as a helper for human programmers rather than a replacement for programmers is currently a more realistic path. It is worth looking forward to seeing what industries and changes these large factories will open up to in the future, in addition to using their own code assistants.
SK Hynix plans to invest over 100 billion yuan to expand production
It is reported that SK Hynix plans to invest approximately 20 trillion Korean won (approximately 14.6 billion US dollars) to build new storage chip production capacity in South Korea and upgrade production capacity to meet the rapidly growing demand for AI development. This South Korean company will allocate an initial allocation of 5.3 trillion Korean won to start building a new factory or wafer fab around the end of April, with plans to complete it by November 2025.
Comment: SK Hynix is one of the major DRAM (Dynamic Random Access Memory) manufacturers. Currently, due to the high demand for AI, there is a shortage of HBM (High Bandwidth Memory) composed of multiple stacked DRAMs. SK Hynix is a major supplier of NVIDIA’s HBM. In addition to SK Hynix, Samsung is also striving to squeeze into Nvidia’s supply chain. Due to the increase in demand for HBM, competition among DRAM manufacturers has become a competition for HBM. In addition to technological leadership, whoever can quickly increase production capacity and mass produce will have more opportunities in the AI era.
Moon’s Dark Side denies founder Yang Zhilin cashing out tens of millions of dollars
There are reports that Yang Zhilin, the founder of the artificial intelligence startup Moon Dark Side, has cashed out tens of millions of dollars by selling his personal shares, with the founder and related personnel cashing out an amount of 40 million dollars. There are also market rumors that Zhang Yutong, former managing partner of Jinshajiang Venture Capital who participated in Yang Zhilin’s first entrepreneurial project “Circular Intelligence”, has resigned from Jinshajiang Venture Capital and will join the dark side of the moon. Regarding the above market rumors, the person in charge of the dark side of the Moon responded that the founder’s cash out news was untrue. The dark side of the Moon had previously announced an employee incentive plan, and Zhang Yutong did not join the dark side of the Moon.
Comment: From past financing, the dark side of the moon is indeed very wealthy. The company has completed three rounds of financing, with a pre investment valuation of over $1.5 billion in February this year, completing a Series B investment of over $1 billion. Alibaba led the investment, Li Si Capital, and Xiaohongshu followed suit, with a post investment valuation of approximately $2.5 billion, making it one of the most important large model unicorns in China. The dark side of the moon has previously attracted market attention due to the long text chat application Kimi, and has even sparked a group of “Kimi concept stocks” in the capital market. However, under the rising heat, the outside world is more hopeful that the dark side of the moon can come up with more successful commercial products to prove its ability to sustain blood production.
OpenAI CEO claims that the performance of GPT-5 far exceeds that of GPT-4
On April 25th, Sam Altman, co-founder and CEO of OpenAI, gave a speech at Stanford University. In a leaked video, Ultraman stated that the GPT-5 will be more intelligent, which will be one of the most eye-catching events in history. The GPT-6 will be much smarter than the GPT-5, and we are far from reaching our limits. For OpenAI product iteration, Altman believes that it is very important to launch AI products as early as possible and frequently, and maintain iterative deployment. Even though ChatGPT still seems a bit awkward now, GPT-4 still seems foolish. To prepare society for technological progress, it depends on iterative deployment.
Comment: With multiple competitors releasing updated and stronger big models, what kind of products are expected from OpenAI to respond. Ultraman revealed that the performance of the GPT-5 far exceeds that of the GPT-4, further enhancing people’s expectations. Recently, there have been frequent rumors about the next generation of OpenAI big models. CITIC Securities recently reported that the GPT-5 is in the red team testing stage and is expected to be launched as early as this summer. The GPT-5 is expected to continue using the MoE (Hybrid Expert Model) architecture and is expected to achieve new milestones in multimodal understanding, long text input, and other areas. It seems that GPT-5 will arrive soon.
Alibaba, Baidu, and Tencent Cloud compete for Llama 3 computing power
After Meta released two open-source models for the Llama 3 series, cloud vendors such as Baidu, Alibaba, and Tencent quickly seized the computing power deployment needs of the Llama 3. On April 22nd, Alibaba Cloud announced that the Bailian Big Model Service Platform will launch a limited time free training, deployment, and inference service for the Llama 3 series. Tencent Cloud announced on the same day that the Tencent Cloud TI platform has become one of the first platforms in China to support the full series of Llama 3 models. On April 19, Baidu AI Cloud Qianfan Model Platform announced that it was the first cloud manufacturer in China to launch a full range of training and reasoning solutions for Llama 3.
Comment: Compared to closed source models with more centralized features in inference training, many cloud vendors have the opportunity to compete for the deployment requirements of open source models with dispersed computing power. The benchmark test data of the open-source version of Llama 3 is impressive. The Llama 3 8B with instruction fine-tuning scored higher than Gemma 7B-1t and Mistral 7B Instrument on five benchmarks, while the Llama 3 70B scored higher than Gemini Pro 1.5 and Claude 3 Sonnet on three benchmarks. As Grok-1, Llama 3, and others successively break through the upper limit of open source model parameters, the potential computing power demand is increasing. If the Llama 3 version has more than 400 billion parameters and its performance is comparable to GPT-4, the demand for deploying the model will increase, and cloud vendors that connect to Llama 3 will not miss the potential computing power dividend.
Tsinghua University establishes the School of Artificial Intelligence
Tsinghua University has established the School of Artificial Intelligence, focusing on two key directions: “Core Basic Theory and Architecture of Artificial Intelligence” and “Artificial Intelligence+X”, to provide strong support for achieving high-level technological self-reliance and self-improvement. Yao Qizhi, winner of Turing Award and academician of the CAS Member, served as the first dean of the School of Artificial Intelligence of Tsinghua University.
Comment: Tsinghua University is one of the earliest institutions in China to carry out artificial intelligence teaching and research. Tsinghua University established a teaching and research group of “artificial intelligence and intelligent control” in 1978, the first intelligent robot laboratory in China was established in 1985, and the first State Key Laboratory named after “intelligence” was established in 1990. Afterwards, Tsinghua also successively established the Brain and Intelligence Laboratory, Future Laboratory, Artificial Intelligence Research Institute, Artificial Intelligence International Governance Research Institute, and Intelligent Industry Research Institute. Nowadays, a group of artificial intelligence entrepreneurial teams have a background in Tsinghua University, including Zhipu AI, Facewall Intelligence, Moon’s Dark Side, and Shengshu Technology. The “Tsinghua Department” has become a shining star. This time, a specialized artificial intelligence college has been established and key directions have been selected. Tsinghua University is expected to enhance its leading ability in the field of artificial intelligence.