model, cool

Image source: Generated by Unbounded AI

After more than 200 days of large-scale entrepreneurship, the mentality of Chinese explorers has changed from ideal excitement to reality.

Prior to this, in addition to the company itself, the entrepreneurship of the AI model was endowed with more meanings such as national sentiment and the trend of the times. Facing the emergence of the general-purpose model ChatGPT, Chinese entrepreneurs quickly reached a consensus on how to build a Chinese version of OpenAI and ChatGPT.

There is no doubt that ChatGPT is the top player in the world this year. Because of it, the traffic of the OpenAI website exceeded 1.8 billion in April, ranking among the top 20 in the global traffic ranking. However, according to the data released by the web analysis company Similarweb, after six months of rapid growth, ChatGPT’s visits experienced a negative growth for the first time, and the visits in June fell by 9.7% from the previous month.

The sudden drop in ChatGPT traffic has triggered concerns and discussions in the global technology community about the risk of a bubble in the AI industry. The Economist magazine even made the conclusion that "the road to AI that is bigger and better is no longer feasible." The idea of becoming a "Chinese version of ChatGPT" is also drifting away from the Chinese entrepreneurial circle.

Zhu Xiaohu, founding partner of GSR, wrote in Moments: "Don't be superstitious about the general model, because next year GPT-3.5 will become commodity (general infrastructure), and three years later, GPT-4 will also be. For most entrepreneurs, the scene is the priority, and the data is king!"

At the same time as the mentality of practitioners has changed, the large-scale entrepreneurial market has begun to divide.

China's large-scale model companies are no longer obsessed with the idealized goal of "becoming China's OpenAI", nor are they obsessed with the pursuit of parametric models and computing power. They have more pragmatic answers and pay more attention to solving problems in actual industrial scenarios.

The competition of AI large models has ushered in a new track node.

Collective Pragmatism

At the beginning of this year, Liang Jianzhang, the founder and chairman of the board of directors of Ctrip, began to contact ChatGPT. In recent years, he has been active in the academic and business circles as a demographer, but he was also the earliest and youngest programmer in China, and almost became a Ph.D. in artificial intelligence.

At the age of 13, Liang Jianzhang designed a program for writing metrical poems and won a national award. At the age of 15, he was admitted to the junior class of Fudan University. After graduation, he went to the United States to study and obtained a master's degree in computer science from the Georgia Institute of Technology at the age of 21. At the age of 22, Liang Jianzhang, who was studying for a Ph.D., suddenly realized the limitations of theoretical knowledge, and chose to give up his Ph.D. to join Oracle in the United States.

The emergence of ChatGPT made Liang Jianzhang glad that he did not continue to study for a doctorate in artificial intelligence, because "All these natural language processing algorithms were completely defeated by it (ChatGPT)". He recalled that when he first came into contact with ChatGPT, he was "very, very shocked", and the second feeling was humility, "The most intelligent algorithm we finally made was so close to the biological structure of the human brain."

Liang Jianzhang began to think about the combination of AI large model and Ctrip business.

In Liang Jianzhang's view, under the background of intelligent society, tourism, as a "spiritual demand industry that is difficult to automate", its demand will increase with the improvement of the overall social affluence, and its proportion in the economy will also expand. He locked his eyes on the vertical model of the tourism industry.

In the past six months, members from Ctrip's various businesses formed the Ctrip large-scale model technical team. In the initial stage of large model development, Ctrip adjusted its organizational structure according to the strategic requirements of the large model, and formed different departments including general technical team, content strategy team, list team, etc., and adapted the technical team according to the continuous upgrading and iteration of large model products.

Liang Jianzhang said that Ctrip will spare no effort to invest in the large model, "We should invest very firmly in the long run if it is valuable to our customers or merchants. This (large model) is very new, and the technical team is constantly optimizing their investment amount, but we have no limit."

In mid-July, Ctrip officially released a large vertical model of the tourism industry. Asked has screened 20 billion unstructured tourism data, combined with Ctrip’s existing structured real-time data, and Ctrip’s historically trained robots and search algorithms, and conducted self-developed vertical model training. “At the same time, we have invested a lot of manpower in generating and verifying the general travel reply content.” Liang Jianzhang emphasized.

Before the release of the question, Ctrip had conducted an internal test, and customer service staff Wang Yun obviously felt that her work had undergone great changes. In the past, she had to answer more than 150 calls a day to answer customers' questions about after-sales order cancellations and lost luggage; now, Wang Yun has transformed into recommending customer service before the trip, and walks into the live broadcast room from behind the scenes to provide netizens with services and suggestions that need to be prepared before travel.

In Liang Jianzhang’s view, on the basis of a general-purpose large-scale model, solving the problem of accuracy in the tourism industry is still the key: “Travel is a consumption-heavy industry. Even if the planning saves half an hour, there may be a 5% chance that the recommended hotel or itinerary result will be wrong.” Therefore, compared with a general-purpose large-scale model like ChatGPT, Liang Jianzhang pays more attention to the opportunity of a vertical large-scale model.

Although it is not as good as the general large model in terms of parameters, nor does it have the harsh requirements for computing power and other conditions of the general large model, but the vertical large model puts forward higher requirements for data and scenarios.

The biggest challenge in the large-scale model training of Ask is that in the real scene, in the process of users obtaining travel information, multiple rounds of interactive data collection and cleaning, the data volume and data accuracy need to be constantly corrected, especially the tourism industry has undergone tremendous changes, and the destination information three years ago may be completely outdated by now, especially the impact of the epidemic on the global tourism industry has exacerbated the timeliness of data.

Like Ctrip, more and more companies put the track of large models in the vertical field.

The Yanxi large-scale model answer sheet handed over by JD.com also takes the industrial large-scale model as an important feature. According to Xu Ran, the new CEO of Jingdong Group, the development of artificial intelligence technology in the past has been on the verge of application explosion several times, but it was often short-lived in the end, and one of the important reasons is that the technology has not formed a solid application in the industry**.

At the press conference of Huawei Pangu Large Model 3.0 on July 7, Zhang Pingan, CEO of Huawei Cloud, even said bluntly, "Pangu Large Model has no time to write poems and chat. No matter how many parameters and how good the dialogue ability is, if it can't solve practical problems, it will not be of much use."

Tencent has not yet announced the progress of the general large-scale model Hunyuan, but it has announced the industry's large-scale model route in a high-profile manner, throwing out more than 50 solutions for 10 major industries in one go. Li Qiang, vice president of Tencent and president of Tencent's government and enterprise business, also said: "General large models are not the only direction for model application, and models for vertical industries will become the tipping point of the value of large models."

Mindset Shift

An obvious node in the mentality change of large-scale entrepreneurs started when Wang Huiwen was diagnosed with depression and was acquired by Meituan light years away from his founding. Everyone suddenly realized that even the star company that had been expected by everyone and had high hopes might have to stop working because of various accidents.

Just half a year ago, other outlets seemed to dissipate overnight, and only the large model was in the C position. Entrepreneurs and investors in China's technology circles flew to Silicon Valley to learn from OpenAI. Zhang Yiming, Ma Huateng, and Wang Xing, the No. 1 or decision-making central figures of these giants, returned overnight to the state of excitement and curiosity when they first started their businesses, reading papers and exchanging technologies late at night.

At that time, it seemed that every explorer of Chinese large-scale models thought about problems with the feelings of family and country. Faced with the rapid iteration of ChatGPT, the goals set by Chinese entrepreneurs are how to achieve overtaking in how long.

Li Yanhong said, "Baidu Wenxinyiyan was in the research and development stage, and Baidu's technical team conducted a comparative test with ChatGPT. At that time, the gap was 40 points, and it could catch up in a month." Wang Xiaochuan also said that he will make the best big language model in China by the end of the year.

Zhou Hongyi, the founder of 360, said in an interview with "Chinese Entrepreneur" that large models are no longer the dimension of commercial competition. If the closedness of the Chinese Internet and the data island problem between APPs caused by the mobile Internet are not resolved, it is likely to lead to the limitations of artificial intelligence engine training, which may lead to a new round of AI revolution between China and the United States. Intergenerational gap.

As for why no company like OpenAI was born in China, Chinese entrepreneurs even began to reflect at that time. In the past, domestic artificial intelligence exploration was too pragmatic, and everything was KPI-oriented, so that everyone did not have the determination and patience to invest in it, so they missed such an important node.

With passion, after Baidu Wenxin fired the first shot, more than 80 large AI models emerged within half a year, and more than 30 large models appeared at the 2023 World Artificial Intelligence Conference held in Shanghai alone. It is not an exaggeration to describe the popularity of large models as "a hundred models war".

But do we really need so many big models? What kind of big model do we need?

In fact, Robin Li proposed very early on, "It doesn't make much sense for startups to recreate ChatGPT. I think there is a great opportunity to develop applications based on this large language model. There is no need to reinvent the wheel. After the wheel is available, the value of making cars and airplanes may be greater than that of the wheel."

He Xiaodong, vice president of technology at JD Group, also realized from the very beginning, "If the big model is to be valuable, it must be placed in the industry, and it is best to be in a field with high industrial value. Only in this way can it truly become a long-term sustainable thing, otherwise it may become a short-lived thing."

NEW VARIABLE

Just when domestic entrepreneurs were struggling to explore the AI model, Zuckerberg's big move brought new variables to this ever-changing field.

In the early hours of July 19th, Beijing time, Meta released the open-source large model Llama 2, which once again detonated the AI circle: Llama 2 not only has the same performance as GPT-3, but also is free, open-source, and commercially available. Llama 2 is the follow-up to the Llama model released by Meta earlier this March.

At the subsequent Microsoft Inspire partner conference, Microsoft CEO Satya Nadella announced the news of the cooperation between Meta and Microsoft. This cooperation allows Llama 2 to run on Microsoft's cloud service Microsoft Azure. At the same time, Amazon AWS cloud also joined the cooperation with Meta.

The significance of Llama 2 to large-scale model entrepreneurs is that, just like the Android system is to APP development, developers do not need to reinvent the wheel, and directly obtain the infrastructure of large-scale models at the lowest cost, so that they can focus more on their own industrial scenarios.

To some extent, this also means that for most entrepreneurs, choosing to focus on industry vertical applications has proved to be a more practical path.

However, different from general large models, **Industrial large models also put forward different thresholds and requirements for industry participants:**On the one hand, industrial large models require developers to have certain technical accumulation and strength; on the other hand, industrial large models also require operators to have rich industrial application practice scenarios.

Liang Jianzhang told "Chinese Entrepreneur": "The most important indicator of a general large model may be how many parameters or how many GPUs are used, etc., but the vertical large model and the large language model are only one part of it. It also has a combination with other data, including manual verification, etc. These are more important. Ultimately, it points to whether the efficiency, accuracy, and reliability of these questions and answers from customers can be improved. "

"The biggest problem in tourism is reliability. This is indeed more complicated than AI writing poems, articles, and novels. It is also a long-term work. Anything that can improve this to 80%, 90%, 95%, or even 99% is worth doing." Liang Jianzhang finally said.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)