Qiu Lili, Vice President of Microsoft Research Asia: Wireless communication and perception and positive two-way enabling of AI big model

It is foreseeable that the application of large models in the field of wireless communication will become increasingly widespread, and the development of wireless communication will also bring more possibilities for the application of large models.

When it comes to big models, people are no longer unfamiliar with them. Multimodal big models have expanded from poetry and painting to improving entertainment experiences and work efficiency. However, Qiu Lili, vice president of Microsoft Research Asia and head of Microsoft Research Asia (Shanghai), believes that the big model can do more.

Increase the number of understandable modal types for large models

“AI technology is already capable of processing different data modalities, such as text, images, videos, and speech. However, to support specific applications in different industries such as healthcare, we also need to process more data that is different from traditional modalities, such as physiological signals and wireless sensing technology, including WiFi, millimeter waves, and LiDAR. We are committed to better supporting these new modalities and exploring how to combine them with traditional modalities. This field has huge development potential. When mentioning the related applications of large models, Qiu Lili’s focus is on making more modal data understandable by large models.”.

As experts in the field of wireless communication, she and her team’s research in the field of wireless technology mainly focuses on enhancing signals, improving rates, and developing algorithms for transmission and perception.

She showed several lightweight boards smaller than A4 paper and several plastic boards stacked like LEGO to media reporters including First Financial. These are quite magical “metasurfaces” – two-dimensional materials with artificially designed structures. By designing the units of each metasurface, the wavefront, phase, and amplitude of the reflected wave can be accurately modified, resulting in powerful capabilities such as beam turning, focusing, polarization conversion, and so on. By accurately modeling and optimizing sound signals or electromagnetic waves, metasurfaces can achieve imaging using ordinary sound waves and increase the communication distance and speed of wireless signals. These passive metasurfaces are low-cost, power free, and easy to deploy technologies that effectively enhance wireless performance and functionality.

For example, metasurfaces can bring visual effects to smart speakers. Smart speakers only have a few speakers and microphones, but they can collect objects and emit waves in different directions through metasurfaces. Based on the changes in the position of a person in the room, the reflected signal can not only locate the person’s position, but also perceive changes in distance to detect exhalation. They can also understand information in space without infringing on privacy.

Ultrasound imaging is common, but metasurfaces significantly lower the threshold for technology implementation, allowing devices like smart speakers to image. One of the advantages of ultrasound imaging is that it can relatively protect privacy while also allowing for transmission. The ultrasonic bandwidth is several megahertz, and a professional probe with hundreds of transceivers is also required. And smart speakers have a bandwidth of only a few thousand Hz, with 2-4 speakers and microphones. Bandwidth and transceivers are greatly limited. “We have achieved this effect for the first time when imaging on a typical smart speaker,” said Qiu Lili.

Qiu Lili’s team not only developed metasurfaces to improve the accuracy of wireless perception, but also combined machine learning. When wireless signals are transmitted, basic information such as the angle and distance of the signal can be determined through signal processing, thereby achieving 2D positioning. However, in practical applications, due to insufficient signal strength or fast movement of the target object, conventional signal processing methods cannot accurately capture the target position, which often leads to abnormal situations. At this point, it is particularly important to combine the use of machine learning models for analysis. This combination of signal processing and machine learning methods can more effectively solve problems.

Machine learning can be applied to the analysis of different modal signals in the development of many systems, such as intelligent speech early screening systems. “Speech is a very useful signal. On the one hand, it can better protect privacy than video, and it also contains a lot of rich information, including human physiological health information. For example, our pronunciation reflects the health of the articulatory organs, and our pronunciation can also reflect the health of the brain and emotions. Therefore, our team has developed a speech ‘therapist’ for patients with high nasal sounds and an early screening system for Alzheimer’s disease based on these. We are collaborating with hospitals to promote the implementation of technology. In addition, we are exploring the use of speech to perceive emotions.” She said.

Qiu Lili stated that the team is also exploring unsupervised anomaly detection through videos, such as autism patients having some abnormal stereotyped behaviors. “We use modeling, extract 2D and 3D key point information, and utilize some characteristics of stereotyped behaviors to achieve unsupervised anomaly behavior monitoring.”.

These are all directions for future applications.

Bidirectional Empowerment of Large Models and Wireless Communication

The continuous development of large models is also empowering wireless communication in both directions.

“AI can improve the compression rate of data, and many contents can be directly generated at the receiving end, thereby reducing transmission volume and greatly reducing network pressure. If packet loss occurs, we can also automatically repair it through AI technology. Using AI technology to transmit data also prompts Microsoft to develop new technologies, such as using AI on edge devices to avoid uploading all data to the cloud, which not only reduces transmission requirements but also better protects privacy. For example, Microsoft’s recently released Phi-3-mini can better protect privacy without transmitting to the cloud.” She said that this is the role that big models can play in wireless communication data transmission.

At the end of April, Microsoft launched the Small Language Models (SLMs) on its official website – Phi-3-mini. As one of the fourth generation products of Microsoft’s Phi series, Phi-3-mini has a huge training data volume of 3.8 billion parameters and 3.3T tokens. It has two context length variants, 4k and 128k tokens. After pre training and instruction adjustment, it can better understand human language, expression, logic, and execute different types of instructions.

“We need to combine communication algorithms and application requirements to understand the specific requirements of data transmission, in order to better compress and recover data, thereby reducing transmission costs. Each mode has its own characteristics,” she said.

Therefore, today’s wireless communication network is a complex field that includes multiple artificial intelligence algorithms.

“AI technology affects every aspect of wireless communication, from the physical layer, network layer, to the application layer. At the physical layer, the most basic task is to decode wireless signals, that is, determine whether they are transmitted as 1 or 0. Usually, this task relies on signal processing technology, but in many antenna arrays, traditional signal processing methods are not optimal. Therefore, adopting AI technology may improve processing efficiency, and many companies are currently promoting the development of this aspect.” Qiu Lili told First Financial that large models have great potential for application in diagnosing and repairing anomalies in the network layer, “because these models have a huge knowledge base that can be effectively diagnosed. They do not necessarily need to rely on ready-made cases. By analyzing networks, they can also have great potential for application.” The protocol can predict possible causes of malfunctions This allows the large model to play a role in network anomaly diagnosis.

In the future, with the continuous progress of AI technology, it can be foreseen that the application of large models in the field of wireless communication will become increasingly widespread, and the development of wireless communication will also bring more possibilities for the application of large models.