-
公开(公告)号:US20240406860A1
公开(公告)日:2024-12-05
申请号:US18204343
申请日:2023-05-31
Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
Inventor: Anuj KALIA , Yu YAN , Xenofon FOUKAS , Bozidar RADUNOVIC , Nikita LAZAREV
IPC: H04W52/02 , H04W72/12 , H04W72/543
Abstract: Methods and apparatuses for improving the performance and energy efficiency of Radio Access Networks (RANs) are described. Various power control schemes may dynamically adjust RAN power consumption based on fluctuations in network traffic, throughput, latency, queue sizes, and/or packet error rates with the goal of increasing energy efficiency while maintaining quality of service metrics. The power control schemes may be implemented using a PRB controller for dynamically allocating physical resource blocks (PRBs) to user devices and a CPU controller for assigning CPU power profiles based on PRB allocations for the user devices. The PRB controller and CPU controller may periodically acquire real-time telemetry data and wireless network performance information and then adjust the number of PRBs for user devices and adjust the CPU power profiles for executing RAN functions based on the telemetry data and wireless network performance information.
-
公开(公告)号:US20230388856A1
公开(公告)日:2023-11-30
申请号:US17825596
申请日:2022-05-26
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yu YAN , Anuj KALIA , Sanjeev MEHROTRA , Paramvir BAHL
CPC classification number: H04W28/0942 , H04W28/24 , H04W28/0289 , H04L1/0003
Abstract: A method for utilizing computing resources in a vRAN is described. A predicted resource load is determined for data traffic processing of wireless communication channels served by the vRAN using a trained neural network model. The data traffic processing comprises at least one of PHY data processing or MAC processing for a 5G RAN. Computing resources are allocated for the data traffic processing based on the predicted resource load. Wireless parameter limits are determined for the wireless communication channels that constrain utilization of the allocated computing resources using the trained neural network model, including setting one or more of a maximum number of radio resource units per timeslot or a maximum MCS index for the wireless parameter limits. The data traffic processing is performed using the wireless parameter limits to reduce load spikes that cause a violation of real-time deadlines for the data traffic processing.
-
3.
公开(公告)号:US20240070223A1
公开(公告)日:2024-02-29
申请号:US18064223
申请日:2022-12-09
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yu YAN , Timothy Lawrence HARRIS
Abstract: Example solutions for multi-stage 8-bit floating point (FP8) matrix multiplication with format conversion, that benefit computation efficiency of matrix multiplication operations by a processor, include: copying data values in FP8 format from global memory to shared memory; loading thread block tiles of FP8 data values from the shared memory into a set of registers; converting each of the multiple FP8 data values in the set of registers to 16-bit floating point (FP16) data values; submitting the FP16 data values to the tensor core; and performing, with the tensor core, matrix multiply accumulate computations.
-
公开(公告)号:US20220417293A1
公开(公告)日:2022-12-29
申请号:US17362708
申请日:2021-06-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Landon Prentice COX , Yu YAN
Abstract: A method for communication session management by a session controller is described. Usage data associated with a video communication session is received for data stream handlers of a first network topology, which handle data streams of the video communication session at a first time. The first network topology includes a plurality of client devices and at least a first media server. A second network topology is determined based on the usage data to handle the data streams when a network parameter and/or an application parameter reaches a corresponding update threshold. Data stream handlers of the second network topology include at least a second media server. The data stream handlers of the second network topology are configured to handle the data streams at a second time, including instructing the first media server to offload at least some of the data streams to the second media server.
-
公开(公告)号:US20220100676A1
公开(公告)日:2022-03-31
申请号:US17178385
申请日:2021-02-18
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yu YAN , Jiusheng CHEN , Ruofei ZHANG
IPC: G06F12/122 , G06N3/04 , G06F40/40
Abstract: Systems and methods for dynamically modifying a cache associated with a neural network model of a natural language generator are described. In examples, a neural network model employs a beam search algorithm at a decoder when decoding output and generating predicted output candidates. The decoder utilizes caching techniques to improve a speed at which the neural network operations. When an amount of memory utilized by one or more caches of the neural network model is determined to exceed a threshold memory size, a layer-specific portion of a cache associated with a layer of the neural network model is identified. The identified layer-specific portion of the cache can be deleted when the amount of memory utilized by the cache of the neural network model exceeds the threshold memory size. In examples, data in the cache is deduplicated and/or deleted.
-
公开(公告)号:US20240405945A1
公开(公告)日:2024-12-05
申请号:US18204332
申请日:2023-05-31
Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
Inventor: Anuj KALIA , Yu YAN , Xenofon FOUKAS , Bozidar RADUNOVIC , Nikita LAZAREV
Abstract: Methods and apparatuses for improving the performance and energy efficiency of Radio Access Networks (RANs) are described. Various power control schemes may dynamically adjust RAN power consumption based on fluctuations in network traffic, throughput, latency, queue sizes, and/or packet error rates with the goal of increasing energy efficiency while maintaining quality of service metrics. The power control schemes may be implemented using a PRB controller for dynamically allocating physical resource blocks (PRBs) to user devices and a CPU controller for assigning CPU power profiles based on PRB allocations for the user devices. The PRB controller and CPU controller may periodically acquire real-time telemetry data and wireless network performance information and then adjust the number of PRBs for user devices and adjust the CPU power profiles for executing RAN functions based on the telemetry data and wireless network performance information.
-
公开(公告)号:US20230412335A1
公开(公告)日:2023-12-21
申请号:US17825766
申请日:2022-05-26
Applicant: Microsoft Technology Licensing, LLC
Inventor: Manikanta KOTARU , Yu YAN , Paramvir BAHL , Neil AGARWAL
CPC classification number: H04L5/0048 , H04B7/0617 , H04W24/02 , H04B7/0626 , H04W72/1263
Abstract: Aspects of the present disclosure relate to determining reference symbol transmission times. In some examples, a method for determining reference symbol transmission times for cellular communications includes receiving signal feedback based on a wireless communication channel between a wireless communication device and a base station, identifying a periodic exchange of reference symbols that are used to adjust beamforming between the wireless communication device and the base station, generating a vector based on the signal feedback, and providing the vector as an input to a trained machine learning model. A training of the trained machine learning model includes calculating a plurality of rewards for a respective plurality of transmission time delays. The plurality of rewards are each calculated based on a function of downlink throughput and uplink overhead. The function of downlink throughput and uplink overhead are based upon a priority level of the wireless communication device.
-
公开(公告)号:US20240046037A1
公开(公告)日:2024-02-08
申请号:US18268699
申请日:2020-12-25
Applicant: Microsoft Technology Licensing, LLC
Inventor: Jian JIAO , Yeyun GONG , Nan DUAN , Weizhu CHEN , Kewen TANG , Qiang LOU , Ruofei ZHANG , Yu YAN , Jiusheng CHEN
IPC: G06F40/284 , G06F40/40
CPC classification number: G06F40/284 , G06F40/40
Abstract: Systems and methods are provided for training a data model based on training data. The training includes pre-training and fine-tuning the data model based on a combination of an autoregressive (AR) model and a non-autoregressive (NAR) model. Training data may be received and encoded into streams of tokens. A pre-trainer during decoding generates a continuum of data structures of the AR and NAR combined model including a main stream and a series of predicting streams. Masked tokens in predicting streams reference or attend to one or more preceding tokens in the main stream or the preceding predicting streams. A fine-tuner selects streams to generate a trained model according to a target data model. The target data model is determined based on balancing an accuracy constraint and an efficiency constraint for predicting tokens. The decoder acts as abridge between the AR and NAR models in generating a trained data model.
-
公开(公告)号:US20230007056A1
公开(公告)日:2023-01-05
申请号:US17363434
申请日:2021-06-30
Applicant: Microsoft Technology Licensing, LLC
Inventor: Landon Prentice COX , Yu YAN , Shadi ABDOLLAHIAN NOGHABI
IPC: H04L29/06
Abstract: A method for data stream prioritization by a session controller is described. Usage data associated with a video communication session is received for one or more client devices of the video communication session. The usage data is based on content within data streams of the video communication session. A first client device of the one or more client devices is identified as having a higher priority level during the video communication session based on the usage data. Instructions are sent to the first client device during the video communication session causing the first client device to improve a quality of a first data stream generated by the first client device for the video communication session.
-
公开(公告)号:US20220417306A1
公开(公告)日:2022-12-29
申请号:US17362474
申请日:2021-06-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ganesh ANANTHANARAYANAN , Yu YAN , Yuanchao SHU
Abstract: Systems and methods are provided for reducing stream data according to a data streaming protocol under a multi-access edge computing. In particular, an IoT device, such as a video image sensing device, may capture stream data and generate inference data by applying a machine-learning model trained to infer data based on the captured stream data. The inference data represents the captured stream data in a reduced data size based on performing data analytics on the captured data. The IoT device formats the inference data according to the data streaming protocol. In contrast to video data compression, the data streaming protocol includes instructions for transmitting the reduced volume of inference data through a data analytics pipeline.
-
-
-
-
-
-
-
-
-