Patent search ap:("Microsoft Technology Licensing Page LLC") AND inv:"Yang LIU"

1.

发明申请
SMART AUDIO SEGMENTATION USING LOOK-AHEAD BASED ACOUSTO-LINGUISTIC FEATURES 有权

公开(公告)号：US20250054491A1

公开(公告)日：2025-02-13

申请号：US18721121

申请日：2021-12-22

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Sayan Dev PATHAK , Hosam Adel KHALIL , Naveen PARIHAR , Piyush BEHRE , Shuangyu CHANG , Christopher Hakan BASOGLU , Sharman W TAN , Eva SHARMA , Jian WU , Yang LIU , Edward C LIN , Amit Kumar AGARWAL

IPC: G10L15/04 , G10L15/01

Abstract: Systems and methods are provided for smart audio segmentation using look-ahead based acousto-linguistic features. For example, systems and methods are provided for obtaining audio, processing the audio, identifying a potential segmentation boundary within the audio, and determining whether to generate a segment break at the potential segmentation boundary. One or more look-ahead words occurring after the potential segmentation boundary are identified, wherein an acoustic segmentation score and a language segmentation score associated with the potential segmentation boundary and the one or more look-ahead words are generated. Systems then either refrain from generating a segment break at the potential segmentation boundary or generate the segment break at the potential segmentation boundary based on the acoustic and/or language segmentation score at least meeting or exceeding a segmentation score threshold.

2.

发明申请
ENHANCED USER EXPERIENCE THROUGH BI-DIRECTIONAL AUDIO AND VISUAL SIGNAL GENERATION 有权

公开(公告)号：US20220343543A1

公开(公告)日：2022-10-27

申请号：US17240510

申请日：2021-04-26

Applicant: Microsoft Technology Licensing, LLC

Inventor： Sunando SENGUPTA , Alexandros NEOFYTOU , Eric Chris Wolfgang SOMMERLADE , Yang LIU

IPC: G06T9/00 , G06T3/60 , G10L19/012 , G06K9/62 , G10L25/51

Abstract: In various embodiments, a computer-implemented method of training a neural network for creating an output signal of different modality from an input signal is described. In embodiments, the first modality may be a sound signal or a visual image and where the output signal would be a visual image or a sound signal, respectively. In embodiments a model is trained using a first pair of visual and audio networks to train a set of codebooks using known visual signals and the audio signals and using a second pair of visual and audio networks to further train the set of codebooks using the augmented visual signals and the augmented audio signals. Further, the first and the second visual networks are equally weighted and where the first and the second audio networks are equally weighted. In aspects of the present disclosure, the set of codebooks comprise a visual codebook, an audio codebook and a correlation codebook. These codebooks are then used to create an visual image from a sound signal and/or a sound signal from a visual image.

3.

发明申请
SYSTEMS AND METHODS FOR PROVIDING A COMMENT-CENTERED NEWS READER 审中-公开

公开(公告)号：US20180159804A1

公开(公告)日：2018-06-07

申请号：US15578203

申请日：2015-05-29

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Furu WEI , Ming ZHOU , Yang LIU , Ziqiang CAO , Shaohan HUANG , Li DONG , Lei CUI

IPC: H04L12/58 , G06F17/27

CPC classification number: H04L51/04 , G06F17/18 , G06F17/2229 , G06F17/2235 , G06F17/241 , G06F17/277 , G06F17/2775 , G06F17/278 , G06F17/2785 , G06N5/022

Abstract: Methods and systems for linking comments to portions of content items. An example computing device receives information associated with a content item produced by a source system, the content item being accessible to other the computing devices via a network and receives a comment associated with the content item, the comment produced by one of the other computing devices. In response to receiving the information and the comment, the computing device predicts a subsection of the content item to link to the received comment based at least on details associated with the content item and the comment, then makes information associated with the predicted subsection of the content item available to other computing devices requesting access to the content item.

4.

发明申请
COMMENT-CENTERED NEWS READER 审中-公开

公开(公告)号：US20180150450A1

公开(公告)日：2018-05-31

申请号：US15578195

申请日：2015-05-29

Applicant: Furu WEI , Ming ZHOU , Yang LIU , Ziqiang CAO , Shaohan HUANG , Li DONG , Lei CUI , MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Furu WEI , Ming ZHOU , Yang LIU , Ziqiang CAO , Shaohan HUANG , Li DONG , Lei CUI

IPC: G06F17/27 , G06F3/048 , G06F17/24

Abstract: Methods and systems for providing a comments-centered news reader. Configurations allow live comments to be presented along with the news or similar website content. While a user scrolls up and down in a browser presenting a news article on the user's computer device (e.g., mobile device), linked comments are shown in a selected region. The displayed comments automatically change to adapt to what parts (paragraphs, sentences) of the news article that user is currently reading. At the same time, users can publish their own comments without having to proceed to a separate section of the browser, thus saving the viewer actions and improving the user's experience. The user's system or a remote server records the comments along with the article or the place users are in the article when the comment was entered.

5.

发明申请
SYSTEMS AND METHODS FOR PROVIDING A COMMENT-CENTERED NEWS READER 有权

公开(公告)号：US20230076387A1

公开(公告)日：2023-03-09

申请号：US18050287

申请日：2022-10-27

Applicant: Microsoft Technology Licensing, LLC

Inventor： Furu WEI , Ming ZHOU , Yang LIU , Ziqiang CAO , Shaohan HUANG , Li DONG , Lei CUI

IPC: H04L51/04 , G06F40/30 , G06F40/131 , G06F40/134 , G06F40/169 , G06F40/284 , G06F40/289 , G06F40/295

Abstract: Methods and systems for linking comments to portions of content items. An example computing device receives information associated with a content item produced by a source system, the content item being accessible to other the computing devices via a network and receives a comment associated with the content item, the comment produced by one of the other computing devices. In response to receiving the information and the comment, the computing device predicts a subsection of the content item to link to the received comment based at least on details associated with the content item and the comment, then makes information associated with the predicted subsection of the content item available to other computing devices requesting access to the content item.

6.

发明申请
CLASSIFYING AUDIO SCENE USING SYNTHETIC IMAGE FEATURES 有权

公开(公告)号：US20210216817A1

公开(公告)日：2021-07-15

申请号：US16844930

申请日：2020-04-09

Applicant: Microsoft Technology Licensing, LLC

Inventor： Eric Chris Wolfgang SOMMERLADE , Yang LIU , Alexandros NEOFYTOU , Sunando SENGUPTA

IPC: G06K9/62 , H04N7/14 , H04N5/272

Abstract: A computing system includes an encoder that receives an input image and encodes the input image into real image features, a decoder that decodes the real image features into a reconstructed image, a generator that receives first audio data corresponding to the input image and generates first synthetic image features from the first audio data, and receives second audio data and generates second synthetic image features from the second audio data, a discriminator that receives both the real and synthetic image features and determines whether a target feature is real or synthetic, and a classifier that classifies a scene of the second audio data based on the second synthetic image features.

7.

发明申请
OCTREE-BASED CONVOLUTIONAL NEURAL NETWORK 审中-公开

公开(公告)号：US20200042863A1

公开(公告)日：2020-02-06

申请号：US16606653

申请日：2018-04-20

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Pengshuai WANG , Yang LIU , Xin TONG

IPC: G06N3/04 , G06F17/15 , G06F16/901 , G06F1/20

Abstract: The implementations of the subject matter described herein relate to an octree-based convolutional neural network. In some implementations, there is provided a computer-implemented method for processing a three-dimensional shape. The method comprises obtaining an octree for representing the three-dimensional shape. Nodes of the octree include empty nodes and non-empty nodes. The empty nodes exclude the three-dimensional shape and are leaf nodes of the octree, and the non-empty nodes include at least a part of the three-dimensional shape. The method further comprises for nodes in the octree with a depth associated with a convolutional layer of a convolutional neural network, performing a convolutional operation of the convolutional layer to obtain an output of the convolutional layer.

8.

发明申请
TRAINING AND USING A DEEP LEARNING MODEL FOR TRANSCRIPT TOPIC SEGMENTATION 有权

公开(公告)号：US20250061277A1

公开(公告)日：2025-02-20

申请号：US18720606

申请日：2021-12-15

Applicant: Chenguang ZHU , Yang LIU , David HUNG , Nanshan ZENG , Microsoft Technology Licensing, LLC

Inventor： Chenguang ZHU , Yang LIU , David Peace HUNG , Nanshan ZENG

IPC: G06F40/284 , G06F40/30 , G10L15/26

Abstract: The disclosure herein describes using a deep learning model to identify topic segments of a communication transcript. A communication transcript including a set of utterances is obtained. The set of utterances is divided into a plurality of utterance windows, wherein each utterance window of the plurality of utterance windows includes a different subset of utterances of the set of utterances, and wherein each utterance of the set of utterances is included in at least one utterance window of the plurality of utterance windows. For each utterance window of the plurality of utterance windows, each utterance in the utterance window is classified as a topic boundary or a non-boundary using a deep learning model. Topic segments of the communication transcript are identified based on utterances of the set of utterances that are classified as topic boundaries. A communication transcript summary is generated using the communication transcript and the identified topic segments.

9.

发明公开
NATURAL LANGUAGE TRAINING AND/OR AUGMENTATION WITH LARGE LANGUAGE MODELS 审中-公开

公开(公告)号：US20240346254A1

公开(公告)日：2024-10-17

申请号：US18133938

申请日：2023-04-12

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Yang LIU , Yichong XU , Dan ITER , Chenguang ZHU , Nanshan ZENG , Shuohang WANG , Hiteshi SHARMA

IPC: G06F40/40 , G06F40/186 , G06F40/20 , G06F40/35 , G06N20/00

CPC classification number: G06F40/40 , G06F40/186 , G06F40/20 , G06F40/35 , G06N20/00

Abstract: The techniques described herein enhance the operations of natural language generation systems through training and/or augmentation by a large language model. In a first example, the large language model can execute training operations by processing a training dataset to produce a natural language output. The natural language generation system can analyze the training dataset and the natural language output to generate a natural language output mimicking the output of the large language model. The large language model can then evaluate the output of the natural language generation system to iteratively adjust and improve the quality of natural language outputs. In a second example, the large language can augment a small language model in executing natural language tasks. This is accomplished by retrieving external information using the large language model to generate an augmentation input to provide context and a language framework to the small language model to enhance overall outputs.

10.

发明公开
ENHANCED USER EXPERIENCE THROUGH BI-DIRECTIONAL AUDIO AND VISUAL SIGNAL GENERATION 审中-公开

公开(公告)号：US20240054683A1

公开(公告)日：2024-02-15

申请号：US18383956

申请日：2023-10-26

Applicant: Microsoft Technology Licensing, LLC

Inventor： Sunando SENGUPTA , Alexandros NEOFYTOU , Eric Chris Wolfgang SOMMERLADE , Yang LIU

IPC: G06T9/00 , G06T3/60 , G10L19/012 , G10L25/51 , G06F18/21

CPC classification number: G06T9/00 , G06T3/60 , G10L19/012 , G10L25/51 , G06F18/21 , G10L2019/0002

Abstract: In various embodiments, a computer-implemented method of training a neural network for creating an output signal of different modality from an input signal is described. In embodiments, the first modality may be a sound signal or a visual image and where the output signal would be a visual image or a sound signal, respectively. In embodiments a model is trained using a first pair of visual and audio networks to train a set of codebooks using known visual signals and the audio signals and using a second pair of visual and audio networks to further train the set of codebooks using the augmented visual signals and the augmented audio signals. Further, the first and the second visual networks are equally weighted and where the first and the second audio networks are equally weighted.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification