专利检索 ap:("MICROSOFT TECHNOLOGY LICENSING, LLC") AND inv:"Lijuan Qin" 第 1 页

1.

发明授权
Customized output to optimize for user preference in a distributed system 有权

公开(公告)号：US11023690B2

公开(公告)日：2021-06-01

申请号：US16398836

申请日：2019-04-30

申请人： Microsoft Technology Licensing, LLC

发明人： Takuya Yoshioka , Andreas Stolcke , Zhuo Chen , Dimitrios Basile Dimitriadis , Nanshan Zeng , Lijuan Qin , William Isaac Hinthorn , Xuedong Huang

IPC分类号： G06F40/58 , G10L13/08 , G10L15/26 , H04L29/06

摘要： Systems and methods for providing customized output based on a user preference in a distributed system are provided. In example embodiments, a meeting server or system receives audio streams from a plurality of distributed devices involved in an intelligent meeting. The meeting system identifies a user corresponding to a distributed device of the plurality of distributed devices and determines a preferred language of the user. A transcript from the received audio streams is generated. The meeting system translates the transcript into the preferred language of the user to form a translated transcript. The translated transcript is provided to the distributed device of the user.

2.

发明授权
Audio stream processing for distributed device meeting 有权

公开(公告)号：US10812921B1

公开(公告)日：2020-10-20

申请号：US16399122

申请日：2019-04-30

申请人： Microsoft Technology Licensing, LLC

发明人： William Isaac Hinthorn , Lijuan Qin , Nanshan Zeng , Dimitrios Basile Dimitriadis , Zhuo Chen , Andreas Stolcke , Takuya Yoshioka , Xuedong Huang

IPC分类号： H04R3/00 , H04S3/00 , H04N21/43 , H04N21/422 , H04R5/04

摘要： A computer implemented method includes receiving multiple channels of audio from three or more microphones detecting speech from a meeting of multiple users, localizing speech sources to determine an approximate direction of arrival of speech from a user, using a speech unmixing model to select two channels corresponding to a primary and a secondary microphone, and sending the two selected channels to a meeting server for generation of a speaker attributed meeting transcript.

3.

发明授权
Intelligent user interfaces for multiple communication lines 有权

公开(公告)号：US09900423B2

公开(公告)日：2018-02-20

申请号：US15249231

申请日：2016-08-26

申请人： Microsoft Technology Licensing, LLC

发明人： Tony He , Susan Chory , Gregory Howard , Peter Bergler , Lijuan Qin , Jon Arnett , Janis Jungeun Lee , Petteri Mikkola , Issa Y. Khoury

IPC分类号： H04W8/22 , H04M1/725 , H04W4/12 , H04W4/00 , H04W8/18 , H04W4/16 , H04L12/24 , H04L29/12 , H04L29/06 , H04W88/06

CPC分类号： H04M1/72563 , H04L41/22 , H04L61/1594 , H04L63/0853 , H04M1/72519 , H04W4/02 , H04W4/12 , H04W4/16 , H04W4/50 , H04W4/60 , H04W8/18 , H04W8/183 , H04W8/22 , H04W88/06

摘要： Various user interfaces and other technologies for interacting with devices that support multiple communication lines can be implemented. Scenarios providing separate communications lines, such as voice over internet protocol (VOIP), social network communications, and the like can be supported. For example, communication-line-separated and communication-line-aggregated user interface paradigms can be supported. Intelligent selection of an appropriate paradigm can support user preferences, conversation user interfaces, and the like. Other features such as communication line defaults can help users deal with multiple communication line scenarios. A consistent, compact user interface for switching communication lines can be supported. Users can interact with their devices more efficiently and with less frustration. A wide variety of use scenarios are supported.

4.

发明授权
Distributed device meeting initiation 有权

公开(公告)号：US11468895B2

公开(公告)日：2022-10-11

申请号：US16399152

申请日：2019-04-30

申请人： Microsoft Technology Licensing, LLC

发明人： Takuya Yoshioka , Andreas Stolcke , Zhuo Chen , Dimitrios Basile Dimitriadis , Nanshan Zeng , Lijuan Qin , William Isaac Hinthorn , Xuedong Huang

IPC分类号： G10L15/26 , H04L65/403 , H04R1/40

摘要： A computer implemented method includes receiving audio streams at a meeting server from two distributed devices that are streaming audio captured during an ad-hoc meeting between at least two users, comparing the received audio streams to determine that the received audio streams are representative of sound from the ad-hoc meeting, generating a meeting instance to process the audio streams in response to the comparing determining that the audio streams are representative of sound from the ad-hoc meeting, and processing the received audio streams to generate a transcript of the ad-hoc meeting.

5.

发明申请
Audio Stream Processing for Distributed Device Meeting 审中-公开

公开(公告)号：US20200351603A1

公开(公告)日：2020-11-05

申请号：US16399122

申请日：2019-04-30

申请人： Microsoft Technology Licensing, LLC

发明人： William Isaac Hinthorn , Lijuan Qin , Nanshan Zeng , Dimitrios Basile Dimitriadis , Zhuo Chen , Andreas Stolcke , Takuya Yoshioka , Xuedong Huang

IPC分类号： H04S3/00 , H04R3/00 , H04R5/04 , H04N21/43 , H04N21/422

摘要： A computer implemented method includes receiving multiple channels of audio from three or more microphones detecting speech from a meeting of multiple users, localizing speech sources to determine an approximate direction of arrival of speech from a user, using a speech unmixing model to select two channels corresponding to a primary and a secondary microphone, and sending the two selected channels to a meeting server for generation of a speaker attributed meeting transcript.

6.

发明授权
Audio-visual diarization to identify meeting attendees 有权

公开(公告)号：US11875796B2

公开(公告)日：2024-01-16

申请号：US16399081

申请日：2019-04-30

申请人： Microsoft Technology Licensing, LLC

发明人： Lijuan Qin , Nanshan Zeng , Dimitrios Basile Dimitriadis , Zhuo Chen , Andreas Stolcke , Takuya Yoshioka , William Isaac Hinthorn , Xuedong Huang

IPC分类号： G10L15/26 , G10L15/22 , H04L65/403 , H04N7/15 , H04R1/40

CPC分类号： G10L15/26 , G10L15/22 , H04L65/403 , H04N7/15 , H04R1/406

摘要： A computer implemented method includes receiving information streams on a meeting server from a set of multiple distributed devices included in a meeting, receiving audio signals representative of speech by at least two users in at least two of the information streams, receiving at least one video signal of at least one user in the information streams, associating a specific user with speech in the received audio signals as a function of the received audio and video signals, and generating a transcript of the meeting with an indication of the specific user associated with the speech.

7.

发明授权
Speaker attributed transcript generation 有权

公开(公告)号：US11322148B2

公开(公告)日：2022-05-03

申请号：US16399166

申请日：2019-04-30

申请人： Microsoft Technology Licensing, LLC

发明人： Takuya Yoshioka , Andreas Stolcke , Zhuo Chen , Dimitrios Basile Dimitriadis , Nanshan Zeng , Lijuan Qin , William Isaac Hinthorn , Xuedong Huang

IPC分类号： G10L15/26 , G10L15/08 , G10L19/018

摘要： A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices. Operations include performing speech recognition on each audio stream by a corresponding speech recognition system to generate utterance-level posterior probabilities as hypotheses for each audio stream, aligning the hypotheses and formatting them as word confusion networks with associated word-level posteriors probabilities, performing speaker recognition on each audio stream by a speaker identification algorithm that generates a stream of speaker-attributed word hypotheses, formatting speaker hypotheses with associated speaker label posterior probabilities and speaker-attributed hypotheses for each audio stream as a speaker confusion network, aligning the word and speaker confusion networks from all audio streams to each other to merge the posterior probabilities and align word and speaker labels, and creating a best speaker-attributed word transcript by selecting the sequence of word and speaker labels with the highest posterior probabilities.

8.

发明授权
Synchronization of audio signals from distributed devices 有权

公开(公告)号：US10743107B1

公开(公告)日：2020-08-11

申请号：US16399369

申请日：2019-04-30

申请人： Microsoft Technology Licensing, LLC

发明人： Takuya Yoshioka , Andreas Stolcke , Zhuo Chen , Dimitrios Basile Dimitriadis , Nanshan Zeng , Lijuan Qin , William Isaac Hinthorn , Xuedong Huang

IPC分类号： H04R5/04 , G10L15/16 , H04L29/06 , H04R3/04 , H04R3/12 , G10L15/08

摘要： A computer implemented method includes receiving audio signals representative of speech via multiple audio channels transmitted from corresponding multiple distributed devices, designating one of the audio channels as a reference channel, and for each of the remaining audio channels, determine a difference in time from the reference channel, and correcting each remaining audio channel by compensating for the corresponding difference in time from the reference channel.

9.

发明申请
ARTIFICIAL INTELLIGENCE SYSTEM UTILIZING MICROPHONE ARRAY AND FISHEYE CAMERA 审中-公开

公开(公告)号：US20190236416A1

公开(公告)日：2019-08-01

申请号：US15885518

申请日：2018-01-31

申请人： Microsoft Technology Licensing, LLC

发明人： Zhenghao Wang , Xuedong Huang , Lijuan Qin , Kun Wu , Huaming Wang

IPC分类号： G06K9/62 , H04N5/232 , H04N5/262 , G06K9/00 , G10L17/22 , G06F3/16 , G06F3/01 , H04R1/22 , G06K7/14 , G06K7/10 , G06N3/08

CPC分类号： G06K9/6289 , G06F3/017 , G06F3/16 , G06F3/167 , G06K7/10722 , G06K7/1417 , G06K9/00288 , G06N3/08 , G10L17/22 , H04N5/23216 , H04N5/23238 , H04N5/2628 , H04N13/204 , H04R1/222 , H04R1/2892 , H04R2201/401

摘要： In some embodiments, the disclosed subject matter involves a system and method relating to using an ambient capture device including a fisheye camera and a microphone array to capture audio and video in an environment, for use in an artificial intelligence (Al) application. The device with fisheye camera may provide approximately a 360° audio and video view, at relatively low cost. An embodiment may utilize a speech and vision fusion model component. The speech and vision fusion model may be trained using deep learning to combine features from many different sources, including available sensor data from the capture device. A long short term memory (LSTM) model may inter or identify features such as, but not limited to: audio direction; vision detection and tracking; voice signature; facial signature; gesture recognition; and object identification. The fusion processing may be performed by a cloud server, enabling the capture device to remain less complex.

10.

发明授权
Processing overlapping speech from distributed devices 有权

公开(公告)号：US11138980B2

公开(公告)日：2021-10-05

申请号：US16399175

申请日：2019-04-30

申请人： Microsoft Technology Licensing, LLC

发明人： Takuya Yoshioka , Andreas Stolcke , Zhuo Chen , Dimitrios Basile Dimitriadis , Nanshan Zeng , Lijuan Qin , William Isaac Hinthorn , Xuedong Huang

IPC分类号： G10L21/0272 , G10L25/30 , G10L15/30 , G10L15/16 , G10L21/0208

摘要： A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类