专利检索 ap:("Microsoft Technology Licensing, LLC") AND inv:"Sriram Srinivasan" 第 1 页

1.

发明授权
Audio pipeline for simultaneous keyword spotting, transcription, and real time communications 有权

公开(公告)号：US11049496B2

公开(公告)日：2021-06-29

申请号：US16203963

申请日：2018-11-29

申请人： Microsoft Technology Licensing, LLC

发明人： Senthil Velayutham , Sriram Srinivasan

IPC分类号： G10L15/08 , G10L15/26 , G10L21/0208

摘要： Disclosed in some examples, are methods, systems, and machine-readable mediums for preventing unintended activation of voice command processing of a voice activated device. A first audio signal may be an audio signal that is to be output to a speaker communicatively coupled to the computing device. A second audio signal may be input from a microphone or other audio capture device. Both audio signals are input to a keyword detector to check for the presence of activation keywords. If the activation keyword(s) are detected in the second audio signal but not the first audio signal the voice command processing of the device is activated as this is likely a command from the user and not feedback from the loudspeaker.

2.

发明申请
PROTECTING DEEP LEARNED MODELS 有权

公开(公告)号：US20210133577A1

公开(公告)日：2021-05-06

申请号：US16828889

申请日：2020-03-24

申请人： Microsoft Technology Licensing, LLC

发明人： Sriram Srinivasan , David Yuheng Zhao , Ming-Chieh Lee , Mu Han

IPC分类号： G06N3/08 , G06N20/00 , G06N5/04 , G06F17/16

摘要： Apparatus and methods are disclosed for using machine learning models with private and public domains. Operations can be applied to transform input to a machine learning model in a private domain that is kept secret or otherwise made unavailable to third parties. In one example of the disclosed technology, a method includes applying a private transform to produce transformed input, providing the transformed input to a machine learning model that was trained using a training set modified by the private transform, and generating inferences with the machine learning model using the transformed input. Examples of suitable transforms that can be employed include matrix multiplication, time or spatial domain to frequency domains, and partitioning a neural network model such that an input and at least one hidden layer form part of the private domain, while the remaining layers form part of the public domain.

3.

发明申请
SYNCHRONIZED JITTER BUFFERS TO HANDLE CODEC SWITCHES 审中-公开

公开(公告)号：US20200244584A1

公开(公告)日：2020-07-30

申请号：US16260771

申请日：2019-01-29

申请人： Microsoft Technology Licensing, LLC

发明人： Sriram Srinivasan , Vinod Prakash , Soren Skak Jensen

IPC分类号： H04L12/841 , H04N21/44 , H04L29/06 , H04L12/26 , H04N21/43

摘要： Techniques are described for managing synchronized jitter buffers for streaming data (e.g., for real-time audio and/or video communications). A separate jitter buffer can be maintained for each codec. For example, as data is received in network packets, the data is added to the jitter buffer corresponding to the codec that is associated with the received data. When data needs to be read, the same amount of data is read from each of the jitter buffers. In other words, at each instance where data needs to be obtained (e.g., for decoding and playback), the same amount of data is obtained from each of the jitter buffers. In addition, the multiple jitter buffers use the same playout timestamp that is synchronized across the multiple of jitter buffers.

4.

发明授权
Handling timestamp inaccuracies for streaming network protocols 有权

公开(公告)号：US10701124B1

公开(公告)日：2020-06-30

申请号：US16216513

申请日：2018-12-11

申请人： Microsoft Technology Licensing, LLC

发明人： Sriram Srinivasan , Soren Skak Jensen , Koen Bernard Vos

IPC分类号： H04L29/06

摘要： Techniques are described for determining corrected timestamps for streaming data that is encoded using frames with a variable frame size. The streaming data is encoded into frames and transmitted in network packets in which the network packets or frames are associated with timestamps incremented in fixed steps. When a network packet is received after a lost packet, a corrected timestamp range can be calculated for the received packet based at least in part on the received timestamp value and attributes of the received network packet along with buffering characteristics.

5.

发明授权
Activity feed service 有权

公开(公告)号：US10693748B2

公开(公告)日：2020-06-23

申请号：US15590858

申请日：2017-05-09

申请人： Microsoft Technology Licensing, LLC

发明人： Chani A. Doggett , Brian R. Meyers , John E. Gallardo , Abolade Gbadegesin , Michael J. Novak , Yisheng Yao , Bartosz H. Paliswiat , Kiran Tatapudi , Colleen E. Hamilton , Shawn P. Henry , Kenneth M. Tubbs , Sriram Srinivasan , Mahmut Arslan

IPC分类号： H04L12/26 , G06F11/34 , G06F8/61 , H04L12/24 , H04L29/08 , H04L29/06

摘要： Technology related to an activity feed service is disclosed. In one example of the disclosed technology, a method can include receiving updates to activity streams, where a respective activity stream indicates an engagement of a respective user with applications executing on a respective client device connected to a network. The different activity streams associated with a particular user can be merged to generate a merged activity stream associated with the particular user. The different received activity streams can correspond to different respective client devices. The merged activity stream associated with the particular user can be transmitted over the network.

6.

发明授权
Artificially generated speech for a communication session 有权

公开(公告)号：US10147415B2

公开(公告)日：2018-12-04

申请号：US15422865

申请日：2017-02-02

申请人： Microsoft Technology Licensing, LLC

发明人： Ross G. Cutler , Sriram Srinivasan , Ramin Mehran , Karlton David Sequeira , Jayant Ajit Gupchup , Senthil K. Velayutham

IPC分类号： G10L13/033 , G10L13/08 , H04L12/26 , H04L29/06 , H04M7/00 , G10L13/047 , H04S7/00

摘要： Content is received at a receiving equipment from a transmitting user terminal over a network in a communication session between a transmitting user and a receiving user. The received content comprises audio data representing speech spoken by a voice of the transmitting user, and further comprises text data generated from speech spoken by the voice of the transmitting user during the communication session. At the receiving equipment, at least a portion of the received text data is converted to artificially-generated audible speech based on a model of the transmitting user's voice stored at the receiving equipment (and in embodiments in dependence on the receive audio quality). The received audio data and the artificially-generated speech are supplied to be played out to the receiving user through one or more speakers.

7.

发明申请
ACTIVITY FEED SERVICE 审中-公开

公开(公告)号：US20180302302A1

公开(公告)日：2018-10-18

申请号：US15590858

申请日：2017-05-09

申请人： Microsoft Technology Licensing, LLC

发明人： Chani A. Doggett , Brian R. Meyers , John E. Gallardo , Abolade Gbadegesin , Michael J. Novak , Yisheng Yao , Bartosz H. Paliswiat , Kiran Tatapudi , Colleen E. Hamilton , Shawn P. Henry , Kenneth M. Tubbs , Sriram Srinivasan , Mahmut Arslan

IPC分类号： H04L12/26 , H04L29/08 , H04L12/24 , H04L29/06 , G06F9/445

摘要： Technology related to an activity feed service is disclosed. In one example of the disclosed technology, a method can include receiving updates to activity streams, where a respective activity stream indicates an engagement of a respective user with applications executing on a respective client device connected to a network. The different activity streams associated with a particular user can be merged to generate a merged activity stream associated with the particular user. The different received activity streams can correspond to different respective client devices. The merged activity stream associated with the particular user can be transmitted over the network.

8.

发明申请
Artificially generated speech for a communication session 审中-公开

公开(公告)号：US20180218727A1

公开(公告)日：2018-08-02

申请号：US15422865

申请日：2017-02-02

申请人： Microsoft Technology Licensing, LLC

发明人： Ross G. Cutler , Sriram Srinivasan , Ramin Mehran , Karlton David Sequeira , Jayant Ajit Gupchup , Senthil K. Velayutham

IPC分类号： G10L13/08 , H04L12/26 , H04L29/06 , H04M7/00 , G10L13/047 , H04S7/00

CPC分类号： G10L13/033 , G10L13/04 , G10L13/047 , G10L13/08 , G10L19/0018 , H04L43/08 , H04L65/1069 , H04M3/2236 , H04M7/0084 , H04M2201/40 , H04S7/30 , H04S2420/01

摘要： Content is received at a receiving equipment from a transmitting user terminal over a network in a communication session between a transmitting user and a receiving user. The received content comprises audio data representing speech spoken by a voice of the transmitting user, and further comprises text data generated from speech spoken by the voice of the transmitting user during the communication session. At the receiving equipment, at least a portion of the received text data is converted to artificially-generated audible speech based on a model of the transmitting user's voice stored at the receiving equipment (and in embodiments in dependence on the receive audio quality). The received audio data and the artificially-generated speech are supplied to be played out to the receiving user through one or more speakers.

9.

发明授权
Phase reconstruction in a speech decoder 有权

公开(公告)号：US11817107B2

公开(公告)日：2023-11-14

申请号：US17875237

申请日：2022-07-27

申请人： Microsoft Technology Licensing, LLC

发明人： Soren Skak Jensen , Sriram Srinivasan , Koen Bernard Vos

IPC分类号： G10L19/00 , G10L19/26 , G10L25/12 , G10L25/69 , G10L25/72

CPC分类号： G10L19/0018 , G10L19/265 , G10L25/12 , G10L25/69 , G10L25/72

摘要： Innovations in phase quantization during speech encoding and phase reconstruction during speech decoding are described. For example, to encode a set of phase values, a speech encoder omits higher-frequency phase values and/or represents at least some of the phase values as a weighted sum of basis functions. Or, as another example, to decode a set of phase values, a speech decoder reconstructs at least some of the phase values using a weighted sum of basis functions and/or reconstructs lower-frequency phase values then uses at least some of the lower-frequency phase values to synthesize higher-frequency phase values. In many cases, the innovations improve the performance of a speech codec in low bitrate scenarios, even when encoded data is delivered over a network that suffers from insufficient bandwidth or transmission quality problems.

10.

发明授权
Reinforcement learning for jitter buffer control 有权

公开(公告)号：US11558275B2

公开(公告)日：2023-01-17

申请号：US16877257

申请日：2020-05-18

申请人： Microsoft Technology Licensing, LLC

发明人： Xiulian Peng , Vinod Prakash , Xiangyu Kong , Sriram Srinivasan , Yan Lu

IPC分类号： H04L47/283 , H04L43/087 , G06K9/62 , G06N20/00 , H04L41/14

摘要： Disclosed in some examples are methods, systems, and machine-readable mediums which determine jitter buffer delay by inputting jitter buffer and currently observed network status information to a machine learned model that is trained using a reinforcement learning (RL) method. The model maps these inputs to an action to compress, stretch, or hold the jitter buffer delay, which is used by a recipient computing device to optimize the jitter buffer delay. The model may be trained using a simulator that uses network traces of past real streaming sessions (e.g., communication sessions) of users. By training the model through reinforcement learning, the model learns to make better decisions through reinforcement in the form of reward signals that reflect the performance of each decision.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类