专利检索 cpc:"G10L15/28" 第 1 页

1.

发明公开
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND COMPUTER PROGRAM PRODUCT 审中-公开

公开(公告)号：US20240347048A1

公开(公告)日：2024-10-17

申请号：US18442441

申请日：2024-02-15

申请人： KABUSHIKI KAISHA TOSHIBA

发明人： Takehiko KAGOSHIMA

IPC分类号： G10L15/16 , G10L15/08 , G10L15/28

CPC分类号： G10L15/16 , G10L15/28 , G10L2015/088

摘要： According to an embodiment, an information processing apparatus includes one or more hardware processors configured to function as a memory control unit, a transformation unit, a first convolutional neural network (CNN), and a second CNN unit. The memory control unit reads a first stride parameter used for controlling an output resolution and a first dilation parameter used for controlling an input resolution from a memory device. The transformation unit transforms the first stride parameter to a second stride parameter and transforms the first dilation parameter to a second dilation parameter by using a transformation parameter. The first CNN unit executes first CNN processing of a feature vector by using at least the second stride parameter. The second CNN unit executes second CNN processing with an output vector of the first CNN unit as an input by using at least the second dilation parameter.

2.

发明授权
Multimodal speech recognition method and system, and computer-readable storage medium 有权

公开(公告)号：US12112744B2

公开(公告)日：2024-10-08

申请号：US17684958

申请日：2022-03-02

申请人： Zhejiang University

发明人： Feng Lin , Tiantian Liu , Ming Gao , Chao Wang , Zhongjie Ba , Jinsong Han , Wenyao Xu , Kui Ren

IPC分类号： G10L15/20 , G01S13/88 , G10L15/06 , G10L15/18 , G10L15/22 , G10L15/28 , G10L25/18 , G10L25/78

CPC分类号： G10L15/20 , G01S13/88 , G10L15/063 , G10L15/1815 , G10L15/22 , G10L15/28 , G10L25/18 , G10L25/78

摘要： The disclosure provides a multimodal speech recognition method and system, and a computer-readable storage medium. The method includes calculating a first logarithmic mel-frequency spectral coefficient and a second logarithmic mel-frequency spectral coefficient when a target millimeter-wave signal and a target audio signal both contain speech information corresponding to a target user; inputting the first and the second logarithmic mel-frequency spectral coefficient into a fusion network to determine a target fusion feature, where the fusion network includes at least a calibration module and a mapping module, the calibration module is configured to perform mutual feature calibration on the target audio/millimeter-wave signals, and the mapping module is configured to fuse a calibrated millimeter-wave feature and a calibrated audio feature; and inputting the target fusion feature into a semantic feature network to determine a speech recognition result corresponding to the target user. The disclosure can implement high-accuracy speech recognition.

3.

发明授权
Information processing device and information processing method 有权

公开(公告)号：US12062360B2

公开(公告)日：2024-08-13

申请号：US16972420

申请日：2019-03-12

申请人： SONY CORPORATION

发明人： Hiro Iwase , Yuhei Taki , Kunihito Sawai

IPC分类号： G10L15/065 , G10L15/08 , G10L15/18 , G10L15/22 , G10L15/28

CPC分类号： G10L15/065 , G10L15/1815 , G10L15/22 , G10L15/28 , G10L2015/088

摘要： The present invention has an issue of effectively reducing the input load related to a voice trigger. There is provided an information processing device comprising a registration control unit that dynamically controls registration of startup phrases used as start triggers of a voice interaction session, in which the registration control unit temporarily additionally registers at least one of the startup phrases based on input voice. There is also provided an information processing method comprising dynamically controlling, by a processor, registration of startup phrases used as start triggers of a voice interaction session, in which the controlling further includes temporarily additionally registering at least one of the startup phrases based on input voice.

4.

发明授权
Information processing apparatus and information processing method 有权

公开(公告)号：US12014736B2

公开(公告)日：2024-06-18

申请号：US17413158

申请日：2019-10-30

申请人： SONY GROUP CORPORATION

发明人： Tatsuma Sakurai , Ichitaro Kohara

IPC分类号： G10L15/22 , A63H5/00 , A63H11/00 , G10L15/28

CPC分类号： G10L15/22 , A63H5/00 , A63H11/00 , G10L15/28 , A63H2200/00

摘要： An information processing apparatus that includes a control unit controlling an action of an autonomous operation unit, and in which the control unit controls transition of plural states relating to speech recognition processing through the autonomous operation unit based on a detected trigger, and the states include a first active state in which an action of the autonomous operation unit is restricted, and a second active state in which the speech recognition processing is performed. An information processing method in which a processor controls an action of an autonomous operation unit, the controlling includes controlling transition of plural states relating to speech recognition processing through the autonomous operation unit based on a detected trigger, and the states include a first active state in which an action of the autonomous operation unit is restricted, and a second active state in which the speech recognition processing is performed.

5.

发明授权
Clothes-processing device 有权

公开(公告)号：US12006619B2

公开(公告)日：2024-06-11

申请号：US17267073

申请日：2019-08-21

申请人： LG ELECTRONICS INC.

发明人： Inseong Hwang , Jeongbeom Kim

IPC分类号： D06F34/34 , G10L15/28 , D06F105/60

CPC分类号： D06F34/34 , G10L15/28 , D06F2105/60

摘要： The present invention relates to a clothes-processing device comprising: a cabinet comprising a body; a body front surface fixed to the body and forming the front surface, and an introduction opening formed through the body front surface; a drum comprising a drum body disposed in the cabinet so as to store clothes and a drum introduction opening formed through the drum body to communicate with the introduction opening; a driving part for rotating the drum; a door rotatably disposed at the cabinet so as to open or close the introduction opening; a control part for controlling the driving part; and a voice recognition part disposed at the door so as to recognize a voice generated by a user and transmit a control command corresponding to the recognized voice to the control part.

6.

发明公开
SPEAKER ASSEMBLY IN A DISPLAY ASSISTANT DEVICE 审中-公开

公开(公告)号：US20240184340A1

公开(公告)日：2024-06-06

申请号：US18543629

申请日：2023-12-18

申请人： Google LLC

发明人： James Nelson Castro , Carl Alexander Cepress , Liang Ching Tseng , Darren Torrie , Frances Maria Hui Hong Kwee , Rex Pinegar Price

IPC分类号： G06F1/16 , G02F1/1333 , G02F1/1337 , G06F3/16 , G06F21/83 , G10L15/28 , H04L12/28 , H04R1/02 , H04R1/34

CPC分类号： G06F1/166 , G02F1/133308 , G02F1/133753 , G06F1/1605 , G06F1/1626 , G06F1/1637 , G06F1/1658 , G06F1/1683 , G06F1/1686 , G06F1/1688 , G06F1/1698 , G06F3/167 , G06F21/83 , G10L15/28 , H04R1/023 , H04R1/025 , H04R1/028 , H04R1/345 , G02F1/133325 , G02F1/133761 , H04L12/282 , H04R2499/15

摘要： In a display assistant device, a speaker is mounted in a waveguide structure which is at least partially disposed beneath a display screen. The waveguide structure is mounted in an exterior housing which includes speaker grills distributed on a plurality of surfaces of the exterior housing, permitting sound waves from the speaker to be projected outside the exterior housing. A cover structure is disposed on top of the waveguide structure to conceal the waveguide structure and speaker within the exterior housing. The cover structure has a tilted bottom surface configured to be suspended above the waveguide structure and to be separated by a first space. Sound waves projected from an upper portion of the speaker are reflected by the tilted bottom surface and are guided through the first space to exit the device from a speaker grill portion located on a rear side of the exterior housing.

7.

发明授权
Multipurpose speaker enclosure in a display assistant device 有权

公开(公告)号：US11994917B2

公开(公告)日：2024-05-28

申请号：US17889683

申请日：2022-08-17

申请人： Google LLC

发明人： Xiaoping Qin , Christen Cameron Bilger , Frederic Heckmann , Frances Kwee , Justin Leong , James Castro

IPC分类号： G06F1/16 , G02F1/1333 , G02F1/1337 , G06F3/16 , G06F21/83 , G10L15/28 , H04L12/28 , H04R1/02 , H04R1/34

CPC分类号： G06F1/166 , G02F1/133308 , G02F1/133753 , G06F1/1605 , G06F1/1626 , G06F1/1637 , G06F1/1658 , G06F1/1683 , G06F1/1686 , G06F1/1688 , G06F1/1698 , G06F3/167 , G06F21/83 , G10L15/28 , H04R1/023 , H04R1/025 , H04R1/028 , H04R1/345 , G02F1/133325 , G02F1/133761 , H04L12/282 , H04R2499/15

摘要： This application is directed to a speaker assembly in which a speaker is mounted in an enclosure structure. The enclosure structure exposes a speaker opening of the speaker and provides a sealed enclosure for a rear portion of the speaker, and further includes an electrically conductive portion. One or more electronic components are coupled to the electrically conductive portion of the enclosure structure (which is grounded in some implementations). The electrically conductive portion of the enclosure structure is configured to provide electromagnetic shielding for the electronic components and forms part of the sealed enclosure of the speaker. In some implementations, the electrically conductive portion of the enclosure structure is thermally coupled to the electronic components and acts as a heat sink that is configured to absorb heat generated by the electronic components and dissipate the generated heat away from the electronic components.

8.

发明授权
Information processing device to stop the turn off of power based on voice input for voice operation 有权

公开(公告)号：US11984121B2

公开(公告)日：2024-05-14

申请号：US17425444

申请日：2020-01-17

申请人： SONY GROUP CORPORATION

发明人： Akira Fukui , Hiroaki Ogawa , Yoshinori Maeda , Chie Kamada , Emiru Tsunoo , Akira Takahashi , Noriko Totsuka , Kazuya Tateishi , Yuichiro Koyama , Yuki Takeda , Hideaki Watanabe , Kan Kuroda

IPC分类号： G10L15/22 , G06F3/16 , G10L15/28

CPC分类号： G10L15/22 , G06F3/16 , G10L15/28 , G10L2015/221 , G10L2015/223 , G10L2015/225 , G10L2015/228

摘要： An information processing device presents first information indicating that voice input for the voice operation is possible and second information representing a domain of utterance in which voice operation is possible in response to an occurrence of a predetermined state transition, and performs voice recognition for voice input by a user.

9.

发明授权
Natural language processing routing 有权

公开(公告)号：US11978453B2

公开(公告)日：2024-05-07

申请号：US17347323

申请日：2021-06-14

申请人： Amazon Technologies, Inc.

发明人： Narendra Gyanchandani , Junqing Shang , Joe Pemberton , Rushi P Desai , Liyuan Zhang , Shubham Katiyar , Lawrence Mariadas Chettiar , Artun Kutchuk , Naushad Zaveri

IPC分类号： G10L15/28 , G06N5/04 , G06N20/00 , G10L15/16 , G10L15/18 , G10L15/22

CPC分类号： G10L15/28 , G06N5/04 , G06N20/00 , G10L15/16 , G10L15/1815 , G10L15/22

摘要： Devices and techniques are generally described for a speech processing routing architecture. First input data representing an input request may be received. First data including a semantic interpretation of the input request may be determined. Metadata of the first input data may be determined. The metadata may identify an entity associated with the input request. In some examples, a query may be sent to a first component. The query may include the metadata. In some examples, second data that identifies a first skill associated with the entity may be received from the first component. In various examples, the first skill may be selected for processing the first input data based at least in part on the first data and the second data.

10.

发明公开
ELECTRONIC APPARATUS FOR PROVIDING VOICE RECOGNITION CONTROL AND OPERATING METHOD THEREFOR 审中-公开

公开(公告)号：US20240106932A1

公开(公告)日：2024-03-28

申请号：US18528013

申请日：2023-12-04

申请人： SAMSUNG ELECTRONICS CO., LTD.

发明人： Kwang-Youn KIM , Won-Nam JANG

IPC分类号： H04M3/493 , G06F3/04817 , G06F3/0482 , G06F3/04842 , G06F3/0487 , G06F3/16 , G06F16/33 , G06F16/957 , G10L15/22

CPC分类号： H04M3/4938 , G06F3/04817 , G06F3/0482 , G06F3/04842 , G06F3/0487 , G06F3/167 , G06F16/3334 , G06F16/957 , G10L15/22 , G10L15/28

摘要： An example electronic apparatus for providing voice recognition control includes a display; and a processor, wherein the processor may be configured to obtain a content including at least one object; distinguish the at least one object within the content; display an instruction text in correspondence with a non-text object among the at least one object; and select the non-text object corresponding to the instruction text if a voice command corresponding to the instruction text is inputted.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类