摘要:
A method for recognizing an environmental sound in a client device in cooperation with a server is disclosed. The client device includes a client database having a plurality of sound models of environmental sounds and a plurality of labels, each of which identifies at least one sound model. The client device receives an input environmental sound and generates an input sound model based on the input environmental sound. At the client device, a similarity value is determined between the input sound model and each of the sound models to identify one or more sound models from the client database that are similar to the input sound model. A label is selected from labels associated with the identified sound models, and the selected label is associated with the input environmental sound based on a confidence level of the selected label.
摘要:
A method for recognizing an environmental sound in a client device in cooperation with a server is disclosed. The client device includes a client database having a plurality of sound models of environmental sounds and a plurality of labels, each of which identifies at least one sound model. The client device receives an input environmental sound and generates an input sound model based on the input environmental sound. At the client device, a similarity value is determined between the input sound model and each of the sound models to identify one or more sound models from the client database that are similar to the input sound model. A label is selected from labels associated with the identified sound models, and the selected label is associated with the input environmental sound based on a confidence level of the selected label.
摘要:
A method for generating an anti-model of a sound class is disclosed. A plurality of candidate sound data is provided for generating the anti-model. A plurality of similarity values between the plurality of candidate sound data and a reference sound model of a sound class is determined. An anti-model of the sound class is generated based on at least one candidate sound data having the similarity value within a similarity threshold range.
摘要:
A method for generating an anti-model of a sound class is disclosed. A plurality of candidate sound data is provided for generating the anti-model. A plurality of similarity values between the plurality of candidate sound data and a reference sound model of a sound class is determined. An anti-model of the sound class is generated based on at least one candidate sound data having the similarity value within a similarity threshold range.
摘要:
A particular method includes transitioning out of a low-power state at a processor. The method also includes retrieving audio feature data from a buffer after transitioning out of the low-power state. The audio feature data indicates features of audio data received during the low-power state of the processor.
摘要:
A method for providing information for a conference at one or more locations is disclosed. One or more mobile devices monitor one or more starting requirements of the conference and transmit input sound information to a server when the one or more starting requirements of the conference is detected. The one or more starting requirements may include a starting time of the conference, a location of the conference, and/or acoustic characteristics of a conference environment. The server generates conference information based on the input sound information from each mobile device and transmits the conference information to each mobile device. The conference information may include information on attendees, a current speaker among the attendees, an arrangement of the attendees, and/or a meeting log of attendee participation at the conference.
摘要:
A processor is configured to transition in and out of a low-power state at a first rate and to operate in a first mode or a second mode. In a particular method, the processor while coupled to a coder/decoder (CODEC) retrieves audio feature data from a buffer after transitioning out of the low-power state. The CODEC is configured to operate at a second rate in the first mode and at a third rate in the second mode, the second rate and the third rate each greater than the first rate. The audio feature data indicates features of audio data received during the low-power state of the processor. A ratio of CODEC activity to processor activity in the second mode is less than the ratio in the first mode.
摘要:
A method for grouping a plurality of client devices is disclosed. The method includes receiving sound descriptors from the plurality of client devices. The sound descriptors are extracted from the environmental sound. Each of the sound descriptors is transmitted to a server, which determines a similarity of the sound descriptors received from the client devices. The server groups the plurality of client devices into at least one similar context group based on the similarity of the sound descriptors.
摘要:
Provided are an apparatus and method for recognizing continuous speech using search space restriction based on phoneme recognition. In the apparatus and method, a search space can be primarily reduced by restricting connection words to be shifted at a boundary between words based on the phoneme recognition result. In addition, the search space can be secondarily reduced by rapidly calculating a degree of similarity between the connection word to be shifted and the phoneme recognition result using a phoneme code and shifting the corresponding phonemes to only connection words having degrees of similarity equal to or higher than a predetermined reference value. Therefore, the speed and performance of the speech recognition process can be improved in various speech recognition services.
摘要:
Provided are an apparatus and method for recognizing continuous speech using search space restriction based on phoneme recognition. In the apparatus and method, a search space can be primarily reduced by restricting connection words to be shifted at a boundary between words based on the phoneme recognition result. In addition, the search space can be secondarily reduced by rapidly calculating a degree of similarity between the connection word to be shifted and the phoneme recognition result using a phoneme code and shifting the corresponding phonemes to only connection words having degrees of similarity equal to or higher than a predetermined reference value. Therefore, the speed and performance of the speech recognition process can be improved in various speech recognition services.