Abstract:
An enhanced low-bit rate parametric voice coder that groups a number of frames from an underlying frame-based vocoder, such as MELP, into a superframe structure. Parameters are extracted from the group of underlying frames and quantized into the superframe which allows the bit rate of the underlying coding to be reduced without increasing the distortion. The speech data coded in the superframe structure can then be directly synthesized to speech or may be transcoded to a format so that an underlying frame-based vocoder performs the synthesis. The superframe structure includes additional error detection and correction data to reduce the distortion caused by the communication of bit errors.
Abstract:
The invention concerns a system, which allows speech transmission between a mobile station and terminal equipment connected to a data network. The terminal equipment, preferably an Internet telephone, sends and receives data packets in accordance with the protocol of the data network in question and is provided with telephone characteristics. It has a speech coder in accordance with the mobile station system which synthesises the original speech from the speech parameters sent by the mobile station and contained in the data packets arriving from the data network and which correspondingly produces speech parameters for location in outgoing data packets. The speech parameters are conveyed as such between the mobile station and the terminal equipment without applying any additional coding to them. When the mobile telephone network is a packet switched network, it may be connected directly to the data network or, when the mobile telephone network is a circuit switched network, a gateway is used, which performs any necessary conversions, so that the call can be connected from one network to the other.
Abstract:
The invention concerns a system, which allows speech transmission between a mobile station and terminal equipment connected to a data network. The terminal equipment, preferably an Internet telephone, sends and receives data packets in accordance with the protocol of the data network in question and is provided with telephone characteristics. It has a speech coder in accordance with the mobile station system which synthesises the original speech from the speech parameters sent by the mobile station and contained in the data packets arriving from the data network and which correspondingly produces speech parameters for location in outgoing data packets. The speech parameters are conveyed as such between the mobile station and the terminal equipment without applying any additional coding to them. When the mobile telephone network is a packet switched network, it may be connected directly to the data network or, when the mobile telephone network is a circuit switched network, a gateway is used, which performs any necessary conversions, so that the call can be connected from one network to the other.
Abstract:
The invention refers to an apparatus for processing an encoded audio signal (100). The audio signal (100) comprises a sequence of access units (100'), each access unit comprising a core signal (101) with a first spectral width and parameters describing a spectrum above the first spectral width. The apparatus comprises: a demultiplexer (1) for generating, from an access unit (100') of the encoded audio signal (100), said core signal (101 ) and a set of said parameters (102), an upsampler (2) for upsampling said core signal (101 ) of said access unit (100') and outputting a first upsampled spectrum (103) and a timely consecutive second upsampled spectrum (103'), the first upsampled spectrum (103) and the second upsampled spectrum (103'), both, having a same content as the core signal (101 ) and having a second spectral width being greater than the first spectral width of the core spectrum (101), a parameter converter (3) for converting parameters of said set of parameters (102) of said access unit (100') to obtain converted parameters (104, 104'), and a spectral gap filling processor (4) for processing said first upsampled spectrum (103) and said second upsampled spectrum (103') using said converted parameters (104). The invention also refers to a corresponding method.
Abstract:
A method for generating a high-band target signal includes receiving, at an encoder, an input signal having a low-band portion and a high-band portion. The method also includes comparing a first autocorrelation value of the input signal to a second autocorrelation value of the input signal. The method further includes scaling the input signal by a scaling factor to generate a scaled input signal. The scaling factor is determined based on a result of the comparison. The method also includes generating a low-band signal based on the input signal and generating the high-band target signal based on the scaled input signal.
Abstract:
상기 기술적 과제를 해결하기 위한 본 발명의 일 실시예에 따른 오디오 신호를 처리하는 방법은, 인터널 채널 게인들(ICGs, Internal Channel Gain)이 선적용(pre-applied)된, 하나의 CPE(Channel Pair Element)에 대한 신호를 수신하는 단계; 재생 채널 구성이 스테레오가 아니라면, MPS212 파라미터들 및 포맷 컨버터에 정의된 MPS212 출력 채널들에 해당하는 렌더링 파라미터들에 기초하여 CPE에 대한 역 인터널 채널 게인들(inverse ICGs)을 획득하는 단계; 및 수신된 하나의 CPE에 대한 신호와 획득된 역 인터널 채널 게인들에 기초하여 출력 신호들을 생성하는 단계;를 더 포함한다.
Abstract:
An audio signal (X) is represented by a bitstream (B) segmented into frames. An audio processing system (500) comprises a buffer (510) and a decoding section (520). The buffer joins sets of audio data (D 1 ; D 2 ,..., D N ) carried by N respective frames (F 1 , F 2 ,..., F N ) into one decodable set of audio data (D) corresponding to a first frame rate and to a first number of samples of the audio signal per frame. The frames have a second frame rate corresponding to a second number of samples of the audio signal per frame. The first number of samples is N times the second number of samples. The decoding section decodes the decodable set of audio data into a segment of the audio signal by at least employing signal synthesis, based on the decodable set of audio data, with a stride corresponding to the first number of samples of the audio signal.
Abstract:
A system for transmitting low latency, synchronised audio that includes an audio source, a processor, a controller and a sink zone with a DAC. Particularly, the processor is capable of selectively resampling the audio source in order to output a data packet for transmission to the sink zone that has a maximised payload size while packet frequency remains a whole number.
Abstract:
Systems,devices and methods are provided for configuring matching rules related to voice input commands. For example,a first mapping relation between one or more first original terms in a preset term database and one or more first identification terms is established; the first mapping relation is stored in a first mapping relation table; one or more first voice input commands are configured for the first identification terms or one or more first statements including the first identification terms; and a second mapping relation between the first identification terms or the first statements and the first voice input commands is stored into a second mapping relation table.