摘要:
Methods of incrementally modifying a word-level finite state transducer (FST) are described for adding and removing sentences. A prefix subset of states and arcs in the FST is determined that matches a prefix portion of the sentence. A suffix subset of states and arcs in the FST is determined that matches a suffix portion of the sentence. A new sentence can then be added to the FST by appending a new sequence of states and arcs to the FST corresponding to a remainder of the sentence between the prefix and suffix. An existing sentence can be removed from the FST by removing any arcs and states between the prefix subset and the suffix subset. The resulting modified FST is locally efficient but does not satisfy global optimization criteria such as minimization.
摘要:
Systems and methods for processing results from plural speech services are described. A method includes receiving speech service results from plural speech services and service specifications corresponding to the speech service results. The results are at least one data structure representing information according to functionality of the speech services. The service specifications describe the data structure and its interpretation for each speech service. The speech service results are encoded into a unified conceptual knowledge representation of the results based on the service specification. The unified conceptual knowledge representation is provided to an application module. A method includes assessing speech service results received asynchronously from plural speech services to determine, based on a reliability measure, whether there is a reliable result among the speech service results received. If there is a reliable result, it is provided to an application module; otherwise, the method continues to assess the speech service results received.
摘要:
A method or associated system for motion adaptive speech processing includes dynamically estimating a motion profile that is representative of a user's motion based on data from one or more resources, such as sensors and non-speech resources, associated with the user. The method includes effecting processing of a speech signal received from the user, for example, while the user is in motion, the processing taking into account the estimated motion profile to produce an interpretation of the speech signal. Dynamically estimating the motion profile can include computing a motion weight vector using the data from the one or more resources associated with the user, and can further include interpolating a plurality of models using the motion weight vector to generate a motion adaptive model. The motion adaptive model can be used to enhance voice destination entry for the user and re-used for other users who do not provide motion profiles.