Abstract:
A method for processing a natural language input to a computerized system. The method parses the input to identify a query portion and a communication portion of the input. The system then determines an answer to the query portion, including identifying communication parameters from the communication portion. Upon determining the answer, the system prepares an answer to the communication and transmits that answer. If the answer requires information from a remote source, the system creates a subsidiary query to obtain that information and then submits the subsidiary query to the remote source. A response to the query is used to compose the answer to the query from the answer to the subsidiary query. If the system concludes that the query portion does not require information from a remote source, analyzing and answering the query locally.
Abstract:
A system and method is presented for performing dual mode speech recognition, employing a local recognition module on a mobile device and a remote recognition engine on a server device. The system accepts a spoken query from a user, and both the local recognition module and the remote recognition engine perform speech recognition operations on the query, returning a transcription and confidence score, subject to a latency cutoff time. If both sources successfully transcribe the query, then the system accepts the result having the higher confidence score. If only one source succeeds, then that result is accepted. In either case, if the remote recognition engine does succeed in transcribing the query, then a client vocabulary is updated if the remote system result includes information not present in the client vocabulary.
Abstract:
Developers can configure custom acoustic models by providing audio files with custom recordings. The custom acoustic model is trained by tuning a baseline model using the audio files. Audio files may contain custom noise to apply to clean speech for training. The custom acoustic model is provided as an alternative to a standard acoustic model. Device developers can select an acoustic model by a user interface. Speech recognition is performed on speech audio using one or more acoustic models. The result can be provided to developers through the user interface, and an error rate can be computed and also provided.
Abstract:
A client device receives a user request (e.g., in natural language form) to execute a command of an application. The client device delegates interpretation of the request to a response-processing server. Using domain knowledge previously provided by a developer of the application, the response-processing server determines the various possible responses that client devices could make in response to the request based on circumstances such as the capabilities of the client devices and the state of the application data. The response-processing server accordingly generates a response package that describes a number of different conditional responses that client devices could have to the request and provides the response package to the client device. The client device selects the appropriate response from the response package based on the circumstances as determined by the client device, executes the command (if possible), and provides the user with some representation of the response.
Abstract:
A command-processing server receives a natural language command from a user. The command-processing server has a set of domain command interpreters corresponding to different domains in which commands can be expressed, such as the domain of entertainment, or the domain of travel. Some or all of the domain command interpreters recognize user commands having a verbal prefix, an optional pre-filter, an object, and an optional post-filter; the pre- and post-filters may be compounded expressions involving multiple atomic filters. Different developers may independently specify the domain command interpreters and the sub-structure interpreters on which they are based.
Abstract:
A platform provides for developers of applications, such as devices, with natural language interfaces to configure the availability of vertical domain modules in applications. Modules can include grammars for parsing natural language expressions and interfaces to data sources. Third party developers can create modules with pricing models for their usage or access to their data. Device developers can browse or search available modules and test their performance for specific queries. The platform provides for devices users to access the chosen modules as configured by device developers and for charging and payment between users, application developers, and module developers.
Abstract:
A query-processing server provides natural language services to applications. More specifically, the query-processing server receives and stores domain knowledge information from application developers, the domain knowledge information comprising a linguistic description of the natural language user queries that application developers wish their applications to support. A first portion of the domain knowledge information is applied to transform a natural language query received from an application to an ordered sequence of question elements. A second portion of the domain knowledge information is applied to group the ordered sequence of question elements into a plurality of distinct structured questions posed by the natural language query. The distinct structured questions may then be provided to the application, which may then execute them and obtain the corresponding data referenced by the questions.
Abstract:
A method of building a natural language understanding application is provided. The method includes receiving at least one electronic record containing programming code and creating executable code from the programming code. Further, the executable code, when executed by a processor, causes the processor to create a parse and an interpretation of a sequence of input tokens, the programming code includes an interpret-block and the interpret-block includes an interpret-statement. Additionally, the interpret-statement includes a pattern expression and the interpret-statement includes an action statement.
Abstract:
The technology disclosed relates to performing speech recognition for a plurality of different devices or devices in a plurality of conditions. This includes storing a plurality of acoustic models associated with different devices or device conditions, receiving speech audio including natural language utterances, receiving metadata indicative of a device type or device condition, selecting an acoustic model from the plurality in dependence upon the received metadata, and employing the selected acoustic model to recognize speech from the natural language utterances included in the received speech audio. Each of speech recognition and the storage of acoustic models can be performed locally by devices or on a network-connected server. Also provided is a platform and interface, used by device developers to select, configure, and/or train acoustic models for particular devices and/or conditions.
Abstract:
The technology disclosed relates to authoring of vertical applications of natural language understanding (NLU), which analyze text or utterances and construct their meaning. In particular, it relates to new programming constructs and tools and data structures implementing those new applications.