摘要:
Architecture that monitors interaction data (e.g., search queries, query results and click-through rates), and provides users with links to other users that fall into similar categories with respect to the foregoing monitored activities (e.g., providing links to individuals and groups that share common interests and/or profiles). A search engine can be interactively coupled with one or more social networks, and that maps individuals and/or groups within respective social networks to subsets of categories associated with searches. A database stores mapped information which can be continuously updated and reorganized as links within the system mapping become stronger or weaker. The architecture can comprise a social network system that includes a database for mapping search-related information to an entity of a social network, and a search component for processing a search query for search results and returning a link to an entity of a social network based on the search query.
摘要:
A speech application accessible across a network is personalized for a particular user based on preferences for the user. The speech application can be modified based on the preferences.
摘要:
Random samples without replacement are extracted from a distributed set of items by leveraging techniques for aggregating sampled subsets of the distributed set. This provides a uniform random sample without replacement representative of the distributed set, allowing statistical information to be gleaned from extremely large sets of distributed information. Subset random samples without replacement are extracted from independent subsets of the distributed set of items. The subset random samples are then aggregated to provide a uniform random sample without replacement of a fixed size that is representative of a distributed set of items of unknown size. In one instance, a multivariate hyper-geometric distribution is sampled by breaking up the multivariate hyper-geometric distribution into a set of univariate hyper-geometric distributions. Individual items of a uniform random sample without replacement are then determined utilizing a normal approximation of the univariate hyper-geometric distributions and a finite population correction factor.
摘要:
The present invention leverages machine learning techniques to provide automatic generation of conditioning variables for constructing a data perspective for a given target variable. The present invention determines and analyzes the best target variable predictors for a given target variable, employing them to facilitate the conveying of information about the target variable to a user. It automatically discretizes continuous and discrete variables utilized as target variable predictors to establish their granularity. In other instances of the present invention, a complexity and/or utility parameter can be specified to facilitate generation of the data perspective via analyzing a best target variable predictor versus the complexity of the conditioning variable(s) and/or utility. The present invention can also adjust the conditioning variables (i.e., target variable predictors) of the data perspective to provide an optimum view and/or accept control inputs from a user to guide/control the generation of the data perspective.
摘要:
A system and method for generating staged mixture model(s) is provided. The staged mixture model includes a plurality of mixture components each having an associated mixture weight, and, an added mixture component having an initial structure, parameters and associated mixture weight. The added mixture component is modified based, at least in part, upon a case that is undesirably addressed by the plurality of mixture components using a structural expectation maximization (SEM) algorithm to modify at the structure, parameters and/or associated mixture weight of the added mixture component.The staged mixture model employs a data-driven staged mixture modeling technique, for example, for building density, regression, and classification model(s). The basic approach is to add mixture component(s) (e.g., sequentially) to the staged mixture model using an SEM algorithm.
摘要:
Visualizing Internet web traffic is disclosed. In one embodiment, a number of windows are displayed, corresponding to a number of clusters into which users have been partitioned based on similar web browsing behavior. The windows are ordered from the cluster having the greatest number of users to the cluster having the least number of users. Each window has one or more rows, where each row corresponds to a user within the cluster. Each row has an ordered number of visible units, such as blocks, where each block corresponds to a web page visited by the user. The blocks can be color coded by the type of web page they represent. In one embodiment, the corresponding cluster models for the clusters are alternatively displayed in the windows.
摘要:
Visualization of high-dimensional data sets is disclosed, particularly the display of a network model for a data set. The network, such as a dependency or a Bayesian network, has a number of nodes having dependencies thereamong. The network can be displayed items and connections, corresponding to nodes and dependencies, respectively. Selection of a particular item in one embodiment results in the display of the local distribution associated with the node for the item. In one embodiment, only a predetermined number of the items are shown, such as only the items representing the most popular nodes. Furthermore, in one embodiment, in response to receiving a user input, a sub-set of the connections is displayed, proportional to the user input. In another embodiment, a particular item is displayed in an emphasized manner, and the particular connections representing dependencies including the node represented by the particular item, as well as the items representing nodes also in these dependencies, are also displayed in the emphasized manner. Furthermore, in one embodiment, only an indicated sub-set of the items is displayed.
摘要:
The invention performs speech recognition using an array of mixtures of Bayesian networks. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. The number of HSBNs in the MBN corresponds to the number of states of the common external hidden variable, and each HSBN models the world under the hypothesis that the common external hidden variable is in a corresponding one of those states. In accordance with the invention, the MBNs encode the probabilities of observing the sets of acoustic observations given the utterance of a respective one of said parts of speech. Each of the HSBNs encodes the probabilities of observing the sets of acoustic observations given the utterance of a respective one of the parts of speech and given a hidden common variable being in a particular state. Each HSBN has nodes corresponding to the elements of the acoustic observations. These nodes store probability parameters corresponding to the probabilities with causal links representing dependencies between ones of said nodes.
摘要:
A method and system for specifying a selection query for a collection of data items. The system allows a user to define a various conditions (e.g., "Supervisor=Smith") that relate to the collection. A unique icon is then assigned to represent each condition. These icon can either be assigned automatically by the system or assigned by a user. When a selection query is to be specified, the system displays a selection query grid. The selection query grid contains a row for each possible combination of the defined conditions. Each possible combination is represented by displaying the icons for the conditions in that combination in the row. A user can then select which combinations should form the selection query by selecting rows of the selection query grid. The selection query is the logical-AND of each condition or logical inverse of each condition of a selected combination and the logical-OR of all the selected combinations. The system then uses this selection query to retrieve the data items from the collection.
摘要:
Architecture that monitors interaction data (e.g., search queries, query results and click-through rates), and provides users with links to other users that fall into similar categories with respect to the foregoing monitored activities (e.g., providing links to individuals and groups that share common interests and/or profiles). A search engine can be interactively coupled with one or more social networks, and that maps individuals and/or groups within respective social networks to subsets of categories associated with searches. A database stores mapped information which can be continuously updated and reorganized as links within the system mapping become stronger or weaker. The architecture can comprise a social network system that includes a database for mapping search-related information to an entity of a social network, and a search component for processing a search query for search results and returning a link to an entity of a social network based on the search query.