摘要:
Extraction of semantic information and the generation of semantic attributes allows for improved organization and management of data. Semantic attributes are automatically generated and eliminate the need for manual entry of attribute information. A semantic file network may further be constructed based on similarities between files that are based on the semantic attribute information. Semantic links representing a semantic relationship may be built between similar or relevant files. In addition, user operations and user operation patterns may also be considered in building the file network. Semantic attributes and information may further facilitate browsing the file systems as well as improve the accuracy and speed of queries.
摘要:
A system and method is described for providing an encoding scheme for a bit stream and displaying or printing the encoded bit stream. Using the encoded bit stream, a pen with a camera may capture an image of a portion of the encoded bit stream. The captured image may then be decoded to provide an indication of the location of the image in relation to the encoded bit stream. The encoding scheme includes information regarding orientation, thus making decoding easier.
摘要:
Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.
摘要:
A fast decoding technique for decoding a position of a bit in a pattern provided on a media surface that can generate large amounts of solution candidates quickly by switching or flipping bits and utilizing a recursion scheme. The fast decoding technique may be employed to simultaneously decode multiple dimensions of a pattern on the media surface.
摘要:
An optical resonator is provided, and methods for making the same, as well as devices and sub-assemblies including the same. For example, such an electronic device (FIG. 4) may include a first electronic component (172) designed to be photoactive to radiation having a first wavelength and a second electronic component (174) designed to be photoactive to radiation having a second wavelength. The device may also include a cavity that defines an optical resonator having a cavity length such that the optical resonator resonates in successive resonant modes that locate at the first and second wavelengths.
摘要:
Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.
摘要:
Described is a unified digital ink recognizer that recognizes various different types of digital ink data, such as handwritten character data and custom data, e.g., sketched shapes, handwritten gestures, and/or drawn pictures, without further participation by a user such as recognition mode selection or parameter input. For a custom item, the output may be a Unicode value from a private use area of Unicode. Building the unified digital ink recognizer may include defining the data set to be recognized, extracting features of training samples corresponding to the dataset items to build a recognizer model, evaluating the recognizer model using testing data, and modifying the recognizer model using tuning data. The extracted features may be processed into feature data for a multi-dimensional nearest neighbor recognizer approach; the extracted features for the samples of each class is calculated and combined into the feature set for this class in the resulting recognizer model.
摘要:
Described is a technology that provides an integrated platform for users to use different kinds of digital ink (e.g., handwritten characters, sketched shapes, handwritten formulas) when interacting with computer programs. The platform interprets the user's digital ink input and outputs one or more associated items into an application program. The output items can be customized for different application programs. In one aspect, the platform includes an ink panel having different operating modes for receiving digital ink, and a recognition service that recognizes different types of digital ink. The recognition service may include a unified recognizer that recognizes different types of digital ink, e.g., characters and shapes. Another recognizer may be included such as an equation recognizer. If the recognition result is text while in a non-text mode, the text may be used in a keyword search to locate items; otherwise, the recognition result may be used without keyword searching.
摘要:
Described is a technology by which software instrumentation data collected from user program sessions are analyzed, including by determining program usage metrics and/or command usage metrics. Information representative of the program usage metrics and/or the command usage metrics is output, such as in the form of a report. The software instrumentation data may be further analyzed, such as to determine at least one usage trend over time, and to determine user groups. For example, a usage subset of sessions that meet specified session usage criteria based on a set of session data may be located, along with a subset of users based on users whose sessions meet specified user criteria. The usage and user subsets may be combined via Boolean logic to produce a result set.
摘要:
Described is a technology by which high dimensional source data corresponding to rows of records with identifiers, and columns comprising dimensions of data values, are processed into a file model for efficient access. An inverted index corresponding to any dimension is built by mapping data from raw dimension values to mapped values based on mapping entries in a dimension table. The record identifiers are arranged into subgroups based on their mapped value; a count and/or an offset may be maintained for locating each of the subgroups. The raw values for a dimension are maintained within a raw value file. For sparse data, the raw value file may be compressed, e.g., by excluding nulls and associating a record identifier with each non-null. A data manager provides access to data in the data files, such as by offering various functions, using caching for efficiency.