Automatic document classification via content analysis at storage time

    公开(公告)号:US09239876B2

    公开(公告)日:2016-01-19

    申请号:US13692699

    申请日:2012-12-03

    发明人: Michael Kraley

    IPC分类号: G06F17/30

    摘要: Techniques are disclosed for efficiently and automatically classifying textual documents or files. In some embodiments, the classification process is integrated into or otherwise made part of the storage function, such that when the user initiates a save process for a given file, the file is processed through a classifier prior to (or contemporaneously with) completing the save function. In some such embodiments, textual content of the file is analyzed using natural language processing to identify a main or substantial concept discussed in the file, and one or more corresponding tags are then assigned to that file. Subsequently, the user can access that file based on the one or more tags, for instance, through a user interface that allows the user to select one or more content categories associated with the assigned tags. The files can be text-based, but may include other content as well, such as images, video, and audio.

    Metadata-based virtual machine configuration
    2.
    发明授权
    Metadata-based virtual machine configuration 有权
    基于元数据的虚拟机配置

    公开(公告)号:US09170834B2

    公开(公告)日:2015-10-27

    申请号:US13665890

    申请日:2012-10-31

    申请人: Google Inc.

    IPC分类号: G06F9/455 G06F17/30

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for associating one or more of a plurality of metadata collections with one or more respective identifiers, wherein each metadata collection includes one or more pairings of metadata attributes with metadata values, and wherein each identifier is one of a project identifier, a tag identifier or an instance identifier; identifying, based on identifier information associated with a virtual machine instance, one or more metadata values to be provided to the virtual machine instance, wherein the identifier information specifies one or more of a project identifier, a tag identifier and an instance identifier, and wherein each identified metadata value belongs to a metadata collection associated with an identifier that is specified in the identifier information; and providing, to the virtual machine instance, the identified one or more metadata values.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于将多个元数据集合中的一个或多个与一个或多个相应的标识符相关联,其中每个元数据集合包括元数据属性与元数据值的一个或多个配对, 并且其中每个标识符是项目标识符,标签标识符或实例标识符之一; 基于与虚拟机实例相关联的标识符信息来识别要提供给所述虚拟机实例的一个或多个元数据值,其中所述标识符信息指定项目标识符,标签标识符和实例标识符中的一个或多个,并且其中 每个识别的元数据值属于与在标识符信息中指定的标识符相关联的元数据收集; 以及向虚拟机实例提供所标识的一个或多个元数据值。

    ENABLING AND PERFORMING COUNT-DISTINCT QUERIES ON A LARGE SET OF DATA
    3.
    发明申请
    ENABLING AND PERFORMING COUNT-DISTINCT QUERIES ON A LARGE SET OF DATA 有权
    在大量数据中启用和执行计数检查查询

    公开(公告)号:US20150161185A1

    公开(公告)日:2015-06-11

    申请号:US14284080

    申请日:2014-05-21

    IPC分类号: G06F17/30

    摘要: A system, method, and apparatus are provided for supporting and/or executing count-distinct queries. A large set of data (e.g., tens or hundreds of millions of event records) is condensed daily to generate presence bitmaps to reflect the distinctiveness of a selected data dimension S (e.g., user ID) for one or more key dimensions g1, g2, . . . (e.g., advertisement ID, campaign ID, advertiser ID). The condensation process eliminates duplication and yields a single value (e.g., 1 or 0) for each tuple [S, g1, . . . ] to represent the distinctiveness of each value in the S dimension to each combination of values in the grouping dimensions. On a monthly basis, the daily values are condensed to yield a single value for the month, and a similar process is applied on any other desired time granularities (e.g., year). The condensed data may be generated for any combination of selected dimension(s) and grouping dimension(s).

    摘要翻译: 提供了用于支持和/或执行不统一查询的系统,方法和装置。 大量的数据集(例如数十亿或数亿的事件记录)被日常浓缩以产生存在位图以反映一个或多个关键维度g1,g2的选定数据维度S(例如,用户ID)的独特性, 。 。 。 (例如,广告ID,活动ID,广告商ID)。 缩合过程消除了重复,并为每个元组[S,g1,...]产生单个值(例如,1或0)。 。 。 ]来表示S维度中每个值与分组维度中每个值的组合的独特性。 按月计算,每日价值被浓缩以产生该月份的单一价值,并且类似的过程适用于任何其他期望的时间粒度(例如,年份)。 可以为所选维度和分组维度的任何组合生成精简数据。

    TAGGED MANAGEMENT OF STORED ITEMS
    5.
    发明申请
    TAGGED MANAGEMENT OF STORED ITEMS 有权
    存储项目的标签管理

    公开(公告)号:US20140359505A1

    公开(公告)日:2014-12-04

    申请号:US13909965

    申请日:2013-06-04

    申请人: Apple Inc.

    IPC分类号: G06F3/0482 G06F17/30

    CPC分类号: G06F17/30126 G06F17/30103

    摘要: In one embodiment, non-transitory computer-readable medium stores instructions for implementing tagged management of stored items, wherein an embodiment can receive an input indicating the selection of a graphical representation of a file in the GUI of an operating system, and can also receive an input indicating the intent to attach a tag to the file. The system can perform an automatic search through the metadata of files associated with the user and the user account to find the set of files having the tag, responsive to the request to display the set of files. Having located the set of files, an operation can be performed to display the set of files having the requested tag, regardless of the storage location of the files.

    摘要翻译: 在一个实施例中,非暂时计算机可读介质存储用于实现所存储项目的标记管理的指令,其中实施例可以接收指示在操作系统的GUI中的文件的图形表示的选择的输入,并且还可以接收 一个表示意图将标签附加到文件的输入。 响应于显示文件集的请求,系统可以通过与用户和用户帐户相关联的文件的元数据来执行自动搜索,以找到具有标签的文件集合。 在找到该组文件之后,可以执行操作以显示具有所请求标签的文件集,而不管文件的存储位置如何。

    METADATA-BASED VIRTUAL MACHINE CONFIGURATION
    6.
    发明申请
    METADATA-BASED VIRTUAL MACHINE CONFIGURATION 有权
    基于元数据的虚拟机配置

    公开(公告)号:US20140123136A1

    公开(公告)日:2014-05-01

    申请号:US13665890

    申请日:2012-10-31

    申请人: Google Inc.

    IPC分类号: G06F9/455

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for associating one or more of a plurality of metadata collections with one or more respective identifiers, wherein each metadata collection includes one or more pairings of metadata attributes with metadata values, and wherein each identifier is one of a project identifier, a tag identifier or an instance identifier; identifying, based on identifier information associated with a virtual machine instance, one or more metadata values to be provided to the virtual machine instance, wherein the identifier information specifies one or more of a project identifier, a tag identifier and an instance identifier, and wherein each identified metadata value belongs to a metadata collection associated with an identifier that is specified in the identifier information; and providing, to the virtual machine instance, the identified one or more metadata values.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于将多个元数据集合中的一个或多个与一个或多个相应的标识符相关联,其中每个元数据集合包括元数据属性与元数据值的一个或多个配对, 并且其中每个标识符是项目标识符,标签标识符或实例标识符之一; 基于与虚拟机实例相关联的标识符信息来识别要提供给所述虚拟机实例的一个或多个元数据值,其中所述标识符信息指定项目标识符,标签标识符和实例标识符中的一个或多个,并且其中 每个识别的元数据值属于与在标识符信息中指定的标识符相关联的元数据收集; 以及向虚拟机实例提供所标识的一个或多个元数据值。

    Elastic Complex Event Processing
    7.
    发明申请
    Elastic Complex Event Processing 有权
    弹性复杂事件处理

    公开(公告)号:US20140006384A1

    公开(公告)日:2014-01-02

    申请号:US13536698

    申请日:2012-06-28

    IPC分类号: G06F17/30

    摘要: Systems and methods according to embodiments provide elasticity for complex event processing (CEP) systems. Embodiments may comprise at least the following three components: (1) incremental query optimization, (2) operator placement, and (3) cost explanation. Incremental query optimization allows avoiding simultaneous computation of identical results by performing operator-level query reuse and subsumption. Using automatic operator placement, a centralized CEP engine can be transformed into a distributed one by dynamically distributing and adjusting the execution according to unpredictable changes in data and query load. Cost explanation functionality can provide end users with near real-time insight into the monetary cost of the whole system, down to operator level granularity. Combination of these components allows a CEP system to be scaled up and down.

    摘要翻译: 根据实施例的系统和方法为复杂事件处理(CEP)系统提供弹性。 实施例可以至少包括以下三个部分:(1)增量查询优化,(2)操作者放置,和(3)成本说明。 增量查询优化允许通过执行运营商级查询重用和包含来避免同时计算相同的结果。 使用自动操作员放置,通过根据不可预测的数据和查询负载变化动态分配和调整执行,可以将集中式CEP引擎转换为分布式CEP引擎。 成本解释功能可以为终端用户提供近乎实时的整个系统的货币成本的洞察力,直到操作员级别的粒度。 这些组件的组合允许CEP系统被放大和缩小。

    Cost Monitoring and Cost-Driven Optimization of Complex Event Processing System
    8.
    发明申请
    Cost Monitoring and Cost-Driven Optimization of Complex Event Processing System 有权
    复杂事件处理系统的成本监控和成本驱动优化

    公开(公告)号:US20130346390A1

    公开(公告)日:2013-12-26

    申请号:US13529681

    申请日:2012-06-21

    IPC分类号: G06F17/30

    摘要: A cost monitoring system can monitor a cost of queries executing in a complex event processing system, running on top of a pay-as-you-go cloud infrastructure. Certain embodiments may employ a generic, cloud-platform independent cost model, multi-query optimization, cost calculation, and/or operator placement techniques, in order to monitor and explain query cost down to an operator level. Certain embodiments may monitor costs in near real-time, as they are created. Embodiments may function independent of an underlying complex event processing system and the underlying cloud platform. Embodiments can optimize a work plan of the cloud-based system so as to minimize cost for the end user, matching the cost model of the underlying cloud platform.

    摘要翻译: 成本监控系统可以监视在复杂事件处理系统中执行的查询的成本,并在付费即付云基础架构上运行。 某些实施例可以采用通用的,独立于云平台的成本模型,多查询优化,成本计算和/或操作员放置技术,以便监视和解释查询成本降低到操作者级别。 某些实施例可以在创建时近似实时地监视成本。 实施例可以独立于底层复杂事件处理系统和底层云平台。 实施例可以优化基于云的系统的工作计划,以最小化最终用户的成本,匹配底层云平台的成本模型。

    Methods and systems for managing permissions data
    9.
    发明申请
    Methods and systems for managing permissions data 有权
    用于管理权限数据的方法和系统

    公开(公告)号:US20080306954A1

    公开(公告)日:2008-12-11

    申请号:US11811189

    申请日:2007-06-07

    申请人: John M. Hornqvist

    发明人: John M. Hornqvist

    IPC分类号: G06F17/30

    摘要: Methods, systems and computer readable media which use permissions checking when deciding whether to allow access to a file are described. In one exemplary embodiment, a method includes receiving a notification of a change of permissions of a directory in a hierarchical file system and determining, in response to the notification, whether to update partially a permissions cache which is used in screening access based on permissions, such as access to search results. The determining may include a comparison of an identifier of the directory to a data structure of cached directories which have files represented in the permissions cache.

    摘要翻译: 描述在决定是否允许访问文件时使用权限检查的方法,系统和计算机可读介质。 在一个示例性实施例中,一种方法包括接收对分层文件系统中的目录的许可的改变的通知,并且响应于该通知确定是否基于许可部分地更新用于筛选访问中的权限高速缓存, 例如访问搜索结果。 该确定可以包括目录的标识符与具有在许可高速缓存中表示的文件的缓存目录的数据结构的比较。

    Systems and methods for providing a user interface with an automatic search menu

    公开(公告)号:US20060167851A1

    公开(公告)日:2006-07-27

    申请号:US11045171

    申请日:2005-01-27

    申请人: Sergei Ivanov

    发明人: Sergei Ivanov

    IPC分类号: G06F17/30

    摘要: Systems and methods are provided for a user interface with an automatic search menu. The interface exposes commands to the user as instantly searchable hierarchy. Visually, this is represented as a tree view with an edit box above it. There is no “Search” or “Go” button to press. One second after any character is entered in the edit box, the computer reduces a displayed hierarchy down to only those items that match the keyword entered. Entering another character before one second expires resets the timer. This allows the user to type in as little or as much of the keyword as necessary to reduce the hierarchy to a few items, one of which can then be mouse-clicked. This method scales to large number of commands.