-
公开(公告)号:US07383234B2
公开(公告)日:2008-06-03
申请号:US11157602
申请日:2005-06-21
申请人: Raman S. Iyer , Ioan Bogdan Crivat , C. James MacLennan , Scott C. Oveson , Rong J. Guan , ZhaoHui Tang , Pyungchul Kim , Irina G. Gorbach
发明人: Raman S. Iyer , Ioan Bogdan Crivat , C. James MacLennan , Scott C. Oveson , Rong J. Guan , ZhaoHui Tang , Pyungchul Kim , Irina G. Gorbach
IPC分类号: G06N5/00
CPC分类号: G06F17/30539 , G06F2216/03
摘要: The subject disclosure pertains to extensible data mining systems, means, and methodologies. For example, a data mining system is disclosed that supports plug-in or integration of non-native mining algorithms, perhaps provided by third parties, such that they function the same as built-in algorithms. Furthermore, non-native data mining viewers may also be seamlessly integrated into the system for displaying the results of one or more algorithms including those provided by third parties as well as those built-in. Still further yet, support is provided for extending data mining languages to include user-defined functions (UDFs).
摘要翻译: 主题公开涉及可扩展数据挖掘系统,手段和方法。 例如,公开了一种数据挖掘系统,其支持可能由第三方提供的非本地挖掘算法的插件或集成,使得它们与内置算法相同。 此外,非本地数据挖掘查看器还可以无缝地集成到系统中,用于显示包括由第三方提供的那些算法的一个或多个算法的结果以及内置的算法。 此外,还提供了用于扩展数据挖掘语言以包括用户定义的功能(UDF)的支持。
-
公开(公告)号:US07398268B2
公开(公告)日:2008-07-08
申请号:US11049031
申请日:2005-02-02
申请人: Pyungchul Kim , ZhaoHui Tang , Ioan Bogdan Crivat , C. James MacLennan , Raman S. Iyer , Irina G. Gorbach
发明人: Pyungchul Kim , ZhaoHui Tang , Ioan Bogdan Crivat , C. James MacLennan , Raman S. Iyer , Irina G. Gorbach
IPC分类号: G06F17/30
CPC分类号: G06F17/30539 , G06F17/30595 , G06Q40/00 , Y10S707/99933
摘要: A system that facilitates data mining comprises a reception component that receives command(s) in a declarative language that relate to utilizing an output of a first data mining model as an input to a second data mining model. An implementation component analyzes the received command(s) and implements the command(s) with respect to the first and second data mining models. In another aspect of the subject invention, the reception component can receive further command(s) in a declarative language with respect to causing one or more of the first and second data mining models to output a prediction, the prediction desirably generated without prediction input, the implementation component causes the one or more of the first and second data mining models to output the prediction.
摘要翻译: 便于数据挖掘的系统包括:接收组件,其以声明性语言接收与将第一数据挖掘模型的输出利用为第二数据挖掘模型的输入相关的命令。 实现组件分析所接收的命令并且针对第一和第二数据挖掘模型实现命令。 在本发明的另一方面,接收组件可以以声明性语言接收另外的命令,以使得第一和第二数据挖掘模型中的一个或多个输出预测,期望地产生而不具有预测输入的预测, 实现组件使第一和第二数据挖掘模型中的一个或多个输出预测。
-
公开(公告)号:US07689703B2
公开(公告)日:2010-03-30
申请号:US11069342
申请日:2005-03-01
申请人: Mosha Pasumansky , Marius Dumitru , Adrian Dumitrascu , Cristian Petculescu , Akshai M. Mirchandani , Paul J. Sanders , Thulusalamatom Krishnamurthi Anand , Richard R. Tkachuk , Raman S. Iyer , Thomas P. Conlon , Alexander Berger , Sergei Gringauze , Ioan Bogdan Crivat , C. James MacLennan , Rong J. Guan
发明人: Mosha Pasumansky , Marius Dumitru , Adrian Dumitrascu , Cristian Petculescu , Akshai M. Mirchandani , Paul J. Sanders , Thulusalamatom Krishnamurthi Anand , Richard R. Tkachuk , Raman S. Iyer , Thomas P. Conlon , Alexander Berger , Sergei Gringauze , Ioan Bogdan Crivat , C. James MacLennan , Rong J. Guan
CPC分类号: G06F17/30893
摘要: The subject invention relates to systems and methods that extend the network data access capabilities of mark-up language protocols. In one aspect, a network data transfer system is provided. The system includes a protocol component that employs a computerized mark-up language to facilitate data interactions between network components, whereby the data interactions were previously limited or based on a statement command associated with the markup language. An extension component operates with the protocol component to support the data transactions, where the extension component supplies at least one other command from the statement command to facilitate the data interactions.
摘要翻译: 本发明涉及扩展标记语言协议的网络数据访问能力的系统和方法。 一方面,提供一种网络数据传送系统。 该系统包括协议组件,其采用计算机化的标记语言来促进网络组件之间的数据交互,由此先前限制数据交互或基于与标记语言相关联的语句命令。 扩展组件与协议组件一起运行以支持数据事务,其中扩展组件从语句命令提供至少一个其他命令,以促进数据交互。
-
公开(公告)号:US07593927B2
公开(公告)日:2009-09-22
申请号:US11373319
申请日:2006-03-10
CPC分类号: G06F17/30943 , G06F2216/03 , Y10S707/99933
摘要: A standard mechanism for directly accessing unstructured data types (e.g., image, audio, video, gene sequencing and text data) in accordance with data mining operations is provided. The subject innovation can enable access to unstructured data directly from within the data mining engine or tool. Accordingly, the innovation enables multiple vendors to provide algorithms for mining unstructured data on a data mining platform (e.g., an SQL-brand server), thereby increasing adoption. As well, the subject innovation allows users to directly mine unstructured data that is not fixed-length, without pre-processing and tokenizing the data external to the data mining engine. In accordance therewith, the innovation can provide a mechanism to expand declarative language content types to include an “unstructured” data type thereby enabling a user and/or application to affirmatively designate mining data as an unstructured type.
摘要翻译: 提供了一种用于根据数据挖掘操作直接访问非结构化数据类型(例如图像,音频,视频,基因排序和文本数据)的标准机制。 主题创新可以直接从数据挖掘引擎或工具中访问非结构化数据。 因此,该创新使得多个供应商能够提供用于在数据挖掘平台(例如,SQL品牌服务器)上挖掘非结构化数据的算法,从而增加采用。 此外,本创新允许用户直接挖掘不固定长度的非结构化数据,而不需要对数据挖掘引擎外部的数据进行预处理和标记。 根据此,创新可以提供一种机制来扩展声明性语言内容类型以包括“非结构化”数据类型,从而使得用户和/或应用程序肯定地将挖掘数据指定为非结构化类型。
-
公开(公告)号:US07451137B2
公开(公告)日:2008-11-11
申请号:US11069121
申请日:2005-02-28
IPC分类号: G06F17/30
CPC分类号: G06F17/30539 , G06F17/30421 , G06F17/30595 , Y10S707/99932 , Y10S707/99934 , Y10S707/99944
摘要: Architecture that facilitates syntax processing for data mining statements. The system includes a syntax engine that receives as an input a query statement which, for example, is a data mining request. The statement can be generated from many different sources, e.g., a client application and/or a server application, and requests query processing of a data source (e.g., a relational database) to return a result set. The syntax engine includes a binding component that converts the query statement into an encapsulated statement in accordance with a predefined grammar. The encapsulated statement includes both data and data operations to be performed on the data of the data source, and which is understood by the data source. An execution component processes the encapsulated statement against the data source to return the desired result set.
摘要翻译: 促进数据挖掘语句的语法处理的架构。 该系统包括语法引擎,其作为输入接收诸如数据挖掘请求的查询语句。 语句可以从许多不同的来源(例如客户端应用程序和/或服务器应用程序)生成,并且请求数据源(例如,关系数据库)的查询处理以返回结果集。 语法引擎包括一个绑定组件,它根据预定义的语法将查询语句转换成封装语句。 封装语句包括要对数据源的数据执行的数据和数据操作,数据源可以理解。 执行组件根据数据源处理封装语句以返回所需的结果集。
-
公开(公告)号:US07797264B2
公开(公告)日:2010-09-14
申请号:US11670735
申请日:2007-02-02
IPC分类号: G06F17/00
CPC分类号: G06F17/30312
摘要: Data expressed as tabular data having columns and rows can be analyzed and data determined to be an exception can be flagged. In addition, reasons for flagging such data as exceptions can be presented to a user to facilitate further analysis and action on the data. A predictive analysis component can utilize a clustering algorithm with predictive capabilities to autonomously analyze the data. Periodic re-analysis of the data can be performed to determine if exceptions have changed based on new or modified data.
摘要翻译: 可以分析表示为具有列和行的表格数据的数据,并且可以标记被确定为异常的数据。 此外,可以向用户呈现标记诸如异常的数据的原因,以促进对数据的进一步分析和动作。 预测分析组件可以利用具有预测能力的聚类算法来自主分析数据。 可以执行数据的定期重新分析,以确定是否根据新的或修改的数据更改了异常。
-
公开(公告)号:US20080189238A1
公开(公告)日:2008-08-07
申请号:US11670735
申请日:2007-02-02
CPC分类号: G06F17/30312
摘要: Data expressed as tabular data having columns and rows can be analyzed and data determined to be an exception can be flagged. In addition, reasons for flagging such data as exceptions can be presented to a user to facilitate further analysis and action on the data. A predictive analysis component can utilize a clustering algorithm with predictive capabilities to autonomously analyze the data. Periodic re-analysis of the data can be performed to determine if exceptions have changed based on new or modified data.
摘要翻译: 可以分析表示为具有列和行的表格数据的数据,并且可以标记被确定为异常的数据。 此外,可以向用户呈现标记诸如异常的数据的原因,以促进对数据的进一步分析和动作。 预测分析组件可以利用具有预测能力的聚类算法来自主分析数据。 可以执行数据的定期重新分析,以确定是否根据新的或修改的数据更改了异常。
-
公开(公告)号:US20080189639A1
公开(公告)日:2008-08-07
申请号:US11670783
申请日:2007-02-02
CPC分类号: G06F17/245
摘要: Fields contained in data expressed as tabular data having columns and rows can initially be marked as exceptions, wherein a column within a row can be the potential cause of the exception. A user configurable parameter can be utilized to change the sensitivity or allowable exceptions for each row and/or column, to increase or decrease the number of exceptions detected. As data within each field are modified, added or deleted, or when the configurable parameter is changed, the exceptions marked can be automatically updated. Such updated exceptions can be the same or different from the initially marked exceptions. As such, a user can evaluate data and determine whether various changes within the data will change various outcomes.
摘要翻译: 包含在以列和行表格数据表示的数据中的字段最初可以被标记为异常,其中行内的列可能是异常的潜在原因。 可以使用用户可配置参数来改变每行和/或列的灵敏度或允许的异常,以增加或减少检测到的异常数量。 由于每个字段中的数据被修改,添加或删除,或者当可配置参数被更改时,可以自动更新标记的异常。 这种更新的异常可以与初始标记的异常相同或不同。 因此,用户可以评估数据并确定数据内的各种变化是否会改变各种结果。
-
公开(公告)号:US20080189237A1
公开(公告)日:2008-08-07
申请号:US11670656
申请日:2007-02-02
CPC分类号: G06N7/00
摘要: Seeking goals in data that can be expressed as rows and columns is provided through predictive analytics. If a desired goal is achievable, the changes to the rows and/or columns that can achieve the goal are presented to a user. If the desired goal is not achievable, an error message or other indicator can be presented to the user. Predictive analytics can include a predictive algorithm, various data mining techniques, or other predictive techniques. A confidence metric of a goal-seek result can be normalized to estimate the degree of confidence that a particular change will yield the desired outcome.
摘要翻译: 通过预测分析来提供可以表示为行和列的数据中的目标。 如果可以实现所需的目标,则可以向用户呈现可实现目标的行和/或列的更改。 如果不能实现所期望的目标,则可向用户呈现错误消息或其他指示符。 预测分析可以包括预测算法,各种数据挖掘技术或其他预测技术。 可以对目标追求结果的置信度量度进行归一化,以估计特定变化产生期望结果的置信程度。
-
公开(公告)号:US07797356B2
公开(公告)日:2010-09-14
申请号:US11670783
申请日:2007-02-02
IPC分类号: G06F7/00
CPC分类号: G06F17/245
摘要: Fields contained in data expressed as tabular data having columns and rows can initially be marked as exceptions, wherein a column within a row can be the potential cause of the exception. A user configurable parameter can be utilized to change the sensitivity or allowable exceptions for each row and/or column, to increase or decrease the number of exceptions detected. As data within each field are modified, added or deleted, or when the configurable parameter is changed, the exceptions marked can be automatically updated. Such updated exceptions can be the same or different from the initially marked exceptions. As such, a user can evaluate data and determine whether various changes within the data will change various outcomes.
摘要翻译: 包含在以列和行表格数据表示的数据中的字段最初可以被标记为异常,其中行内的列可能是异常的潜在原因。 可以使用用户可配置参数来改变每行和/或列的灵敏度或允许的异常,以增加或减少检测到的异常数量。 由于每个字段中的数据被修改,添加或删除,或者当可配置参数被更改时,可以自动更新标记的异常。 这种更新的异常可以与初始标记的异常相同或不同。 因此,用户可以评估数据并确定数据内的各种变化是否会改变各种结果。
-
-
-
-
-
-
-
-
-