发明授权
US09064047B2 Parallel processing of ETL jobs involving extensible markup language documents 有权
并行处理涉及可扩展标记语言文档的ETL作业

Parallel processing of ETL jobs involving extensible markup language documents
摘要:
Techniques for running an Extract Transform Load (ETL) job in parallel on one or more processors wherein the ETL job comprises use of an extensible markup language (XML) document are provided. The techniques include receiving an XML document input, identifying a node in the XML document at which partitioning of the XML document is to begin, sending partition information to each respective processor, performing a shallow parsing of the XML document in parallel on the one or more processors, wherein each processor performs shallow parsing using the identified partition node until it reaches its identified partition, using the shallow parsing to generate the partition of the input XML document, wherein each processor generates a different partition of the same XML document, and sending each partition in streaming format to an ETL job instance.
信息查询
0/0