Abstract:
Embodiments of the present invention relate to systems, methods and computer storage media for providing Structured Computations Optimized for Parallel Execution (SCOPE) that facilitate analysis of a large-scale dataset utilizing row data of those data sets. SCOPE includes, among other features, an extract command for extracting data bytes from a data stream and structuring the data bytes as data rows having strictly defined columns. SCOPE also includes a process command and a reduce command that identify data rows as inputs. The reduce command also identifies a reduce key that facilitates the reduction based on the reduce key. SCOPE additionally includes a combine command that identifies two data row sets that are to be combined based on an identified joint condition. Additionally, SCOPE includes a select command that leverages SQL and C# languages to create an expressive script that is capable of analyzing large-scale data sets in a parallel computing environment.