Scaling HDFS for hive
Abstract:
A non-transitory computer-readable storage media storing program instructions which, when executed by one or more processors, cause the one or more processors to perform: receiving a query to the distributed file system; determining a particular partition, associated with the data warehouse system, targeted by the query; accessing a repository associated with the data warehouse system to determine whether a partition-to-cluster mapping entry for the particular partition targeted by the query exists in the repository; in response to a determination that the entry for the particular partition exists in the repository, obtaining, from the entry for the particular partition, an identifier of a particular cluster to which the particular partition is assigned by the entry for the particular partition, the particular cluster being one of a plurality of clusters of the distributed file system, each cluster of the plurality of clusters having one name node and a plurality of data nodes.
Information query
Patent Agency Ranking
0/0