Abstract:
Systems and methods for source-to-source transformation for compiler optimization for many integrated core (MIC) coprocessors, including identifying data dependencies in candidate loops and data elements used in each iteration for arrays, profiling candidate loops to find a proper number m, wherein data transfer and computation for m iterations take an equal amount of time, and creating an outer loop outside the candidate loop, with each iteration of the outer loop executing m iterations of the candidate loop. Data streaming is performed by determining optimum buffer size for one or more arrays and inserting code before the outer loop to create optimum sized buffers, overlapping data transfer between central processing units (CPUs) and MICs with the computation; reusing buffers to reduce memory employed on the MICs, and reusing threads on MICs to repeatedly launch kernels on the MICs for asynchronous data transfer.
Abstract:
Systems and methods for recognizing a face are disclosed and includes receiving images of faces; generating feature vectors of the images; generating clusters of feature vectors each with a centroids or a cluster representative; for a query to search for a face, generating corresponding feature vectors for the face and comparing the feature vector with the centroids of all clusters; for clusters above a similarity threshold, comparing cluster members with the corresponding feature vector; and indicating as matching candidates for cluster members with similarity above a threshold.