摘要:
Methods and apparatus for implementing Brownian Bridge algorithm on Single Instruction Multiple Data (SIMD) computing platforms are described. In one embodiment, a memory stores a plurality of data corresponding to an SIMD (Single Instruction, Multiple Data) instruction. A processor may include a plurality of SIMD lanes. Each of the plurality of the SIMD lanes may process one of the plurality of data stored in the memory in accordance with the SIMD instruction. Other embodiments are also described.
摘要:
In one embodiment, the present invention includes a method for performing a first level task of an application in a first processor of a system and dynamically allocating a second level task of the application to one of the first processor and a second processor based on architectural feedback information. In this manner, improved scheduling and application performance can be achieved by better utilizing system resources. Other embodiments are described and claimed.
摘要:
Methods and apparatus to provide virtualized vector processing are disclosed. In one embodiment, a processor includes a decode unit to decode a first instruction into a decoded first instruction and a second instruction into a decoded second instruction, and an execution unit to: execute the decoded first instruction to cause allocation of a first portion of one or more operations corresponding to a virtual vector request to a first processor core, and generation of a first signal corresponding to a second portion of the one or more operations to cause allocation of the second portion to a second processor core, and execute the decoded second instruction to cause a first computational result corresponding to the first portion of the one or more operations and a second computational result corresponding to the second portion of the one or more operations to be aggregated and stored to a memory location.
摘要:
Methods and apparatus to provide virtualized vector processing are described. In one embodiment, one or more operations corresponding to a virtual vector request are distributed to one or more processor cores for execution.
摘要:
A technique to perform concurrent updates to a shared data structure. At least one embodiment of the invention concurrently stores copies of a data structure within a plurality of local caches, updates the local caches with a partial result of a computation distributed among a plurality of processing elements, and returns the partial results to combining logic in parallel, which combines the partial results into a final result.
摘要:
The present disclosure is directed to systems and methods of bit-serial, in-memory, execution of at least an nth layer of a multi-layer neural network in a first on-chip processor memory circuitry portion contemporaneous with prefetching and storing layer weights associated with the (n+1)st layer of the multi-layer neural network in a second on-chip processor memory circuitry portion. The storage of layer weights in on-chip processor memory circuitry beneficially decreases the time required to transfer the layer weights upon execution of the (n+1)st layer of the multi-layer neural network by the first on-chip processor memory circuitry portion. In addition, the on-chip processor memory circuitry may include a third on-chip processor memory circuitry portion used to store intermediate and/or final input/output values associated with one or more layers included in the multi-layer neural network.
摘要:
The present disclosure is directed to systems and methods of implementing a neural network using in-memory mathematical operations performed by pipelined SRAM architecture (PISA) circuitry disposed in on-chip processor memory circuitry. A high-level compiler may be provided to compile data representative of a multi-layer neural network model and one or more neural network data inputs from a first high-level programming language to an intermediate domain-specific language (DSL). A low-level compiler may be provided to compile the representative data from the intermediate DSL to multiple instruction sets in accordance with an instruction set architecture (ISA), such that each of the multiple instruction sets corresponds to a single respective layer of the multi-layer neural network model. Each of the multiple instruction sets may be assigned to a respective SRAM array of the PISA circuitry for in-memory execution. Thus, the systems and methods described herein beneficially leverage the on-chip processor memory circuitry to perform a relatively large number of in-memory vector/tensor calculations in furtherance of neural network processing without burdening the processor circuitry.
摘要:
Some embodiments of the invention provide a computer based method of editing video, allowing users to add multimedia objects such as sound effect, text, stickers, animation, and template to a specific point in a timeline of the video. In some embodiments, the timeline of the video is represented by a simple scroll bar with a control play button, which allows a user to drag the control play button to a specific point on the timeline. This enables a user to pinpoint a specific frame within a video to edit. In some embodiments, a multimedia panel allows a user to select specific multimedia objects to add to the frame, with further manipulation. A mechanism to store these multimedia objects associated with the selected frame is defined in this invention. In some embodiments, frames with multimedia objects added will have indicators shown on the scroll bar, allowing the user to fast forward to that frame and further edit the frame.
摘要:
A method in a computer system of distributing pills into containers for use by a patient includes obtaining from a memory a fill pattern including a mapping of each of a first plurality of pills to one of a first plurality of containers such that at least two of the first plurality of pills are mapped to the same one of the first plurality of containers, obtaining from the memory an attribute of each of the first plurality of pills such that at least two of the first plurality of pills differ in at least the obtained attribute, and automatically sorting the first plurality of pills according to the attribute of each of the plurality of pills according to a predefined order to generate an ordered list corresponding to an order of depositing the first plurality of pills into the first plurality of containers.
摘要:
A device for extracting the mold core and a mold assembly using the same are disclosed. The extracting device comprises a first gear, a plurality of first racks and a plurality of transmission assemblies, wherein the transmission assemblies and the corresponding first racks are spaced apart along the periphery of the first gear such that each transmission assembly is located between the first gear and first racks. The rotating motion of the first gear is transformed into a reciprocating motion of each of the first racks from a first position to a second position.