Abstract:
The present disclosure is directed to systems and methods of bit-serial, in-memory, execution of at least an nth layer of a multi-layer neural network in a first on-chip processor memory circuitry portion contemporaneous with prefetching and storing layer weights associated with the (n+1)st layer of the multi-layer neural network in a second on-chip processor memory circuitry portion. The storage of layer weights in on-chip processor memory circuitry beneficially decreases the time required to transfer the layer weights upon execution of the (n+1)st layer of the multi-layer neural network by the first on-chip processor memory circuitry portion. In addition, the on-chip processor memory circuitry may include a third on-chip processor memory circuitry portion used to store intermediate and/or final input/output values associated with one or more layers included in the multi-layer neural network.
Abstract:
The present disclosure is directed to systems and methods of implementing a neural network using in-memory mathematical operations performed by pipelined SRAM architecture (PISA) circuitry disposed in on-chip processor memory circuitry. A high-level compiler may be provided to compile data representative of a multi-layer neural network model and one or more neural network data inputs from a first high-level programming language to an intermediate domain-specific language (DSL). A low-level compiler may be provided to compile the representative data from the intermediate DSL to multiple instruction sets in accordance with an instruction set architecture (ISA), such that each of the multiple instruction sets corresponds to a single respective layer of the multi-layer neural network model. Each of the multiple instruction sets may be assigned to a respective SRAM array of the PISA circuitry for in-memory execution. Thus, the systems and methods described herein beneficially leverage the on-chip processor memory circuitry to perform a relatively large number of in-memory vector/tensor calculations in furtherance of neural network processing without burdening the processor circuitry.
Abstract:
Embodiments of an invention for a compact, low power Advanced Encryption Standard circuit are disclosed. In one embodiment, an apparatus includes an encryption unit having a substitution box and an accumulator. The substitution box is to perform a substitution operation on one byte per clock cycle. The accumulator is to accumulate four bytes and perform a mix-column operation in four clock cycles. The encryption unit is implemented using optimum Galois Field polynomial arithmetic for minimum area.
Abstract:
Some implementations disclosed herein provide techniques and arrangements for provisioning keys to integrated circuits/processors. A processor may include physically unclonable functions component, which may generate a unique hardware key based at least on at least one physical characteristic of the processor. The hardware key may be employed in encrypting a key such as a secret key. The encrypted key may be stored in a memory of the processor. The encrypted key may be validated. The integrity of the key may be protected by communicatively isolating at least one component of the processor.
Abstract:
Disclosed is a system and device and related methods for data manipulation, especially for SIMD operations such as permute, shift, and rotate. An apparatus includes a permute section that repositions data on sub-word boundaries and a shift section that repositions the data distances smaller than the sub-word width. The sub-word width is configurable and selectable, and the permute section and shift section may operate on different boundary widths. In a first stage, the permute section repositions the data at the nearest sub-word boundary and, in a second stage, the shift section repositions the data to its final desired position. The shift section includes multi-stages set in a logarithmic cascade relationship. Additionally, each shifter within each of the multi-stages is highly connected, allowing fast and precise data movements.
Abstract:
A multi-core die is provided that allows packets to be communicated across the die using resources of a packet switched network and a circuit switched network.
Abstract:
A system and method are provided for generating one or more menus having options that display insights from visualizations. The options presented in the menus enable users to determine relationships between elements of the visualization. The relationships may be displayed textually to enable user to navigate the menus using a keyboard, a text-to-voice converter, and/or pointers.
Abstract:
A system comprises reception of input data of a Galois field GF(2k), mapping of the input data to a composite Galois field GF(2nm), where k=nm, inputting of the mapped input data to an Advanced Encryption Standard round function, performance of two or more iterations of the Advanced Encryption Standard round function in the composite Galois field GF(2nm), reception of output data of a last of the two or more iterations of the Advanced Encryption Standard round function, and mapping of the output data to the Galois field GF(2k).
Abstract:
A method for indicating the presence of federated calendar entries in a currently viewed time period of a calendar and/or scheduling application, includes: receiving a user's selection for a date range in a calendar and/or scheduling application; determining whether there are one or more federated calendars associated with the user's calendar and/or scheduling application; wherein in the event there are one or more federated calendars associated with the user's calendar and/or scheduling application: determining whether there are one or more events from the one or more federated calendars in the selected date range; and wherein in the event there are federated calendar events in the selected date range: generating a calendar and/or scheduling page with one or more indicators for federated calendars with events in the selected date range.
Abstract:
For one disclosed embodiment, a converter converts 2N-bit data into an N-bit value indicating a number of bits in the data that have a predetermined logical value. The converter includes N comparators, each determining whether the number of bits in the data having the predetermined logical value exceeds a respective one of a plurality of reference values. The N-bit value is generated based on the outputs of the comparators. For another disclosed embodiment, a first delay element delays a signal based on a number of bits in a data value having a predetermined logical value, and a second delay element delays the signal based on a number of bits in a reference value having the predetermined logical value. A comparator then generates a bit value based on the delayed signals.