-
公开(公告)号:US11714644B2
公开(公告)日:2023-08-01
申请号:US17459130
申请日:2021-08-27
Applicant: Arm Limited
Inventor: Abhishek Raja
CPC classification number: G06F9/30043 , G06F9/30036 , G06F9/30145 , G06F9/3836
Abstract: A predicated vector load micro-operation specifies a load target address, a destination vector register for which active vector elements of the destination vector register are to be loaded with data associated with addresses identified based on the load target address, and a predicate operand indicative of whether each vector element of the destination vector register is active or inactive. A predetermined type of predicated vector load micro-operation can be issued to the processing circuitry before the predicate operand is determined to meet an availability condition, and if issued in this way memory access circuitry can determine, based on the load target address, whether the predetermined type of predicated vector load micro-operation satisfies a predetermined condition, and if the predetermined condition is unsatisfied, perform a complete vector load assuming all vector elements of the destination vector register are active vector elements, independent of whether the predicate operand when available identifies any inactive vector element of the destination vector register.
-
182.
公开(公告)号:US11714641B2
公开(公告)日:2023-08-01
申请号:US16471185
申请日:2017-11-08
Applicant: ARM LIMITED
CPC classification number: G06F9/30036 , G06F9/3013 , G06F9/30043 , G06F9/30112 , G06F9/345 , G06F9/3552 , G06F9/3555
Abstract: An apparatus and method are provided for performing vector processing operations. In particular the apparatus has processing circuitry to perform the vector processing operations and an instruction decoder to decode vector instructions to control the processing circuitry to perform the vector processing operations specified by the vector instructions. The instruction decoder is responsive to a vector generating instruction identifying a scalar start value and wrapping control information, to control the processing circuitry to generate a vector comprising a plurality of elements. In particular, the processing circuitry is arranged to generate the vector such that the first element in the plurality is dependent on the scalar start value, and the values of the plurality of elements follow a regularly progressing sequence that is constrained to wrap as required to ensure that each value is within bounds determined from the wrapping control information. The vector generating instruction can be useful in a variety of situations, a particular use case being to implement a circular addressing mode within memory, where the vector generating instruction can be coupled with an associated vector memory access instruction. Such an approach can remove the need to provide additional logic within the memory access path to support such circular addressing.
-
公开(公告)号:US20230221866A1
公开(公告)日:2023-07-13
申请号:US18000761
申请日:2021-05-20
Applicant: Arm Limited
Inventor: Jamshed JALAL , Gurunath RAMAGIRI , Tushar P RINGE , Mark David WERKHEISER , Ashok Kumar TUMMALA , Dimitrios KASERIDIS
IPC: G06F3/06
CPC classification number: G06F3/0613 , G06F3/0659 , G06F3/0673 , G06F3/0653
Abstract: A technique for handling memory access requests is described. An apparatus has an interconnect for coupling a plurality of requester elements with a plurality of slave elements. The requester elements are arranged to issue memory access requests for processing by the slave elements. An intermediate element within the interconnect acts as a point of serialisation to order the memory access requests issued by requester elements via the intermediate element. The intermediate element has tracking circuitry for tracking handling of the memory access requests accepted by the intermediate element. Further, request acceptance management circuitry is provided to identify a target slave element amongst the plurality of slave elements for that given memory access request, and to determine whether the given memory access request is to be accepted by the intermediate element dependent on an indication of bandwidth capability for the target slave element.
-
公开(公告)号:US20230214236A1
公开(公告)日:2023-07-06
申请号:US17998221
申请日:2021-05-13
Applicant: Arm Limited
Inventor: David Hennah MANSELL
Abstract: An apparatus comprises matrix processing circuitry to perform a matrix processing operation on first and second input operands to generate a result matrix, where the result matrix is a two-dimensional matrix; operand storage circuitry to store information for forming the first and second input operands for the matrix processing circuitry; and masking circuitry to perform a masking operation to mask at least part of the matrix processing operation or the information stored to the operand storage circuitry based on masking state data indicative of one or more masked row or column positions to be treated as representing a masking value. This is useful for improving performance of two-dimensional convolution operations, as the masking can be used to mask out selected rows or columns when performing the 2D convolution as a series of 1×1 convolution operations applied to different kernel positions.
-
公开(公告)号:US11693796B2
公开(公告)日:2023-07-04
申请号:US17334960
申请日:2021-05-31
Applicant: Arm Limited
Inventor: Paul Nicholas Whatmough , Zhi-Gang Liu , Supreet Jeloka , Saurabh Pijuskumar Sinha , Matthew Mattina
CPC classification number: G06F13/1668 , G06F13/4004 , G06F7/5443 , G06F15/8046 , G06N3/063
Abstract: Various implementations described herein are directed to a device having a multi-layered logic structure with a first logic layer and a second logic layer arranged vertically in a stacked configuration. The device may have a memory array that provides data, and also, the device may have an inter-layer data bus that vertically couples the memory array to the multi-layered logic structure. The inter-layer data bus may provide multiple data paths to the first logic layer and the second logic layer for reuse of the data provided by the memory array.
-
公开(公告)号:US11687464B2
公开(公告)日:2023-06-27
申请号:US16648041
申请日:2019-01-23
Applicant: ARM LIMITED
Inventor: Graeme Peter Barnes , Catalin Theodor Marinas , William James Deacon
IPC: G06F12/10
CPC classification number: G06F12/10 , G06F2212/657
Abstract: An apparatus comprises address translation circuitry (70) to perform a translation of a virtual address (80) comprising a virtual tag portion (88) and a virtual address portion (86) into a physical address (82) comprising a physical tag portion (92) and a physical address portion (90). The address translation circuitry comprises address tag translation circuitry (72) to perform a translation of the virtual tag portion into the physical tag portion and the address translation to be performed is selected in dependence on the virtual address.
-
公开(公告)号:US20230196661A1
公开(公告)日:2023-06-22
申请号:US17558383
申请日:2021-12-21
Applicant: Arm Limited
Inventor: Olof Henrik Uhrenholt
Abstract: When generating a render output in which primitives to be rendered are to be clipped against a user-defined clip plane defined for the render output, and a primitive to be rendered is intersected by a user-defined clip plane defined for the render output, an edge representing the intersection of the primitive with the user-defined clip plane is determined. The rasteriser, when rasterising the primitive, then tests one or more regions of the render output being generated against the determined edge representing the intersection of the primitive with the user-defined clip plane to determine whether the region or regions should not be rendered for the primitive on the basis of the user-defined clip plane.
-
公开(公告)号:US20230195638A1
公开(公告)日:2023-06-22
申请号:US18067180
申请日:2022-12-16
Applicant: Arm Limited
Inventor: Olof Henrik UHRENHOLT , Edvard FIELDING , Ole Henrik JAHREN
IPC: G06F12/0891 , G06F12/0811 , G06F12/0817
CPC classification number: G06F12/0891 , G06F12/0811 , G06F12/0824
Abstract: A method of operating a cache system is disclosed. Information indicating a link between associated header and payload cache entries is maintained. The link information may be used to reduce cache coherency traffic.
-
公开(公告)号:US20230195466A1
公开(公告)日:2023-06-22
申请号:US17554573
申请日:2021-12-17
Applicant: Arm Limited
Inventor: Yasuo ISHII , Muhammad Umar FAROOQ , William Elton BURKY , Michael Brian SCHINZLER , Jason Lee SETTER , David Gum LIM
IPC: G06F9/38
CPC classification number: G06F9/384 , G06F9/3867
Abstract: A data processing apparatus is provided that comprises rename circuitry for performing a register rename stage of a pipeline in respect of a stream of operations. Move elimination circuitry performs a move elimination operation on the stream of operations in which a move operation is eliminated and the register rename stage performs an adjustment of an identity of registers in the stream of operations to compensate for the move operation being eliminated and demotion circuitry reverses or inhibits the adjustment in response to one or more conditions being met.
-
公开(公告)号:US20230195419A1
公开(公告)日:2023-06-22
申请号:US17554024
申请日:2021-12-17
Applicant: Arm Limited
Inventor: Dibakar Gope , Jesse Garrett Beu , Milos Milosavljevic
CPC classification number: G06F7/5443 , G06N3/04 , G06F2207/4824
Abstract: A neural network system, method and apparatus are provided. A truth table matrix, an index vector and an input data tensor are read from a memory. At least a portion of the input data tensor is flattened into an input data vector. A scatter accumulate instruction is executed on the index vector and the input data vector to generate an intermediate vector. The truth table matrix and the intermediate vector are then multiplied to generate an output data vector.
-
-
-
-
-
-
-
-
-