-
公开(公告)号:US20240111441A1
公开(公告)日:2024-04-04
申请号:US18523335
申请日:2023-11-29
Applicant: Oracle International Corporation
Inventor: Rishabh Jain , Erik M. Schlanger
IPC: G06F3/06 , G06F9/52 , G06F9/54 , G06F15/173 , G06F15/78
CPC classification number: G06F3/0631 , G06F3/0604 , G06F3/067 , G06F9/526 , G06F9/547 , G06F15/17331 , G06F15/7825
Abstract: A hardware-assisted Distributed Memory System may include software configurable shared memory regions in the local memory of each of multiple processor cores. Accesses to these shared memory regions may be made through a network of on-chip atomic transaction engine (ATE) instances, one per core, over a private interconnect matrix that connects them together. For example, each ATE instance may issue Remote Procedure Calls (RPCs), with or without responses, to an ATE instance associated with a remote processor core in order to perform operations that target memory locations controlled by the remote processor core. Each ATE instance may process RPCs (atomically) that are received from other ATE instances or that are generated locally. For some operation types, an ATE instance may execute the operations identified in the RPCs itself using dedicated hardware. For other operation types, the ATE instance may interrupt its local processor core to perform the operations.
-
2.
公开(公告)号:US10599488B2
公开(公告)日:2020-03-24
申请号:US15197436
申请日:2016-06-29
Applicant: Oracle International Corporation
Inventor: David A. Brown , Rishabh Jain , Michael Duller , Erik Schlanger
Abstract: Techniques are provided for improving the performance of a constellation of coprocessors by hardware support for asynchronous events. In an embodiment, a coprocessor receives an event descriptor that identifies an event and a logic. The coprocessor processes the event descriptor to configure the coprocessor to detect whether the event has been received. Eventually a device, such as a CPU or another coprocessor, sends the event. The coprocessor detects that it has received the event. In response to detecting the event, the coprocessor performs the logic.
-
公开(公告)号:US20190324939A1
公开(公告)日:2019-10-24
申请号:US16457793
申请日:2019-06-28
Applicant: Oracle International Corporation
Inventor: David A. Brown , Daniel Fowler , Rishabh Jain , Erik Schlanger , Michael Duller
IPC: G06F13/42 , G06F1/3234
Abstract: Techniques are provided for exchanging dedicated hardware signals to manage a first-in first-out (FIFO). In an embodiment, a first processor initiates content transfer into the FIFO. The first processor activates a first hardware signal that is reserved for indicating that content resides within the FIFO. A second processor activates a second hardware signal that is reserved for indicating that content is accepted. The second hardware signal causes the first hardware signal to be deactivated. This exchange of hardware signals demarcates a FIFO transaction, which is mediated by interface circuitry of the FIFO.
-
公开(公告)号:US20180067889A1
公开(公告)日:2018-03-08
申请号:US15256936
申请日:2016-09-06
Applicant: Oracle International Corporation
Inventor: David A. Brown , Daniel Fowler , Rishabh Jain , Erik Schlanger , Michael Duller
CPC classification number: G06F13/4221 , G06F1/3243 , Y02D10/152
Abstract: Techniques are provided for exchanging dedicated hardware signals to manage a first-in first-out (FIFO). In an embodiment, a first processor initiates content transfer into the FIFO. The first processor activates a first hardware signal that is reserved for indicating that content resides within the FIFO. A second processor activates a second hardware signal that is reserved for indicating that content is accepted. The second hardware signal causes the first hardware signal to be deactivated. This exchange of hardware signals demarcates a FIFO transaction, which is mediated by interface circuitry of the FIFO.
-
公开(公告)号:US11868628B2
公开(公告)日:2024-01-09
申请号:US17663280
申请日:2022-05-13
Applicant: Oracle International Corporation
Inventor: Rishabh Jain , Erik M. Schlanger
CPC classification number: G06F3/0631 , G06F3/0604 , G06F3/067 , G06F9/526 , G06F9/547 , G06F15/17331 , G06F15/7825
Abstract: A hardware-assisted Distributed Memory System may include software configurable shared memory regions in the local memory of each of multiple processor cores. Accesses to these shared memory regions may be made through a network of on-chip atomic transaction engine (ATE) instances, one per core, over a private interconnect matrix that connects them together. For example, each ATE instance may issue Remote Procedure Calls (RPCs), with or without responses, to an ATE instance associated with a remote processor core in order to perform operations that target memory locations controlled by the remote processor core. Each ATE instance may process RPCs (atomically) that are received from other ATE instances or that are generated locally. For some operation types, an ATE instance may execute the operations identified in the RPCs itself using dedicated hardware. For other operation types, the ATE instance may interrupt its local processor core to perform the operations.
-
公开(公告)号:US10055358B2
公开(公告)日:2018-08-21
申请号:US15074248
申请日:2016-03-18
Applicant: Oracle International Corporation
Inventor: David A. Brown , Rishabh Jain , Sam Idicula , Erik Schlanger , David Joseph Hawkins
CPC classification number: G06F12/1081 , G06F9/30036 , G06F9/3455 , G06F12/02 , G06F16/221 , G06F2212/656
Abstract: Techniques are described herein for efficient movement of data from a source memory to a destination memory. In an embodiment, in response to a particular memory location being pushed into a first register within a first register space, the first set of electronic circuits accesses a descriptor stored at the particular memory location. The descriptor indicates a width of a column of tabular data, a number of rows of tabular data, and one or more tabular data manipulation operations to perform on the column of tabular data. The descriptor also indicates a source memory location for accessing the tabular data and a destination memory location for storing data manipulation result from performing the one or more data manipulation operations on the tabular data. Based on the descriptor, the first set of electronic circuits determines control information indicating that the one or more data manipulation operations are to be performed on the tabular data and transmits the control information, using a hardware data channel, to a second set of electronic circuits to perform the one or more operations. Based on the control information, the second set of electronic circuits retrieve the tabular data from source memory location and apply the one or more data manipulation operations to generate the data manipulation result. The second set of electronic circuits cause the data manipulation result to be stored at the destination memory location.
-
公开(公告)号:US10783102B2
公开(公告)日:2020-09-22
申请号:US15290357
申请日:2016-10-11
Applicant: Oracle International Corporation
Inventor: David Brown , Rishabh Jain , David Hawkins
Abstract: Techniques are provided for configuring and operating hardware to sustain real-time hashing throughput. In an embodiment, during a first set of clock cycles, a particular amount of data items of a first data column are transferred into multiple hash lanes. During a second set of clock cycles, the same particular amount of data items of a second data column are transferred into the hash lanes. The transferred data items of the first and second data columns are then processed to calculate a set of hash values. When combined with techniques such as pipelining and horizontal scaling, the loading, hashing, and other processing occur in real time at the full speed of the underlying data path. For example, hashing throughput may sustainably equal or exceed the throughput of main memory.
-
公开(公告)号:US10176114B2
公开(公告)日:2019-01-08
申请号:US15362693
申请日:2016-11-28
Applicant: Oracle International Corporation
Inventor: David A. Brown , Sam Idicula , Erik Schlanger , Rishabh Jain , Michael Duller
IPC: G06F12/02 , G06F12/1081 , G06F13/28
Abstract: Techniques provide for hardware accelerated data movement between main memory and an on-chip data movement system that comprises multiple core processors that operate on the tabular data. The tabular data is moved to or from the scratch pad memories of the core processors. While the data is in-flight, the data may be manipulated by data manipulation operations. The data movement system includes multiple data movement engines, each dedicated to moving and transforming tabular data from main memory data to a subset of the core processors. Each data movement engine is coupled to an internal memory that stores data (e.g. a bit vector) that dictates how data manipulation operations are performed on tabular data moved from a main memory to the memories of a core processor, or to and from other memories. The internal memory of each data movement engine is private to the data movement engine. Tabular data is efficiently copied between internal memories of the data movement system via a copy ring that is coupled to the internal memories of the data movement system and/or is coupled to a data movement engine. Also, a data movement engine internally broadcasts data to other data movement engines, which then transfer the data to respective core processors. Partitioning may also be performed by the hardware of the data movement system. Techniques are used to partition data “in flight”. The data movement system also generates a column of row identifiers (RIDs). A row identifier is a number treated as identifying a row or element's position within a column. Row identifiers each identifying a row in column are also generated.
-
9.
公开(公告)号:US10061714B2
公开(公告)日:2018-08-28
申请号:US15073905
申请日:2016-03-18
Applicant: Oracle International Corporation
Inventor: David A. Brown , Rishabh Jain , Michael Duller , Sam Idicula , Erik Schlanger , David Joseph Hawkins
CPC classification number: G06F12/1081 , G06F9/30105 , G06F12/023 , G06F13/28 , G06F2212/1044 , Y02D10/14
Abstract: Techniques are described herein for efficient movement of data from a source memory to a destination memory. In an embodiment, in response to a particular memory location being pushed into a first register within a first register space, the first set of electronic circuits accesses a descriptor stored at the particular memory location. The descriptor indicates a width of a column of tabular data, a number of rows of tabular data, and one or more tabular data manipulation operations to perform on the column of tabular data. The descriptor also indicates a source memory location for accessing the tabular data and a destination memory location for storing data manipulation result from performing the one or more data manipulation operations on the tabular data. Based on the descriptor, the first set of electronic circuits determines control information indicating that the one or more data manipulation operations are to be performed on the tabular data and transmits the control information, using a hardware data channel, to a second set of electronic circuits to perform the one or more operations. Based on the control information, the second set of electronic circuits retrieve the tabular data from source memory location and apply the one or more data manipulation operations to generate the data manipulation result. The second set of electronic circuits cause the data manipulation result to be stored at the destination memory location.
-
公开(公告)号:US20180101530A1
公开(公告)日:2018-04-12
申请号:US15290357
申请日:2016-10-11
Applicant: Oracle International Corporation
Inventor: David Brown , Rishabh Jain , David Hawkins
CPC classification number: G06F13/28 , G06F11/1004
Abstract: Techniques are provided for configuring and operating hardware to sustain real-time hashing throughput. In an embodiment, during a first set of clock cycles, a particular amount of data items of a first data column are transferred into multiple hash lanes. During a second set of clock cycles, the same particular amount of data items of a second data column are transferred into the hash lanes. The transferred data items of the first and second data columns are then processed to calculate a set of hash values. When combined with techniques such as pipelining and horizontal scaling, the loading, hashing, and other processing occur in real time at the full speed of the underlying data path. For example, hashing throughput may sustainably equal or exceed the throughput of main memory.
-
-
-
-
-
-
-
-
-