-
1.
公开(公告)号:US20130246712A1
公开(公告)日:2013-09-19
申请号:US13886467
申请日:2013-05-03
申请人: WEI LIU , YOUFENG WU , CHRISTOPHER WILKERSON , HERBERT HUM
发明人: WEI LIU , YOUFENG WU , CHRISTOPHER WILKERSON , HERBERT HUM
IPC分类号: G06F12/08
CPC分类号: G06F12/0888 , G06F8/4442 , G06F9/30043 , G06F9/3826 , G06F9/383 , Y02D10/13
摘要: Various embodiments of the invention concern methods and apparatuses for power and time efficient load handling. A compiler may identify producer loads, consumer reuse loads, consumer forwarded loads, and producer/consumer hybrid loads. Based on this identification, performance of the load may be efficiently directed to a load value buffer, store buffer, data cache, or elsewhere. Consequently, accesses to cache are reduced, through direct loading from load value buffers and store buffers, thereby efficiently processing the loads.
摘要翻译: 本发明的各种实施例涉及用于功率和时间有效的负载处理的方法和装置。 编译器可以识别生产者负载,消费者重用负载,消费者转发负载以及生产者/消费者混合负载。 基于该识别,可以将负载的性能有效地指向负载值缓冲器,存储缓冲器,数据高速缓存或其他位置。 因此,通过从负载值缓冲区和存储缓冲区的直接加载,减少对高速缓存的访问,从而有效地处理负载。
-
公开(公告)号:US20150186183A1
公开(公告)日:2015-07-02
申请号:US14143576
申请日:2013-12-30
申请人: NALINI VASUDEVAN , YOUFENG WU , CHENG WANG , SARA BAGHSORKHI , ALBERT HARTONO
发明人: NALINI VASUDEVAN , YOUFENG WU , CHENG WANG , SARA BAGHSORKHI , ALBERT HARTONO
CPC分类号: G06F9/30036 , G06F9/30043 , G06F9/3824 , G06F9/3834
摘要: A processor includes a decoder to decode an instruction, a scheduler to schedule the instruction, and an execution unit to execute the instruction. The instruction is to load a memory operation applicable to a quantity of addresses into an execution vector. The execution vector includes a plurality of vector positions for respective addressees. The instruction is further to evaluate, for a given address in the execution vector at a vector position, whether a cache indicates that a previous memory operation was performed at a higher vector position than the vector position of the given address. The instruction is also to determine, based on the evaluation whether the cache indicates that the previous memory operation was performed at a higher vector position than the vector position of the given address, whether the memory operation will cause a memory error.
摘要翻译: 处理器包括解码指令的解码器,调度指令的调度器以及执行指令的执行单元。 该指令是将适用于一定数量的地址的存储器操作加载到执行向量中。 执行向量包括用于各个收件人的多个向量位置。 该指令进一步评估对于向量位置处的执行向量中的给定地址,缓存是否指示在比给定地址的向量位置更高的向量位置执行先前的存储器操作。 该指令还基于评估来确定缓存是否指示在比给定地址的向量位置更高的向量位置执行先前的存储器操作,该存储器操作是否将引起存储器错误。
-
公开(公告)号:US20160092234A1
公开(公告)日:2016-03-31
申请号:US14497833
申请日:2014-09-26
申请人: NALINI VASUDEVAN , CHENG WANG , YOUFENG WU , ALBERT HARTONO , SARA S. BAGHSORKHI
发明人: NALINI VASUDEVAN , CHENG WANG , YOUFENG WU , ALBERT HARTONO , SARA S. BAGHSORKHI
IPC分类号: G06F9/38
CPC分类号: G06F9/3842 , G06F9/30032 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/3013 , G06F9/30174 , G06F9/3824 , G06F9/3834 , G06F9/3838 , G06F15/8053
摘要: An apparatus and method for speculative vectorization. For example, one embodiment of a processor comprises: a queue comprising a set of locations for storing addresses associated with vectorized memory access instructions; and execution logic to execute a first vectorized memory access instruction to access the queue and to compare a new address associated with the first vectorized memory access instruction with existing addresses stored within a specified range of locations within the queue to detect whether a conflict exists, the existing addresses having been previously stored responsive to one or more prior vectorized memory access instructions.
摘要翻译: 一种用于推测矢量化的装置和方法。 例如,处理器的一个实施例包括:队列,其包括用于存储与向量化存储器访问指令相关联的地址的一组位置; 以及执行逻辑,以执行第一向量化存储器访问指令以访问队列,并将与第一向量化存储器访问指令相关联的新地址与存储在队列内的指定范围内的现有地址进行比较,以检测冲突是否存在, 先前已存储的存储的地址响应于一个或多个先前的向量化存储器访问指令而被存储。
-
公开(公告)号:US20180018177A1
公开(公告)日:2018-01-18
申请号:US15653403
申请日:2017-07-18
申请人: NALINI VASUDEVAN , CHENG WANG , YOUFENG WU , ALBERT HARTONO , SARA S. BAGHSORKHI
发明人: NALINI VASUDEVAN , CHENG WANG , YOUFENG WU , ALBERT HARTONO , SARA S. BAGHSORKHI
CPC分类号: G06F9/3842 , G06F9/30032 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/3013 , G06F9/30174 , G06F9/3824 , G06F9/3834 , G06F9/3838 , G06F9/384 , G06F15/8053
摘要: An apparatus and method for speculative vectorization. For example, one embodiment of a processor comprises: a queue comprising a set of locations for storing addresses associated with vectorized memory access instructions; and execution logic to execute a first vectorized memory access instruction to access the queue and to compare a new address associated with the first vectorized memory access instruction with existing addresses stored within a specified range of locations within the queue to detect whether a conflict exists, the existing addresses having been previously stored responsive to one or more prior vectorized memory access instructions.
-
公开(公告)号:US20160188392A1
公开(公告)日:2016-06-30
申请号:US14582430
申请日:2014-12-24
申请人: SARA S. BAGHSORKHI , ALBERT HARTONO , YOUFENG WU , CHENG WANG
发明人: SARA S. BAGHSORKHI , ALBERT HARTONO , YOUFENG WU , CHENG WANG
IPC分类号: G06F11/07
CPC分类号: G06F9/50 , G06F9/30 , G06F9/3834 , G06F9/3838
摘要: The present disclosure is directed to fast approximate conflict detection. A device may comprise, for example, a memory, a processor and a fast conflict detection module (FCDM) to cause the processor to perform fast conflict detection. The FCDM may cause the processor to read a first and second vector from memory, and to then generate summaries based on the first and second vectors. The summaries may be, for example, shortened versions of write and read addresses in the first and second vectors. The FCDM may then cause the processor to distribute the summaries into first and second summary vectors, and may then determine potential conflicts between the first and second vectors by comparing the first and second summary vectors. The summaries may be distributed into the first and second summary vectors in a manner allowing all of the summaries to be compared to each other in one vector comparison transaction.
摘要翻译: 本公开涉及快速近似冲突检测。 设备可以包括例如存储器,处理器和快速冲突检测模块(FCDM),以使处理器执行快速冲突检测。 FCDM可以使处理器从存储器读取第一和第二矢量,然后基于第一和第二矢量生成汇总。 摘要可以是例如第一和第二向量中的写入和读取地址的缩写版本。 然后,FCDM可以使处理器将摘要分发到第一和第二摘要向量中,然后可以通过比较第一和第二概括向量来确定第一和第二向量之间的潜在冲突。 总结可以以允许在一个向量比较事务中将所有概要相互比较的方式分发到第一和第二摘要向量中。
-
-
-
-