Generating local addresses and communication sets for data-parallel
programs
    1.
    发明授权
    Generating local addresses and communication sets for data-parallel programs 失效
    生成数据并行程序的本地地址和通讯组

    公开(公告)号:US5450313A

    公开(公告)日:1995-09-12

    申请号:US217404

    申请日:1994-03-24

    IPC分类号: G06F9/45 G06F15/16

    CPC分类号: G06F8/447 G06F8/45

    摘要: An optimizing compilation process generates executable code which defines the computation and communication actions that are to be taken by each individual processor of a computer having a distributed memory, parallel processor architecture to run a program written in a data-parallel language. To this end, local memory layouts of the one-dimensional and multidimensional arrays that are used in the program are derived from one-level and two-level data mappings consisting of alignment and distribution, so that array elements are laid out in canonical order and local memory space is conserved. Executable code then is generated to produce at program run time, a set of tables for each individual processor for each computation requiring access to a regular section of an array, so that the entries of these tables specify the spacing between successive elements of said regular section resident in the local memory of said processor, and so that all the elements of said regular section can be located in a single pass through local memory using said tables. Further executable code is generated to produce at program run time, another set of tables for each individual processor for each communication action requiring a given processor to transfer array data to another processor, so that the entries of these tables specify the identity of a destination processor to which the array data must be transferred and the location in said destination processor's local memory at which the array data must be stored, and so that all of said array data can be located in a single pass through local memory using these communication tables. And, executable node code is generated for each individual processor that uses the foregoing tables at program run time to perform the necessary computation and communication actions on each individual processor of the parallel computer.

    摘要翻译: 优化编译过程产生可执行代码,其定义将由具有分布式存储器的计算机的每个单独处理器采取的计算和通信动作,并行处理器架构来运行以数据并行语言编写的程序。 为此,在程序中使用的一维和多维数组的本地存储器布局是从由对齐和分布组成的一级和两级数据映射导出的,因此数组元素以规范顺序排列, 本地存储空间是保守的。 然后生成可执行代码以在程序运行时产生用于每个单独处理器的一组表,用于需要访问数组的常规部分的每个计算,使得这些表的条目指定所述常规部分的连续元素之间的间隔 驻留在所述处理器的本地存储器中,并且使得所述常规部分的所有元素可以位于通过使用所述表的本地存储器的单次传递中。 生成进一步的可执行代码以在程序运行时产生用于每个单独处理器的另一组表,用于每个通信动作,要求给定处理器将阵列数据传送到另一个处理器,以便这些表的条目指定目标处理器的标识 必须传送数组数据以及必须存储阵列数据的所述目的地处理器的本地存储器中的位置,并且使得所有所述阵列数据可以位于通过使用这些通信表的本地存储器的单次传递中。 并且,对于在程序运行时使用上述表的每个单独处理器生成可执行节点代码,以在并行计算机的每个单独处理器上执行必要的计算和通信动作。

    Method of compilation optimization using an N-dimensional template for
relocated and replicated alignment of arrays in data-parallel programs
for reduced data communication during execution
    2.
    发明授权
    Method of compilation optimization using an N-dimensional template for relocated and replicated alignment of arrays in data-parallel programs for reduced data communication during execution 失效
    使用N维模板进行编译优化的方法,用于数据并行程序中数组的重定位和复制对齐,用于在执行期间进行减少的数据通信

    公开(公告)号:US5475842A

    公开(公告)日:1995-12-12

    申请号:US104755

    申请日:1993-08-11

    CPC分类号: G06F8/453

    摘要: When a data-parallel language like Fortran 90 is compiled for a distributed-memory machine, aggregate data objects (such as arrays) are distributed across the processor memories. The mapping determines the amount of residual communication needed to bring operands of parallel operations into alignment with each other. A common approach is to break the mapping into two stages: first, an alignment that maps all the objects to an abstract template, and then a distribution that maps the template to the processors. This disclosure deals with two facets of the problem of finding alignments that reduce residual communication; namely, alignments that vary in loops, and objects that permit of replicated alignments. It is shown that loop-dependent dynamic alignment is sometimes necessary for optimum performance, and algorithms are provided so that a compiler can determine good dynamic alignments for objects within "do" loops. Also situations are identified in which replicated alignment is either required by the program itself (via spread operations) or can be used to improve performance. An algorithm based on network flow is proposed for determing which objects to replicate so as to minimize the total amount of broadcast communication in replication.

    摘要翻译: 当为分布式存储器机器编译Fortran 90的数据并行语言时,聚合数据对象(例如阵列)分布在处理器存储器中。 映射确定使并行操作的操作数彼此对准所需的剩余通信量。 一种常见的方法是将映射分为两个阶段:首先,将所有对象映射到抽象模板,然后将模板映射到处理器的分布。 本公开涉及寻找减少残余通信的对齐问题的两个方面; 即循环中不同的对齐,以及允许复制对齐的对象。 显示了循环相关的动态对齐有时是最佳性能所必需的,并且提供了算法,以便编译器可以确定“do”循环内对象的良好动态对齐。 还可以确定复制对齐是程序本身需要的(通过扩展操作)或可用于提高性能的情况。 提出了一种基于网络流的算法,用于确定要复制的对象,以便最小化复制中广播通信的总量。

    Optical detector for a particle sorting system
    8.
    发明授权
    Optical detector for a particle sorting system 有权
    用于粒子分选系统的光学检测器

    公开(公告)号:US07298478B2

    公开(公告)日:2007-11-20

    申请号:US10915016

    申请日:2004-08-09

    IPC分类号: G01N21/00

    摘要: An optical system for acquiring fast spectra from spatially channel arrays includes a light source for producing a light beam that passes through the microfluidic chip or the channel to be monitored, one or more lenses or optical fibers for capturing the light from the light source after interaction with the particles or chemicals in the microfluidic channels, and one or more detectors. The detectors, which may include light amplifying elements, detect each light signal and transducer the light signal into an electronic signal. The electronic signals, each representing the intensity of an optical signal, pass from each detector to an electronic data acquisition system for analysis. The light amplifying element or elements may comprise an array of phototubes, a multianode phototube, or a multichannel plate based image intensifier coupled to an array of photodiode detectors.

    摘要翻译: 用于从空间通道阵列获取快速光谱的光学系统包括用于产生穿过微流体芯片或待监测通道的光束的光源,用于在相互作用之后捕获来自光源的光的一个或多个透镜或光纤 微流体通道中的颗粒或化学物质,以及一个或多个检测器。 可以包括光放大元件的检测器检测每个光信号并将光信号转换成电子信号。 每个表示光信号强度的电子信号从每个检测器传递到电子数据采集系统进行分析。 光放大元件可以包括耦合到光电二极管检测器阵列的光管阵列,多子光电管或基于多通道板的图像增强器。