This page gives a summary of how to use A-ZPL's new "Grid dimension" feature. This is a means of localizing data on a per-processor basis, which provides power and flexibility not available in traditional ZPL. If you find any bugs or inconsistencies in our implementation of Grid dimensions, pleae let us know at email@example.com.
ZPL regions support both standard (
floodable dimensions (
*). A-ZPL regions support an
additional dimension type known as a Grid dimension
::). Grid dimensions cause the allocation of a
single element per processor in that dimension of the processor
grid (hence the name). So, for example, the region
[::,::] would result in a single element per
processor while the region
[::,1..n] would result in
a parallel row to be allocated per processor grid row. In many
respects, Grid dimensions are similar to floodable dimensions,
except that there is no constraint or semantic check that all
elements in the Grid dimension have identical values.
An element in a Grid dimension is conformable to any index owned by that processor. This allows Grid dimensions to interact with traditional arrays without associating an explicit numerical index with the Grid dimension.
Note that by their very nature, Grid dimensions allow the user to break the determinism that is present in ZPL programs across varying processor set sizes since different processors can now take different actions based on values contained in arrays declared over regions with Grid dimensions.
Here is an example Grid program, which will be explained below:
program test_grid; config var n:integer = 4; region R = [1..n,1..n]; Grid = [::,::]; var A:[R] integer; Tot:[Grid] integer; total:integer; procedure test_grid(); [R] begin -- initialize a normal array of integers and print A := (Index1-1)*n + Index2; writeln(A); -- initialize a local total and print both over R and [::,::] [Grid] Tot := 0; writeln(Tot); [Grid] writeln(Tot); -- do a local per-processor reduction and output it Tot += A; writeln(Tot); [Grid] writeln(Tot); -- do a full reduction over [R] of local values (incorrect) total := +<< Tot; writeln(total,"\n"); -- do a full reduction over [::] of local values (correct) [Grid] total := +<< Tot; writeln(total,"\n"); end;
This program creates a traditional 4x4 region "R" and then a region "Grid" whose dimensions are both Grid dimensions. It declares an array "A" over R, an array "Tot" over Grid and a traditional scalar "http://www.cs.washington.edu/research/zpl/otal."
The program begins by initializing A to contain the integers 1-16 in row-major order and printing it out.
Next, it zeroes Tot over the region Grid. This has the effect of zeroing out the unique value of Tot stored per processor. Note that this could also be achieved by assigning "Tot := 0;" over region [R], but that this would result in redundant assignments per processor, as each processor would reassign its single value of Tot for each element in R that it owned.
To show that Tot is conformable to R as well as Grid, it is printed out over both regions. Printing Tot over R causes each processor to print out its single (zero) value once per element of R that it owns, resulting in 16 zeroes being printed out in all. Printing Tot over Grid causes each processor to print out its single (zero) value.
Next, each processor adds values of A into Tot over region R. Since Tot has a single value per processor and is conformable with all of a processor's elements of R, this causes each processor to accumulate its local elements of A into Tot. Thus if we run on 4 processors in a 2x2 grid, each processor would add up the 2x2 subarray of A for which it's responsible. Once again, we print out Tot over both R and Grid, though in this case we'll get different values per processor since each processor owned unique values of A. For example, on one processor, we would see:
136 136 136 136 136 136 136 136 136 136 136 136 136 136 136 136 136
while a 4-processor run would result in:
14 14 22 22 14 14 22 22 46 46 54 54 46 46 54 54 14 22 46 54
Next, we perform a complete reduction over Tot to get the single total sum. The first reduction is performed incorrectly. Since it is being done over R rather than Grid, each processor will add in its value of Tot once per element in R. For example, in the 4-processor example above, the upper-left processor would add four 14 values together as its local contribution to the reduction. The incorrect total is printed out.
Next, a correct complete reduction is done over Tot, but this
time over the Grid region, which will cause each value to be
accumulated into the sum just once, as intended. The result of
this reduction will be equal to the value that would've been
achieved by simply doing a full reduction over region R in the
first place (
[R] total := +<< A).
While we are confident of the usefulness of Grid dimensions, the number of applications that we've used them in so far has been limited (due primarily to their youth). We'd be interested in hearing about your experiences using Grid dimensions, and any possible improvements to their semantics or implementation.
Grid dimensions were implemented quickly and have not seen much use as of yet, so please don't be surprised if you run into bugs (and please don't hesitate to report them to us at firstname.lastname@example.org). In addition, certain semantics of Grid have not yet been determined (for example, what would [R] Tot := Tot@east; mean in the above example?), so not everything has been implemented in this regard (and please let us know if you have any suggestions).
If you have further questions about Grid dimensions in A-ZPL, please don't hesitate to contact us at email@example.com.