Common Mistake in Using OpenCL (1): Missing Vector Type When Constructing Vectors from Scalars
[Following the style of my 'Common Mistakes in Using OpenMP' series, I am starting the new 'Common Mistakes in Using OpenCL'.]
In OpenCL, one can construct a vector from a set of scalars or vectors by writing a vector type followed by a parenthesized set of expressions. For example,
float4 fv4 = float4(1.0f, 2.0f, 3.0f, 4.0f);
int2 x = int2(1, 2);
int2 y = int2(3, 4);
int4 iv4 = int4(x, y);
The vector type in front of the parenthesis is important. Take a look at the following example.
int4 y = int4(10, 10, 10, 10);
int4 z = (1, 2, 3, 4) + y;
// z equals to int4(14, 14, 14, 14) at this point.
It took quite a while for a colleague and me to figure out why the vector ‘z’ was getting value int4(14, 14, 14, 14) after the second assignment.
Without a vector type, (1, 2, 3, 4) is a comma expression (OpenCL spec 1.0 rev 43, p.145 item l). Just like in C, the expression is evaluated from left to right and gets the value 4. Then the addition becomes a binary operator with a scalar operand and a vector operand. According to the promotion rule on p. 142 (item a), the scalar 4 is widened to vector int4(4, 4, 4, 4). Therefore int4(14, 14, 14, 14) is the sum!

