AMSC 662 / CMSC 662 Fall 2013

Frequently Asked Questions for Homework 5

Two typos in the formula following equation(1) were fixed on 10-29-13: subscripts "i" were changed to "j".

Question: 11-07-13 In Part 2b, when we determine cycles per element, is it cycles per element of the t vector, or cycles per element of tj vector?

Answer: Cycles per element of the t vector, k elements in all.

Question: 11-07-13 For 1d, should students base the calculation based off the definition of L_l, which involves a quotient and a product for each value of j?

Answer: Yes, it says "from its definition, without any reordering of operations".

Question: Tyler has been telling students that any "code motion" that is in Lagrange2.c should also be in Lagrange.c so it does not contribute to any measured speedups.

Answer: Yes, any code motion done for Lagrange2.c should be done, if possible, for Lagrange.c, too.

Question: 11-09-13 When I try to obtain timing data for Lagrange.c I am always presented with an elapsed time of 0.000000 seconds. It is only by putting in statements which print the value of the loop indices that the program is slowed down enough to get timing information. However, that seems kludgy and doesn't provide insight into what may really improve as I try different optimizations.

What are your thoughts on this?

Answer: You are correct -- print statements increase the elapsed time without giving insight.

Instead, enclose what you want to time (the call to Lagrange) in a loop:

for (i = 0; i < S; i++)
Lagrange ...

Time the loop and divide the resulting time by S.

Choose S so that the time is on the order of a second or so. This should give you good precision in the timings.

This is mentioned briefly in Homework 3, p.2.

Question: 11-11-13 When calculating cycles per element, should I include memory read/writes and loading and saving floating point numbers in the CPE analysis?

Answer: Because of the long latency of floating point operations, these usually do not affect CPEs. They can occur in parallel with the arithmetic.

So I expect you will get the same answer whether you include them or not.

Question:

Answer: