AMSC 662 / CMSC 662 Fall 2013

Frequently Asked Questions for Homework 6

Question: 11-11-13 I'm getting negative times. The problem seems to be that "temp = temp + index" is compiled to: add %esi,0x28(%esp) instead of keeping temp in a register.

Answer: Try using a higher optimization level in gcc. If that doesn't help, subtract off the time for an empty loop. It isn't as bad as it seems, though, since there are no cache misses in add %esi,0x28(%esp), so you can still work with the data.

Question: 11-18-13 For number 2, it says to create an ijk function - should that have all six permutations of ijk (ie ijk, jik, ...) or just ijk? Also, for creating the block matrices should we create block ijk?

Answer: I'm sorry that the instructions are not clear. Write two versions. Compare:

  • Unblocked i,j,k
  • The fastest blocked version you can build.
  • Really fast blocked versions might receive a bonus.

    Question: 11-19-13 I can't get a reasonable picture from cache.m

    Answer: I can't tell you what is going wrong on your particular machine. Make sure that you compile with the least optimization available (-O0). You can also glance at the assembly language code to make sure that the compiler isn't doing something unexpected. Also make sure that you know whether you have a 64-bit or 32-bit machine.

    If you despair of getting a reasonable picture for your machine, you may use cache.m generated by cache.c for my antiquated desktop machine.

    If you are having problems with cache.m, you can send it to Tyler. He offers to look at it and see if he sees anything to be fixed.

    Other hints from Tyler:

  • They need to manipulate the number a little bit since the result has units ns/4B and they are trying to recreate the throughput surface 6.43 in the book which has units MB/s.
  • Make sure they were not doing anything else on the computer they were testing on and make sure it was plugged in so it doesn't go into any power saving modes.
  • Question: 11-20-13 In problem 2, what type of data do the matrices contain? (Are A, B, C supposed to be int, long, float or double etc.?)

    Answer: I prefer double. Float is acceptable, if you have already completed the problem.