Question: 09-28-10 What is the due date? How should I submit it?
Answer: This assignment is due by 9:30am on Tuesday October 19.
Question: 09-28-10 What are the point values for the challenges?
Challenge 1a : 2 points
Challenge 1b : 2 points
Challenge 2a : 10 points
Challenge 2b : 2 points
Challenge 3a : 5 points
Challenge 3b : 2 points
Challenge 4a : 7 points
Challenge 4b : 10 points
Question: 09-28-10 What should I submit?
Answer: We grade everyone's Challenge 1, then everyone's Challenge 2, etc., to help insure uniformity in grading, so I would really appreciate your help in making this easy.Grading notes:
Question: 09-29-10 When I look at the data files, I see trash.
Answer: The files are in Matlab format. Save the file, don't look at it. (Depending on your browser, you may need to right-click on the file in order to bring up the "save" option as an alternative to "view".) To load cisimatrix.mat into Matlab, type "load cisimatrix". Then type "whos" to see what you got, or look at the changes in storage on the right subwindow of the Matlab screen.
Question: 09-29-10 Do we need any Matlab toolboxes for this case study?
Correction: 10-01-10 In Algorithm 0.1, the matrix U has n columns, so "Let U = [u1, . . . , uk]" should say "Let U = [u1, . . . , un]." Similarly, on p.2, and in Challenge 0.1(b), we determine the columns of U by solving n least squares problems, not k. (These changes are marked in green on the posted copy.)
10-05-10 Question: I was wondering with the assignment what Matlab routines can we use? Specifically I am thinking of 2)a). Can we use qr(...), matrix multiplication etc or should everything be re written in loops?
Answer: Yes, use Mathworks functions (qr, norm, etc) and commands (matrix multiply, etc.) to make your life easier. These also tend to make the computations more efficient and introduce fewer bugs.
1. Remove the assumption that m is greater than or equal to n, since not all of the test matrices satisfy this assumption.
2. In Challenge 4, choose values of k between 1 and min(m,n)/4.
3. Stop NonNegApprox after 50 iterations if it has not converged.
10-08-10 In Challenge 4, the CISI data is taking a long time.
Answer: Use k = 20, 40, ... , 360. If your run takes much longer than an hour, you may cut the number of random trials in half.
Corrections to Challenge 2b:
1. In Challenge 2b, it is ok to do the multiplication count for a matrix A that has no zero entries. Doing it for a matrix with nz nonzeros is rather difficult.
2. Also in Challenge 2b, change "that that" to "that".
10-11-10 Question: If I try to do a RR-QR on the CISI dataset I get back an error message.
Answer: If you use Matlab's qr, apply it to full(S), where S is the matrix from the datafile.
10-11-10 Question: Can we add optional input arguments to NonNegApprox() for the convergence threshold and the maximum number of iterations?
10-14-10 Question: How efficient do my programs need to be?
Answer: They should not take an order of magnitude longer than necessary, so if you introduce an extra factor of n or m, you will probably lose a point or two. In particular, Matlab's pivoted qr will give you the right answer but it cannot be stopped after k steps.
10-15-10 Question: When I use 'svd' on the 'cisimatrix.mat', I get an error. I am able to use it on the other two matricies just fine. Should I not compute an SVD decomposition for the 'cisimatrix.mat' ?
Answer: Yes, it is fine to use svds on the large matrix. svd(full(A)) also works on the cisimatrix, but this is slow.
10-17-10 Question: If there are negative elements in the initial guess for the nonnegative matrix factorization, can we set them to zero?
10-18-10 Question: Is Matlab matrix multiplication left or right associative?
Answer: Matlab evaluates matrix expressions from left to right, so A*B*C is the same as (A*B)*C.
10-18-10 Question: When calculating the big-O of the number of multiplications, can we assume that m and n are much larger than k?
Answer: m and n can be assumed to be larger than k, but k could be a fixed fraction of min(m,n).