Posted: April 1, 2017
Due: April 21, 2017
Last Update: March 23, 2017

Gradient Descent

Problem 1 Implement the gradient descent algorithm (either batch or stochastic versions) for multiple linear regression. I.e., extend the version of the algorithm in the lecture notes to multiple parameters.

The gradient descent update equation for logistic regression is given by:

\[ \beta^{k+1} = \beta^k + \alpha \sum_{i=1}^{n} (y_i - p_i(\beta^k))\mathbf{x_i} \]

where (from the definition of log-odds):

\[ p_i(\beta^k) = \frac{e^{f_i(\beta^k)}}{1+e^{f_i(\beta^k)}} \]

and \(f_i(\beta^k) = \beta_0^k + \beta_1^k x_{i1} + \beta_2^k x_{i2} + \cdots + \beta_p^k x_{ip}\).

Problem 2 Derive the above update equation. Write the derivation in a markdown ipynb cell.

Problem 3 Implement the gradient descent algorithm (either batch or stochastic versions) for multiple logistic regression. I.e., modify your code in problem 1 for the logistic regression update equation.

Make sure you include in your submission writeup, which version of the algorithm you are solving (stochastic or batch), and make sure to comment your code to help us understand your implementation.

Problem 4 To test your programs, simulate data from the linear regression and logistic regression models and check that your implementations recover the simulation parameters properly.

Use the following functions to simulate data for your testing:

# simulate data for linear regression

gen_data_x, gen_data_y = sklearn.datasets.make_regression(n_samples=100, n_features=20, noise = 1.5)

#simulate data for logistic regression.  This is similar to linear, only now values are either 0 or 1.  
log_gen_data_x, dump_y = sklearn.datasets.make_regression(n_samples=100, n_features=20, noise = 1.5)
log_gen_data_y = [0 if i>0 else 1 for i in dump_y]}

You can use this function as follows in your submission:

# a really bad estimator
# returns random vector as estimated parameters
dummy = np.ndarray([100, 20])
for index, row in enumerate(dummy):
    dummy[index] = np.random.normal(0, .1, 20)
plt.plot(gen_data_x, dummy)

Include a similar plot in your writeup and comment on how your gradient descent implementation is working.

Try it out

  1. Find a dataset on which to try out different classification (or regression) algorithms.

  2. Choose two of the following algorithms:

  1. Linear Discriminant Analysis (LDA) (only classification)
  2. classification (or regression) trees,
  3. random forests
  4. linear SVM,
  5. non-linear SVM
  6. k-NN classification (or regression)

and compare their prediction performance on your chosen dataset to your logistic regression gradient descent implementation using 10-fold cross-validation and a paired \(t\)-test (one for each of the two algorithms vs. your logistic regression code). Note: for those algorithms that have hyper-parameters, i.e., all of the above except for LDA, you need to specify in your writeup which model selection procedure you used.

Handing in:

  1. For Problems 1 and 3 include your code in the writeup. Make sure they are commented and that the code is readable in your final writeup (e.g., check line widths).

  2. For Problem 2, include the derivation of the gradient descent update in the writeup

  3. For Problem 4, make sure you run the provided code and include the output in the writeup.

  4. For the next section organize your writeup as follows:

  1. Describe the dataset you are using, including: what is the outcome you are predicting (remember this should be a classification task) and what are the predictors you will be using.

  2. Include code to obtain and prepare your data as a dataframe to use with your three classification algorithms. In case your dataset includes non-numeric predictors, include the code you are using to transform these predictors into numeric predictors you can use with your logistic regression implementation.

  3. Specify the two additional algorithms you have chosen in part (b), and for algorithms that have hyper-parameters specify the method you are using for model selection.

  4. Include all code required to perform the 10-fold cross-validation procedure on your three algorithms.

  5. Writeup the result of your 10-fold cross-validation procedure. Make sure to report the 10-fold CV error estimate (with standard error) of each of the three algorithms. Also report on the result of the two paired \(t\)-tests comparing your logistic regression algorithm with your chosen two algorithms.

Submit to ELMS as Project 3: https://myelms.umd.edu/courses/1218364/assignments/4389209.

Web Accessibility