{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Name: **Your name here** \n", "UID: **Your student ID num here**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Optional Homework: MCMC \n", "In this homework you will create a loss function for a logistic regression. Unlike your previous homeworks, where you \"solved\" for the optimal regression parameters using gradient optimization, in this assignment you create a confidence interval for the slope of the separation line between two classes." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from utility import *\n", "import numpy as np\n", "from numpy.random import randn, rand\n", "import matplotlib.pyplot as plt\n", "np.random.seed(0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create a classification problem in two dimensions\n", "The two classes will be separated by the line\n", " $$w^Tx = 0$$\n", "where $w$ is a 2-vector. The slope of this line is given by $m=-w[0]/w[1]$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create a matrix of data points and a vector of labels\n", "X, y = create_classification_problem(100, 2, cond_number=3)\n", "\n", "# Define the logistic loss function, and its gradient\n", "nll = lambda w: logreg_objective(w,X,y)\n", "\n", "# An initial guess of the minimizer (may not be close to center of distribution)\n", "# Note: I'm choosing a \"bad\" initial guess to produce burn-in samples for instructional purposes\n", "w_guess = np.array([[-10],[10]]) \n", "\n", "# Test the negative log likelihood function\n", "f = nll(w_guess)\n", "print('The NLL of the initial guess is ', f)\n", "ind = y.ravel()==1\n", "plt.scatter(X[ind,0], X[ind,1], color='blue')\n", "plt.scatter(X[~ind,0], X[~ind,1], color='red')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Generate many samples from the posterios distribution\n", "Note: the NLL function above generates $-\\log(p(w)).$ \n", "\n", "**You will have to fill in the formula for the acceptance probability, alpha.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "iters = 5000 # number of MCMC samples to draw\n", "sigma = 3 # sigma for the Guassian proposal distribution\n", "\n", "# Counters to keep track of how many rejected and accepted proposals there have been \n", "reject_count=0;\n", "accept_count=0;\n", "\n", "# Arrays to store all the iterates be produced\n", "samps = np.zeros((iters,2)) # The samples of w from the distribution\n", "slopes = np.zeros((iters,1)) # The slopes of the samples\n", "nlls = np.zeros((iters,1)) # The NLL values of the samples\n", "\n", "# Run the Metropolis sampler \n", "w = w_guess\n", "for i in range(iters):\n", " # Make a proposal\n", " wp = w+sigma*randn(2,1) \n", " \n", " # The acceptance probability\n", " alpha = ######## FiLL IN THIS LINE OF CODE #######\n", " \n", " # Should you accept this sample?\n", " if rand()