I am writing a program to do neural network in python I am trying to set up the backpropagation algorithm. The basic idea is that I look through 5,000 training examples and collect the errors and find out in which direction I need to move the thetas and then move them in that direction. There are the training examples, then I use one hidden layer, and then an output layer. However I am getting the gradient/derivative/error wrong here because I am not moving the thetas correct as they need to be moved. I put 8 hours into this today not sure what I'm doing wrong. Thanks for your help!!
x = 401x5000 matrixy = 10x5000 matrix # 10 possible output classes, so one column will look like [0, 0, 0, 1, 0... 0] to indicate the output class was 4theta_1 = 25x401theta_2 = 10x26alpha=.01 sigmoid= lambda theta, x: 1 / (1 + np.exp(-(theta*x))) #move thetas in right direction for each iteration for iter in range(0,1): all_delta_1, all_delta_2 = 0, 0 #loop through each training example, 1...m for t in range(0,5000): hidden_layer = np.matrix(np.concatenate((np.ones((1,1)),sigmoid(theta_1,x[:,t])))) output_layer = sigmoid(theta_2,hidden_layer) delta_3 = output_layer - y[:,t] delta_2= np.multiply((theta_2.T*delta_3),(np.multiply(hidden_layer,(1-hidden_layer)))) #print type(delta_3), delta_3.shape, type(hidden_layer.T), hidden_layer.T.shape all_delta_2 += delta_3*hidden_layer.T all_delta_1 += delta_2[1:]*x[:,t].T delta_gradient_2 = (all_delta_2 / m) delta_gradient_1 = (all_delta_1 / m) theta_1 = theta_1- (alpha * delta_gradient_1) theta_2 = theta_2- (alpha * delta_gradient_2)