Backprop and SGD From Scratch 2022-10-13

[[my backprop SGD from scratch 2022-Aug]] 13:16 so per yesterday, wondering why is it that the network I have is producing basically the same result , around 0.48 for any inputs. And that's true both in my original matrix-multiplication code and manually constructed too. So lets say for a simple network, y_prob = sigmoid(x1*w1 + x2*w2) where x1 and x2 are also outputs of sigmoids, in (0, 1) , what are possible values for y_prob ?...

October 13, 2022 · (updated February 26, 2023) · 2 min · 288 words · Michal Piekarczyk

Back prop from scratch 2022-10-12

ok [[my backprop SGD from scratch 2022-Aug]] looking over results from last time, indeed so strange how microbatch loss was going back and forth and eventually trending that the plot of my change in loss, is increasing. deltas = [x["loss_after"] - x["loss_before"] for x in metrics["micro_batch_updates"]] although initially the values were some negatives, as well. But I wonder does it indeed something is terribly wrong if this number ever goes up at all?...

October 12, 2022 · (updated February 26, 2023) · 7 min · 1280 words · Michal Piekarczyk

Not sure how I managed to catch my bus to DC this morning

I have no idea how I managed to get on my bus to DC this morning. My Citibike station I planned on, was full, then my phone is Unavailable until like 10:42 . Bus boards at 11:00am . Still got to find a citibike dock to park the bike, because the stolen citibike fee is $1,250. At 10:42 of course I open phone and have absolutely no cell service. The citibike station does not have a printed map on it....

October 6, 2022 · (updated March 4, 2023) · 2 min · 303 words · Michal Piekarczyk

Back prop from scratch 2022-10-02

my backprop SGD from scratch 2022-Aug 14:13 ok reviewing from last time , Yea so I had switched from relu to sigmoid on commit b88ef76daf , but yea log loss is still going up during training, so for sure got rid of the bug of how it did not make sense to map that relu output to a sigmoid since a relu only produces positive numbers and so the sigmoid therefore was only able to produce values greater than 0....

October 2, 2022 · (updated February 26, 2023) · 4 min · 811 words · Michal Piekarczyk

Backprop and SGD From Scratch 2022-09-25

[[my back prop SGD from scratch 2022-Aug]] 13:35 yea so last time I had noticed , hey on a random initialization why was the y_prob 0.5 ? I had literally just initialized a new network and got this first example, ipdb> p x, y (array([10.31816265, -8.80044688]), 1) while running the ipdb debug mode, and inside of train_network() , ran --> 186 y_prob = feed_forward(x, model.layers, verbose=False) and got ipdb> p y_prob 0....

September 25, 2022 · (updated February 26, 2023) · 4 min · 683 words · Michal Piekarczyk