{"id":19150708,"url":"https://github.com/mantasu/cs231n","last_synced_at":"2025-04-06T03:06:44.027Z","repository":{"id":50321803,"uuid":"401074012","full_name":"mantasu/cs231n","owner":"mantasu","description":"Shortest solutions for CS231n 2021-2024","archived":false,"fork":false,"pushed_at":"2024-05-25T03:36:34.000Z","size":15162,"stargazers_count":306,"open_issues_count":4,"forks_count":64,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-30T02:04:46.650Z","etag":null,"topics":["computer-vision","convolutional-neural-networks","cs231n","deep-learning","pytorch","tensorflow","visual-recognition"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mantasu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-08-29T15:18:02.000Z","updated_at":"2025-03-29T02:05:29.000Z","dependencies_parsed_at":"2023-12-24T20:30:48.245Z","dependency_job_id":null,"html_url":"https://github.com/mantasu/cs231n","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mantasu%2Fcs231n","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mantasu%2Fcs231n/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mantasu%2Fcs231n/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mantasu%2Fcs231n/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mantasu","download_url":"https://codeload.github.com/mantasu/cs231n/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247427006,"owners_count":20937201,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","convolutional-neural-networks","cs231n","deep-learning","pytorch","tensorflow","visual-recognition"],"created_at":"2024-11-09T08:12:51.138Z","updated_at":"2025-04-06T03:06:43.986Z","avatar_url":"https://github.com/mantasu.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003eCS231n: Assignment Solutions\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\u003cb\u003eConvolutional Neural Networks for Visual Recognition\u003c/b\u003e\u003c/p\u003e\n\u003cp align=\"center\"\u003e\u003ci\u003eStanford - Spring 2021-2024\u003c/i\u003e\u003c/p\u003e\n\n## About\n### Overview\nSolutions for **CS231n** course assignments offered by Stanford University (Spring 2021-2024). Inline questions are explained in detail, the code is brief and commented (see examples below). From what I investigated, these should be the shortest code solutions (excluding open-ended challenges). In assignment 2, _DenseNet_ is used in _PyTorch_ notebook and _ResNet_ in _TensorFlow_ notebook. \n\n\u003e Check out the solutions for **[CS224n](https://github.com/mantasu/cs224n)**. They contain more comprehensive explanations than others.\n\n### Main sources (official)\n* [**Course page**](http://cs231n.stanford.edu/index.html)\n* [**Assignments**](http://cs231n.stanford.edu/assignments.html)\n* [**Lecture notes**](https://cs231n.github.io/)\n* [**Lecture videos** (2017)](https://www.youtube.com/playlist?list=PLC1qU-LWwrF64f4QKQT-Vg5Wr4qEE1Zxk)\n\n\u003cbr\u003e\n\n## Solutions\n### Assignment 1\n* [Q1](assignment1/knn.ipynb): k-Nearest Neighbor classifier. (_Done_)\n* [Q2](assignment1/svm.ipynb): Training a Support Vector Machine. (_Done_)\n* [Q3](assignment1/softmax.ipynb): Implement a Softmax classifier. (_Done_)\n* [Q4](assignment1/two_layer_net.ipynb): Two-Layer Neural Network. (_Done_)\n* [Q5](assignment1/features.ipynb): Higher Level Representations: Image Features. (_Done_)\n\n### Assignment 2\n* [Q1](assignment2/FullyConnectedNets.ipynb): Fully-connected Neural Network. (_Done_)\n* [Q2](assignment2/BatchNormalization.ipynb): Batch Normalization. (_Done_)\n* [Q3](assignment2/Dropout.ipynb): Dropout. (_Done_)\n* [Q4](assignment2/ConvolutionalNetworks.ipynb): Convolutional Networks. (_Done_)\n* [Q5](assignment2/PyTorch.ipynb) _option 1_: PyTorch on CIFAR-10. (_Done_)\n* [Q5](assignment2/TensorFlow.ipynb) _option 2_: TensorFlow on CIFAR-10. (_Done_)\n\n### Assignment 3\n* [Q1](assignment3/RNN_Captioning.ipynb): Image Captioning with Vanilla RNNs (_Done_)\n* [Q2](assignment3/Transformer_Captioning.ipynb): Image Captioning with Transformers (_Done_)\n* [Q3](assignment3/Network_Visualization.ipynb): Network Visualization: Saliency Maps, Class Visualization, and Fooling Images (_Done_)\n* [Q4](assignment3/Generative_Adversarial_Networks.ipynb): Generative Adversarial Networks (_Done_)\n* [Q5](assignment3/Self_Supervised_Learning.ipynb): Self-Supervised Learning for Image Classification (_Done_)\n* [Q6](assignment3/LSTM_Captioning.ipynb): Image Captioning with LSTMs (_Done_)\n\n\u003cbr\u003e\n\n## Running Locally\n\nIt is advised to run in [Colab](https://colab.research.google.com/), however, you can also run locally. To do so, first, set up your environment - either through [conda](https://docs.conda.io/en/latest/) or [venv](https://docs.python.org/3/library/venv.html). It is advised to install [PyTorch](https://pytorch.org/get-started/locally/) in advance with GPU acceleration. Then, follow the steps:\n1. Install the required packages:\n   ```bash\n   pip install -r requirements.txt\n   ```\n2. Change every first code cell in `.ipynb` files to:\n   ```bash\n   %cd cs231n/datasets/\n   !bash get_datasets.sh\n   %cd ../../\n   ```\n3. Change the first code cell in section **Fast Layers** in [ConvolutionalNetworks.ipynb](assignment2/ConvolutionalNetworks.ipynb) to:\n   ```bash\n   %cd cs231n\n   !python setup.py build_ext --inplace\n   %cd ..\n   ```\n\nI've gathered all the requirements for all 3 assignments into one file [requirements.txt](requirements.txt) so there is no need to additionally install the requirements specified under each assignment folder. If you plan to complete [TensorFlow.ipynb](assignment2/TensorFlow.ipynb), then you also need to additionally install [Tensorflow](https://www.tensorflow.org/install).\n\n\n\u003e **Note**: to use MPS acceleration via Apple M1, see the comment in [#4](https://github.com/mantasu/cs231n/issues/4#issuecomment-1492202538).\n\n## Examples\n\n\u003cdetails\u003e\u003csummary\u003e\u003cb\u003eInline question example\u003c/b\u003e\u003c/summary\u003e\n\u003cbr\u003e\n\u003cb\u003eInline Question 1\u003c/b\u003e\n\n\u003chr\u003e\n\u003cp align=\"justify\"\u003e\u003csub\u003eIt is possible that once in a while a dimension in the gradcheck will not match exactly. What could such a discrepancy be caused by? Is it a reason for concern? What is a simple example in one dimension where a gradient check could fail? How would change the margin affect of the frequency of this happening? \u003ci\u003eHint: the SVM loss function is not strictly speaking differentiable\u003c/i\u003e\u003c/sub\u003e\u003c/p\u003e\n\u003chr\u003e\n\n\u003cbr\u003e\n\n\u003cb\u003eYour Answer\u003c/b\u003e\n\n\u003chr\u003e\n\n\u003csub\u003e\nFirst, we need to make some assumptions. To compute our \u003cb\u003eSVM loss\u003c/b\u003e, we use \u003cb\u003eHinge loss\u003c/b\u003e which takes the form $\\max(0, -)$. For \u003ccode\u003e1D\u003c/code\u003e case, we can define it as follows ( $\\hat{y}$ - score, $i$ - any class, $c$ - correct class, $\\Delta$ - margin):\n\u003c/sub\u003e\n\n\u003csub\u003e\n$$f(x)=\\max(0, x),\\ \\text{ where } x=\\hat{y}_i-\\hat{y}_c+\\Delta$$\n\u003c/sub\u003e\n\n\u003csub\u003e\nLet's now see how our $\\max$ function fits the definition of computing the gradient. It is the formula we use for computing the gradient \u003ci\u003enumerically\u003c/i\u003e when, instead of implementing the limit approaching to $0$, we choose some arbitrary small $h$:\n\u003c/sub\u003e\n\n\u003csub\u003e\n$$\\frac{df(x)}{dx}=\\lim_{h \\to 0}\\frac{\\max(0,x+h)-\\max(0,x)}{h}$$\n\u003c/sub\u003e\n\n\u003csub\u003e\nNow we can talk about the possible mismatches between \u003ci\u003enumeric\u003c/i\u003e and \u003ci\u003eanalytic\u003c/i\u003e gradient computation:\n\u003c/sub\u003e\n\n1. \u003csub\u003e**Cause of mismatch** \u003c/sub\u003e\n    * \u003csub\u003e _Relative error_ - the discrepancy is caused due to arbitrary choice of small values of $h$ because by definition it should approach `0`. _Analytic_ computation produces an exact result (as precise as computation precision allows) while _numeric_ solution only approximates the result. \u003c/sub\u003e\n    * \u003csub\u003e _Kinks_ - $\\max$ only has a subgradient because when both values in $\\max$ are equal, its gradient is undefined, therefore, not smooth. Such parts, referred to as _kinks_, may cause _numeric_ gradient to produce different results from _analytic_ computation due to (again) arbitrary choice of $h$. \u003c/sub\u003e\n2. \u003csub\u003e **Concerns** \u003c/sub\u003e\n    * \u003csub\u003e When comparing _analytic_ and _numeric_ methods, _kinks_ are more dangerous than small inaccuracies where the gradient is smooth. Small derivative inaccuracies still change the weight by approximately the same amount but _kinks_ may cause unintentional updates as seen in an example below. If the unintentional values would have a noticeable affect on parameter updates, it is a reason for concern. \u003c/sub\u003e\n3. \u003csub\u003e **`1D` example of numeric gradient fail** \u003c/sub\u003e\n    * \u003csub\u003e Assume $x=-10^{-9}$. Then the _analytic_ computation of the derivative of $\\max(0, x)$ would yield `0`. However, if we choose our $h=10^{-8}$, then the _numeric_ computation would yield `0.9`. \u003c/sub\u003e\n4. \u003csub\u003e **Relation between margin and mismatch** \u003c/sub\u003e\n    * \u003csub\u003e Assuming all other parameters remain **unchanged**, increasing $\\Delta$ will lower the frequency of _kinks_. This is because higher $\\Delta$ will cause more $x$ to be positive, thus reducing the probability of kinks. In reality though, it would not have a big effect - if we increase the margin $\\Delta$, the **SVM** will only learn to increase the (negative) gap between $\\hat y_i - \\hat y_c$ and `0` (when $i\\ne c$). But that still means, if we add $\\Delta$, there is the same chance for $x$ to result on the edge. \u003c/sub\u003e\n\n\u003chr\u003e\n\u003c/details\u003e\n\n\u003cdetails\u003e\u003csummary\u003e\u003cb\u003ePython code example\u003c/b\u003e\u003c/summary\u003e\n\u003csub\u003e\n\n```python\ndef conv_forward_naive(x, w, b, conv_param):\n    \"\"\"A naive implementation of the forward pass for a convolutional layer.\n\n    The input consists of N data points, each with C channels, height H and\n    width W. We convolve each input with F different filters, where each filter\n    spans all C channels and has height HH and width WW.\n\n    Input:\n    - x: Input data of shape (N, C, H, W)\n    - w: Filter weights of shape (F, C, HH, WW)\n    - b: Biases, of shape (F,)\n    - conv_param: A dictionary with the following keys:\n      - 'stride': The number of pixels between adjacent receptive fields in the\n        horizontal and vertical directions.\n      - 'pad': The number of pixels that will be used to zero-pad the input.\n\n    During padding, 'pad' zeros should be placed symmetrically (i.e equally on both sides)\n    along the height and width axes of the input. Be careful not to modfiy the original\n    input x directly.\n\n    Returns a tuple of:\n    - out: Output data, of shape (N, F, H', W') where H' and W' are given by\n      H' = 1 + (H + 2 * pad - HH) / stride\n      W' = 1 + (W + 2 * pad - WW) / stride\n    - cache: (x, w, b, conv_param)\n    \"\"\"\n    out = None\n    ###########################################################################\n    # TODO: Implement the convolutional forward pass.                         #\n    # Hint: you can use the function np.pad for padding.                      #\n    ###########################################################################\n    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n\n    P1 = P2 = P3 = P4 = conv_param['pad'] # padding: up = right = down = left\n    S1 = S2 = conv_param['stride']        # stride:  up = down\n    N, C, HI, WI = x.shape                # input dims  \n    F, _, HF, WF = w.shape                # filter dims\n    HO = 1 + (HI + P1 + P3 - HF) // S1    # output height      \n    WO = 1 + (WI + P2 + P4 - WF) // S2    # output width\n\n    # Helper function (warning: numpy version 1.20 or above is required for usage)\n    to_fields = lambda x: np.lib.stride_tricks.sliding_window_view(x, (WF,HF,C,N))\n\n    w_row = w.reshape(F, -1)                                            # weights as rows\n    x_pad = np.pad(x, ((0,0), (0,0), (P1, P3), (P2, P4)), 'constant')   # padded inputs\n    x_col = to_fields(x_pad.T).T[...,::S1,::S2].reshape(N, C*HF*WF, -1) # inputs as cols\n\n    out = (w_row @ x_col).reshape(N, F, HO, WO) + np.expand_dims(b, axis=(2,1))\n    \n    x = x_pad # we will use padded version as well during backpropagation\n\n    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n    ###########################################################################\n    #                             END OF YOUR CODE                            #\n    ###########################################################################\n    cache = (x, w, b, conv_param)\n    return out, cache\n```\n\n\u003c/sub\u003e\n\u003c/details\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmantasu%2Fcs231n","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmantasu%2Fcs231n","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmantasu%2Fcs231n/lists"}