{"id":13499065,"url":"https://github.com/bamos/densenet.pytorch","last_synced_at":"2025-04-04T17:09:43.777Z","repository":{"id":50505151,"uuid":"81465000","full_name":"bamos/densenet.pytorch","owner":"bamos","description":"A PyTorch implementation of DenseNet.","archived":false,"fork":false,"pushed_at":"2018-08-16T19:33:50.000Z","size":3135,"stargazers_count":837,"open_issues_count":9,"forks_count":188,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-03-28T16:11:07.638Z","etag":null,"topics":["deep-learning","densenet","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bamos.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-02-09T15:33:23.000Z","updated_at":"2025-03-23T03:33:22.000Z","dependencies_parsed_at":"2022-08-02T20:01:05.681Z","dependency_job_id":null,"html_url":"https://github.com/bamos/densenet.pytorch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bamos%2Fdensenet.pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bamos%2Fdensenet.pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bamos%2Fdensenet.pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bamos%2Fdensenet.pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bamos","download_url":"https://codeload.github.com/bamos/densenet.pytorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247217221,"owners_count":20903009,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","densenet","pytorch"],"created_at":"2024-07-31T22:00:27.888Z","updated_at":"2025-04-04T17:09:43.759Z","avatar_url":"https://github.com/bamos.png","language":"Python","funding_links":[],"categories":["Papers\u0026Codes","Python","Paper implementations｜论文实现","Paper implementations","Densely Connected Convolutional Networks","Paper Implementations"],"sub_categories":["DenseNet","Other libraries｜其他库:","Other libraries:","Implementations"],"readme":"# A PyTorch Implementation of DenseNet\n\nThis is a [PyTorch](http://pytorch.org/) implementation of the\nDenseNet-BC architecture as described in the\npaper [Densely Connected Convolutional Networks](https://arxiv.org/abs/1608.06993)\nby G. Huang, Z. Liu, K. Weinberger, and L. van der Maaten.\nThis implementation gets a CIFAR-10+ error rate of\n4.77 with a 100-layer DenseNet-BC with a growth rate of 12.\nTheir official implementation and links to many other\nthird-party implementations are available in the\n[liuzhuang13/DenseNet](https://github.com/liuzhuang13/DenseNet)\nrepo on GitHub.\n\n![](images/header.png)\n\n# Why DenseNet?\n\nAs this table from the DenseNet paper shows, it provides\ncompetitive state of the art results on CIFAR-10,\nCIFAR-100, and SVHN.\n\n![](images/densenet-err-table.png)\n\n# Why yet another DenseNet implementation?\n\nPyTorch is a great new framework and it's nice to have these\nkinds of re-implementations around so that they can be integrated\nwith other PyTorch projects.\n\n# How do you know this implementation is correct?\n\nInterestingly while implementing this, I had a lot of\ntrouble getting it to converge and looked at every part\nof the code closer than I usually would.\nI compared all of the model's hidden states and gradients\nwith the official implementation to make sure my code was correct\nand even trained a VGG-style network on CIFAR-10 with the\ntraining code here.\nIt turns out that I uncovered a new critical PyTorch\nbug (now fixed) that was causing this.\n\nI have left around my original message about how this\nisn't working and the things that I have checked\n[in this document](attic/debugging-discussion.md).\nI think this should be interesting for other people to\nsee my development and debugging strategies when\nhaving issues implementing a model that's known\nto converge.\nI also started\n[this PyTorch forum thread](https://discuss.pytorch.org/t/help-debugging-densenet-model-on-cifar-10/412),\nwhich has a few other discussion points.\nYou may also be interested in\n[my script that\ncompares PyTorch gradients to Torch gradients](https://github.com/bamos/densenet.pytorch/blob/master/attic/compare-pytorch-and-torch-grads.py)\nand\n[my script that numerically checks PyTorch gradients](https://github.com/bamos/densenet.pytorch/blob/master/attic/numcheck-grads.py).\n\nMy convergence issues were due to a critical PyTorch bug\nrelated to using `torch.cat` with convolutions with cuDNN\nenabled (which it is by default when CUDA is used).\nThis bug caused incorrect gradients and the fix to\nthis bug is to disable cuDNN (which doesn't have\nto be done anymore because it's fixed).\nThe oversight in my debugging strategies that caused me to\nnot find this error is that I did not think to disable cuDNN.\nUntil now, I have assumed that the cuDNN option in frameworks\nare bug-free, but have learned that this is not always the case.\nI may have also found something if I would have numerically\ndebugged `torch.cat` layers with convolutions instead of\nfully connected layers.\n\nAdam fixed the PyTorch bug that caused this in\n[this PR](https://github.com/pytorch/pytorch/pull/708)\nand has been merged into Torch's master branch.\n**If you are interested in using the DenseNet code in\nthis repository, make sure your PyTorch version\ncontains [this PR](https://github.com/pytorch/pytorch/pull/708)\nand was downloaded after 2017-02-10.**\n\n# What does the PyTorch compute graph of the model look like?\n\nYou can see the compute graph [here](images/model.png),\nwhich I created with [make_graph.py](https://github.com/bamos/densenet.pytorch/blob/master/make_graph.py),\nwhich I copied from\n[Adam Paszke's gist](https://gist.github.com/apaszke/01aae7a0494c55af6242f06fad1f8b70).\nAdam says PyTorch will soon have a better way to create\ncompute graphs.\n\n# How does this implementation perform?\n\nBy default, this repo trains a 100-layer DenseNet-BC with\nan growth rate of 12 on the CIFAR-10 dataset with\ndata augmentations.\nDue to GPU memory sizes, this is the largest model I am able to run.\nThe paper reports a final test error of 4.51 with this\narchitecture and we obtain a final test error of 4.77.\n\n![](images/sgd-loss-error.png)\n\n# Why don't people use ADAM instead of SGD for training ResNet-style models?\n\nI also tried training a net with ADAM and found that it didn't\nconverge as well with the default hyper-parameters compared\nto SGD with a reasonable learning rate schedule.\n\n![](images/adam-loss-error.png)\n\n# What about the non-BC version?\n\nI haven't tested this as thoroughly, you should make sure\nit's working as expected if you plan to use and modify it.\nLet me know if you find anything wrong with it.\n\n# A paradigm for ML code\n\nI like to include a few features in my projects\nthat I don't see in some other re-implementations\nthat are present in this repo.\nThe training code in `train.py` uses `argparse` so the batch size\nand some other hyper-params can easily be changed\nand as the model is training, progress is written\nout to csv files in a work directory also defined\nby the arguments.\nThen a separate script `plot.py` plots the\nprogress written out by the training script.\nThe training script calls `plot.py` after every epoch,\nbut it can importantly be run on its own so figures\ncan be tweaked without re-running the entire experiment.\n\n# Help wanted: Improving memory utilization and multi-GPU support\n\nI think there are ways to improve the memory utilization\nin this code as in the\n[the official space-efficient Torch implementation](https://github.com/gaohuang/DenseNet_lite).\nI also would be interested in multi-GPU support.\n\n# Running the code and viewing convergence\n\nFirst install PyTorch (ideally in an anaconda3 distribution).\n[./train.py](./train.py) will create a model, start training it,\nand save progress to `args.save`, which is\n`work/cifar10.base` by default.\nThe training script will call [plot.py](./plot.py) after\nevery epoch to create plots from the saved progress.\n\n# Citations\n\nThe following is a [BibTeX](http://www.bibtex.org/)\nentry for the DenseNet paper that you should cite\nif you use this model.\n\n```\n@article{Huang2016Densely,\n  author = {Huang, Gao and Liu, Zhuang and Weinberger, Kilian Q.},\n  title = {Densely Connected Convolutional Networks},\n  journal = {arXiv preprint arXiv:1608.06993},\n  year = {2016}\n}\n```\n\nIf you use this implementation, please also consider citing this implementation and\ncode repository with the following BibTeX or plaintext entry.\nThe BibTeX entry requires the `url` LaTeX package.\n\n```\n@misc{amos2017densenet,\n  title = {{A PyTorch Implementation of DenseNet}},\n  author = {Amos, Brandon and Kolter, J. Zico},\n  howpublished = {\\url{https://github.com/bamos/densenet.pytorch}},\n  note = {Accessed: [Insert date here]}\n}\n\nBrandon Amos, J. Zico Kolter\nA PyTorch Implementation of DenseNet\nhttps://github.com/bamos/densenet.pytorch.\nAccessed: [Insert date here]\n```\n\n# Licensing\n\nThis repository is\n[Apache-licensed](https://github.com/bamos/densenet.pytorch/blob/master/LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbamos%2Fdensenet.pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbamos%2Fdensenet.pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbamos%2Fdensenet.pytorch/lists"}