{"id":13419214,"url":"https://github.com/wenwei202/caffe","last_synced_at":"2025-03-15T05:30:34.542Z","repository":{"id":83105199,"uuid":"38183099","full_name":"wenwei202/caffe","owner":"wenwei202","description":"Caffe for Sparse and Low-rank Deep Neural Networks","archived":false,"fork":false,"pushed_at":"2020-03-08T18:33:18.000Z","size":73001,"stargazers_count":375,"open_issues_count":23,"forks_count":134,"subscribers_count":35,"default_branch":"master","last_synced_at":"2024-07-31T22:46:16.587Z","etag":null,"topics":["acceleration","caffe","compression","deep-neural-networks","low-rank-approximation","sparse-convolution","sparsity"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wenwei202.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2015-06-28T02:19:00.000Z","updated_at":"2024-07-30T02:11:58.000Z","dependencies_parsed_at":null,"dependency_job_id":"0d557e38-7b88-4041-bcd4-0cc772040374","html_url":"https://github.com/wenwei202/caffe","commit_stats":null,"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wenwei202%2Fcaffe","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wenwei202%2Fcaffe/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wenwei202%2Fcaffe/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wenwei202%2Fcaffe/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wenwei202","download_url":"https://codeload.github.com/wenwei202/caffe/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243690113,"owners_count":20331726,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["acceleration","caffe","compression","deep-neural-networks","low-rank-approximation","sparse-convolution","sparsity"],"created_at":"2024-07-30T22:01:12.867Z","updated_at":"2025-03-15T05:30:33.461Z","avatar_url":"https://github.com/wenwei202.png","language":"C++","readme":"# ABOUT \n## Repo summary\n### Lower-rank deep neural networks (ICCV 2017)\nPaper: [Coordinating Filters for Faster Deep Neural Networks](http://openaccess.thecvf.com/content_ICCV_2017/papers/Wen_Coordinating_Filters_for_ICCV_2017_paper.pdf).\n\n[Poster](/docs/ICCV17-Poster.pdf) is available.\n\nsource code is in this master branch.\n\n### Sparse Deep Neural Networks (NIPS 2016)\nSee the source code in branch [scnn](https://github.com/wenwei202/caffe/tree/scnn)\n\n### (NIPS 2017 Oral) Ternary Gradients to Reduce Communication in Distributed Deep Learning \nA work to accelerate training. [code](https://github.com/wenwei202/terngrad)\n\n### Direct sparse convolution and guided pruning (ICLR 2017)\nOriginally in branch [intel](https://github.com/wenwei202/caffe/tree/intel), but merged to [IntelLabs/SkimCaffe](https://github.com/IntelLabs/SkimCaffe) with contributions also by @jspark1105\n\n### Caffe version\nMaster branch is from caffe @ commit [eb4ba30](https://github.com/BVLC/caffe/commit/eb4ba30e3c4899edc7a9713158d61503fa8ecf90)\n\n## Lower-rank deep neural networks (ICCV 2017)\nTutorials on using python to decompose DNNs to low-rank space is [here](/python). \n\nIf any problems/bugs/questions, you are welcome to open an issue and we will response asap.\n\nDetails of Force Regularization is in the Paper: [Coordinating Filters for Faster Deep Neural Networks](https://arxiv.org/abs/1703.09746).\n\n### Training with Force Regularization for Lower-rank DNNs\nIt is easy to use the code to train DNNs toward lower-rank DNNs.\nOnly three additional protobuf configurations are required:\n\n1. `force_decay` in `SolverParameter`: Specified in solver. The coefficient to make the trade-off between accuracy and ranks. Larger `force_decay`, smaller ranks and usually lower accuracy.\n2. `force_type` in `SolverParameter`: Specified in solver. The kind of force to coordinate filters. `Degradation` - The strength of pairwise attractive force decreases as the distance decreases. This is the L2-norm force in the paper; `Constant` - The strength of pairwise attractive force keeps constant regardless of the distance. This is the L1-norm force in the paper.\n3. `force_mult` in `ParamSpec`: Specified for the `param` of weights in each layer. The local multiplier of `force_decay` for filters in a specific layer, i.e., `force_mult*force_decay` is the final coefficient for the specific layer. You can set `force_mult: 0.0` to eliminate force regularization in any layer.\n\nSee details and implementations in [caffe.proto](/src/caffe/proto/caffe.proto#L190-L193) and [SGDSolver](/src/caffe/solvers/sgd_solver.cpp#L223)\n\n### Examples\nAn example of training LeNet with L1-norm force regularization:\n\n```\n##############################################################\\\n# The train/test net with local force decay multiplier       \nnet: \"examples/mnist/lenet_train_test_force.prototxt\"        \n##############################################################/\n\ntest_iter: 100\ntest_interval: 500\n# The base learning rate. For large-scale DNNs, you might try 0.1x smaller base_lr of training the original DNNs from scratch.\nbase_lr: 0.01\nmomentum: 0.9\nweight_decay: 0.0005\n\n##############################################################\\\n# The coefficient of force regularization.                   \n# The hyper-parameter to tune to make trade-off              \nforce_decay: 0.001                                           \n# The type of force - L1-norm force                          \nforce_type: \"Constant\"                                       \n##############################################################/\n\n# The learning rate policy\nlr_policy: \"multistep\"\ngamma: 0.9\nstepvalue: 5000\nstepvalue: 7000\nstepvalue: 8000\nstepvalue: 9000\nstepvalue: 9500\n# Display every 100 iterations\ndisplay: 100\n# The maximum number of iterations\nmax_iter: 10000\n# snapshot intermediate results\nsnapshot: 5000\nsnapshot_prefix: \"examples/mnist/lower_rank_lenet\"\nsnapshot_format: HDF5\nsolver_mode: GPU\n```\n\nRetraining a trained DNN with force regularization might get better results, comparing with training from scratch.\n\n### Hyperparameter\nWe included the hyperparameter of \"lambda_s\" for AlexNet in [Figure 6](https://arxiv.org/pdf/1703.09746.pdf). \n\n### Some open research topics\nForce Regularization can squeeze/coordinate weight information to much lower rank space, but after low-rank decomposition with the same precision of approximation, it is more challenging to recover the accuracy from the much more lightweight DNNs. \n\n## License and Citation\nPlease cite our ICCV and Caffe if it is useful for your research:\n\n    @InProceedings{Wen_2017_ICCV,\n\t  author={Wen, Wei and Xu, Cong and Wu, Chunpeng and Wang, Yandan and Chen, Yiran and Li, Hai},\n      title={Coordinating Filters for Faster Deep Neural Networks},\n\t  booktitle = {The IEEE International Conference on Computer Vision (ICCV)},\n\t  month = {October},\n\t  year = {2017}\n    }\n\nCaffe is released under the [BSD 2-Clause license](https://github.com/BVLC/caffe/blob/master/LICENSE).\nThe BVLC reference models are released for unrestricted use.\n\nPlease cite Caffe in your publications if it helps your research:\n\n    @article{jia2014caffe,\n      Author = {Jia, Yangqing and Shelhamer, Evan and Donahue, Jeff and Karayev, Sergey and Long, Jonathan and Girshick, Ross and Guadarrama, Sergio and Darrell, Trevor},\n      Journal = {arXiv preprint arXiv:1408.5093},\n      Title = {Caffe: Convolutional Architecture for Fast Feature Embedding},\n      Year = {2014}\n    }\n","funding_links":[],"categories":["TODO scan for Android support in followings","**Conference Papers**"],"sub_categories":["**ICCV 2017**"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwenwei202%2Fcaffe","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwenwei202%2Fcaffe","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwenwei202%2Fcaffe/lists"}