{"id":15065043,"url":"https://github.com/zfturbo/classification_models_1d","last_synced_at":"2025-09-21T21:57:38.153Z","repository":{"id":44412462,"uuid":"483825527","full_name":"ZFTurbo/classification_models_1D","owner":"ZFTurbo","description":"Classification models 1D Zoo - Keras and TF.Keras","archived":false,"fork":false,"pushed_at":"2024-07-18T06:20:57.000Z","size":156,"stargazers_count":27,"open_issues_count":3,"forks_count":2,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-09-03T18:34:10.234Z","etag":null,"topics":["1d-cnn","1d-convolution","1d-model","audio-processing","keras","keras-tensorflow","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ZFTurbo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-04-20T21:57:21.000Z","updated_at":"2025-06-07T22:09:27.000Z","dependencies_parsed_at":"2025-02-17T07:41:23.210Z","dependency_job_id":null,"html_url":"https://github.com/ZFTurbo/classification_models_1D","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/ZFTurbo/classification_models_1D","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZFTurbo%2Fclassification_models_1D","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZFTurbo%2Fclassification_models_1D/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZFTurbo%2Fclassification_models_1D/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZFTurbo%2Fclassification_models_1D/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ZFTurbo","download_url":"https://codeload.github.com/ZFTurbo/classification_models_1D/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZFTurbo%2Fclassification_models_1D/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":276312988,"owners_count":25620627,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-21T02:00:07.055Z","response_time":72,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["1d-cnn","1d-convolution","1d-model","audio-processing","keras","keras-tensorflow","tensorflow"],"created_at":"2024-09-25T00:30:01.414Z","updated_at":"2025-09-21T21:57:38.138Z","avatar_url":"https://github.com/ZFTurbo.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Classification models 1D Zoo - Keras and TF.Keras\n\nThis repository contains 1D variants of popular CNN models for classification like ResNets, DenseNets, VGG, etc. It also contains weights obtained by converting ImageNet weights from the same 2D models.\nIt can be useful for classification of audio or some timeseries data.\n\nThis repository is based on great [classification_models](https://github.com/qubvel/classification_models) repo by [@qubvel](https://github.com/qubvel/)\n\n### Architectures: \n- [VGG](https://arxiv.org/abs/1409.1556) [16, 19]\n- [ResNet](https://arxiv.org/abs/1512.03385) [18, 34, 50, 101, 152]\n- [ResNeXt](https://arxiv.org/abs/1611.05431) [50, 101]\n- [SE-ResNet](https://arxiv.org/abs/1709.01507) [18, 34, 50, 101, 152]\n- [SE-ResNeXt](https://arxiv.org/abs/1709.01507) [50, 101]\n- [SE-Net](https://arxiv.org/abs/1709.01507) [154]\n- [DenseNet](https://arxiv.org/abs/1608.06993) [121, 169, 201]\n- [Inception ResNet V2](https://arxiv.org/abs/1602.07261)\n- [Inception V3](http://arxiv.org/abs/1512.00567)\n- [MobileNet](https://arxiv.org/pdf/1704.04861.pdf)\n- [MobileNet v2](https://arxiv.org/abs/1801.04381)\n- [EfficientNet](https://arxiv.org/abs/1905.11946)\n- [EfficientNet v2](https://arxiv.org/abs/2104.00298)\n\n### Installation \n\n`pip install classification-models-1D`\n\n### Examples \n\n##### Loading model:\n\n```python\nfrom classification_models_1D.tfkeras import Classifiers\n\nResNet18, preprocess_input = Classifiers.get('resnet18')\nmodel = ResNet18(input_shape=(224*224, 2), weights='imagenet')\n```\n\nAll possible nets for `Classifiers.get()` method: \n\nBased on Conv1D: `'resnet18', 'resnet34', 'resnet50', 'resnet101', 'resnet152', 'seresnet18', 'seresnet34', \n'seresnet50', 'seresnet101', 'seresnet152', 'seresnext50','seresnext101', 'senet154', 'resnext50', \n'resnext101', 'vgg16', 'vgg19', 'densenet121', 'densenet169', 'densenet201', 'mobilenet', 'mobilenetv2', \n'inceptionresnetv2', 'inceptionv3', 'EfficientNetB0', 'EfficientNetB1', 'EfficientNetB2', 'EfficientNetB3', \n'EfficientNetB4', 'EfficientNetB5', 'EfficientNetB6', 'EfficientNetB7', 'EfficientNetV2B0', 'EfficientNetV2B1', \n'EfficientNetV2B2', 'EfficientNetV2B3', 'EfficientNetV2S', 'EfficientNetV2M', 'EfficientNetV2L'`\n\nNon-standard nets (Conv1D): `resnet18_pool8`\n\nBased on spectrograms and Conv2D: `'EfficientNetB0_spectre', 'EfficientNetB1_spectre', 'EfficientNetB2_spectre', \n'EfficientNetB3_spectre', 'EfficientNetB4_spectre', 'EfficientNetB5_spectre', 'EfficientNetB6_spectre', \n'EfficientNetB7_spectre'`\n\n### Convert imagenet weights (2D -\u003e 1D)\n\nCode to convert 2D imagenet weights to 1D variant is available here: [convert_imagenet_weights_to_1D_models.py](convert_imagenet_weights_to_1D_models.py).\n\n### How to choose input shape\n\nIf initial 2D model had shape (224, 224, 3) then you can use shape (W, 3) where `W ~= 224*224`, so something like\n(224*224, 2) will be ok.\n\n### Additional features\n\n* Default pooling/stride size for 1D models set equal to 4 to match (2, 2) pooling for 2D nets. Kernel size by default is 9 to match (3, 3) kernels. You can change it for your needs using parameters \n `stride_size` and `kernel_size`. Example:\n \n ```python\nfrom classification_models_1D.tfkeras import Classifiers\n\nResNet18, preprocess_input = Classifiers.get('resnet18')\nmodel = ResNet18(\n    input_shape=(224*224, 2),\n    stride_size=6,\n    kernel_size=3, \n    weights=None\n)\n```\n\n* You can set different pooling for each pooling block. For example you don't need pooling at first convolution. \nYou can do it using tuple as value for `stride_size`:\n\n ```python\nfrom classification_models_1D.tfkeras import Classifiers\n\nResNet18, preprocess_input = Classifiers.get('resnet34')\nmodel = ResNet18(\n    input_shape=(65536, 2),\n    stride_size=(1, 4, 4, 8, 8),\n    kernel_size=9,\n    weights='imagenet'\n)\n```\n\n* For some models like (resnet, resnext, senet, vgg16, vgg19, densenet) it's possible to change number of blocks/poolings. \nFor example if you want to switch to pooling/stride = 2 but make more poolings overall. You can do it like that:\n\n ```python\nfrom classification_models_1D.tfkeras import Classifiers\n\nResNet18, preprocess_input = Classifiers.get('resnet34')\nmodel = ResNet18(\n    input_shape=(224*224, 2),\n    include_top=False,\n    weights=None,\n    stride_size=(2, 4, 4, 4, 2, 2, 2, 2),\n    kernel_size=3,\n    repetitions=(2, 2, 2, 2, 2, 2, 2),\n    init_filters=16,\n)\n```\n\n**Note**: Since number of filters grows 2 times, you can set initial number of filters with `init_filters` parameter.\n\n### Pretrained weights\n\n#### Imagenet weights\n\nImagenet weights available for all models except ('inceptionresnetv2', 'inceptionv3'). They available only for `kernel_size == 3` or `kernel_size == 9` and 2 channel input (e.g. stereo sound). Weights were converted from 2D models to 1D variant. Weights can be loaded with any pooling scheme.   \n\n#### Audioset weights\n\n[AudioSet](https://research.google.com/audioset/) is large audio dataset. It's multilabel classifcation on 527 different classes. All available data was used for training. It's around 1.9 millions of audio tracks. Each track is around 10 seconds of length. \n* AudioSet weights were obtained for default parameters `kernel_size = 9`, `stride_size = (4, 4, 4, 4, 4)`. \n* Random class sampling was used during training. To form batch first choose random class, then choose random sample, which contains this class. \n* Validation data can be found here: [AudioSet validation](https://www.kaggle.com/datasets/zfturbo/audioset-valid).\n\nQuality table below:\n\n| Model name | Eval mAP (macro) | Eval mAP (micro) | Eval AUC (macro) | Eval AUC (local) | Eval LL | Eval Acc (Macro) | Eval Acc (per sample) |\n| :--------: | :--------------: | :---------------:| :--------------: | :--------------: | :-----: | :--------------: | :-------------------: |\n| resnet18   | 0.2812           | 0.3712           | 0.9541           | 0.9666           | 8.5059  |  0.2401          | 0.2372                |\n| resnet34   | 0.3350           | 0.4390           | 0.9594           | 0.9705           | 8.1962  |  0.2769          | 0.2787                |\n| EfficientNetB5 | 0.3514       | 0.4725           | 0.9662           | 0.9767           | 8.0650  |  0.2832          | 0.2873                |\n| EfficientNetV2L | 0.3307      | 0.4559           | 0.9608           | 0.9726           | 8.3544  |  0.2642          | 0.2648                |\n| resnet18_pool8 | 0.3125       | 0.4318           | 0.9602           | 0.9718           | 8.3810  | 0.2596           | 0.2576                |\n| EfficientNetB5_spectre | 0.3801       | 0.5056   | 0.9695           | 0.9787           | 7.7415  | 0.3167           | 0.3295                |\n| Ensemble (EfficientNetB5 + EfficientNetB5_spectre) | 0.4046       | 0.5215   | 0.9737           | 0.9821           | 7.4294  | 0.3059           | 0.3104                |\n\n### Model comparison list\n\n| Model name | Number of params (millions) | Req. memory for 1 sample (GB) | Time proc one image (sec) |\n| :--------: | :-------------------------: | :---------------------------: | :-----------------------: |\n| resnet18 | 11 | 0.416 | 0.1450 |\n| resnet34 | 21 | 0.639 | 0.2680 |\n| resnet50 | 23 | 1.380 | 0.3950 |\n| resnet101 | 42 | 2.094 | 0.5375 |\n| resnet152 | 58 | 2.946 | 0.7941 |\n| seresnet18 | 11 | 0.441 | 0.1283 |\n| seresnet34 | 21 | 0.685 | 0.2287 |\n| seresnet50 | 26 | 1.534 | 0.3108 |\n| seresnet101 | 47 | 2.368 | 0.5387 |\n| seresnet152 | 64 | 3.366 | 0.7853 |\n| seresnext50 | 25 | 2.202 | 0.5495 |\n| seresnext101 | 47 | 3.345 | 0.9465 |\n| senet154 | 113 | 6.132 | 2.7225 |\n| resnext50 | 23 | 2.015 | 0.7168 |\n| resnext101 | 42 | 3.037 | 0.9152 |\n| vgg16 | 14 | 0.552 | 0.6331 |\n| vgg19 | 20 | 0.614 | 0.7746 |\n| densenet121 | 7 | 1.656 | 0.4552 |\n| densenet169 | 12 | 2.010 | 0.5861 |\n| densenet201 | 18 | 2.595 | 0.7707 |\n| mobilenet | 3 | 0.563 | 0.1101 |\n| mobilenetv2 | 2 | 0.722 | 0.1391 |\n| inceptionresnetv2 | 80 | 2.046 | 0.7017 |\n| inceptionv3 | 41 | 0.833 | 0.3453 |\n| EfficientNetB0 | 3 | 0.825 | 0.2259 |\n| EfficientNetB1 | 6 | 1.142 | 0.3066 |\n| EfficientNetB2 | 7 | 1.198 | 0.3217 |\n| EfficientNetB3 | 10 | 1.590 | 0.4202 |\n| EfficientNetB4 | 17 | 2.082 | 0.5470 |\n| EfficientNetB5 | 27 | 2.870 | 0.7400 |\n| EfficientNetB6 | 40 | 3.685 | 0.9357 |\n| EfficientNetB7 | 63 | 4.955 | 1.2509 |\n| EfficientNetV2B0 | 5 | 0.535 | 0.1710 |\n| EfficientNetV2B1 | 6 | 0.698 | 0.2207 |\n| EfficientNetV2B2 | 8 | 0.759 | 0.2526 |\n| EfficientNetV2B3 | 12 | 0.958 | 0.3317 |\n| EfficientNetV2S | 20 | 1.396 | 0.4392 |\n| EfficientNetV2M | 53 | 2.340 | 0.7458 |\n| EfficientNetV2L | 117 | 4.205 | 1.3081 |\n| EfficientNetB0_spectre | 4 | 0.029 | 0.1647 |\n| EfficientNetB1_spectre | 6 | 0.039 | 0.2184 |\n| EfficientNetB2_spectre | 7 | 0.043 | 0.2220 |\n| EfficientNetB3_spectre | 10 | 0.055 | 0.2915 |\n| EfficientNetB4_spectre | 17 | 0.081 | 0.3644 |\n| EfficientNetB5_spectre | 28 | 0.121 | 0.4704 |\n| EfficientNetB6_spectre | 40 | 0.168 | 0.5964 |\n| EfficientNetB7_spectre | 64 | 0.254 | 0.7912 |\n\n* **Note**: Required memory is for input shape of (441000, 2) - it's for classification of 10 seconds stereo audio (like in AudioSet). \n\n\n### Related repositories\n\n * [https://github.com/qubvel/classification_models](https://github.com/qubvel/classification_models) - original 2D repo\n * [https://github.com/ZFTurbo/classification_models_3D](https://github.com/ZFTurbo/classification_models_3D) - 3D variant repo\n \n### ToDo List\n\n* Create pretrained weights obtained on [AudioSet](https://research.google.com/audioset/) ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzfturbo%2Fclassification_models_1d","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzfturbo%2Fclassification_models_1d","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzfturbo%2Fclassification_models_1d/lists"}