{"id":16938709,"url":"https://github.com/dsnet/matrix-transpose","last_synced_at":"2026-03-10T09:02:06.122Z","repository":{"id":132959750,"uuid":"41650329","full_name":"dsnet/matrix-transpose","owner":"dsnet","description":"Experiments in the efficient transpose of bit-matrices.","archived":false,"fork":false,"pushed_at":"2015-08-31T02:04:28.000Z","size":232,"stargazers_count":10,"open_issues_count":1,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-13T15:34:56.459Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dsnet.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-08-31T01:56:05.000Z","updated_at":"2025-04-03T07:10:26.000Z","dependencies_parsed_at":"2023-03-10T01:37:22.915Z","dependency_job_id":null,"html_url":"https://github.com/dsnet/matrix-transpose","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dsnet/matrix-transpose","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dsnet%2Fmatrix-transpose","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dsnet%2Fmatrix-transpose/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dsnet%2Fmatrix-transpose/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dsnet%2Fmatrix-transpose/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dsnet","download_url":"https://codeload.github.com/dsnet/matrix-transpose/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dsnet%2Fmatrix-transpose/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30328269,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-10T05:25:20.737Z","status":"ssl_error","status_checked_at":"2026-03-10T05:25:17.430Z","response_time":106,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-13T21:02:14.041Z","updated_at":"2026-03-10T09:02:06.047Z","avatar_url":"https://github.com/dsnet.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Bit-Matrix Transpose #\n\n## Introduction ##\n\n![matrix-transpose-intro](doc/matrix-transpose-intro.png)\n\nThis code was written to test an algorithm to perform transpose on a bit-matrix.\nIn this example, the transpose is performed by swapping all bits in a square\nbit-matrix across the top-right to bottom-left diagonal.\n\nThe application of this is in embedded applications to take advantage of a\nmodulation technique called\n[Binary Code Modulation](http://www.batsocks.co.uk/readme/art_bcm_1.htm). This\ntechnique allows the value of a register to control the duty-cycle and only\nrequires a single hardware timer to control multiple outputs.\n\nThus in BCM usage, imagine that 8x 8-bit registers contains the values of the\nintended output duty-cycle for 8x output pins. The problem is that when\noutputting the values, the lowest bit of each register should be written\ntogether to the output, followed by the second lowest bit, and so on.\nThe naive way of doing this by masking and shifting one bit at a time would be\nfar too inefficient. Thus, by coming up with an efficient way to do bit-matrix\ntransposes, we can transform the values in the registers such that the first\nregister now contains the lowest bit of all outputs, the second register\ncontains the second lowest bit of all outputs, and so on. Now, all that needs\nto be done is to write each register to the output whenever the hardware timer\ntriggers an interrupt.\n\n\n## Theory ##\n\nThis technique exploits the fact that an 8-bit microcontroller almost always has\nan 8-bit ALU. Rather than using bit-shifts and bit-masks to transpose one bit\nat a time, the ALU is used to shift 8-bits at a time. Note that the number of\nswap operation does not go down, but rather this method exploits a form of\nparallelism by utilizing the properties of the hardware.\n\nThe code example committed is generalized to operate on any n-bit matrix. Since\nmodern processors are now 64-bit, this allows efficient transposing of 8b, 16b,\n32b, and 64b square bit-matrices. The operational complexity to perform a\ntranspose is _O(n\\*log(n))_ as opposed to _O(n\\*n)_ without this method.\n\n![matrix-transpose-method](doc/matrix-transpose-method.png)\n\nThe above figure shows a visual representation how recursively swapping in\nsmaller and smaller square blocks can transpose the entire bit-matrix.\nThis specific solution starts swapping in large 4x4 blocks and eventually ends\nwith swapping small 1x1 blocks. Please note that the order that the blocks are\nswapped does not affect the end solution. It is perfectly valid to swap by 4x4\nfirst, then by 1x1, and then finally by 2x2. This would be harder to code, but\nit is a valid solution.\n\nInterestingly, the bit-mask needed to select the 4x4, 2x2, and 1x1 blocks looks\nsomething like the following:\n```c\nuint8_t mask_4 = 0b11110000;\nuint8_t mask_2 = 0b11001100;\nuint8_t mask_1 = 0b10101010;\n```\n\nHard-coding these mask constants works, but would not scale well if this method\nwas applied to 16b, 32b, or 64b bit-matrices. Looking at the pattern generated,\none can notice that the transition frequency of the next mask is twice that of\nthe previous. This pattern can be easily generated by starting with a mask\nfilled with all 1's. Take that mask and XOR it with a 90° shifted version of\nitself. Basically continue this process until the bit mask becomes all zeroes.\n\nThe following C code demonstrates how to generate the masks on each iteration\nof the loop:\n```c\nint width = 8;\nuint8_t mask = 0b11111111;\nwhile (width \u003e 1) {\n\twidth \u003e\u003e= 1;\n\tmask ^= mask \u003e\u003e width;\n}\n```\n\n\n## References ##\n\n* [Binary code modulation](http://www.batsocks.co.uk/readme/art_bcm_1.htm)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdsnet%2Fmatrix-transpose","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdsnet%2Fmatrix-transpose","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdsnet%2Fmatrix-transpose/lists"}