{"id":13658839,"url":"https://github.com/COMP1511UNSW/dcc","last_synced_at":"2025-04-24T11:32:58.180Z","repository":{"id":43881915,"uuid":"133503905","full_name":"COMP1511UNSW/dcc","owner":"COMP1511UNSW","description":"dcc - a C compiler which explains errors to novice programmers","archived":false,"fork":false,"pushed_at":"2024-12-05T05:33:58.000Z","size":1061,"stargazers_count":157,"open_issues_count":28,"forks_count":18,"subscribers_count":17,"default_branch":"master","last_synced_at":"2024-12-05T06:27:35.036Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/COMP1511UNSW.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-05-15T11:05:57.000Z","updated_at":"2024-12-05T05:34:00.000Z","dependencies_parsed_at":"2023-11-20T06:28:36.414Z","dependency_job_id":"0c04f41a-c318-41ba-ba36-19b0f727a39f","html_url":"https://github.com/COMP1511UNSW/dcc","commit_stats":null,"previous_names":[],"tags_count":65,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/COMP1511UNSW%2Fdcc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/COMP1511UNSW%2Fdcc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/COMP1511UNSW%2Fdcc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/COMP1511UNSW%2Fdcc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/COMP1511UNSW","download_url":"https://codeload.github.com/COMP1511UNSW/dcc/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250618697,"owners_count":21460139,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T05:01:03.057Z","updated_at":"2025-04-24T11:32:58.098Z","avatar_url":"https://github.com/COMP1511UNSW.png","language":"Python","readme":"\n# Introduction\n\ndcc helps novice C programmers by catching common errors and providing easy-to-understand explanations.\n\nFor example:\n\ndcc add extra runtime checking for errors and prints information\nlikely to be helpful to novice programmers, including\nprinting values of variables and expressions.\nRun-time checking includes array indices, for example:\n\n```\n$ gcc count_zero.c\n$ ./a.out\n9\n$ dcc count_zero.c\n$ ./a.out\ncount_zero.c.c:7:7: runtime error - index 10 out of bounds for type 'int [10]'\ndcc explanation: You are using an illegal array index: 10\n  Valid indices for an array of size 10 are 0..9\n  Make sure the size of your array is correct.\n  Make sure your array indices are correct.\nExecution stopped in main() in count_zero.c at line 7:\n\nint main(void) {\n\tint numbers[10] = {0};\n\tint count = 0;\n\tfor (int i = 1; i \u003c= 10; i++) {\n--\u003e\t\tif (numbers[i] \u003e 0) {\n\t\t\tcount++;\n\t\t}\n\t}\n\nValues when execution stopped:\ncount = 0\ni = 10\nnumbers = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}\nnumbers[i] = \u003cuninitialized value\u003e\n```\n\nRun-time checking also includes pointer dereferences, for example:\n\n```\n$ gcc linked_list.c\n$ a.out\nSegmentation fault (core dumped)\n$ dcc linked_list.c\n$ a.out\nlinked_list.c:12:15: runtime error - accessing a field via a NULL pointer\ndcc explanation: You are using a pointer which is NULL\n  A common error is  using p-\u003efield when p == NULL.\nExecution stopped in main() in linked_list.c at line 12:\n\nint main(void) {\n    struct list_node s = {0};\n    struct list_node *a = \u0026s;\n    while (a != NULL) {\n--\u003e     a-\u003enext-\u003edata = 42;\n        a = a-\u003enext;\n    }\n}\n\nValues when execution stopped:\ns = {next = NULL, data = 0}\na-\u003enext = NULL\n```\n\ndcc also embeds code to detect use of uninitialized variables, for example:\n\n```\n$ gcc uninitialised_variable.c\n$ a.out\n0\n$ dcc uninitialised_variable.c\n$ a.out\nRuntime error: uninitialized variable accessed.\nExecution stopped in main() in uninitialised_variable.c at required:\n\nint main(void) {\n    int numbers[10];\n    for (int i = 1; i \u003c 10; i++) {\n        numbers[i] = i;\n    }\n--\u003e printf(\"%d\\n\", numbers[0]);\n}\n\nValues when execution stopped:\nnumbers = {\u003cuninitialized value\u003e,1,2,3,4,5,6,7,8,9}\nnumbers[0] = \u003cuninitialized value\u003e\n```\n\ndcc compiles C programs using clang and adds explanations suitable for novice programmers\nto compiler messages novice programmers are likely to encounter and not understand.\nFor example:\n\n```\n$ dcc a.c\na.c:3:15: warning: address of stack memory associated with local variable 'counter' returned [-Wreturn-stack-address]\n        return \u0026counter;\n\ndcc explanation: you are trying to return a pointer to the local variable 'counter'.\n  You can not do this because counter will not exist after the function returns.\n  See more information here: https://comp1511unsw.github.io/dcc/stack_use_after_return.html\n```\n\nUninitialized variables are detected by running valgrind simultaneously as a separate process.\n\nThe synchronisation of the 2 processes is only effective for the standard C library (signal.h and threads.h excepted).\nwhich should include almost all typical programs written by novice programmers.\nIf synchronisation is lost the 2nd process should terminate silently.\n\nIf libraries other the standard C library are used, uninitialized variables does not occur.\n \n# Leak checking\n\ndcc can also embed code to check for memory leaks:\n\n```\n$ dcc  --leak-check leak.c\n$ ./a.out\nError: free not called for memory allocated with malloc in function main in leak.c at line 3.\n```\n\n# Runtime Helper Script\n\nAfter reporting a runtime error an executable produced by `dcc`  can optionally run an external program.\n\nAfter reporting a runtime error a `dcc` executable checks if an executable\nnamed **dcc-runtime-helper** exists in `$PATH` and if so runs it.\n\nAn alternate name for the executable file can be supplied in the environment variable `DCC_RUNTIME_HELPER`\n\nThe helper executable is run with a different working directory to the orignal executable.\nIt is run in a temporary directory created by the dcc executable which contains the source\nto the original executable and dcc infrastructure files.\n\nThese environment variable are supplied to the helper script. They may be empty.\n\n- `DCC_PWD` - the original directory where the executable was run\n- `HELPER_FILENAME` - source filename where error occurred\n- `HELPER_LINE_NUMBER` - source line number where error occurred\n- `HELPER_COLUMN`  - source column where error occurred\n- `HELPER_SOURCE` - source lines surrounding error\n- `HELPER_CALL_STACK` - function call stack \n- `HELPER_VARIABLES` - current values of variables near the error location\n- `HELPER_JSON` - above variables encoded as JSON\n\n# Compile Helper Script\n\nAfter reporting a compiler message `dcc`  can optionally run an external program.\n\nAfter reporting a compiler message `dcc`  checks if an executable\nnamed **dcc-compile-helper** exists in `$PATH` and if so runs it.\n\nAn alternate name for the executable file can be supplied in the environment variable `DCC_COMPILE_HELPER`\n\nThese environment variable are supplied to the helper script. They may be empty.\n\n- `LOGGER_ARGV` - compiler command-line arguments\n- `LOGGER_RETURNCODE` - compiler exit status\n- `LOGGER_JSON` - above variables encoded as JSON\n\n# Compile Logger Script\n\nAfter completing a compilation message `dcc`  can optionally log the details.\n\nAfter reporting a compiler message `dcc`  checks if an executable\nnamed **dcc-compile-logger** exists in `$PATH` and if so runs it.\n\nAn alternate name for the executable file can be supplied in the environment variable `DCC_COMPILE_LOGGER`\n\nThese environment variable are supplied to the helper script. They may be empty.\n\n- `HELPER_COMPILER_MESSAGE` - compiler message\n- `HELPER_MESSAGE_TYPE` - message type (e.g warning)\n- `HELPER_FILENAME` - source filename where error occurred\n- `HELPER_LINE_NUMBER` - source line number where error occurred\n- `HELPER_COLUMN`  - source column where error occurred\n- `HELPER_EXPLANATION` - dcc text explaining error\n- `HELPER_JSON` - above variables encoded as JSON\n\n# Output checking\n\ndcc can check a program's output is correct.  If a program outputs an incorrect line, the program is stopped.  A description of why the output is incorrect is printed.  The current execution location is shown with the current values of variables \u0026 expressions.\n\nThe environment variable DCC_EXPECTED_STDOUT should be set to the expected output.\n\nIf `DCC_IGNORE_CASE` is true, case is ignored when checking expected output.  Default false.\n\n`DCC_IGNORE_WHITE_SPACE` is true, white space is ignored when checking expected output.  Default false.\n\n`DCC_IGNORE_TRAILING_WHITE_SPACE` is true, trailing white space is ignored when checking expected output.   Default true.\n\n`DCC_IGNORE_EMPTY_LINES` is true, empty lines are ignored when checking expected output.  Default false.\n\n`DCC_COMPARE_ONLY_CHARACTERS` is set to a non-empty string, the characters not in the string are ignored when checking expected output. New-lines can not be ignored.\n\n`DCC_IGNORE_CHARACTERS` is set to a non-empty string, the characters in the string are ignored when checking expected output. New-lines can not be ignored.\n\n`DCC_IGNORE_CHARACTERS` and `DCC_IGNORE_WHITE_SPACE`  take precedence over `DCC_COMPARE_ONLY_CHARACTERS`\n\nEnvironment variables are considered true if their value is a non-empty string starting with a character other than '0', 'f' or 'F'.  They are considered false otherwise.\n\n# Local Variable Use After Function Return Detection\n\n```\n$ dcc --use-after-return bad_function.c\n$ ./a.out\nbad_function.c:22 runtime error - stack use after return\n\ndcc explanation: You have used a pointer to a local variable that no longer exists.\n  When a function returns its local variables are destroyed.\n\nFor more information see: https://comp1511unsw.github.io/dcc//stack_use_after_return.html\nExecution stopped here in main() in bad_function at line 22:\n\n\n\tint *a = f(42);\n--\u003e\tprintf(\"%d\\n\", a[0]);\n}\n```\n\nvalgrind also usually detect this type of error, e.g.:\n\n```\n$ dcc --use_after_return bad_function.c\n$ ./a.out\nRuntime error: access to function variables after function has returned\nYou have used a pointer to a local variable that no longer exists.\nWhen a function returns its local variables are destroyed.\n\nFor more information see: https://comp1511unsw.github.io/dcc//stack_use_after_return.html'\n\n\nExecution stopped here in main() in tests/run_time/bad_function.c at line 22:\n\n\nint main(void) {\n--\u003e\tprintf(\"%d\\n\", *f(50));\n}\n```\n\n# Installation\n\n* Deb-based Systems including Debian, Ubuntu, Mint and Windows Subsystem for Linux\n\n\t```bash\n\tcurl -L https://github.com/COMP1511UNSW/dcc/releases/download/2.37/dcc_2.37_all.deb -o /tmp/dcc_2.37_all.deb\n\tsudo apt install /tmp/dcc_2.37_all.deb\n\t```\n\n\tor\n\n\t```bash\n\tsudo apt install  clang gcc gdb valgrind python3 curl\n\tsudo curl -L https://github.com/COMP1511UNSW/dcc/releases/latest/download/dcc -o /usr/local/bin/dcc\n\tsudo chmod o+rx  /usr/local/bin/dcc\n\t```\n\n\t```bash\n    # on  Windows Subsystem for Linux (only) this might be necessary to run programs\n\tsudo bash -c \"echo 0 \u003e /proc/sys/kernel/yama/ptrace_scope;echo 1 \u003e/proc/sys/vm/overcommit_memory\"\n\t```\n\n\tThe Ubuntu \u0026 Mint UndefinedSanitizer builds appear not to allow `__ubsan_on_report` to be intercepted\n\twhich degrades some error reporting\n\n* ARCH Linux\n\n\t```bash\n\tsudo pacman -S clang gcc gdb valgrind python3 curl\n\tsudo curl -L https://github.com/COMP1511UNSW/dcc/releases/latest/download/dcc -o /usr/local/bin/dcc\n\tsudo chmod o+rx  /usr/local/bin/dcc\n\t```\n\n* RPM-based Systems including CentOS, Fedora\n\n\t```bash\n\tsudo yum install clang gcc gdb valgrind python3 curl\n\tsudo curl -L https://github.com/COMP1511UNSW/dcc/releases/latest/download/dcc -o /usr/local/bin/dcc\n\tsudo chmod o+rx  /usr/local/bin/dcc\n\t```\n\n\tOn OpenSUSE:\n\n\t```bash\n\tsudo zypper install clang gcc gdb valgrind python3 curl\n\tsudo curl -L https://github.com/COMP1511UNSW/dcc/releases/latest/download/dcc -o /usr/local/bin/dcc\n\tsudo chmod o+rx  /usr/local/bin/dcc\n\t```\n\t\n\t\n* MacOS\n\tInstall python3 - see https://docs.python-guide.org/starting/install3/osx/\n\tInstall gdb - see https://sourceware.org/gdb/wiki/PermissionsDarwin\n\tIn your terminal, run:\n\t```bash\n\tbash \u003c(curl -s https://raw.githubusercontent.com/COMP1511UNSW/dcc/master/install_scripts/macos_install.sh)\n\t```\n\tNote: It is usually not a good idea to blindly run remote bash scripts in your terminal, you can inspect the file by opening the URL and reading to see what it does yourself.\n\n    ```bash\n\tsudo curl -L https://github.com/COMP1511UNSW/dcc/releases/latest/download/dcc -o /usr/local/bin/dcc\n\tsudo chmod o+rx  /usr/local/bin/dcc\n    ```\n\t\n\tvalgrind and MemorySanitizer are not currently supported on macOS which prevent checking for unitialized variables\n\n\n# C++ Support\n\nThere is experimental support for C++ programs if `dcc` is invoked as `d++` or `dcc++`.\n\nInstall by creating a symbolic link, e.g.:\n\n```bash\nsudo ln  -sf dcc /usr/local/bin/d++\n```\n\n\n# Run-time Error Handling Implementation\n\n* dcc by default enables clang's  AddressSanitizer (`-fsanitize=address`) and UndefinedBehaviorSanitizer (`-fsanitize=undefined`) extensions.\n\n* dcc embeds in the binary produced a xz-compressed tar file (see [compile.py]) containing the C source files for the program and some Python code which is executed if a runtime error occurs.\n\n* Sanitizer errors are intercepted by a shim for the function `__asan_on_error` in [dcc_util.c].\n\n* A set of signals produced by runtime errors are trapped by `_signal_handler` in [dcc_util.c].\n\n* Both functions call `_explain_error` in [dcc_util.c] which creates a temporary directory,\nextracts into it the program source and Python from the embedded tar file, and executes the Python code, which:\n\n    * runs the Python ([start_gdb.py]) to print an error message that a novice programmer will understand, then\n\n    * starts gdb, and uses it to print current values of variables used in source lines near where the error occurred.\n\n#  Facilitating Clear errors from Uninitialized Variables\n\nLinux initializes stack pages to zero.  As a consequence novice programmers  writing small programs with few function calls\nare likely to find zero in uninitialized local variables.  This often results in apparently correct behaviour from a\ninvalid program with uninitialized local variables.\n\ndcc embeds code in the binary which initializes the first few megabytes of the stack to 0xbe (see `clear-stack` in [dcc_util.c].\n\nFor valgrind dcc uses its malloc-fill and --free-fill options to achieve the same result see [dcc_util.c].  AddressSanitizer \u0026 MemorySanitizer use a malloc which does this by default.\n\nWhen printing variable values, dcc prints ints, doubles \u0026 pointers consisting of 0xbe bytes as \"\u003cuninitialized\u003e\". \n\nIndirection using pointers consisting of 0xbe bytes will produced an unaligned access error from  UndefinedBehaviourSanitizer, unless the pointer is to char.  dcc intercepts these and explanations suitable for novice programmers (see  explain_ubsan_error in [drive_gdb.py])\n\n```\n$ dcc dereference_uninitialized.c\n$ ./a.out\ntests/run_time/dereference_uninitialized_with_arrow.c:9:14: runtime error - accessing a field via an uninitialized pointer\n\ndcc explanation: You are using a pointer which has not been initialized\n  A common error is using p-\u003efield without first assigning a value to p.\n\nExecution stopped here in main() in dereference_uninitialized.c at line 9:\n\nint main(void) {\n    struct list_node *a = malloc(sizeof *a);\n--\u003e a-\u003enext-\u003edata = 42;\n}\n\nValues when execution stopped:\n\na-\u003enext = \u003cuninitialized value\u003e\n```\n\n# Build Instructions\n\n```bash\ngit clone https://github.com/COMP1511UNSW/dcc\ncd dcc\nmake\ncp -p ./dcc /usr/local/bin/dcc\n```\n\n# Compilation Diagram\n\n```mermaid\nflowchart\n    dcc[\"dcc program.c -o program\u003cbr\u003e (python)\"] --\u003e user_code[program.c]\n    user_code --\u003e gcc[gcc\u003cbr\u003efor extra error-detection only]\n    gcc --\u003e |compile-time error| dcc_explanation[dcc python]\n    dcc_explanation ---\u003e |outputs| explanation[error message with explanation added\u003cbr\u003e suitable for novice]\n    dcc --\u003e wrapper_code[dcc wrapper C code]\n    dcc --\u003e embedded_Python[embeded Python\u003cbr\u003efor runtime error-handling]\n    user_code --\u003e clang1[\"clang with options for valgrind\u003cbr\u003e(no sanitizers)\"]\n    wrapper_code --\u003e clang1\n    clang1 --\u003e |compile-time error| dcc_explanation\n    clang1 --\u003e temporary_executable[temporary executable]\n    wrapper_code --\u003e clang2[clang with options for AddressSanitizer]\n    user_code --\u003e clang2\n    embedded_Python --\u003e |tar file embedded by\u003cbr\u003eencoding as array initializer| clang2\n    temporary_executable --\u003e |binary embedded by\u003cbr\u003eencoding as array initializer| clang2\n    clang2 --\u003e program\n```\n\nAssumes the default option of AddressSanitizer + valgrind run in parallel.\n\n# Runtime Overview\n![DCC Runtime Overview](docs/dcc_overview.png)\n\n# Runtime Error Handling Diagram\n\n```mermaid\nflowchart\n    user1[user runs binary from dcc] --\u003e main\n    main[execute dcc wrapper code in binary]  --\u003e |stack pages initialized to 0xbe| Sanitizer1[\"execute user's code in binary\u003cbr\u003e(compiled with AddressSanitizer)\"]\n    main --\u003e |extract embedded binary\u003cbr\u003eto temporary file \u0026 fork| Sanitizer2[\"valgrind executes user's code in temporary file\u003cbr\u003e(not compiled with sanitizers)\"]\n    main --\u003e |fork| Watcher[valgrind watcher]\n    \n    Sanitizer1 --\u003e |runtime error| embedded_C[intercepted by dcc code in binary]\n    Sanitizer1 \u003c--\u003e | synchronize at system calls | Sanitizer2\n\n    embedded_C --\u003e embedded_Python\n\n    Watcher --\u003e |error details| embedded_Python\n\n    Sanitizer2 --\u003e |runtime error| Watcher\n    gdb \u003c--\u003e |stack backtrace \u0026 variable values| embedded_Python[embedded Python]\n    embedded_Python --\u003e |outputs| user2[novice friendly error message\u003cbr\u003elocation in source code\u003cbr\u003evariable values\u003cbr\u003eextra explanation]\n```\n\nAssumes the default option of AddressSanitizer + valgrind run in parallel.\n   \n# Papers\n\n[Foundations First: Improving C's Viability in Introductory Programming Courses with the Debugging C Compiler. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (SIGCSE 2023).](https://dl.acm.org/doi/10.1145/3545945.3569768)\n\nIf you have used DCC in your teaching or research, please cite the paper:\n```\n@inproceedings{10.1145/3545945.3569768,\nauthor = {Taylor, Andrew and Renzella, Jake and Vassar, Alexandra},\ntitle = {Foundations First: Improving C's Viability in Introductory Programming Courses with the Debugging C Compiler},\nyear = {2023},\nisbn = {9781450394314},\npublisher = {Association for Computing Machinery},\naddress = {New York, NY, USA},\nurl = {https://doi.org/10.1145/3545945.3569768},\ndoi = {10.1145/3545945.3569768},\npages = {346–352},\nnumpages = {7},\nkeywords = {educational compiler, c in cs1, cs1 programming languages},\nlocation = {Toronto ON, Canada},\nseries = {SIGCSE 2023}\n}\n```\n\n# Dependencies\n\nclang, python3, gdb, valgrind\n\n# Author\n\nAndrew Taylor (andrewt@unsw.edu.au)\n\nCode for ANSI colors in colors.py is by Giorgos Verigakis\n\n# License\n\nGPLv3\n\n","funding_links":[],"categories":["Python (144)"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FCOMP1511UNSW%2Fdcc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FCOMP1511UNSW%2Fdcc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FCOMP1511UNSW%2Fdcc/lists"}