{"id":19795584,"url":"https://github.com/dkogan/clockfunction","last_synced_at":"2025-02-28T10:25:21.632Z","repository":{"id":136714168,"uuid":"99993753","full_name":"dkogan/clockfunction","owner":"dkogan","description":"Runs an executable, measuring execution time statistics for given functions","archived":false,"fork":false,"pushed_at":"2023-09-02T19:13:30.000Z","size":53,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-11T04:50:19.830Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dkogan.png","metadata":{"files":{"readme":"README.org","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-08-11T05:10:22.000Z","updated_at":"2024-04-27T14:31:53.000Z","dependencies_parsed_at":"2025-01-11T04:49:33.387Z","dependency_job_id":"d3070947-1fbe-43d4-9b2f-bd6b41462c3e","html_url":"https://github.com/dkogan/clockfunction","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkogan%2Fclockfunction","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkogan%2Fclockfunction/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkogan%2Fclockfunction/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkogan%2Fclockfunction/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dkogan","download_url":"https://codeload.github.com/dkogan/clockfunction/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241137693,"owners_count":19916178,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T07:16:46.464Z","updated_at":"2025-02-28T10:25:21.625Z","avatar_url":"https://github.com/dkogan.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Profiling tools generally answer the question *where's all my time going?*, and\nthey answer this question by periodically sampling a running executable to see\nwhat it's doing at that point in time. When many samples are collected, they can\nbe analyzed and tallied. Execution hotspots then show up as a large sample\ncount: the application spent more time executing the hotspots. As expected, a\nhigher sampling rate yields more precise results at the cost of a higher\ninstrumentation overhead.\n\nRecently I needed to answer a slightly diffent question: *how quickly does THIS\nSPECIFIC function run?*. A sampling profiler is both too much and too little\nhere. Too much because I don't need to sample when the function of interest is\nnot running, and too little because a very high sampling resolution may be\nnecessary to report the timings with sufficient precision.\n\nI looked around, and didn't see any available tools that solved this problem.\nThe [[https://perf.wiki.kernel.org/][=perf=]] tool from the linux kernel does 99% of what is needed to construct\nsuch a tool, so I wrote a simple tool that uses =perf= to give me the facilities\nI need: =clockfunction=.\n\n=perf= is able to probe executables at /entry/ points to a function and at\n/exit/ points of a function. Probing the entry is trivial, since the addresses\ncan be looked up in the symbol table or in the debug symbols. Probing the exit\nis /not/ trivial, since multiple =ret= statements could be present. One could\neither do some minor static analysis to find the =ret= statements, or one could\nlook at the return address upon entry, and then dynamically place a probe there.\nAnd if there're any non-local exits, these would both break. I'm not 100% sure,\nbut I suspect that =perf= does neither, but instead uses some special hardware\nto do this. In any case, I don't care: =perf= allows me to probe function\nreturns somehow, and that's all I care about.\n\nAfter placing the probes, I run the executable being evaluated while =perf= is\nrecording all probe crossings. When the executable exits, the probe log can be\nanalyzed to extract the timing information.\n\nThe =clockfunction= tool automates this. Multiple functions can be sampled, with\neach one specified as a =func@lib= string (ltrace-style). =func= is the name of\nthe function we care about. This could be a shell pattern to pick out multiple\nfunctions. =lib= is the ELF library or executable that contains this function;\nmust be an absolute path. An example:\n\n#+BEGIN_EXAMPLE\n$ ./clockfunction.py '*rand*'@/usr/bin/perl perl_run@/usr/bin/perl perl -e 'for $i (0..100000) { $s = rand(); }'\n\n# function mean min max stdev Ncalls\n## All timings in seconds\nPerl_drand48_init_r 7.55896326154e-06 7.55896326154e-06 7.55896326154e-06 0.0               1\nPerl_drand48_r      1.95271501819e-06 1.76404137164e-06 3.67719912902e-05 4.0105865074e-07  100001\nPerl_pp_rand        5.23026800056e-06 4.78199217469e-06 0.000326015986502 1.71576428687e-06 100001\nperl_run            0.662568764063    0.662568764063    0.662568764063    0.0               1\n#+END_EXAMPLE\n\nThe table was re-spaced for readability. We see that the main perl application\ntook 0.66 seconds. And =Perl_pp_rand= was called 100001 times, taking 5.23us\neach time, on average, for a total of 0.523 seconds. A lower-level\n=Perl_drand48_r= function took about 1/3 of the time of =Perl_pp_rand=. If one\ncared about this detail of perl, this would be very interesting to know. And we\nfound it out without any compile-time instrumentation of our binary and without\neven bothering to find out what the =*rand*= functions area called.\n\nRecursive or parallel invocations are supported so far as the mean and Ncalls\nwill be reported correctly. The min, max and stdev of the timings will not be\navailable, however.\n\n\n* Caveats\n\nThis tool is a quick hack, and all the actual work is done by 'perf'. This tool\ncalls 'sudo' all over the place, which is ugly.\n\nA relatively recent 'perf' is required. The devs have been tinkering with the\nsemantics of 'perf probe -F'. The following should produce reasonable output:\n\n#+BEGIN_EXAMPLE\n  perf probe -x `which python` -F 'Py*'\n#+END_EXAMPLE\n\n(I.e. it should print out a long list of instrumentable functions in the python\nexecutable that start with 'Py'). Older versions of the 'perf' tool will barf\ninstead. Note that 'perf' is a userspace tool that lives in the linux kernel\nsource tree. And it doesn't directly depend on specific kernel versions.\nGrabbing a very recent kernel tree and rebuilding JUST 'perf' usually works. And\nyou don't need to rebuild the kernel and reboot. Usually.\n\nWhen instrumenting C++ functions you generally need to use the mangled symbol\nnames. At this time 'perf' has partial support for demangled names, but it's not\ncomplete enough to work fully ('perf probe -F' can report demangled names, but\nyou can't insert probes with ':' in their names since ':' is already taken in\n'perf probe' syntax). So I use 'perf probe --no-demangle', which again requires\na relatively recent 'perf'. If you aren't looking at C++, but your perf is too\nold to have --no-demangle, you'll get needless barfing; take out the\n'--no-demangle' in that case.\n\n\n* License\n\nreleased into the public domain; I'm giving up all copyright.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdkogan%2Fclockfunction","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdkogan%2Fclockfunction","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdkogan%2Fclockfunction/lists"}