{"id":20433778,"url":"https://github.com/intelpython/profiling_guide","last_synced_at":"2026-06-04T12:31:26.435Z","repository":{"id":169073638,"uuid":"642040060","full_name":"IntelPython/Profiling_Guide","owner":"IntelPython","description":null,"archived":false,"fork":false,"pushed_at":"2024-10-01T15:18:20.000Z","size":29964,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-15T19:20:14.789Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/IntelPython.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-17T17:29:42.000Z","updated_at":"2024-09-30T20:41:32.000Z","dependencies_parsed_at":null,"dependency_job_id":"4385b1df-c334-42e4-9da3-3502636e7485","html_url":"https://github.com/IntelPython/Profiling_Guide","commit_stats":null,"previous_names":["intelpython/profiling_guide"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelPython%2FProfiling_Guide","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelPython%2FProfiling_Guide/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelPython%2FProfiling_Guide/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelPython%2FProfiling_Guide/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/IntelPython","download_url":"https://codeload.github.com/IntelPython/Profiling_Guide/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241976153,"owners_count":20051587,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-15T08:21:09.036Z","updated_at":"2026-06-04T12:31:26.407Z","avatar_url":"https://github.com/IntelPython.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Performance Profiling Guide for Python\nProfilers help to identify performance problems. These are tools designed to give the metrics to find the slowest parts of the code so that we can optimize what really matters. Profilers can gather a wide variety of metrics: wall time, CPU time, network or memory consumption, I/O operations, etc.\n\u003cbr\u003e\nProfilers can answer questions like,\n- How many times is each method in my code called? \n- How long does each of these methods take?\n- How much memory does the method consume?\n\n\u003cbr\u003e\nThere are different types of profilers:\n\n- **Deterministic Profiling:** Deterministic profilers execute trace functions at various points of interest (function call, function return) and record precise timings of these events. It means the code runs slower under profiling. Its use in production systems is often impractical.\n\n\n- **Statistical profiling:**  Instead of tracking every event (call to every function), statistical profilers interrupt applications periodically and collect samples of the execution state (call stack snapshots). The call stacks are then analyzed to determine the execution time of different parts of the application. This method is less accurate, but it also reduces the overhead.\n\nAll the profilers we are going to discuss here are **Deterministic Profilers** because they capture precise timings of events. Please note that the **Memory Profiler** package also has the **mprof** module that does **statistical profiling**. It is discussed briefly in Memory Profiler notebook. \n\nThis GitHub aims to show different profilers for Python and explain in detail the procedure to profile different workloads with different profilers. Below is the list of all the profilers we will be discussing. Each profiler has a separate folder with a Jupyter Notebook to guide you.  \n\n| Performance Profiler | Lines or Function | Description of Profiler |\n| ----------- | ----------- | ----------- |\n| **[Memory Profiler](https://github.com/pythonprofilers/memory_profiler)** | lines | \u003cul\u003e\u003cli\u003e It provides memory consumption of each individual line inside the function. \u003c/li\u003e \u003cli\u003e Minimal code modification is required.\u003c/li\u003e\u003cli\u003e It is generally used after identifying **hotspot functions** from a function profiler.\u003c/li\u003e\u003cli\u003eIt does not profile GPU workloads.\u003c/li\u003e\u003cli\u003eIt cannot profile individual threads.\u003c/li\u003e\u003cli\u003e It does not provide execution time information.\u003c/li\u003e\u003c/ul\u003e|\n| **[Line Profiler](https://github.com/pyutils/line_profiler)** | lines | \u003cul\u003e\u003cli\u003e It times the execution of each individual line inside the function. \u003c/li\u003e \u003cli\u003e No code modification is required.\u003c/li\u003e\u003cli\u003e It is generally used after identifying **hotspot functions** from a function profiler.\u003c/li\u003e\u003cli\u003eIt does not profile GPU workloads.\u003c/li\u003e\u003cli\u003eIt cannot profile individual threads.\u003c/li\u003e\u003cli\u003e It does not provide memory consumption information.\u003c/li\u003e\u003c/ul\u003e|\n| **[cProfile](https://docs.python.org/3/library/profile.html)** | function | \u003cul\u003e\u003cli\u003e It times the execution of different functions. \u003c/li\u003e \u003cli\u003e No code modification is required.\u003c/li\u003e\u003cli\u003e It provides a call stack graph and execution time of functions that help identify hotspots.\u003c/li\u003e\u003cli\u003eIt does not profile GPU workloads.\u003c/li\u003e\u003cli\u003eIt cannot profile individual threads.\u003c/li\u003e\u003cli\u003e It does not provide memory consumption information.\u003c/li\u003e\u003c/ul\u003e|\n| **[Profile](https://docs.python.org/3/library/profile.html)** | function | \u003cul\u003e\u003cli\u003e It times the execution of different functions. \u003c/li\u003e \u003cli\u003e No code modification is required.\u003c/li\u003e\u003cli\u003e It provides a call stack graph and execution time of functions that help identify hotspots.\u003c/li\u003e\u003cli\u003eIt does not profile GPU workloads.\u003c/li\u003e\u003cli\u003eUnlike cProfile, it can profile individual threads but has more overhead compared to cProfile.\u003c/li\u003e\u003cli\u003e It does not provide memory consumption information\u003c/li\u003e\u003c/ul\u003e|\n| **[FunctionTrace](https://functiontrace.com/)** | function | \u003cul\u003e\u003cli\u003e It times the execution of different functions but only supports Python\u003e3.5. \u003c/li\u003e \u003cli\u003e No code modification is required.\u003c/li\u003e\u003cli\u003e It provides stack charts, flame graphs, and call trees that help identify hotspots.\u003c/li\u003e\u003cli\u003eIt does not profile GPU workloads.\u003c/li\u003e\u003cli\u003eIt can profile individual threads.\u003c/li\u003e\u003cli\u003e It does not provide memory consumption information.\u003c/li\u003e\u003cli\u003eProfiling results can be shared very easily through browser.\u003c/li\u003e\u003c/ul\u003e|\n| **[Scalene](https://github.com/plasma-umass/scalene)** | function and line | \u003cul\u003e\u003cli\u003e It times the execution of different functions and lines but only supports Python\u003e3.7. \u003c/li\u003e \u003cli\u003e No code modification is required.\u003c/li\u003e\u003cli\u003e It does not provide call stack information.\u003c/li\u003e\u003cli\u003eIt can profile GPU workloads.\u003c/li\u003e\u003cli\u003eIt can profile individual threads.\u003c/li\u003e\u003cli\u003e It provides memory consumption information.\u003c/li\u003e\u003cli\u003eProfiling results can be shared very easily through browser.\u003c/li\u003e\u003cli\u003eIt has integration to GPT3, when activated it can suggest changes to optimize code\u003c/li\u003e\u003c/ul\u003e|\n| **[VTune](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)** | function and line | \u003cul\u003e\u003cli\u003e It times the execution of different functions and lines and supports other languages like C, Java, etc. \u003c/li\u003e \u003cli\u003e Minimal code modification is required. It also provides a GUI that is easy to use.\u003c/li\u003e\u003cli\u003e It provides call stack information, flame graph, and hardware utilization.\u003c/li\u003e\u003cli\u003eIt can profile GPU workloads.\u003c/li\u003e\u003cli\u003eIt can profile individual threads.\u003c/li\u003e\u003cli\u003e It provides memory consumption information.\u003c/li\u003e\u003cli\u003eProfiling results can be shared very easily through web browser interface.\u003c/li\u003e\u003cli\u003eIt also gives low-level C, C++ functions that can be potential hotspots.\u003c/li\u003e\u003cli\u003e The profiling overhead is high as compared to other profilers.\u003c/li\u003e\u003c/ul\u003e|\n\n\n\nWe will also use the following Intel AI Reference Kit in our profiling examples:\n- **Scikit-Learn Intelligent Indexing for Incoming Correspondence – [Ref Kit](https://github.com/oneapi-src/intelligent-indexing)**\n\n![image](https://user-images.githubusercontent.com/113541458/226619059-f5ea3ec5-a297-43d4-a6d4-c173265379e2.png)\n\n**Follow the steps mentioned in the [intelligent-Indexing Ref Kit GitHub ReadMe](https://github.com/oneapi-src/intelligent-indexing) to setup the environments accordingly.** \u003cbr\u003e\nThe process involves\n- Setting up a virtual environment for both stock and Intel®-accelerated machine learning packages\n- Preprocessing data using Pandas*/Intel® Distribution of Modin and NLTK\n- Training an NLP model for text classification using Scikit-Learn*/Intel® Extension for Scikit-Learn*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fintelpython%2Fprofiling_guide","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fintelpython%2Fprofiling_guide","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fintelpython%2Fprofiling_guide/lists"}