https://github.com/michurin/pprofiler

Pure Python profiler
https://github.com/michurin/pprofiler
profiling pure-python python
Last synced: 7 months ago
JSON representation
Pure Python profiler
Host: GitHub
URL: https://github.com/michurin/pprofiler
Owner: michurin
License: mit
Created: 2017-09-17T05:29:42.000Z (almost 8 years ago)
Default Branch: master
Last Pushed: 2024-01-21T08:11:17.000Z (over 1 year ago)
Last Synced: 2024-01-21T09:23:57.271Z (over 1 year ago)
Topics: profiling, pure-python, python
Language: Python
Homepage:
Size: 25.4 KB
Stars: 5
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        [![test](https://github.com/michurin/pprofiler/actions/workflows/ci.yaml/badge.svg)](https://github.com/michurin/pprofiler/actions/workflows/ci.yaml)

[![codecov](https://codecov.io/gh/michurin/pprofiler/branch/master/graph/badge.svg)](https://codecov.io/gh/michurin/pprofiler)

[![Python 3.8](https://img.shields.io/badge/python-3.8-blue.svg)](https://www.python.org/)

[![Python 2.7](https://img.shields.io/badge/python-2.7-blue.svg)](https://www.python.org/)

pprofiler

=========

`pprofiler` ("Pretty profiler" or "Python code level profiler" or "Pure Python profiler")

is a Python level profiler like `timeit` (against C code level profilers like `cProfile`).

Key features

------------

* Can be used like

  * context manager

  * decorator

* Provides

  * total time

  * mean avarage

  * standard deviation

  * min/max

* Nested scopes

* Flexible reporting formats

  * simplest printing to stdout

  * ouput using `logging` or other engine

  * data as nested structure to store into some document-oriented database like MongoDB

  * data as flat structure to store into some relational database like PostgreSQL

* Easy usage/integration

Quick overview

--------------

You can found all examples in `/examples` dir.

To use `pprofiler` you need to import only one symbol. You can drop `pprofiler.py` into you

source tree and use it without installation `pprofiler` package.

### Using as context manager

```python

#!/usr/bin/python

# coding: U8

import time

import random

from pprofiler import profiler

def rand_sleep(dt):

    time.sleep(dt * (1 + 0.4 * (random.random() - 0.5)))  # dt ± 20%

def main():

    with profiler('prepare input'):

        rand_sleep(.1)

    for n in range(3):

        with profiler('process chunk'):

            rand_sleep(.2)

    with profiler('push result'):

        rand_sleep(.1)

    profiler.print_report()

if __name__ == '__main__':

    main()

```

Output:

```

name             perc   sum  n   avg   max   min   dev

---------------- ---- ----- -- ----- ----- ----- -----

process chunk ..  73%  0.58  3  0.19  0.22  0.18  0.02

push result ....  14%  0.11  1  0.11  0.11  0.11     -

prepare input ..  13%  0.10  1  0.10  0.10  0.10     -

```

### Using as decorator:

```python

#!/usr/bin/python

# coding: U8

import time

import random

from pprofiler import profiler

def rand_sleep(dt):

    time.sleep(dt * (1 + 0.4 * (random.random() - 0.5)))  # dt ± 20%

@profiler('prepare input')

def prepare_input():

    rand_sleep(.1)

@profiler('process chunk')

def process_chunk():

    rand_sleep(.2)

@profiler('push result')

def push_result():

    rand_sleep(.1)

def main():

    prepare_input()

    for n in range(3):

        process_chunk()

    push_result()

    profiler.print_report()

if __name__ == '__main__':

    main()

```

Output:

```

name             perc   sum  n   avg   max   min   dev

---------------- ---- ----- -- ----- ----- ----- -----

process chunk ..  74%  0.64  3  0.21  0.23  0.19  0.03

prepare input ..  14%  0.12  1  0.12  0.12  0.12     -

push result ....  13%  0.11  1  0.11  0.11  0.11     -

```

### Report formats

You can get reports in different ways and formats. Look at this simplest example:

```python

#!/usr/bin/python

# coding: U8

import time

import random

import pprint

import sys

import logging

from pprofiler import profiler

logger = logging.getLogger(__name__)

def rand_sleep(dt):

    time.sleep(dt * (1 + 0.4 * (random.random() - 0.5)))  # dt ± 20%

@profiler('demo')

def demo():

    rand_sleep(.2)

def main():

    logging.basicConfig(

        format='%(asctime)s [%(levelname)s] [%(process)d] %(message)s',

        datefmt='%H:%M:%S',

        level=logging.DEBUG,

        handlers=[logging.StreamHandler(sys.stdout)])

    demo()

    demo()

    profiler.print_report()  # print to stdout

    profiler.print_report(logger.info)  # print using custom logger

    pprint.pprint(profiler.report)  # get report as structure

    pprint.pprint(list(profiler))  # get report as lines

if __name__ == '__main__':

    main()

```

You can just print formated report to `stdout` using `profiler.print_report()`:

```

name     perc   sum  n   avg   max   min   dev

------- ----- ----- -- ----- ----- ----- -----

demo ..  100%  0.37  2  0.19  0.21  0.16  0.03

```

You can use you custom logger like this `profiler.print_report(logger.info)`:

```

13:06:22 [INFO] [27038] name     perc   sum  n   avg   max   min   dev

13:06:22 [INFO] [27038] ------- ----- ----- -- ----- ----- ----- -----

13:06:22 [INFO] [27038] demo ..  100%  0.37  2  0.19  0.21  0.16  0.03

```

You can get report as nested structure (`profiler.report`) to store it into some document-orienteted storage like MongoDB:

```python

[{'avg': 0.1868218183517456,

  'dev': 0.03486269297645324,

  'max': 0.2114734649658203,

  'min': 0.1621701717376709,

  'name': 'demo',

  'num': 2,

  'percent': 100.0,

  'sum': 0.3736436367034912}]

```

And you can get the report as flat list of items to store report line by line to some relational databases, using `profiler` as iterator:

```python

[{'avg': 0.18060684204101562,

  'dev': 0.0004245030582017247,

  'level': 0,

  'max': 0.1809070110321045,

  'min': 0.18030667304992676,

  'name': 'demo',

  'num': 2,

  'percent': 100.0,

  'sum': 0.36121368408203125}]

```

You can find more complex exmaples in 'examples/' directory.

### Nested scopes

Nested scopes helps you to study your timings deeper. Look at this:

```python

#!/usr/bin/python

# coding: U8

import time

import random

from pprofiler import profiler

def rand_sleep(dt):

    time.sleep(dt * (1 + 0.4 * (random.random() - 0.5)))  # dt ± 20%

def main():

    with profiler('cook document'):

        rand_sleep(.1)  # we do not want to give the individual attention to this

        with profiler('create title'):

            rand_sleep(.2)

        with profiler('create body'):

            rand_sleep(.3)

    with profiler('push document'):

        rand_sleep(.5)

    profiler.print_report()

if __name__ == '__main__':

    main()

```

Here we have two high level tasks: (i) 'cook document' and (ii) 'push document'. But we want to pay attention

to some subtasks in 'cook document': 'create title' and 'create body'.

We can get report like this:

```

name              perc   sum  n   avg   max   min dev

----------------- ---- ----- -- ----- ----- ----- ---

cook document ...  52%  0.63  1  0.63  0.63  0.63   -

. create body ...  65%  0.34  1  0.34  0.34  0.34   -

. create title ..  35%  0.18  1  0.18  0.18  0.18   -

push document ...  48%  0.59  1  0.59  0.59  0.59   -

```

You can read it level by level. This is high level tasks:

```

...

cook document ...  52%  0.63  1  0.63  0.63  0.63   -

...

push document ...  48%  0.59  1  0.59  0.59  0.59   -

```

You can see, that cooking takes 52% of time, and pushing takes 48%. But you can look deeper

inside of cooking:

```

...

. create body ...  65%  0.34  1  0.34  0.34  0.34   -

. create title ..  35%  0.18  1  0.18  0.18  0.18   -

...

```

Here you can analyze parts of 'cook document' task.

Multithreading/multiprocessing

------------------------------

`pprofiler` is not designed for concurrent computing. But it is possible to

use it in multithreading/multiprocessing applications. The simplest way is

to build autonomous instance of `pprofiler` in each worker:

```python

#!/usr/bin/python

# coding: U8

import time

import random

import multiprocessing

import sys

import logging

from pprofiler import profiler

logger = logging.getLogger()

def rand_sleep(dt):

    time.sleep(dt * (1 + 0.4 * (random.random() - 0.5)))  # dt ± 20%

def subprocess_worker(x):

    local_profiler = type(profiler)()  # create and use local profiler in each child process

    logger.debug('Worker %d start', x)

    with local_profiler('worker_%d' % x):

        rand_sleep(3)

    logger.debug('Worker %d done', x)

    local_profiler.print_report(logger.info)

def main():

    logging.basicConfig(

        format='%(asctime)s,%(msecs)03d [%(process)d] [%(levelname)s] %(message)s',

        datefmt='%H:%M:%S',

        level=logging.DEBUG,

        handlers=[logging.StreamHandler(sys.stdout)])

    logger.debug('Master start')

    with profiler('master (total)'):

        pool = multiprocessing.Pool(processes=2)

        for i in pool.imap_unordered(subprocess_worker, range(2)):

            pass

    logger.debug('Master done')

    profiler.print_report(logger.info)

if __name__ == '__main__':

    main()

```

You can get something like this:

```

22:53:11,092 [1216] [DEBUG] Master start

22:53:11,236 [1217] [DEBUG] Worker 0 start

22:53:11,237 [1218] [DEBUG] Worker 1 start

22:53:14,189 [1218] [DEBUG] Worker 1 done

22:53:14,190 [1218] [INFO] name         perc   sum  n   avg   max   min dev

22:53:14,190 [1218] [INFO] ----------- ----- ----- -- ----- ----- ----- ---

22:53:14,191 [1218] [INFO] worker_1 ..  100%  2.95  1  2.95  2.95  2.95   -

22:53:14,402 [1217] [DEBUG] Worker 0 done

22:53:14,403 [1217] [INFO] name         perc   sum  n   avg   max   min dev

22:53:14,403 [1217] [INFO] ----------- ----- ----- -- ----- ----- ----- ---

22:53:14,404 [1217] [INFO] worker_0 ..  100%  3.16  1  3.16  3.16  3.16   -

22:53:14,405 [1216] [DEBUG] Master done

22:53:14,406 [1216] [INFO] name               perc   sum  n   avg   max   min dev

22:53:14,406 [1216] [INFO] ----------------- ----- ----- -- ----- ----- ----- ---

22:53:14,407 [1216] [INFO] master (total) ..  100%  3.31  1  3.31  3.31  3.31   -

```

You can see three processes: one master (PID=1216) and two workers (PID=1217 and 1218).

Each of them uses a separate instance of `pprofiler` and gets its own report.

This is not the only solution. You can collect all reports in one process/thread

and store all results or prepare you custom report:

```python

#!/usr/bin/python

# coding: U8

import time

import random

import multiprocessing

import sys

import logging

import itertools

from pprofiler import profiler

logger = logging.getLogger()

def rand_sleep(dt):

    time.sleep(dt * (1 + 0.4 * (random.random() - 0.5)))  # dt ± 20%

def subprocess_worker(x):

    local_profiler = type(profiler)()  # create and use local profiler in each child process

    logger.debug('Worker %d start', x)

    with local_profiler('worker_%d' % x):

        rand_sleep(3)

    logger.debug('Worker %d done', x)

    return local_profiler

def main():

    logging.basicConfig(

        format='%(asctime)s,%(msecs)03d [%(process)d] [%(levelname)s] %(message)s',

        datefmt='%H:%M:%S',

        level=logging.DEBUG,

        handlers=[logging.StreamHandler(sys.stdout)])

    logger.debug('Master start')

    with profiler('master (total)'):

        pool = multiprocessing.Pool(processes=2)

        workers_reports = list(pool.imap_unordered(subprocess_worker, range(2)))

    logger.debug('Master done')

    logger.info('== Native reports:')

    profiler.print_report(logger.info)

    for r in workers_reports:

        r.print_report(logger.info)

    logger.info('== Simple custom report:')

    for r in itertools.chain.from_iterable(x.report for x in [profiler] + workers_reports):

        logger.info('{name:.<20s} {sum:.3f} seconds'.format(**r))

if __name__ == '__main__':

    main()

```

Result:

```

10:24:48,085 [1353] [DEBUG] Master start

10:24:48,136 [1355] [DEBUG] Worker 1 start

10:24:48,136 [1354] [DEBUG] Worker 0 start

10:24:51,391 [1355] [DEBUG] Worker 1 done

10:24:51,705 [1354] [DEBUG] Worker 0 done

10:24:51,707 [1353] [DEBUG] Master done

10:24:51,708 [1353] [INFO] == Native reports:

10:24:51,708 [1353] [INFO] name               perc   sum  n   avg   max   min dev

10:24:51,709 [1353] [INFO] ----------------- ----- ----- -- ----- ----- ----- ---

10:24:51,709 [1353] [INFO] master (total) ..  100%  3.62  1  3.62  3.62  3.62   -

10:24:51,710 [1353] [INFO] name         perc   sum  n   avg   max   min dev

10:24:51,710 [1353] [INFO] ----------- ----- ----- -- ----- ----- ----- ---

10:24:51,710 [1353] [INFO] worker_1 ..  100%  3.25  1  3.25  3.25  3.25   -

10:24:51,711 [1353] [INFO] name         perc   sum  n   avg   max   min dev

10:24:51,711 [1353] [INFO] ----------- ----- ----- -- ----- ----- ----- ---

10:24:51,711 [1353] [INFO] worker_0 ..  100%  3.57  1  3.57  3.57  3.57   -

10:24:51,712 [1353] [INFO] == Simple custom report:

10:24:51,712 [1353] [INFO] master (total)...... 3.621 seconds

10:24:51,712 [1353] [INFO] worker_1............ 3.253 seconds

10:24:51,713 [1353] [INFO] worker_0............ 3.567 seconds

```
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/michurin/pprofiler

Awesome Lists containing this project

README