https://github.com/rrze-hpc/pylikwid

Python interface for the LIKWID C API (https://github.com/RRZE-HPC/likwid)
https://github.com/rrze-hpc/pylikwid
likwid python
Last synced: about 1 month ago
JSON representation
Python interface for the LIKWID C API (https://github.com/RRZE-HPC/likwid)
Host: GitHub
URL: https://github.com/rrze-hpc/pylikwid
Owner: RRZE-HPC
License: gpl-2.0
Created: 2015-10-02T17:08:30.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2025-03-29T09:35:32.000Z (2 months ago)
Last Synced: 2025-04-24T00:13:08.969Z (about 1 month ago)
Topics: likwid, python
Language: C
Homepage:
Size: 94.7 KB
Stars: 46
Watchers: 5
Forks: 14
Open Issues: 4
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project

README

        pylikwid

========

Python interface for the C API of LIKWID

(https://github.com/RRZE-HPC/likwid)

.. image:: https://travis-ci.com/RRZE-HPC/pylikwid.svg?branch=master

    :target: https://travis-ci.com/RRZE-HPC/pylikwid?branch=master

Installation

============

I added a setup.py script for the installation. It builds the C module

and copies it to the proper destination.

::

    $ git clone https://github.com/RRZE-HPC/pylikwid.git

    $ cd pylikwid

    # Build C interface

    $ python setup.py build_ext -I  -L  -R 

    # Install module to the proper location

    $ python setup.py install (--prefix=)

    # Testing

    $ python -c "import pylikwid"

    $ ./testlib.py

Functions

=========

After ``import pylikwid`` you can call the following functions:

Marker API

----------

-  ``pylikwid.markerinit()``: Initialize the Marker API of the LIKWID library.

   Must be called previous to all other functions.

-  ``pylikwid.markerthreadinit()``: Add the current thread to the Marker API.

   Since Python is commonly single-threaded simply call it directly

   after ``pylikwid.markerinit()``

-  ``rr = pylikwid.registerregion(regiontag)``: Register a region to the

   Marker API. This is an optional function to reduce the overhead of

   region registration at ``pylikwid.markerstartregion``. If you don't call

   ``pylikwid.registerregion(regiontag)``, the registration is done at

   ``pylikwid.markerstartregion(regiontag)``. On success, 0 is return. If you

   havn't called ``pylikwid.markerinit()``, a negative number is returned.

-  ``err = pylikwid.markerstartregion(regiontag)``: Start measurements under

   the name ``regiontag``. On success, 0 is return. If you havn't called

   ``pylikwid.markerinit()``, a negative number is returned.

-  ``err = pylikwid.markerstopregion(regiontag)``: Stop measurements under the

   name ``regiontag`` again. On success, 0 is return. If you havn't

   called ``pylikwid.markerinit()``, a negative number is returned.

-  ``num_events, events[], time, count = pylikwid.markergetregion(regiontag)``:

   Get the intermediate results of the region identified by

   ``regiontag``. On success, it returns the number of events in the

   current group, a list with all the aggregated event results, the

   measurement time for the region and the number of calls.

-  ``pylikwid.nextgroup()``: Switch to the next event set in a

   round-robin fashion. If you have set only one event set on the

   command line, this function performs no operation.

-  ``pylikwid.markerreset(regiontag)``: Reset the values stored using the region

   name ``regiontag``. On success, 0 is returned.

-  ``pylikwid.markerclose()``: Close the connection to the LIKWID Marker API

   and write out measurement data to file. This file will be evaluated

   by ``likwid-perfctr``.

-  ``pylikwid.getprocessorid()``: Returns the ID of the currently

   executing CPU

-  ``pylikwid.pinprocess(cpuid)``: Pins the current process to the CPU

   given as ``cpuid``.

-  ``pylikwid.pinthread(cpuid)``: Pins the current thread to the CPU

   given as ``cpuid``.

Topology

--------

-  ``pylikwid.inittopology()``: Initialize the topology module (reads in

   system topology)

-  ``infodict = pylikwid.getcpuinfo()``: Return a dict with general

   information about the system (CPU model, CPU family, ...)

   -  ``osname``: Name of the CPU retrieved from the CPUID leafs

   -  ``name``: Name of the micro architecture

   -  ``short_name``: Short name of the micro architecture

   -  ``family``: ID of the CPU family

   -  ``model``: Vendor-specific model number of the CPU

   -  ``stepping``: Stepping (Revision) of the CPU

   -  ``perf_version``: Version number of the hardware performance

      monitoring capabilities

   -  ``perf_num_ctr``: Amount of general-purpose counter registers per

      hardware thread

   -  ``perf_num_fixed_ctr``: Amount of fixed-purpose counter registers

      per hardware thread

   -  ``perf_width_ctr``: Bit length of the counter registers

   -  ``clock``: CPU clock (only unequal to 0 if timer module is

      initialized)

   -  ``turbo``: Is turbo mode supported?

   -  ``isIntel``: Is it an Intel CPU?

   -  ``supportUncore``: Does the system have performance monitoring

      counters in the Uncore?

   -  ``features``: String with performance relevant CPU features (AVX,

      SSE, ...)

   -  ``featureFlags``: Bitmask for all available CPU features

-  ``topodict = pylikwid.getcputopology()``: Return a dict with the

   topology of the system. Here is a list of fields in the dict:

   -  ``numSockets``: Number of CPU sockets

   -  ``numHWThreads``: Number of hardware threads (physical +

      hyperthreading cores)

   -  ``activeHWThreads``: Number of active/usable hardware threads

   -  ``numCoresPerSocket``: Amount of hardware threads per CPU socket

   -  ``numThreadsPerCore``: Amount of hardware threads assembled in

      every physical CPU core

   -  ``numCacheLevels``: Amount of levels in cacheing hierarchy

   -  ``cacheLevels``: Dict with information about the cache levels,

      keys are the levels (1, 2, 3,...)

      -  ``level``: Level of the cache in the hierarchy

      -  ``lineSize``: Size of a cache line

      -  ``sets``: Amount of sets

      -  ``inclusive``: Is the cache inclusive or exclusive?\`

      -  ``threads``: Amount of threads attached to the cache

      -  ``associativity``: Associativity of the cache

      -  ``type``: data (= data cache), unified = (data + instruction

         cache)

      -  ``size``: Size of the cache in bytes

   -  ``threadPool``: Dict with information about the hardware threads.

      Keys are the os-generated ID of the hardware thread

      -  ``coreId``: ID of the corresponding physical core

      -  ``apicId``: ID set by the operating system

      -  ``threadId``: ID of the hardware thread in the physical core

      -  ``packageId``: ID of the CPU socket hosting the hardware thread

-  ``pylikwid.printsupportedcpus()``: Prints all supported micro

   architecture names to stdout

-  ``pylikwid.finalizetopology()``: Delete all information in the

   topology module

NUMA

----

-  ``numadict = pylikwid.initnuma()``: Initialize the NUMA module and

   return the gathered values

   -  ``numberOfNodes``: Amount of NUMA nodes in the system

   -  ``nodes``: Dict holding the information about the NUMA domains.

      Keys are the NUMA domain IDs

      -  ``id``: ID of the NUMA domain (should be equal to dict key)

      -  ``numberOfProcessors``: Number of hardware threads attached to

         the NUMA domain

      -  ``processors``: List of all CPU IDs attached to the NUMA domain

      -  ``freeMemory``: Amount of free memory in the NUMA domain (in

         Kbytes)

      -  ``totalMemory``: Amount of total memory in the NUMA domain (in

         Kbytes)

      -  ``numberOfDistances``: How many distances to self/other NUMA

         domains

      -  ``distances``: List with distances, NUMA domain IDs are the

         destination indexes in the list

-  ``pylikwid.finalizenuma()``: Delete all information in the NUMA

   module

Affinity

--------

-  ``affdict = pylikwid.initaffinity()``: Initialize the affinity domain

   module and return the gathered values

   -  ``numberOfAffinityDomains``: Amount of affinity domains

   -  ``numberOfSocketDomains``: Amount of CPU socket related affinity

      domains

   -  ``numberOfNumaDomains``: Amount of NUMA related affinity domains

   -  ``numberOfCacheDomains``: Amount of last level cache related

      affinity domains

   -  ``numberOfProcessorsPerSocket``: Amount of hardware threads per

      CPU socket

   -  ``numberOfCoresPerCache``: Amount of physical CPU cores per last

      level cache

   -  ``numberOfProcessorsPerCache``: Amount of hardware threads per

      last level cache

   -  ``domains``: Dict holding the information about the affinity

      domains

      -  ``tag``: Name of the affinity domain (N = node, SX = socket X,

         CY = cache Y, MZ = memory domain Z)

      -  ``numberOfProcessors``: Amount of hardware threads in the

         domain

      -  ``numberOfCores``: Amount of physical CPU cores in the domain

      -  ``processorList``: List holding the CPU IDs in the domain

-  ``pylikwid.finalizeaffinity()``: Delete all information in the

   affinity domain module

-  ``pylikwid.cpustr_to_cpulist()``: Transform a valid cpu string in

   LIKWID syntax into a list of CPU IDs

Timer

-----

-  ``pylikwid.getcpuclock()``: Return the CPU clock

-  ``t_start = pylikwid.startclock()``: Start the clock and return the

   current timestamp

-  ``t_end = pylikwid.stopclock()``: Stop the clock and return the

   current timestamp

-  ``t = pylikwid.getclock(t_start, t_end)``: Return the time in seconds

   between ``t_start`` and ``t_end``

-  ``c = pylikwid.getclockcycles(t_start, t_end)``: Return the amount of

   CPU cycles between ``t_start`` and ``t_end``

Temperature

-----------

-  ``pylikwid.inittemp(cpu)``: Initialize the temperature module for CPU

   ``cpu``

-  ``pylikwid.readtemp(cpu)``: Read the current temperature of CPU

   ``cpu``

Energy

------

-  ``pinfo = pylikwid.getpowerinfo()``: Initializes the energy module

   and returns gathered information. If it returns ``None``, there is no

   energy support

   -  ``minFrequency``: Minimal possible frequency of a CPU core

   -  ``baseFrequency``: Base frequency of a CPU core

   -  ``hasRAPL``: Are energy reading supported?

   -  ``timeUnit``: Time unit

   -  ``powerUnit``: Power unit

   -  ``domains``: Dict holding the information about the energy

      domains. Keys are PKG, PP0, PP1, DRAM

      -  ``ID``: ID of the energy domain

      -  ``energyUnit``: Unit to derive raw register counts to uJ

      -  ``supportInfo``: Is the information register available?

      -  ``tdp``: TDP of the domain (only if supportInfo == True)

      -  ``minPower``: Minimal power consumption by the domain (only if

         supportInfo == True)

      -  ``maxPower``: Maximal power consumption by the domain (only if

         supportInfo == True)

      -  ``maxTimeWindow``: Maximal time window between updates of the

         energy registers

      -  ``supportStatus``: Are energy readings from the domain are

         possible?

      -  ``supportPerf``: Is power capping etc. available?

      -  ``supportPolicy``: Can we set a power policy for the domain?

-  ``e_start = pylikwid.startpower(cpu, domainid)``: Return the start

   value for a cpu for the domain with ``domainid``. The ``domainid``

   can be found in ``pinfo["domains"][domainname]["ID"]``

-  ``e_stop = pylikwid.stoppower(cpu, domainid)``: Return the stop value

   for a cpu for the domain with ``domainid``. The ``domainid`` can be

   found in ``pinfo["domains"][domainname]["ID"]``

-  ``e = pylikwid.getpower(e_start, e_stop, domainid)``: Calculate the

   uJ from the values retrieved by ``startpower`` and ``stoppower``.

Configuration

-------------

-  ``pylikwid.initconfiguration()``: Read in config file from different

   places. Default is ``/etc/likwid.cfg``

-  ``config = pylikwid.getconfiguration()``: Get the dict with the

   configuration options

   -  ``configFileName``: Path to the config file

   -  ``topologyCfgFileName``: If a topology file was created with

      ``likwid-genTopoCfg`` and found by ``initconfiguration()``

   -  ``daemonPath``: Path to the access daemon executable

   -  ``groupPath``: Path to the base directory with the performance

      group files

   -  ``daemonMode``: Configured access mode (0=direct, 1=accessDaemon)

   -  ``maxNumThreads``: Maximal amount of hardware threads that can be

      handled by LIKWID

   -  ``maxNumNodes``: Maximal amount of CPU sockets that can be handled

      by LIKWID

-  ``pylikwid.destroyconfiguration()``: Destroy all information about

   the configuration

Access module

-------------

-  ``pylikwid.hpmmode(mode)``: Set access mode. For x86 there are two

   modes:

   -  ``mode = 0``: Access the MSR and PCI devices directly. May require

      root access

   -  ``mode = 1``: Access the MSR and PCI devices through access daemon

      instances

-  ``pylikwid.hpminit()``: Initialize the access functions according to

   the access mode

-  ``pylikwid.hpmaddthread(cpu)``: Add CPU ``cpu`` to the access layer

   (opens devices files or connection to an access daemon)

-  ``pylikwid.hpmfinalize()``: Unregister all CPUs from the access layer

   and close files/connections

Performance Monitoring

----------------------

-  ``pylikwid.init(cpus)``: Initialize the perfmon module for the CPUs

   given in list ``cpus``

-  ``pylikwid.getnumberofthreads()``: Return the number of threads

   initialized in the perfmon module

-  ``pylikwid.getnumberofgroups()``: Return the number of groups

   currently registered in the perfmon module

-  ``pylikwid.getgroups()``: Return a list of all available groups. Each

   list entry is a dict:

   -  ``Name``: Name of the performance group

   -  ``Short``: Short information about the performance group

   -  ``Long``: Long description of the performance group

-  ``gid = pylikwid.addeventset(estr)``: Add a performance group or a

   custom event set to the perfmon module. The ``gid`` is required to

   specify the event set later

-  ``pylikwid.getnameofgroup(gid)``: Return the name of the group

   identified by ``gid``. If it is a custom event set, the name is set

   to ``Custom``

-  ``pylikwid.getshortinfoofgroup(gid)``: Return the short information

   about a performance group

-  ``pylikwid.getlonginfoofgroup(gid)``: Return the description of a

   performance group

-  ``pylikwid.getnumberofevents(gid)``: Return the amount of events in

   the group

-  ``pylikwid.getnumberofmetrics(gid)``: Return the amount of derived

   metrics in the group. Always 0 for custom event sets.

-  ``pylikwid.getnameofevent(gid, eidx)``: Return the name of the event

   identified by ``gid`` and the index in the list of events

-  ``pylikwid.getnameofcounter(gid, eidx)``: Return the name of the

   counter register identified by ``gid`` and the index in the list of

   events

-  ``pylikwid.getnameofmetric(gid, midx)``: Return the name of a derived

   metric identified by ``gid`` and the index in the list of metrics

-  ``pylikwid.setup(gid)``: Program the counter registers to measure all

   events in group ``gid``

-  ``pylikwid.start()``: Start the counter registers

-  ``pylikwid.stop()``: Stop the counter registers

-  ``pylikwid.read()``: Read the counter registers (stop->read->start)

-  ``pylikwid.switch(gid)``: Switch to group ``gid``

   (stop->setup(gid)->start)

-  ``pylikwid.getidofactivegroup()`` Return the ``gid`` of the currently

   configured group

-  ``pylikwid.getresult(gid, eidx, tidx)``: Return the raw counter

   register result of all measurements identified by group ``gid`` and

   the indices for event ``eidx`` and thread ``tidx``

-  ``pylikwid.getlastresult(gid, eidx, tidx)``: Return the raw counter

   register result of the last measurement cycle identified by group

   ``gid`` and the indices for event ``eidx`` and thread ``tidx``

-  ``pylikwid.getmetric(gid, midx, tidx)``: Return the derived metric

   result of all measurements identified by group ``gid`` and the

   indices for metric ``midx`` and thread ``tidx``

-  ``pylikwid.getlastmetric(gid, midx, tidx)``: Return the derived

   metric result of the last measurement cycle identified by group

   ``gid`` and the indices for metric ``midx`` and thread ``tidx``

-  ``pylikwid.gettimeofgroup(gid)``: Return the measurement time for

   group identified by ``gid``

-  ``pylikwid.finalize()``: Reset all used registers and delete internal

   measurement results

Marker API result file reader

-----------------------------

-  ``pylikwid.markerreadfile(filename)``: Reads in the result file of an

   application run instrumented by the LIKWID Marker API

-  ``pylikwid.markernumregions()``: Return the number of regions in an

   application run

-  ``pylikwid.markerregiontag(rid)``: Return the region tag for the

   region identified by ``rid``

-  ``pylikwid.markerregiongroup(rid)``: Return the group name for the

   region identified by ``rid``

-  ``pylikwid.markerregionevents(rid)``: Return the amount of events for

   the region identified by ``rid``

-  ``pylikwid.markerregionthreads(rid)``: Return the amount of threads

   that executed the region identified by ``rid``

-  ``pylikwid.markerregiontime(rid, tidx)``: Return the accumulated

   measurement time for the region identified by ``rid`` and the thread

   index ``tidx``

-  ``pylikwid.markerregioncount(rid, tidx)``: Return the call count for

   the region identified by ``rid`` and the thread index ``tidx``

-  ``pylikwid.markerregionresult(rid, eidx, tidx)``: Return the call

   count for the region identified by ``rid``, the event index ``eidx``

   and the thread index ``tidx``

-  ``pylikwid.markerregionmetric(rid, midx, tidx)``: Return the call

   count for the region identified by ``rid``, the metric index ``midx``

   and the thread index ``tidx``

GPU Topology (if LIKWID is built with Nvidia interface)

-------------------------------------------------------

-  ``pylikwid.initgputopology()``: Initialize the topology module (reads in

   system topology)

-  ``topolist = pylikwid.getgputopology()``: Return a list with the

   GPU topology of the system. Each GPU is represented by a dict. The entries in

   the dicts are:

   -  ``devid``: Device identifier for the GPU

   -  ``numaNode``: The NUMA node identifier the GPU is attached at

   -  ``name``: Name of the device

   -  ``mem``: Memory capacity of the device

   -  ``ccapMajor``: Major number of the compute capability

   -  ``ccapMinor``: Minor number of the compute capability

   -  ``maxThreadsDim[3]``: Maximum sizes of each dimension of a block

   -  ``maxGridSize[3]``: Maximum sizes of each dimension of a grid

   -  ``maxThreadsPerBlock``: Maximam number of thread per block

   -  ``sharedMemPerBlock``: Total amount of shared memory available per block

   -  ``totalConstantMemory``: Total amount of constant memory available on the device

   -  ``simdWidth``: SIMD width of arithmetic units = warp size

   -  ``memPitch``: Maximum pitch allowed by the memory copy functions that involve memory regions allocated through cuMemAllocPitch()

   -  ``regsPerBlock``: Total number of registers available per block

   -  ``clockRatekHz``: Clock frequency in kilohertz

   -  ``textureAlign``: Alignment requirement

   -  ``surfaceAlign``: Alignment requirement for surfaces

   -  ``l2Size``: L2 cache in bytes. 0 if the device doesn't have L2 cache

   -  ``memClockRatekHz``: Peak memory clock frequency in kilohertz

   -  ``pciBus``: PCI bus identifier of the device

   -  ``pciDev``: PCI device (also known as slot) identifier of the device

   -  ``pciDom``: PCI domain identifier of the device

   -  ``maxBlockRegs``: Maximum number of 32-bit registers available to a thread block

   -  ``numMultiProcs``: Number of multiprocessors on the device

   -  ``maxThreadPerMultiProc``: Maximum resident threads per multiprocessor

   -  ``memBusWidth``: Global memory bus width in bits

   -  ``unifiedAddrSpace``: 1 if the device shares a unified address space with the host, or 0 if not

   -  ``ecc``: 1 if error correction is enabled on the device, 0 if error correction is disabled or not supported by the device

   -  ``asyncEngines``: Number of asynchronous engines

   -  ``mapHostMem``: 1 if the device can map host memory into the CUDA address space

   -  ``integrated``: 1 if the device is an integrated (motherboard) GPU and 0 if it is a discrete (card) component

-  ``pylikwid.finalizegputopology()``: Delete all information in the

   topology module

Performance Monitoring for Nvidia GPUs (if LIKWID is built with Nvidia interface)

---------------------------------------------------------------------------------

-  ``pylikwid.nvinit(gpus)``: Initialize the nvmon module for the GPUs

   given in list ``gpus``

-  ``pylikwid.nvgetnumberofgpus()``: Return the number of GPUs

   initialized in the nvmon module

-  ``pylikwid.nvgetnumberofgroups()``: Return the number of groups

   currently registered in the nvmon module

-  ``pylikwid.nvgetgroups()``: Return a list of all available groups. Each

   list entry is a dict:

   -  ``Name``: Name of the performance group

   -  ``Short``: Short information about the performance group

   -  ``Long``: Long description of the performance group

-  ``gid = pylikwid.nvaddeventset(estr)``: Add a performance group or a

   custom event set to the perfmon module. The ``gid`` is required to

   specify the event set later

-  ``pylikwid.nvgetnameofgroup(gid)``: Return the name of the group

   identified by ``gid``. If it is a custom event set, the name is set

   to ``Custom``

-  ``pylikwid.nvgetshortinfoofgroup(gid)``: Return the short information

   about a performance group

-  ``pylikwid.nvgetlonginfoofgroup(gid)``: Return the description of a

   performance group

-  ``pylikwid.nvgetnumberofevents(gid)``: Return the amount of events in

   the group

-  ``pylikwid.nvgetnumberofmetrics(gid)``: Return the amount of derived

   metrics in the group. Always 0 for custom event sets.

-  ``pylikwid.nvgetnameofevent(gid, eidx)``: Return the name of the event

   identified by ``gid`` and the index in the list of events

-  ``pylikwid.nvgetnameofcounter(gid, eidx)``: Return the name of the

   counter register identified by ``gid`` and the index in the list of

   events

-  ``pylikwid.nvgetnameofmetric(gid, midx)``: Return the name of a derived

   metric identified by ``gid`` and the index in the list of metrics

-  ``pylikwid.nvsetup(gid)``: Program the counter registers to measure all

   events in group ``gid``

-  ``pylikwid.nvstart()``: Start the counter registers

-  ``pylikwid.nvstop()``: Stop the counter registers

-  ``pylikwid.nvread()``: Read the counter registers (stop->read->start)

-  ``pylikwid.nvswitch(gid)``: Switch to group ``gid``

   (stop->setup(gid)->start)

-  ``pylikwid.nvgetidofactivegroup()`` Return the ``gid`` of the currently

   configured group

-  ``pylikwid.nvgetresult(gid, eidx, tidx)``: Return the raw counter

   register result of all measurements identified by group ``gid`` and

   the indices for event ``eidx`` and thread ``tidx``

-  ``pylikwid.nvgetlastresult(gid, eidx, tidx)``: Return the raw counter

   register result of the last measurement cycle identified by group

   ``gid`` and the indices for event ``eidx`` and thread ``tidx``

-  ``pylikwid.nvgetmetric(gid, midx, tidx)``: Return the derived metric

   result of all measurements identified by group ``gid`` and the

   indices for metric ``midx`` and thread ``tidx``

-  ``pylikwid.nvgetlastmetric(gid, midx, tidx)``: Return the derived

   metric result of the last measurement cycle identified by group

   ``gid`` and the indices for metric ``midx`` and thread ``tidx``

-  ``pylikwid.nvgettimeofgroup(gid)``: Return the measurement time for

   group identified by ``gid``

-  ``pylikwid.nvfinalize()``: Reset all used registers and delete internal

   measurement results

Nvmon Marker API (if LIKWID is built with Nvidia interface)

-----------------------------------------------------------

-  ``pylikwid.gpumarkerinit()``: Initialize the Nvmon Marker API of the LIKWID library.

   Must be called previous to all other functions.

-  ``rr = pylikwid.gpuregisterregion(regiontag)``: Register a region to the

   Nvmon Marker API. This is an optional function to reduce the overhead of

   region registration at ``pylikwid.markerstartregion``. If you don't call

   ``pylikwid.gpumarkerregisterregion(regiontag)``, the registration is done at

   ``pylikwid.gpumarkerstartregion(regiontag)``. On success, 0 is return. If you

   havn't called ``pylikwid.gpumarkerinit()``, a negative number is returned.

-  ``err = pylikwid.gpumarkerstartregion(regiontag)``: Start measurements under

   the name ``regiontag``. On success, 0 is return. If you havn't called

   ``pylikwid.gpumarkerinit()``, a negative number is returned.

-  ``err = pylikwid.gpumarkerstopregion(regiontag)``: Stop measurements under the

   name ``regiontag`` again. On success, 0 is return. If you havn't

   called ``pylikwid.gpumarkerinit()``, a negative number is returned.

-  ``num_gpus, num_events, events[][], time[], count[] = pylikwid.gpumarkergetregion(regiontag)``:

   Get the intermediate results of the region identified by

   ``regiontag``. On success, it returns the number of events in the

   current group, a list with all the aggregated event results per GPU, the

   measurement time for the region and the number of calls.

-  ``pylikwid.gpunextgroup()``: Switch to the next event set in a

   round-robin fashion. If you have set only one event set on the

   command line, this function performs no operation.

-  ``pylikwid.gpumarkerreset(regiontag)``: Reset the values stored using the region

   name ``regiontag``. On success, 0 is returned.

-  ``pylikwid.gpumarkerclose()``: Close the connection to the LIKWID Nvmon Marker API

   and write out measurement data to file. This file will be evaluated

   by ``likwid-perfctr``.

Usage

=====

Marker API

----------

Code

~~~~

Here is a small example Python script how to use the LIKWID Marker API

in Python:

::

    #!/usr/bin/env python

    import pylikwid

    pylikwid.markerinit()

    pylikwid.markerthreadinit()

    liste = []

    pylikwid.markerstartregion("listappend")

    for i in range(0,1000000):

        liste.append(i)

    pylikwid.markerstopregion("listappend")

    nr_events, eventlist, time, count = pylikwid.markergetregion("listappend")

    for i, e in enumerate(eventlist):

        print(i, e)

    pylikwid.markerclose()

This code simply measures the hardware performance counters for

appending 1000000 elements to a list. First the API is initialized with

``likwid.init()`` and ``likwid.threadinit()``. Afterwards it creates an

empty list, starts the measurements with

``likwid.startregion("listappend")`` and executes the appending loop.

When the loop has finished, we stop the measurements again using

``likwid.stopregion("listappend")``. Just for the example, we get the

values inside our script using ``likwid.getregion("listappend")`` and

print out the results. Finally, we close the connection to the LIKWID

Marker API.

You always have to use ``likwid-perfctr`` to program the hardware

performance counters and specify the CPUs that should be measured. Since

Python is commonly single-threaded, the cpu set only contains one entry:

``likwid-perfctr -C 0 -g  -m `` This pins the

Python interpreter to CPU 0 and measures ```` for all regions

in the Python script. You can set multiple event sets by adding multiple

``-g `` to the command line. Please see the LIKWID page for

further information how to use ``likwid-perfctr``. Link:

https://github.com/rrze-likwid/likwid

Example

~~~~~~~

Using the above Python script we can measure the L2 to L3 cache data

volume:

::

    $ likwid-perfctr -C 0 -g L3 -m ./test.py

    --------------------------------------------------------------------------------

    CPU name:   Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz

    CPU type:   Intel Core Haswell processor

    CPU clock:  3.39 GHz

    --------------------------------------------------------------------------------

    (0, 926208305.0)

    (1, 325539316.0)

    (2, 284626172.0)

    (3, 1219118.0)

    (4, 918368.0)

    Wrote LIKWID Marker API output to file /tmp/likwid_17275.txt

    --------------------------------------------------------------------------------

    ================================================================================

    Group 1 L3: Region listappend

    ================================================================================

    +-------------------+----------+

    |    Region Info    |  Core 0  |

    +-------------------+----------+

    | RDTSC Runtime [s] | 0.091028 |

    |     call count    |     1    |

    +-------------------+----------+

    +-----------------------+---------+--------------+

    |         Event         | Counter |    Core 0    |

    +-----------------------+---------+--------------+

    |   INSTR_RETIRED_ANY   |  FIXC0  | 9.262083e+08 |

    | CPU_CLK_UNHALTED_CORE |  FIXC1  | 3.255393e+08 |

    |  CPU_CLK_UNHALTED_REF |  FIXC2  | 2.846262e+08 |

    |    L2_LINES_IN_ALL    |   PMC0  | 1.219118e+06 |

    |     L2_TRANS_L2_WB    |   PMC1  | 9.183680e+05 |

    +-----------------------+---------+--------------+

    +-------------------------------+--------------+

    |             Metric            |    Core 0    |

    +-------------------------------+--------------+

    |      Runtime (RDTSC) [s]      |  0.09102752  |

    |      Runtime unhalted [s]     | 9.596737e-02 |

    |          Clock [MHz]          | 3.879792e+03 |

    |              CPI              | 3.514753e-01 |

    |  L3 load bandwidth [MBytes/s] | 8.571425e+02 |

    |  L3 load data volume [GBytes] |  0.078023552 |

    | L3 evict bandwidth [MBytes/s] | 6.456899e+02 |

    | L3 evict data volume [GBytes] |  0.058775552 |

    |    L3 bandwidth [MBytes/s]    | 1.502832e+03 |

    |    L3 data volume [GBytes]    |  0.136799104 |

    +-------------------------------+--------------+

At first a header with the current system type and clock is printed.

Afterwards the output of the Python script lists the results of the

measurements we got internally with ``likwid.getregion``. The next

output is the region results evaluated by ``likwid-perfctr`` and prints

at first a headline stating the measured eventset, here ``L3`` and the

region name ``listappend``. Afterwards 2 or 3 tables are printed. At

first some basic information about the region like run time (or better

measurement time) and the number of calls of the region. The next table

contains the raw values for each event in the eventset. These numbers

are similar to the ones we got internally with ``likwid.getregion``. If

you have set an performance group (here ``L3``) instead of a custom

event set, the raw results are derived to commonly used metrics, here

the ``CPI`` (Cycles per instruction, lower is better) and different

bandwidths and data volumes. You can see, that the load bandwidth for

the small loop is 857 MByte/s and the evict (write) bandwidth is 645

MByte/s. In total we have a bandwidth of 1502 MByte/s.

Full API

--------

Code

~~~~

::

    #!/usr/bin/env python

    import pylikwid

    liste = []

    cpus = [0,1]

    pylikwid.init(cpus)

    group = pylikwid.addeventset("INSTR_RETIRED_ANY:FIXC0")

    pylikwid.setup(group)

    pylikwid.start()

    for i in range(0,1000000):

        liste.append(i)

    pylikwid.stop()

    for thread in range(0,len(cpus)):

        print("Result CPU %d : %f" % (cpus[thread], pylikwid.getresult(group,0,thread)))

    pylikwid.finalize()

Example

~~~~~~~

::

    $ ./test.py

    Result CPU 0 : 87335.000000

    Result CPU 1 : 5222188.000000

Further comments

================

Please be aware that Python is a high-level language and your simple

code is translated to a lot of Assembly instructions. The ``CPI`` value

is commonly low (=> good) for high-level languages because they have to

perform type-checking and similar stuff that can be executed fast in

comparison to the CPU clock. If you would compare the results to a lower

level language like C or Fortran, the ``CPI`` will be worse for them but

the performance will be higher as no type-checking and transformations

need to be done.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rrze-hpc/pylikwid

Awesome Lists containing this project

README