{"id":45740836,"url":"https://github.com/dimitriwauters/PANDI","last_synced_at":"2026-04-08T09:00:43.905Z","repository":{"id":156016447,"uuid":"586616485","full_name":"dimitriwauters/PANDI","owner":"dimitriwauters","description":null,"archived":false,"fork":false,"pushed_at":"2024-01-03T14:30:05.000Z","size":15066,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-02-20T16:03:28.959Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dimitriwauters.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-01-08T18:53:42.000Z","updated_at":"2024-02-20T16:03:28.961Z","dependencies_parsed_at":"2024-01-03T15:51:05.043Z","dependency_job_id":null,"html_url":"https://github.com/dimitriwauters/PANDI","commit_stats":null,"previous_names":["dimitriwauters/pandi"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dimitriwauters/PANDI","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dimitriwauters%2FPANDI","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dimitriwauters%2FPANDI/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dimitriwauters%2FPANDI/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dimitriwauters%2FPANDI/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dimitriwauters","download_url":"https://codeload.github.com/dimitriwauters/PANDI/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dimitriwauters%2FPANDI/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31547845,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-07T16:28:08.000Z","status":"online","status_checked_at":"2026-04-08T02:00:06.127Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-02-25T14:00:28.947Z","updated_at":"2026-04-08T09:00:43.880Z","avatar_url":"https://github.com/dimitriwauters.png","language":"Jupyter Notebook","readme":"# PANDI\n\nPANDI is a ***dynamic packing detection*** solution built on top of PANDA (https://github.com/panda-re/panda), a platform for Architecture-Neutral Dynamic Analysis. \nPANDI is currently developed at UCLouvain (Belgium) and is available under MIT license.\n\n## How to use\n***First, put the malware(s) that need to be analysed under the folder \"payload\".***  \nThen, there is two main ways of running this program:\n- Use launch.py\n- Use the docker-compose provided\n\n### Launching the packing detector\n#### Launch.py\nThis script is an interface between the Docker implementation of PANDA and the host machine.  \nIt allows the user to easily modify the parameters of the software, as shown below.  \nIt will also download automatically the Windows VM needed to run PANDA and build the Docker image.\n```\nusage: launch.py [-h] [--build] [--silent] [--debug] [--executable EXECUTABLE] [--force_complete_replay] [--max_memory_write_exe_list_length MAX_MEMORY_WRITE_EXE_LIST_LENGTH] [--entropy_granularity ENTROPY_GRANULARITY]\n                 [--max_entropy_list_length MAX_ENTROPY_LIST_LENGTH] [--dll_discover_granularity DLL_DISCOVER_GRANULARITY] [--max_dll_discover_fail MAX_DLL_DISCOVER_FAIL] [--force_dll_rediscover] [--memcheck] [--entropy] [--dll]\n                 [--dll_discover] [--sections_perms] [--first_bytes]\n\noptions:\n  -h, --help            show this help message and exit\n  --build               rebuild panda image\n  --silent              only print the result in JSON format\n  --debug               activate verbose mode\n  --executable EXECUTABLE\n                        force the selection of one software\n  --force_complete_replay\n                        read the replay until the end\n  --max_memory_write_exe_list_length MAX_MEMORY_WRITE_EXE_LIST_LENGTH\n                        maximum length of the returned list before exiting\n  --entropy_granularity ENTROPY_GRANULARITY\n                        number of basic blocks between samples. Lower numbers result in higher run times\n  --max_entropy_list_length MAX_ENTROPY_LIST_LENGTH\n                        maximum length of entropy list before exiting\n  --dll_discover_granularity DLL_DISCOVER_GRANULARITY\n                        maximum length of the returned list before exiting\n  --max_dll_discover_fail MAX_DLL_DISCOVER_FAIL\n                        maximum length of the returned list before exiting\n  --force_dll_rediscover\n                        read the replay until the end\n  --memcheck            activate memory write and executed detection\n  --entropy             activate entropy analysis\n  --dll                 activate syscalls analysis\n  --dll_discover        activate dll discovering system\n  --sections_perms      activate sections permission analysis\n  --first_bytes         activate first bytes analysis\n  --count_instr         activate the instruction counter\n```\n\n#### docker-compose\nBefore using the provided docker-compose, the virtual machine that will be used to perform the analysis need to be downloaded.   \nYou can find it by following this link: https://uclouvain-my.sharepoint.com/:u:/g/personal/d_wauters_uclouvain_be/EZXz0Kf1U_VEhQSddwlPOI4B_oKqEwY-HmxC5Nv6Wd4WSA?e=09Zg7E   \n(or by performing the procedure of creating a new virtual machine)\n\nOnce the virtual machine is downloaded, the process can be launched as any docker-compose project.\n\n### Importing additional DLLs\nThere is the possibility to add new DLL that are not present in the virtual machine.\nThis might be useful in the case of a sample that need a specific DLL that is not standard.   \nTo add these DLLs to the virtual machine, you can simply put them in the `additional-dll` folder, and they will be loaded\nin parallel to the sample.\n\n## Usage\nThe five possible options of this software can be combined but at least one must be enabled.  \nSome parameters can be tweaked to modify the behavior of the whole software. These parameters are:\n- `--build`: rebuild the docker image\n- `--silent` (or `panda_silent=False`): don't print anything in stdout\n- `--debug` (or `panda_debug=False`): save the output of the replay to `.debug` folder (verbose)\n- `--executable` (or `panda_executable=None`): choose one executable of the `payload` folder to be executed\n\n### Memory Write\u0026Execution Detection\n\u003eThis option must be activated with the `--memcheck` parameter on `launch.py` or by modifying the `docker-compose.yml` file by adding `panda_memcheck=True` in the environment variables.\n\nThis memory check process will be responsible to analyse the memory of a given software by detecting each time it tries\nto execute a piece of memory that it has previously written to. If this list of written-then-executed memory portion is\nnot empty and contains some consecutive addresses, we can consider that the analysed software is in fact packed.\n\nThis analysis works by using the `@panda.cb_virt_mem_after_write` callback from PANDA to register each memory address\nwritten. Then a second callback `@panda.cb_before_block_exec` will be responsible to detect\nwhen a previously written address (known by the first syscall) is currently executed.\n\nSome parameters can be tweaked to modify the behavior of this option. These parameters are:\n- `--max_memory_write_exe_list_length=1000` (or `panda_max_memory_write_exe_list_length=1000`) define the maximum length\nof the writen-then-executed list before cutting the analysis. This allows to reduce the execution time when there is enough\ndata. The default value of this parameter is a length of 1000.\n\n### Entropy Analysis\n\u003eThis option must be activated with the `--entropy` parameter on `launch.py` or by modifying the `docker-compose.yml` file by adding `panda_entropy=True` in the environment variables.   \n\u003e This option will need the help of machine learning to give the result.\n\nThe entropy analysis will gather the entropy of each of the program section at every execution of a basic block\n(with a defined granularity). These entropy points will then be used to compute statistics to determine, with the\nhelp of some machine learning, if the analysed software is packed or not.   \nThe entry point of the software and the entry point of the unpacked software (if any) will be also used to extract statistics.\n\nSome parameters can be tweaked to modify the behavior of this option. These parameters are:\n- `--entropy_granularity=1000` (or `panda_entropy_granularity=1000`) define the granularity to adopt between two analysis\nof the entropy. We use the basic blocks of PANDA as our metric and analysis only a portion of them to minimize the time\nneeded to finish the whole analysis. Here we defined that the entropy is computed every 1000 basic block.\n- `--max_entropy_list_length=0` (or `panda_max_entropy_list_length=0`) define the maximum length of the list containing\nthe computed entropy of the sections. If the length of this list reach the limit, the entropy analysis is stopped. The\ndefault value for this parameter is 0, meaning that the list can be any size.\n\n### Syscalls Analysis\n\u003eThis option must be activated with the `--dll` parameter on `launch.py` or by modifying the `docker-compose.yml` file by adding `panda_dll=True` in the environment variables.   \n\u003e This option will need the help of machine learning to give the result.\n\nThis analysis will recover the initially imported function by recovering the IAT (Import Address Table) of the software\nand raising an event when the address currently executed correspond to an imported function. It also detects (thanks to\nthe `syscalls2` PANDA plugin) when a DLL is used and produce a list of used DLL before and after the detected entry point\nof the unpacked part of the software (if any).\n\nIf the initial IAT contains `GetProcAddress` or `LoadLibrary`, this module is able to count the number of times these\nfunctions are called and also recover the imported function or DLL to later detect their usage.   \nThe data collected on these syscalls will be used, in a machine learning algorithm, to determine if the analysed software\nis packed or not.\n\n### Automatic DLL Discovery\n\u003eThis option must be activated with the `--dll_discovery` parameter on `launch.py` or by modifying the `docker-compose.yml` file by adding `panda_dll_discovery=True` in the environment variables.\n\nThis option is useful in the cas of a corrupted/missing/stripped IAT (Import Address Table).\nIt will parse the DLLs loaded in memory by the sample to analyse and parse it to discover the exported function of these loaded DLL.\nMeaning that we will not need anymore to parse the IAT to recover the addresses of each function that will be called by the sample,\nwe will know them before the execution of the sample.\n\nTo limit the overhead of this option, the result is saved into a file available at `/payload/dll` and will be used\nfor the next analysis. If you don't want to reuse the previously computed result, you can delete the file or add the\n`--force_dll_rediscover` on `launch.py` or add `panda_force_dll_rediscover=True` in the environment parameters of the `docker-compose.yml`.\n\nThis option must be used in parallel to the [Syscalls Analysis](#syscalls-analysis) section. It will benefit from the discovered\nDLLs to perform its analysis. This option will not work alone.\n\nSome parameters can be tweaked to modify the behavior of this option. These parameters are:\n- `--dll_discover_granularity=1000` (or `panda_dll_discover_granularity=1000`) define the granularity to adopt between \ntwo analysis of the loaded DLL in-memory. This parameter is exactly like the one for the entropy. The default value is \nan analysis every 1000 basic block.\n- `--max_dll_discover_fail=10000` (or `panda_max_dll_discover_fail=10000`) define the maximum number of failure authorized \nbefore shutting down the discovery of DLL functions. Not every function of the DLL are mapped in-memory when loading the DLL\nmeaning that some will throw an error when trying to get information about them, this is the type of failure we see here.\nThe default value is fixed at 10 000 errors.\n- `--force_dll_rediscover=False` (or `panda_force_dll_rediscover=False`) force the re-discovery of DLL functions even if\nit was already done in the past, like explained above.\n\n### Section Permissions Modification Detection\n\u003eThis option must be activated with the `--section_perms` parameter on `launch.py` or by modifying the `docker-compose.yml` file by adding `panda_section_perms=True` in the environment variables.   \n\u003e This option will need the help of machine learning to give the result.\n\nThis analysis make an additional verification regarding the headers of the executable. It recovers the initial permissions\nof the different sections at the beginning of the execution and tries at multiple times during the execution of the sample\nto write in the section if it was previously announced at read-only.\n\nIt allows to know if the section permissions have been changed during the execution of the program, giving an indication\nthat the sample may perform an unpacking procedure.\n\n### First Bytes Extraction\n\u003eThis option must be activated with the `--first_bytes` parameter on `launch.py` or by modifying the `docker-compose.yml` file by adding `panda_first_bytes=True` in the environment variables.   \n\u003e This option will need the help of machine learning to give the result.\n\nThis option extract the first 64 bytes executed by the sample. This is not necessarily the bytes located at the address\npointer by the entry point in the header of the PE, as this can be evaded. For example, with TLS (Thread Local Storage)\ncallbacks, a malware can execute some code before executing the code pointed by the entry point. This can be used by some\nmalware to evade debugging.\n\nThis option detect the real first bytes executed by the sample and allow to detect this kind of escaping behavior.\n\n## Output (results)\nAn output directory containing the data collected under the name `ouput` is created when the analysis ends. This directory\ncontains a sub-folder for each sample analysed, with the name of the sample as the name of the sub-folder.   \nThese results can be read easily as they are packed with the python module `pickle` but a script is provided for convenience.\nThis script is called `output_read.py` and take the name of the sample you want to analyse in parameter. It will then\nprint the result collected by each activated option during the analysis.\n\n## Improvements\n\n### Exact Unpacked Entry-Point Detection\nCurrently, the entry-point of the unpacked program is detected (if only packed once) but this detection is not precise.\nAs not every instruction is observed to reduce the process time, the exact entry-point can happen between two lookups.\nThis means that the analysis will detect an approximation of the entry-point (for example 1000 instructions later) but not the exact one.\n\nTo scope with that, we can think of two approaches:\n- The first one will be simple to set the granularity to zero with the variable `panda_entropy_granularity` but this will\nsignificantly increase the time needed to finish the analysis as every instruction will be analysed.\n- The second needs a little more work. It can be done by setting temporarily the variable `panda_entropy_granularity`\nto 0 when we see that the entry-point of the unpacked code is close. For example when a dynamically imported function is \ncalled (DYNAMIC_DLL).\n\nThis functionality as not been implemented as the purpose of this software is not to exact the unpacked data but only\nto define if the provided sample is packed or not.\n\n## Build VM from scratch\nIn case if you want to build your own VM or the given link is broken, this section will present how to rebuild the VM for PANDA.\n\n* Download a Windows 7 x86 virtual machine from a known source   \nFor example: https://az792536.vo.msecnd.net/vms/VMBuild_20150916/VirtualBox/IE8/IE8.Win7.VirtualBox.zip\n* Transform the virtual machine into a QEMU compatible one (if not already)   \nFor example with `qemu-utils` on Linux\n* Launch the virtual machine, open a prompt and make a snapshot of the machine (on the slot 1)\n  * To perform that, a script has been made. It can be launched with `docker-compose -f docker-compose.newvm.yml run pandare`\n  * You can add, prior launching the script, some file you will want to launch on the VM. Those files can be put a folder called `new-vm`\n  * Now connect to the VM with a VNC viewer (like Remmina) through the IP of the docker container\n  * Install the program you want and finish by opening a prompt\n  * Once it's done, type `finished` in the console (where you typed `docker-compose -f docker-compose.newvm.yml run pandare`)\n  * The VM is now ready to be used !\n  * (If you don't want to save the modification you have done, you can type anything in the console and the snapshot will not be saved)\n* Name the virtual machine as `vm.qcow2` and place the file under the folder `./docker/.panda`\n* ENJOY\n","funding_links":[],"categories":[":wrench: Tools"],"sub_categories":["Before 2000"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdimitriwauters%2FPANDI","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdimitriwauters%2FPANDI","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdimitriwauters%2FPANDI/lists"}