{"id":19662203,"url":"https://github.com/efeslab/igor-ae","last_synced_at":"2026-06-03T23:31:36.653Z","repository":{"id":127885785,"uuid":"336148837","full_name":"efeslab/igor-ae","owner":"efeslab","description":"Artifact evaluation for IGOR (RTAS '21) ","archived":false,"fork":false,"pushed_at":"2021-04-12T18:25:26.000Z","size":64292,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-02-27T03:19:12.812Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/efeslab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-02-05T03:07:20.000Z","updated_at":"2023-07-03T08:05:06.000Z","dependencies_parsed_at":null,"dependency_job_id":"af788815-a7bc-427d-b10e-72b928a8ce06","html_url":"https://github.com/efeslab/igor-ae","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/efeslab/igor-ae","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/efeslab%2Figor-ae","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/efeslab%2Figor-ae/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/efeslab%2Figor-ae/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/efeslab%2Figor-ae/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/efeslab","download_url":"https://codeload.github.com/efeslab/igor-ae/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/efeslab%2Figor-ae/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":285704833,"owners_count":27217837,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-21T02:00:06.175Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-11T16:09:55.726Z","updated_at":"2025-11-21T23:02:35.863Z","avatar_url":"https://github.com/efeslab.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# IGOR (RTAS '21)\n\nThis repository contains artifacts for our RTAS '21 paper on IGOR, an approach\nfor accelerating BFT SMR by eagerly executing on sensor data on multiple cores.\n\nThe repo contains the following directories.\n\n- `data/` - raw timing data logged from our prototype\n- `evaluation/` - scripts for reproducing results in our paper evaluation\n- `paper/` - a copy of the IGOR paper (once the camera-ready version is available)\n- `scripts/` - scripts used for timing analysis and visualization\n- `setup/` - a schematic of the circuit we used for time synchronization in our prototype\n- `software/` - a Core Flight System project with our implementation source code\n\nWe split this README file into two main sections.\n\nThe [first section](#running-igor) describes how to set up the RPi cluster used\nin our evaluation, as well as how to build and run the IGOR software. This\nsection is intended for researchers seeking to test and extend IGOR in their\nown environment.\n\n**For AE Committee:** The [second section](#repeating-results) describes how to\nrepeat the key results from the IGOR paper. This section is intended primarily\nfor the artifact evaluation committee.\n\n\n## Deploying IGOR\n\nThis section is intended for researchers setting up their own RPi cluster in\norder to test or extend IGOR.\n\n\n### Setup\n\nOur implementation of IGOR is designed to run on a cluster of RPi 3B+ single\nboard computers. One RPi emulates sensors and actuators. The other RPis act as \nreplicas.  We tested IGOR with RPis running Raspbian 9.4 with kernel version\n4.14.34 and the PREEMPT RT patch.\n\n#### Network setup\n\nThe RPis are connected by an Ethernet switch. The IP addresses of the RPis are\ndefined in the `AFDX_LIB_IpTable` array defined in `igor-ae/software/igor/igor_defs/tables/afdx_lib_vls.c`.\n\n```c\nAFDX_IpEntry_t AFDX_LIB_IpTable[CPU_COUNT] =\n{\n    {1,  \"10.0.0.21\"}, /* replica 0 */\n    {2,  \"10.0.0.22\"}, /* replica 1 */\n    ...\n    {11, \"10.0.0.31\"}, /* sim interface */     \n};\n```\n\nBefore running IGOR, add the following at the end of `/etc/sysctl.conf` to\nincrease socket buffer sizes. Then reboot the RPis.\n\n```script\nnet.core.rmem_default=512000\nnet.core.wmem_default=512000\nnet.core.rmem_max=512000\nnet.core.wmem_max=512000\n```\n#### Timing setup\n\nThe RPis must be synchronized via an external timimg circuit. The flight software\nperforms one processing step each time the voltage level on GPIO pin 4 is changed.\nWe provide a diagram of the timimg circuit in `igor-ae/setup/`.\n\nAdjust the resistor and capacitor values to change the schedule frequency. The\ncFS scheduler is currently configured to run at a 500 Hz rate (see `software/igor/igor_defs/sch_sync_platform_cfg.h`).\nThat means the circuit needs a frequency of 250 Hz.\nTo get approximately this rate, you can use Ra = 1 kΩ, Rb = 3 MΩ, and C = 1 nF. \n\n#### Additional Setup\n\nBefore running IGOR, it is necessary to increase the number of threads that one\ntask can spawn. Do this by adding the following to the bottom of \n`/etc/systemd/logind.conf`.  Then reboot the RPis.\n\n```script\nUserTasksMax=infinity\n```\n\n**Note:** Always safely shutdown the RPis before powering them off. Otherwise, it is\neasy to corrupt the SD cards. Safely shutdown with the following.\n\n```script\n$ sudo shutdown -h now\n```\n\n\n### Steps to minimize timing variability\n\nTo minimize timing variability, it is necessary to perform additional configuration\non the RPis. Specifically, it is necessary to:\n\n#### Isolate a core for inter-replica communication\n\nThis implementation assumes core 0 is isolated for I/O threads. To prevent the\nLinux scheduler from running tasks on core 0, add the following ot `/boot/cmdline.txt`\non each RPi, then reboot the RPis.\n\n```script\nisolcpus=0\n```\n#### Slow down background tasks\n\nCertain background tasks will cause unnecessary timing variability. One such culprit\nis `x2gocleansessions`, which will periodically consume 1% of the CPU if using X2Go.\nTo slow it down, run `$ sudo nano /usr/sbin/x2gocleansessions` and change `while (sleep 2)`\nto while `(sleep 1000)`.\n\n#### Built-in steps\n\nIn addition to the above, IGOR performs additional configuration changes automatically\nin order to reduce timing variability. These include.\n\n- Setting real-time task priorities. \n- Pinning processes and threads to specific cores.\n- Disabling the RPi from varying clock speed automatically. \n- Preventing tasks from being forced to yield the CPU to the OS scheduler. \n\n**Note:** Despite these steps, other factors can also impact timing. These include the IRQ affinity settings (set via `smp_affinity` in `/proc/irq/\u003cirq_num\u003e/`), the runlevel for your platform, and whether the inter-replica traffic is sharing network resources with other traffic (e.g. for SSH).\n\n\n### Building IGOR\n\nIGOR is implemented in NASA's Core Flight System (cFS). The top-level cFS\ndirectory is `software/igor`. We developed 11 supporting libraries and\napplications for IGOR, which can be found in `software/igor/apps`.\n\n- **afdx_lib** - emulates an AFDX network over standard Ethernet\n- **bcast_lib** - implements reliable/Byzantine broadcast primitive\n- **comp_lib** - executes mock speculative computations\n- **exchange_lib** - primitives for exchanging information between replicas\n- **io_lib** - primitives for communicating between sensors and replicas\n- **log_lib** - used for recording activities to a log file\n- **select_lib** - executes mock source selection algorithms\n- **state_lib** - used for distributing state between replicas\n- **vote_lib** - implements various voting functions\n- **sch_sync** - application for synchronizing the execution of cFS tasks\n- **sim** - emulates sensors and actuators\n\nWe also implemented the following applications for testing each BFT protocol.\n\n- **test_bft_ef** - an agree-execute system using Lamport, Shostak, and Pease’s Oral Messages protocol (OM)\n- **test_bft_turpin_ef** - an agree-execute system using Turpin and Coan's reduction protocol (TC)\n- **test_igor_ef** - multi-fault version of IGOR that does use Filtering (IGOR for f \u003e 1)\n- **test_igor_nofilter_ef** - single-fault version of IGOR that does not use Filtering (IGOR for f = 1)\n- **test_no_rep** - a simple non-replicated system (NoRep)\n\n\n#### Configuring IGOR\n\nEach build of IGOR is configured with two configuration files found at\n`software/igor/igor_defs/cpuN_cfe_es_startup.scr` and\n`software/igor/igor_defs/targets_config.cmake`.\n\nThe `cpuN_cfe_es_startup.scr` controls which applications run when cFS is started,\nwhere N indicated the RPi that uses the startup script. Each startup file must\nbe edited to start the BFT protocol you want to run. For example, to run a\nparticular BFT protocol, edit the `cpuN_cfe_es_startup.scr` file for each replica\n(e.g. `cpu1_` for the first replica, `cpu4_` for the fourth replica) and ensure\nit ends in one of the following lines:\n\n```script\nCFE_APP, /cf/test_igor_nofilter_ef.so, TEST_IGOR_NOFILTER_EF_AppMain, TEST_IGOR_EF_NOFILTER, 90, 64000,  0x0, 0, 0;\nCFE_APP, /cf/test_igor_ef.so,          TEST_IGOR_EF_AppMain,          TEST_IGOR_EF,          90, 64000,  0x0, 0, 0;\nCFE_APP, /cf/test_bft_turpin_ef.so,    TEST_BFT_TURPIN_EF_AppMain,    TEST_BFT_TURPIN_EF,    90, 64000,  0x0, 0, 0;\nCFE_APP, /cf/test_bft_ef.so,           TEST_BFT_EF_AppMain,           TEST_BFT_EF,           90, 64000,  0x0, 0, 0;\nCFE_APP, /cf/test_no_rep.so,           TEST_NO_REP_AppMain,           TEST_NO_REP,           90, 64000,  0x0, 0, 0;\n```\nAll other applications in the startup file should not be altered.\nNote that all contents of the file after the first `!` character are ignored.\n\nNext, we must edit the `targets_config.cmake` file, which contains many configuration\nparameters that control the behavior of the software. The comments in the file\ndescribe each of the configuration parameters. Below, we focus on the most\nimportant parameters.\n\nFirst, edit `SIM_MODE` according to the BFT protocol you are using. It should\nbe set to `SIM_MODE=SIM_MODE_NO_REP` when using the **test_no_rep** protocol,\nand `SIM_MODE=SIM_MODE_VOTE` in all other cases.\n\nNext, we must choose the cFS schedule tables that control in which slots each\ntask is executed. The `REPLICA_SCH_TABLE` parameter controls the task schedule\non the replica RPis. The `SIM_SCH_TABLE` parameter controls the task schedule on\nthe sensor/actuator RPi.\n\nBy default, these are set to \"default\" schedule tables located in \n`software/igor/igor_defs/tables/default`. These default tables spread each\nexecution of the protocol over a one second major frame, which is useful when\nmeasuring the worst-case timing of each constituent code segment.\n\n```script\n# Specify the replica schedule table.\nSET(REPLICA_SCH_TABLE \"default/default_replica_table_bft_ef.c\")\n\n# Specify the sim schedule table.\nSET(SIM_SCH_TABLE \"default/default_sim_table.c\")\n```\n\nThey can easily be set to any custom tables stored in `software/igor/igor_defs/tables`.\n\n\n#### Running IGOR\n\nWe build IGOR by first copying the software to each of the RPis, then building\nIGOR on the RPis in parallel. We do this by running the `build-cfs-rpi.sh` script,\nwhich can be found in `software/igor`.\n\n```script\n$ cd software/igor\n$ ./build-cfs-rpi.sh\n```\n\nThe `USER` and `USER_RPI` variables at the top of the script need to be set to \nto your username on your host computer (where you are copying IGOR from) and\nyour RPis respectively. After IGOR builds on each RPi, it will be copied to\na `cfs/` directory on the user's desktop on the RPi.\n\nTo run IGOR, ssh into each RPi (i.e. each of the replicas, and the sensor/actuator\nRPi) and execute the following.\n\n```script\n$ cd ~/Desktop/cfs\n$ sudo ./core-cpuN, where N is the RPi's ID.\n```\n\nEach RPi will initialize the scheduler application and start waiting for \ninterrupts from the timing circuit (described in [this section](#timing-setup).\nPress the push-button in the circuit to start the software on the RPis. Stop\nthe RPis by pressing the push-button again, then typing `CTRL-C` on each RPi.\n\n\n### Processing timing data\n\nAfter running IGOR, a log `log_cpu=N.dat`is produced on each RPi. The log \ncontains timestamps of when each major code segment in the protocol started and\nfinished execution.\n\nWe provide a variety of scripts for analyzing the timing data. The scripts are\ntested on Ubuntu 20.04 LTS. We provide a list of dependencies to install \nin `scripts/dependencies.txt`.\n\n#### Visualizing the execution\n\nTo visualize the contents of the log, we developed a script `view_timing` found\nin the `scripts/` directory. To run it, use the following. Note that this script\ncan take considerable time to run on large logs. In general, we recommend it\nonly be used on logs that are tens of seconds or shorter.\n\n```script\n$ cd scripts/view_timing/\n$ ./main.py (then choose `log_cpu=N.dat`)\n```\n\nIf you want to stop the script early, press `CTRL-C`. The `output/` directory\nwill be populated with a visualization of the log. Solid blocks represent the\ntimes during which each activity (specified on the left) is performed.\n\n#### Generating compressed schedules\n\nOther useful scripts are the `compress_schedule` scripts, which can be used to\ngenerate the schedule tables for a specific configuration by analyzing the\ntiming data produced when using the spread out \"default\" schedules (described\nin [this section](#configuring-igor)).\n\nEssentially, the scripts measures the worst-case execution time for each major\ncode segment in the protocol, then generates a compressed schedule in which \neach activity is allocated as little time as possible (with a configurable\nmargin).\n\nThe scripts are configured by editing the `scripts/compress_schedule/config.py`\nconfiguration file. Other parameters that can be adjusted besides margin include\nthe number of slots in each one second major frame (called `MINOR_FRAME_COUNT`)\nand the length of each slot (called `MINOR_FRAME_MS`). \n\nTo run the scripts, execute the following. Note that there is a different version\nof the script for each BFT protocol.\n\n```script\n$ cd scripts/compress_schedule/\n$ ./compress_bft_ef.py (then choose `log_cpu=N.dat`)\n```\nThe script produces two schedule tables, one for the replicas, and one for the\nsensor/actuator RPi. To use these schedules, copy them to\n`software/igor/igor_defs/tables` and specify that they should be used in\n`software/igor/igor_defs/target_config.cmake`. This process is described in\n[this section](#configuring-igor)).\n\nIn addition, it is necessary to set the timeouts used for each stage of the BFT\nprotocol to match the time allocated to each stage within the schedule. This is\nnecessary in cases where messages can be dropped by the network, or faulty\ndevices may fail to send messages. To set timeouts, edit the \n`software/igor/apps/test_X/fsw/src/test_X.c` corresponding to your protocol\nto use the correct timeouts. Unfortunately, this process must currently be done\nmanually. In the future it should be automated.\n\n**Note:** Because `compress_schedule` works by measuring the worst-case execution\ntime of each code segment over a given run, it is heavily impacted by factors that\nincrease timing variability (described in [this section](#steps-to-minimize-timing-variability)).\n\nTo determine whether your worst-case timing measurements are being inflated by\nuncontrolled factors, set the `PRINT_HIST` parameter in `config.py` to `True`.\nThis will cause a histogram to be printed for each major code segment that shows\nthe distribution of measurements taken from the log. If a small number of \nmeasurements are significantly higher than the others, it is likely they are \nbeing inflated by [other factors](#steps-to-minimize-timing-variability)).\n\n**Note:** Certain protocols naturally exhibit different timing distributions.\nFor example, **test_bft_ef** typically has wide timing distributions during the\nagreement phase when tolerating multiple faults, since it needs to break\nmessages into multiple fragments, each of which can be delayed variably due to a\nvariety of factors. In contrast, **test_bft_turpin_ef** and **test_igor_ef**\ntypically have narrow time distributions in the agreement phase, since they each\nonly need to send a small number of minimally-sized messages.\n\n\n## Repeating Results\n\nThis section is intended for researchers intending to reproduce results for the\nIGOR paper. To minimize effort on the part of reviewers, we provide a Ubuntu\n20.04 LTS virtual machine that already contains the required scripts and\ndependencies outlined [above](#processing-timing-data). The virtual machine\nhas been tested with VirtualBox 6.1.\n\nThe virtual machine image can be downloaded from [here](https://drive.google.com/file/d/1yImohmgo7i_Mb3-7MkVPr02hBEE2hpz7/view?usp=sharing).\n\n\n### Setting up the VM\n\nDownload the VM `.ovf` file from the above link.  Next, import it into your\nchosen virtual machine manager software. On VirtualBox, this is done by going\nto **File \u003e Import Appliance**, selecting the `.ovf` file, and following the\nwizard.\n\nOnce imported, run the VM. The username and password are both `rtas21`.\n\nThe `igor-ae` repository has already been cloned to `/home/rtas21`.\nThe file structure is described at the top of this README.\n\n\n### Latency\n\nAs described in Section 6A of the paper, we generated latency plots by first\nmeasuring the worst-case latencies of each major code segment for each BFT\nprotocol, then generating schedules that ran each code segment in sequence with\n10% margin. The end-to-end latency for each BFT protocol was determined as the time\ndifference in the resulting schedules between when the sensors were scheduled to\nsend inputs to the replicas, and the actuators were scheduled to read outputs\nfrom the replicas.  \n\nTo get the initial timing of each code segment, we ran each protocol in each\nconfiguration using its \"default\" schedule (described [here](#configuring-igor))\nover 100 iterations. The raw logs produced from this activity are provided in\n`~/igor-ae/data`. The first two columns of the logs contain application and\nactivity codes. The third column contains timestamps in microseconds. \n\nTo generate the new schedules, we ran the `compress_schedule` script (described \n[here](#generating-compressed-schedules)) on each log. The resulting schedules\nwere then recompiled and tested in the IGOR software.\n\nTo repeat this compression step on the raw data and generate new schedules for\nall test cases, do the following:\n\n```script\n$ cd ~/igor-ae/evaluation/partA_latency\n$ ./batch.py \n```\n\nThe script will run for around 30 seconds. When it finishes, the following\ndirectory will be populated with all the schedule tables that were generated.\n\n```script\n~/igor-ae/evaluation/partA_latency/output\n```\n\nIn addition, the following directory will be populated with the latency plots\ndetermined by measuring the generated schedules. These plots should match the\nplots in **Fig. 4** of the paper.\n\n```script\n~/igor-ae/evaluation/partA_latency/figures\n```\n\n\n### Schedulability\n\nAs described in Section 6B of the paper, we generated latency plots by varying core\nutilization from 0.1 to 1, and randomly generating 1000 tasksets for each utilization.\nWe scheduled tasksets in periodic rate groups from smallest period to largest. For\neach taskset, we determined if the taskset was schedulable with the default BFT protocol\n(OM for f = 1, TC for f = 2). If not, we replaced any tasks that did not meet deadlines\nwith speculative IGOR tasks, then determined whether the resulting taskset was schedulable.\n\nThe worst-case number of time slots needed to execute each stage of each protocol was\ndetermined by parsing the raw timing data in `~/igor-ae/data`.\n\nTo re-process these steps, including re-processing the raw timing data,\nexecute the following:\n\n```script\n$ cd ~/igor-ae/evaluation/partB_schedulability\n$ ./batch.py \n```\n\nThe script will randomly generate the tasksets and report the fraction of\nschedulable tasksets at each utilization.\n\nIn addition, the following directory will be populated with the resulting\nschedulability plots. These plots should be similar to **Fig. 5** of the paper.\nNote that there will be some differences, since the tasksets are randomly\ngenerated for each execution.\n\n```script\n~/igor-ae/evaluation/partB_schedulability/figures\n\n```\n\n\n### Computation Overhead\n\nAs described in Section 6C of the paper, we evaluated IGOR's 'computation\noverhead by taking those schedules generated in Section 6B (using the default\nBFT protocol alone, as well as using IGOR for unschedulable tasks), and for each\nutilization, calculating the average CPU capacity remaining per core. Note that\nunschedulable tasksets were excluded from the calculation.\n\nTo repeat these steps, execute the following:\n\n```script\n$ cd ~/igor-ae/evaluation/partC_capacity\n$ ./batch.py \n```\n\nThe scripts will parse the schedules generated in `partB_schedulability/` and\nreport the average remaining capacity per core.\n\nIn addition, the following directory will be populated with the capacity plots.\nThese plots should be similar to **Fig. 6** of the paper. Note that there will\nbe some differences, since the tasksets are randomly generated for each\nexecution.\n\n```script\n~/igor-ae/evaluation/partC_capacity/figures\n```\n\nA missing histogram bar means that no tasksets were schedulable at the given\nutilization for the given protocol, and thus the average remaning capacity\ncould not be calculated.\n\n\n### Communication Cost\n\nAs described in Section 6D of the paper, we evaluated IGOR's communication\noverhead by running each BFT protocol for 100 iterations, sniffing the traffic\nover the network, and counting the total number of bytes communicated. The\nprogram we used for sniffing the network is provided in `igor-ae/scripts/count_bytes`.\nThe program generates logs of the total number of bytes communicated. We provide\nthese raw logs in `igor-ae/evaluation/partD_communication/data`.\n\nTo parse the logs and compare the communication overheads of the protocols,\nexecute the following.\n\n```script\n$ cd ~/igor-ae/evaluation/partD_communication\n$ ./batch.py \n```\n\nThe following directory will be populated with the communication overhead plots.\nThese plots should match **Fig. 7** in the paper.\n\n```script\n~/igor-ae/evaluation/partD_communication/figures\n```\n\n### Case Study\n\nThe flight software and simulation used in our case study are NASA proprietary\nand cannot be shared. Moreover, the raw data is not approved for distribution\nbeyond what is included in the paper. However, our results in Section 6E are\nconsistent with our latency measurements from Section 6A. \n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fefeslab%2Figor-ae","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fefeslab%2Figor-ae","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fefeslab%2Figor-ae/lists"}