{"id":23477936,"url":"https://github.com/lakhoune/automatic-conformance-checking","last_synced_at":"2025-10-31T03:30:41.296Z","repository":{"id":111417318,"uuid":"552546237","full_name":"lakhoune/Automatic-Conformance-Checking","owner":"lakhoune","description":"Automatic Conformance Checking insights in Celonis","archived":false,"fork":false,"pushed_at":"2023-05-18T14:28:07.000Z","size":39743,"stargazers_count":4,"open_issues_count":0,"forks_count":4,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-12-24T18:47:16.269Z","etag":null,"topics":["celonis","pql","process-mining"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lakhoune.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-10-16T20:39:02.000Z","updated_at":"2024-05-30T14:44:13.000Z","dependencies_parsed_at":null,"dependency_job_id":"9e7b61d1-9f27-4c5f-b753-d6c04c7a6b6b","html_url":"https://github.com/lakhoune/Automatic-Conformance-Checking","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lakhoune%2FAutomatic-Conformance-Checking","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lakhoune%2FAutomatic-Conformance-Checking/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lakhoune%2FAutomatic-Conformance-Checking/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lakhoune%2FAutomatic-Conformance-Checking/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lakhoune","download_url":"https://codeload.github.com/lakhoune/Automatic-Conformance-Checking/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239102837,"owners_count":19582051,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["celonis","pql","process-mining"],"created_at":"2024-12-24T18:39:59.769Z","updated_at":"2025-10-31T03:30:38.069Z","avatar_url":"https://github.com/lakhoune.png","language":"Python","readme":"# Automatic-Conformance-Checking\n\nWe developed the library **pyinsights** to get automatic conformance checking insights on business processes.\nWe aim at a seamless integration with one of the leading process mining tools [Celonis](https://www.celonis.com/).\n\n## Dependencies\n\n- pm4py\n- streamlit\n- scikit-learn\n- prince\n- seaborn\n- plotly\n\n## Install\n\nJust do\n\n```sh\npip install --extra-index-url=https://pypi.celonis.cloud/ .\n```\n\nand pip will take care of the rest!\n\n## Usage Examples\n\n### Resource Profiling Example\n\nOur library pyinsights can compute the resource profile of an event log and\nidentify deviating cases with batches based on it. We define the resource profile as the\nnumber of times a resource executes an activity within a certain time-unit.\nA batch then is when these numbers exceed a certain threshold. You can also group\nthe batches into types.\n\n```python\n    from pyinsights import Connector\n    from pyinsights.organisational_profiling import ResourceProfiler\n    celonis_url = \u003ccelonis_url\u003e\n    api_token = \u003ccelonis api token\u003e\n\n    # define connector and connect to celonis\n    connector = Connector(api_token=api_token, url=celonis_url, key_type=\"USER_KEY\")\n\n    # choose data model\n    print(\"Available datamodels:\")\n    print(connector.celonis.datamodels)\n    print(\"Input id of datamodel:\")\n    id = input()\n    connector.set_parameters(model_id=id, end_timestamp=\"END_DATE\",\n                            resource_column=\"CE_UO)\n\n    # init resource profiler\n    res_profiler = ResourceProfiler(connector=connector)\n\n    # compute resource profile (not needed for next step)\n    res_profile = res_profiler.resource_profile(time_unit=\"HOURS\",\n                                                reference_unit=\"DAY\")\n\n    # get cases with batches\n    batches_df = res_profiler.cases_with_batches(time_unit=\"HOURS\", reference_unit=\"DAY\",\n                                         min_batch_size=2, batch_percentage=0.1\n                                    , grouped_by_batches=True, batch_types=True)\n    batches_df\n```\n\n\u003cp align=\"center\"\u003e\n  \u003cimg width=\"\" src=\"docs/images/batch_detection_with_groups.png\" /\u003e\n\u003c/p\u003e\n\nYou can also identify cases violating the four-eyes principle.\n\n```python\n    from pyinsights.organisational_profiling import segregation_of_duties\n\n    activities = ['Pending Liquidation Request', 'Request completed with account closure']\n    segregation_of_duties(connector=connector, resource_column=\"CE_UO\", activities)\n```\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/images/4-eyes.png\" /\u003e\n\u003c/p\u003e\n\n### Temporal Profiling Example\n\nOur library pyinsights can compute the temporal profile of an event log and\nidentify deviating cases based on it.\n\n```python\n    from pyinsights import Connector\n    from pyinsights.temporal_profiling import TemporalProfiler\n\n    celonis_url = \u003ccelonis_url\u003e\n    api_token = \u003ccelonis api token\u003e\n\n    # define connector and connect to celonis\n    connector = Connector(api_token=api_token, url=celonis_url, key_type=\"USER_KEY\")\n\n    # choose data model\n    print(\"Available datamodels:\")\n    print(connector.celonis.datamodels)\n    print(\"Input id of datamodel:\")\n    id = input()\n    connector.set_parameters(model_id=id, end_timestamp=\"END_DATE\")\n\n    # init temporal profiler\n    temporal_profiler = TemporalProfiler(connector=connector)\n\n    #compute temporal profile (not necessary for next steps)\n    temporal_profile = temporal_profiler.temporal_profile()\n    # compute deviating cases with deviation cost\n    df_temporal_profile = temporal_profiler.deviating_cases(sigma = 6, extended_view=False)\n\n    df_temporal_profile\n```\n\n\u003cp align=\"center\"\u003e\n  \u003cimg width=\"\" src=\"docs/images/temporal_deviations_example.PNG\" /\u003e\n\u003c/p\u003e\n\n### Log Skeleton Example\n\nPyinsights can compute the log skeleton of a log.\n\n````python\nfrom pyinsights.log_skeleton import LogSkeleton\n\nskeleton = LogSkeleton(connector)\n\n# get lsk as pm4py-conforming dict\nlsk_dict = skeleton.get_log_skeleton(noise_threshold=0)```\n````\n\nTo use the log skeleton for conformance, use the following code\n\n````python\nfrom pyinsights.log_skeleton import LogSkeleton\n\nskeleton = LogSkeleton(connector)\n\n# get non conforming cases\ndf_log_skeleton = skeleton.get_non_conforming_cases(noise_threshold=0)\n\ndf_log_skeleton\n````\n\nThis returns a data frame with the non conforming cases\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/images/log_skeleton_example.png\" /\u003e\n\u003c/p\u003e\n\n### Anomaly Detection Example\n\nPyinsights can identify anomalous cases based on IsolationForests.\n\n```python\nfrom pyinsights.anomaly_detection import anomaly_detection\n\nconnector.set_parameters(model_id=id, end_timestamp=\"END_DATE\")\nanomaly_detection_df = anomaly_detection(connector=connector)\n\nanomaly_detection_df\n```\n\n\u003cp align=\"center\"\u003e\n  \u003cimg width=\"\" src=\"docs/images/anomaly_ex.PNG\" /\u003e\n\u003c/p\u003e\n\n### Combiner\n\nLastly, each method can be used in combination with all others.\n\n```python\nfrom pyinsights import Combiner\n\ncombiner = Combiner(connector=connector)\n\ndeviations = {\"Log Skeleton\":df_log_skeleton, \n              \"Temporal Profile\":df_temporal_profile,\n              \"Anomaly Detection\":anomaly_detection_df\n              }\n\ndf = combiner.combine_deviations(deviations=deviations, how=\"union\")\n\ndf\n```\n\n## Web Frontend\n\nThe easiest way to interact with our library is to use the frontend, which we developed for it. To get started, run the following command in your Terminal:\n\n```bash\nstreamlit run user_interface.py --theme.base \"dark\"\n```\n\nThis will open the web interface in your browser\n\u003cimg width=\"\" src=\"docs/images/streamlit/login.png\" /\u003e\nLogin with your credentials\n\u003cimg width=\"\" src=\"docs/images/streamlit/config.png\" /\u003e\nIn the left tab, you can select your event log and select the end timestamp and resource column.\nYou can select which deviation method to select and how you want to combine the results.\nYou can also configure the parameters for each method.\nOn the main tab, you can now click on `Get deviations`.\nThis will run each method that you selected and combine the result into a single data frame. The results should look as follows. (Note that the deviation distribution will only show if you selected the Temporal Profiler)\n\u003cimg width=\"\" src=\"docs/images/streamlit/deviations.png\" /\u003e\n\u003cimg width=\"\" src=\"docs/images/streamlit/deviationtable.png\" /\u003e\nYou can also export the dataframe as `CSV` by clicking on `Download data as CSV`\n\n\n## Citations\n\nPyinsights implements some approaches on conformance checking first suggested in research.\nSome of the papers we used include:\n\n- [Temporal Conformance Checking at Runtime based on Time-infused Process Models](https://arxiv.org/abs/2008.07262)\n- [Log Skeletons: A Classification Approach to Process Discovery](https://arxiv.org/abs/1806.08247)\n\n## Adjustments to Log Skeleton\nDuring the implementation of the log skeleton according to the definition of the paper, we concluded that their approach is not computationally feasible as the runtime for determining all possible subsets that need to be checked is exponential.\nThe authors also acknowledge this fact in their paper.\n\nThus for our approach, we decided to take a simpler approach. For constructing the log skeleton, we use a `noise_threshold`. Instead of requiring that a particular relation holds for each trace, we require it to hold for `noise_treshold*case_count`.\n\nFor computing the deviations, we first calculate the relation for each trace individually. We say that a trace conforms, if there are less than `noise_threshold*case_count` of differences between the relation for the trace and the overall relation for all traces.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flakhoune%2Fautomatic-conformance-checking","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flakhoune%2Fautomatic-conformance-checking","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flakhoune%2Fautomatic-conformance-checking/lists"}