{"id":15063971,"url":"https://github.com/coelhosilva/flight-ad","last_synced_at":"2025-04-10T11:26:37.341Z","repository":{"id":49019815,"uuid":"377359420","full_name":"coelhosilva/flight-ad","owner":"coelhosilva","description":"flight-ad is a Python package for anomaly detection in the aviation domain built on top of scikit-learn.","archived":false,"fork":false,"pushed_at":"2022-05-15T21:00:45.000Z","size":86,"stargazers_count":6,"open_issues_count":0,"forks_count":2,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-03-21T05:33:41.600Z","etag":null,"topics":["anomaly-detection","data-science","fdm","flight-data","flight-data-analysis","flight-data-monitoring","machine-learning","python","scikit-learn"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/coelhosilva.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-06-16T03:26:57.000Z","updated_at":"2024-05-10T15:31:14.000Z","dependencies_parsed_at":"2022-08-27T22:54:17.290Z","dependency_job_id":null,"html_url":"https://github.com/coelhosilva/flight-ad","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coelhosilva%2Fflight-ad","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coelhosilva%2Fflight-ad/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coelhosilva%2Fflight-ad/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coelhosilva%2Fflight-ad/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/coelhosilva","download_url":"https://codeload.github.com/coelhosilva/flight-ad/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248208623,"owners_count":21065203,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anomaly-detection","data-science","fdm","flight-data","flight-data-analysis","flight-data-monitoring","machine-learning","python","scikit-learn"],"created_at":"2024-09-25T00:09:32.237Z","updated_at":"2025-04-10T11:26:37.316Z","avatar_url":"https://github.com/coelhosilva.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# flight-ad\n\n[![Codacy Badge](https://app.codacy.com/project/badge/Grade/d2f06dedcb044256828e1c907d9c511a)](https://www.codacy.com/gh/coelhosilva/flight-ad/dashboard?utm_source=github.com\u0026amp;utm_medium=referral\u0026amp;utm_content=coelhosilva/flight-ad\u0026amp;utm_campaign=Badge_Grade)\n[![PyPI version](https://badge.fury.io/py/flight-ad.svg)](https://badge.fury.io/py/flight-ad)\n\n`flight-ad` is a Python package for anomaly detection in the aviation domain built on top of scikit-learn.\n\nIt provides:\n\n-   An implementation of an anomaly detection pipeline;\n-   A DataBinder object for loading and transforming the data within the pipeline on the fly;\n-   A DataWrangler object for building a data wrangling pipeline;\n-   A StatisticalLearner object for binding scikit-learn's pipelines and integrating them on the anomaly detection workflow;\n-   Visualization tools for assessing potential anomalies;\n-   Reporting tools for analyzing results;\n-   Sample airplane sensor data, repackaged from NASA's DASHlink for the purpose of evaluating and advancing data mining capabilities that can be used to promote aviation safety;\n-   Adaptations of machine learning algorithms, such as a DBSCAN implementation that calculates the hyperparameter epsilon from the input data.\n\n## Installation\n\nThe easiest way to install `flight-ad` is using pip from your virtual environment.\n\nFrom PyPI:\n\n`pip install flight-ad`\n\nOr directly from GitHub:\n\n`pip install git+https://github.com/coelhosilva/flight-ad.git`\n\n## Examples\n\nThis is a sample usage of the package for constructing an anomaly detection pipeline. Beware that the sample dataset \nmay take up roughly 1 GB in disk space.\n\n```python\nfrom flight_ad.datasets import load_dashlink_bindings\nfrom flight_ad.utils.data import DataBinder\nfrom flight_ad.wrangling import DataWrangler\nfrom wrangling_functions import preprocess, change_col, resample, select\nfrom flight_ad.transformations import reshape_df_interspersed\nfrom sklearn.preprocessing import StandardScaler\nfrom sklearn.decomposition import PCA\nfrom flight_ad.cluster import DBSCAN\nfrom flight_ad.learn import FunctionTransformer\nfrom flight_ad.learn import StatisticalLearner\nfrom flight_ad.pipeline import AnomalyDetectionPipeline\nfrom flight_ad.report import clustering_info, silhouette\n\n# Binder\ndata_bindings = load_dashlink_bindings(download=True)\nbinder = DataBinder(data_bindings)\n\n# Wrangler\nwrangling_steps = [\n    ('preprocess_flight', preprocess),\n    ('resample_dataframe', resample),\n    ('change_col', change_col),\n    ('select_col', select)\n\n]\nwrangler = DataWrangler(wrangling_steps, memorize='change_col')\n\n# Learner\nlearning_steps = {\n    'preprocessing': [\n        ('reshaper', FunctionTransformer(reshape_df_interspersed)),\n        ('scaler', StandardScaler()),\n        ('pca', PCA())\n    ],\n    'training': [\n        ('dbscan', DBSCAN())\n    ]\n}\nlearner = StatisticalLearner(learning_steps, record='pca')\n\n# Pipeline\nad_pipeline = AnomalyDetectionPipeline(binder, wrangler, learner)\nad_pipeline.fit()\n\n# Results\nlabels, n_clusters, n_noise = clustering_info(learner.pipeline['dbscan'])\navg_silhouette, _ = silhouette(learner.partial_data['pca'], labels)\n```\n\n## Package structure\n\nTBD.\n\n## Dependencies\n\n`flight-ad` requires:\n\n-   Python (\u003e=3.6)\n-   NumPy\n-   pandas\n-   scikit-learn\n-   matplotlib\n-   tqdm\n\n## Contributions\n\nWe welcome and encourage new contributors to help test `flight-ad` and add new functionality. Any input, feedback, \nbug report or contribution is welcome.\n\nIf one wishes to contact the author, they may do so by emailing coelho@ita.br.\n\n## Citation\n\nIf you use `flight-ad` in a scientific publication, we would appreciate citations.\n\nBibTex: TBD.\n\nCitation string: TBD.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoelhosilva%2Fflight-ad","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcoelhosilva%2Fflight-ad","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoelhosilva%2Fflight-ad/lists"}