{"id":21692995,"url":"https://github.com/johnsmithm/ner-accidente","last_synced_at":"2026-05-11T16:32:57.644Z","repository":{"id":55961246,"uuid":"313216553","full_name":"johnsmithm/ner-accidente","owner":"johnsmithm","description":null,"archived":false,"fork":false,"pushed_at":"2020-12-29T08:02:46.000Z","size":28790,"stargazers_count":0,"open_issues_count":5,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-20T14:04:30.575Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/johnsmithm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-11-16T07:01:35.000Z","updated_at":"2020-11-16T07:02:35.000Z","dependencies_parsed_at":"2022-08-15T10:20:38.448Z","dependency_job_id":null,"html_url":"https://github.com/johnsmithm/ner-accidente","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/johnsmithm/ner-accidente","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnsmithm%2Fner-accidente","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnsmithm%2Fner-accidente/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnsmithm%2Fner-accidente/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnsmithm%2Fner-accidente/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/johnsmithm","download_url":"https://codeload.github.com/johnsmithm/ner-accidente/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnsmithm%2Fner-accidente/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32903353,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-10T13:40:02.631Z","status":"online","status_checked_at":"2026-05-11T02:00:05.975Z","response_time":120,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-25T18:18:05.083Z","updated_at":"2026-05-11T16:32:57.627Z","avatar_url":"https://github.com/johnsmithm.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\n# ner-accidente\n\nIndentificarea locatiei unui accident din textul stirilor folosind NLP\n\n## Setup\n1. Install git and checkout the [git code repository]\n2. Install [anaconda] python version 3.6+\n3. Change working directory into the git code repository root\n4. Create the self contained conda environment. In a terminal go to the git code repository root and enter the command:\n\n   `conda env create --file conda_env.yml`\n\n5. Any python modules under src need to be available to other scripts. This can be done in a couple of ways. You can \nsetup and install the python modules by executing the setup.py command below which will install the packages to the \nconda environments site-packages folder but with a symlink to the src folder so modifications are reflected immediately. \n\n   `python setup.py develop`\n   \n    As an alternative you may prefer to set the python path directly from the console, within notebooks, test scripts \n    etc. From Pycharm you can also right click the src folder and select the _Mark Directory As | Source Root_ option.\n\n6. .. Place your own project specific setup steps here e.g. copying data files ...\n\nWhen distributing your module, you can create a Python egg with the command `python setup.py bdist_egg` and upload the egg.\n\nNOTE: When working in the project notebooks from within the Equinor network, you may need to include the lines below if your proxy is not otherwise setup.\n\n`os.environ['HTTP_PROXY']=\"http://www-proxy.statoil.no:80\"`\u003cbr /\u003e\n`os.environ['HTTPS_PROXY']=\"http://www-proxy.statoil.no:80\"`\n\n## Using the Python Conda environment\n\nOnce the Python Conda environment has been set up, you can\n\n* Activate the environment using the following command in a terminal window:\n\n  * Windows: `activate ner-accidente`\n  * Linux, OS X: `source activate ner-accidente`\n  * The __environment is activated per terminal session__, so you must activate it every time you open terminal.\n\n* Deactivate the environment using the following command in a terminal window:\n\n  * Windows: `deactivate ner-accidente`\n  * Linux, OS X: `source deactivate ner-accidente`\n               \n* Delete the environment using the command (can't be undone):\n\n  * `conda remove --name ner-accidente --all`\n\n## Initial File Structure\n\n```\n├── .gitignore               \u003c- Files that should be ignored by git. Add seperate .gitignore files in sub folders if \n│                               needed\n├── conda_env.yml            \u003c- Conda environment definition for ensuring consistent setup across environments\n├── LICENSE\n├── README.md                \u003c- The top-level README for developers using this project.\n├── requirements.txt         \u003c- The requirements file for reproducing the analysis environment, e.g.\n│                               generated with `pip freeze \u003e requirements.txt`. Might not be needed if using conda.\n├── setup.py                 \u003c- Metadata about your project for easy distribution.\n│\n├── data\n│   ├── interim_[desc]       \u003c- Interim files - give these folders whatever name makes sense.\n│   ├── processed            \u003c- The final, canonical data sets for modeling.\n│   ├── raw                  \u003c- The original, immutable data dump.\n│   ├── temp                 \u003c- Temporary files.\n│   └── training             \u003c- Files relating to the training process\n│\n├── docs                     \u003c- Documentation\n│   ├── data_science_code_of_conduct.md  \u003c- Code of conduct.\n│   ├── process_documentation.md         \u003c- Standard template for documenting process and decisions.\n│   └── writeup              \u003c- Sphinx project for project writeup including auto generated API.\n│      ├── conf.py           \u003c- Sphinx configurtation file.\n│      ├── index.rst         \u003c- Start page.\n│      ├── make.bat          \u003c- For generating documentation (Windows)\n│      └── Makefikle         \u003c- For generating documentation (make)\n│\n├── examples                 \u003c- Add folders as needed e.g. examples, eda, use case\n│\n├── extras                   \u003c- Miscellaneous extras.\n│   └── add_explorer_context_shortcuts.reg    \u003c- Adds additional Windows Explorer context menus for starting jupyter.\n│\n├── notebooks                \u003c- Notebooks for analysis and testing\n│   ├── eda                  \u003c- Notebooks for EDA\n│   │   └── example.ipynb    \u003c- Example python notebook\n│   ├── features             \u003c- Notebooks for generating and analysing features (1 per feature)\n│   ├── modelling            \u003c- Notebooks for modelling\n│   └── preprocessing        \u003c- Notebooks for Preprocessing \n│\n├── scripts                  \u003c- Standalone scripts\n│   ├── deploy               \u003c- MLOps scripts for deployment (WIP)\n│   │   └── score.py         \u003c- Scoring script\n│   ├── train                \u003c- MLOps scripts for training\n│   │   ├── submit-train.py  \u003c- Script for submitting a training run to Azure ML Service\n│   │   ├── submit-train-local.py \u003c- Script for local training using Azure ML\n│   │   └── train.py         \u003c- Example training script using the iris dataset\n│   ├── example.py           \u003c- Example sctipt\n│   └── MLOps.ipynb          \u003c- End to end MLOps example (To be refactored into the above)\n│\n├── src                      \u003c- Code for use in this project.\n│   └── neraccidente       \u003c- Example python package - place shared code in such a package\n│       ├── __init__.py      \u003c- Python package initialisation\n│       ├── examplemodule.py \u003c- Example module with functions and naming / commenting best practices\n│       ├── features.py      \u003c- Feature engineering functionality\n│       ├── io.py            \u003c- IO functionality\n│       └── pipeline.py      \u003c- Pipeline functionality\n│\n└── tests                    \u003c- Test cases (named after module)\n    ├── test_notebook.py     \u003c- Example testing that Jupyter notebooks run without errors\n    └── neraccidente       \u003c- neraccidente tests\n        ├── examplemodule    \u003c- examplemodule tests (1 file per method tested)\n        ├── features         \u003c- features tests\n        ├── io               \u003c- io tests\n        └── pipeline         \u003c- pipeline tests\n```\n\n## MLOps\nStarter scripts for MLOps with Azure ML Service are included as a part of this template in the scripts folder and may be\ncustomised for your own purposes. Please browse the contents of the scripts folder for more details.\n\nFor model training, the provided setup allows for running locally without any dependency on Azure ML by running train.py\nin the scripts/train folder directly. Alternatively you can submit local or remote runs using the submit scripts in the \nsame folder.\n\n## Testing\nReproducability and the correct functioning of code are essential to avoid wasted time. If a code block is copied more \nthan once then it should be placed into a common script / module under src and unit tests added. The same applies for \nany other non trivial code to ensure the correct functioning.\n\nTo run tests, install pytest using pip or conda (should have been setup already if you used the conda_env.yml file) and \nthen from the repository root run\n \n```\npytest\n```\n\n## Automated Document Generation\nA [sphinx](https://www.sphinx-doc.org/) project is provided under docs/writeup that will generate writeup that\nalso includes automatically generated API information for any packages. THe output can be created in multiple\nformats including html and pdf. If you are using CI then this can be run automatically. To run \nlocally execute the following commands:\n \n```\ncd docs/writeup\nmake html\n```\n\nOn Windows this will run the make.bat, a Makefile is also included for those using the 'make' command.\n\n## Development Process\nContributions to this template are greatly appreciated and encouraged.\n\nTo contribute an update simply:\n* Create a new branch / fork for your updates.\n* Check that your code follows the PEP8 guidelines (line lengths up to 120 are ok) and other general conventions within this document.\n* Ensure that as far as possible there are unit tests covering the functionality of any new code.\n* Check that all existing unit tests still pass.\n* Edit this document if needed to describe new files or other important information.\n* Create a pull request.\n\n## Important Links\n* https://wiki.equinor.com/wiki/index.php/Statoil_Data_Science_Technical_Standards - Data Science Technical Standards (Equinor Internal)\n* https://dataplatformwiki.azurewebsites.net/doku.php - Data Platform wiki (Equinor internal)\n* https://github.com/equinor/data-science-shared - Shared Data Science Code Repository (Equinor internal)\n\n## References\n* https://github.com/equinor/data-science-template/ - The master template for this project\n* http://docs.python-guide.org/en/latest/writing/structure/\n* https://github.com/Azure/Microsoft-TDSP\n* https://drivendata.github.io/cookiecutter-data-science/\n\n[//]: #\n   [anaconda]: \u003chttps://www.continuum.io/downloads\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohnsmithm%2Fner-accidente","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjohnsmithm%2Fner-accidente","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohnsmithm%2Fner-accidente/lists"}