{"id":23090537,"url":"https://github.com/goodtocode/analytics","last_synced_at":"2025-04-03T17:44:42.060Z","repository":{"id":231785816,"uuid":"403363824","full_name":"goodtocode/analytics","owner":"goodtocode","description":"GoodToCode Analytics supports file infrastructure (Excel), AI services (Azure Cognitive Services, Text Analytics) and persistence (Azure Storage Tables, CosmosDb) for Data Lake analytics workflows.","archived":false,"fork":false,"pushed_at":"2023-01-10T21:17:24.000Z","size":3033,"stargazers_count":1,"open_issues_count":6,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-01T14:48:27.049Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/goodtocode.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2021-09-05T16:53:52.000Z","updated_at":"2024-04-05T21:37:30.000Z","dependencies_parsed_at":"2024-04-05T23:40:29.802Z","dependency_job_id":null,"html_url":"https://github.com/goodtocode/analytics","commit_stats":null,"previous_names":["goodtocode/analytics"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/goodtocode%2Fanalytics","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/goodtocode%2Fanalytics/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/goodtocode%2Fanalytics/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/goodtocode%2Fanalytics/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/goodtocode","download_url":"https://codeload.github.com/goodtocode/analytics/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247051995,"owners_count":20875677,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-16T21:00:20.709Z","updated_at":"2025-04-03T17:44:42.040Z","avatar_url":"https://github.com/goodtocode.png","language":"C#","readme":"# GoodToCode Analytics Library for Azure Cognitive Services\r\n[![Build Status](https://dev.azure.com/GoodToCode/GoodToCode.com/_apis/build/status/gtc-rg-analytics?branchName=main)](https://dev.azure.com/GoodToCode/GoodToCode.com/_build/latest?definitionId=65\u0026branchName=main)\r\n\r\n\u003csup\u003eGoodToCode Analytics supports file infrastructure (Excel), AI services (Azure Cognitive Services, Text Analytics) and persistence (Azure Storage Tables, CosmosDb) for Data Lake analytics workflows.\u003c/sup\u003e \u003cbr\u003e\r\n\r\nThis is a simple, low-dependency library for managing Azure Cognitive Services and Text Analytics, and persisting the results to Azure Storage Tables and CosmosDb. These services rely on Azure Machine Learning and Artificial Intelligence in the [Azure Cognitive Services](https://azure.microsoft.com/en-us/services/cognitive-services/) suite. The APIs supported are text analytics and cognitive services, expanding to others such as computer vision, facial recognition, video indexing, etc.\r\n\r\n#### /src Contents\r\nPath | Item | Contents\r\n--- | --- | ---\r\nsrc | - | Contains the C# solution, project files and source code.\r\nsrc | Analytics.Activities | Workflow activities to be the steps of an Durable Function Orchestration\r\nsrc | Analytics.Domain | Domain Entities for this solutions services.\r\nsrc | Analytics.Tests | Tests against fakes and reals for cognitive services and text analytics.\r\n\r\n#### /infrastructure ARM Templates\r\nPath | Contents\r\n--- | --- | ---\r\ninfrastructure | - | Contains Azure DevOps YML files, Windows PowerShell scripts, and variables to support Azure DevOps YML Pipelines.\r\ninfrastructure | *.json | ARM template for that Azure resource.\r\ninfrastructure | *.parameters.json | Parameter definition for the ARM template for that Azure resource.\r\n\r\n#### /pipeline YML Files\r\nPath | Item | Contents\r\n--- | --- | ---\r\npipelines | - | Contains Azure DevOps YML files, Windows PowerShell scripts, and variables to support Azure DevOps YML Pipelines.\r\npipelines | gtc-rg-analytics-src.yml | Azure DevOps Pipeline main file.\r\npipelines | scripts | Command Line Interface files (.cmd) for windows/bash commands. Windows PowerShell scripts Set-Version.ps1.\r\npipelines | steps | Azure DevOps Pipeline step templates.\r\npipelines | variables | Variables (non-secret only) for the Azure landing zone, Azure infrastructure and NuGet packages.\r\n\r\n#### Azure Cognitive Services\r\nCognitive Service | Purpose\r\n:---------------------:| --- \r\n[Computer Vision](https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/)|Inspects each image associated with an incoming article to (1) scrape out written words from the image and (2) determine what types of objects are present in the image. \r\n[Face API](https://azure.microsoft.com/en-us/services/cognitive-services/face/)|Inspects each image associated with an incoming article to find faces and determine whether the face represents a male or female and associates an estimated age to those faces.\r\n[Text Analytics](https://azure.microsoft.com/en-us/services/cognitive-services/text-analytics/) | Used to find \u003ci\u003ekey word phrases\u003c/i\u003e and \u003ci\u003eentities\u003c/i\u003e in title and body text after it has been translated.\r\n[Translation API](https://azure.microsoft.com/en-us/services/cognitive-services/translator-text-api/) | Determines the language of the incoming title and body, when present, then translates them to English. However, the target language is just another input and can be changed from English to any [supported language](https://docs.microsoft.com/en-us/azure/cognitive-services/translator/reference/v3-0-languages) of your choice.\r\n\r\n#### Azure Services used in GoodToCode repositories\r\nAzure Service | Purpose\r\n:---------------------:| --- \r\n[Azure Cosmos DB](https://azure.microsoft.com/en-us/services/cosmos-db/)| NoSQL database where original content as well as processing results are stored.\r\n[Azure Functions](https://azure.microsoft.com/en-us/try/app-service/)|Code blocks that analyze the documents stored in the Azure Cosmos DB.\r\n[Azure Service Bus](https://azure.microsoft.com/en-us/services/service-bus/)|Service bus queues are used as triggers for durable Azure Functions.\r\n[Azure Storage](https://azure.microsoft.com/en-us/services/storage/)|Holds images from articles and hosts the code for the Azure Functions.\r\n\r\n\u003e \u003cb\u003e Note \u003c/b\u003e This design uses the service collection extensions, dependency inversion, queue notification, and serverless patterns for simplicity. While these are useful patterns, this is not the only pattern that can be used to accomplish this data flow.\r\n\u003e\r\n\u003e [Azure Service Bus Topics](https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-dotnet-how-to-use-topics-subscriptions) could be used which would allow processing different parts of the article in a parallel as opposed to the serial processing done in this example. Topics would be useful if article inspection processing time is critical.  A comparison between Azure Service Bus Queues and Azure Service Bus Topics can be found [here](https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-dotnet-how-to-use-topics-subscriptions).\r\n\u003e\r\n\u003eAzure functions could also be implemented in an [Azure Logic App](https://azure.microsoft.com/en-us/services/logic-apps/).  However, with parallel processing the user would have to implement record-level locking such as [Redlock](https://redis.io/topics/distlock) until Cosmos DB supports [partial document updates](https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/6693091-be-able-to-do-partial-updates-on-document). \r\n\u003e\r\n\u003eA comparison between durable functions and Logic apps can be found [here](https://docs.microsoft.com/en-us/azure/azure-functions/functions-compare-logic-apps-ms-flow-webjobs).\r\n\r\n# Contributing\r\n\r\nThis project welcomes contributions and suggestions.  Most contributions require you to agree to a\r\nContributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us\r\nthe rights to use your contribution. For details, visit https://cla.microsoft.com.\r\n\r\nThis project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).\r\nFor more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or\r\ncontact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.\r\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoodtocode%2Fanalytics","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoodtocode%2Fanalytics","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoodtocode%2Fanalytics/lists"}