{"id":22124030,"url":"https://github.com/matteocontrini/textify","last_synced_at":"2025-07-25T15:31:45.955Z","repository":{"id":53739977,"uuid":"224043936","full_name":"matteocontrini/Textify","owner":"matteocontrini","description":"HTML to plaintext conversion library (C#/.NET Standard 2.0)","archived":false,"fork":false,"pushed_at":"2022-08-28T16:52:25.000Z","size":52,"stargazers_count":17,"open_issues_count":1,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-12-01T12:52:43.478Z","etag":null,"topics":["csharp","dotnet","dotnet-standard","html","plaintext","textify"],"latest_commit_sha":null,"homepage":"","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/matteocontrini.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-11-25T21:21:41.000Z","updated_at":"2024-09-17T14:03:45.000Z","dependencies_parsed_at":"2022-09-16T09:41:17.038Z","dependency_job_id":null,"html_url":"https://github.com/matteocontrini/Textify","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matteocontrini%2FTextify","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matteocontrini%2FTextify/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matteocontrini%2FTextify/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matteocontrini%2FTextify/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/matteocontrini","download_url":"https://codeload.github.com/matteocontrini/Textify/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227590146,"owners_count":17790446,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csharp","dotnet","dotnet-standard","html","plaintext","textify"],"created_at":"2024-12-01T15:46:14.797Z","updated_at":"2024-12-01T15:46:16.457Z","avatar_url":"https://github.com/matteocontrini.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Textify\n\n[![build](https://github.com/matteocontrini/Textify/workflows/Build%20and%20tests/badge.svg)](https://github.com/matteocontrini/Textify/actions) [![codecov](https://codecov.io/gh/matteocontrini/Textify/branch/master/graph/badge.svg)](https://codecov.io/gh/matteocontrini/Textify) [![NuGet](https://img.shields.io/nuget/v/Textify?color=success)](https://www.nuget.org/packages/Textify) [![License](https://img.shields.io/github/license/matteocontrini/Textify?color=success)](https://github.com/matteocontrini/Textify/blob/master/LICENSE)\n\nAn HTML to plaintext conversion library for **.NET Standard 2.0** written in C#.\n\n## Features\n\n- Supports HTML **headings, paragraphs, containers, lists and tables** (basic support)\n- Takes an **HTML string** as an input or an `INode` from [AngleSharp](https://github.com/AngleSharp/AngleSharp)\n- Outputs a readable text representation of the web page\n- Targets **.NET Standard 2.0**\n- **Full test coverage**\n\n## Installation\n\nInstall [from NuGet](https://www.nuget.org/packages/Textift/):\n\n```powershell\nInstall-Package Textify\n```\n\nor\n\n```powershell\ndotnet add package Textify\n```\n\n## Usage\n\n```csharp\nHtmlToTextConverter converter = new HtmlToTextConverter();\nstring output = converter.Convert(html);\n```\n\nBy default, the whole page will be converted.\n\nIf you're interested in converting only a part of it, parse the page by yourself with AngleSharp and pass the `INode` you're interested in. You don't need to install AngleSharp because Textify already depends on it.\n\n```csharp\nHtmlParser parser = new HtmlParser();\nIHtmlDocument doc = parser.ParseDocument(html);\nIElement element = doc.QuerySelector(\"#main\");\n\nHtmlToTextConverter converter = new HtmlToTextConverter();\nstring output = converter.Convert(element);\n```\n\n## Example\n\n**Input:**\n\n```html\n\u003cdiv id=\"page\"\u003e\n    \u003cheader\u003e\n        \u003ca href=\"http://example.com\" class=\"site-logo\"\u003e\n        \t\u003cimg src=\"logo.png\" alt=\"Logo\" /\u003e\n        \u003c/a\u003e\n        \u003ch1\u003e\n            Site title\n        \u003c/h1\u003e\n    \u003c/header\u003e\n    \u003cmain\u003e\n    \t\u003carticle\u003e\n        \t\u003ch2\u003eArticle title\u003c/h2\u003e\n            \n            \u003cp\u003e\n                \u003cstrong\u003eLorem ipsum\u003c/strong\u003e dolor sit amet, consectetur adipiscing elit,\n                sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.\n            \u003c/p\u003e\n\n            \u003cp\u003eUt enim ad minim veniam, quis nostrud exercitation ullamco\n            laboris nisi ut aliquip ex ea commodo consequat.\u003c/p\u003e\n            \n            Here is a list of things anyway:\n\n            \u003cul\u003e\n                \u003cli\u003eOne\u003c/li\u003e\n                \u003cli\u003eTwo\u003c/li\u003e\n                \u003cli\u003eThree\u003c/li\u003e\n            \u003c/ul\u003e\n\n            But maybe a table is nicer:\u003cbr\u003e\u003cbr\u003e\n            \n            \u003ctable\u003e\n                \u003cthead\u003e\n                \t\u003cth\u003eKey\u003c/th\u003e\n                    \u003cth\u003eValue\u003c/th\u003e\n                \u003c/thead\u003e\n                \u003ctr\u003e\n                \t\u003ctd\u003eOne\u003c/td\u003e\n                    \u003ctd\u003eValue\u003c/td\u003e\n                \u003c/tr\u003e\n            \u003c/table\u003e\n        \u003c/article\u003e\n    \u003c/main\u003e\n\u003c/div\u003e\n```\n\nOutput:\n\n```\n[IMG: Logo] [1]\n\n++++++++++\nSite title\n++++++++++\n\n-------------\nArticle title\n-------------\n\nLorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.\n\nUt enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.\n\nHere is a list of things anyway:\n\n* One\n* Two\n* Three\n\nBut maybe a table is nicer:\n\n| Key | Value |\n\n| One | Value |\n\n[1] http://example.com\n```\n\n## License\n\nMIT license.\n\nThanks to Jay Taylor for the inspiration with his [html2text](https://github.com/jaytaylor/html2text) Go module.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmatteocontrini%2Ftextify","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmatteocontrini%2Ftextify","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmatteocontrini%2Ftextify/lists"}