{"id":13527850,"url":"https://github.com/isaacs/sax-js","last_synced_at":"2025-05-13T20:19:05.369Z","repository":{"id":804808,"uuid":"508894","full_name":"isaacs/sax-js","owner":"isaacs","description":"A sax style parser for JS","archived":false,"fork":false,"pushed_at":"2024-05-27T23:46:20.000Z","size":484,"stargazers_count":1111,"open_issues_count":100,"forks_count":326,"subscribers_count":25,"default_branch":"main","last_synced_at":"2025-05-13T00:06:59.236Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/isaacs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["isaacs"]}},"created_at":"2010-02-09T01:49:11.000Z","updated_at":"2025-05-06T12:54:03.000Z","dependencies_parsed_at":"2023-07-05T19:16:23.206Z","dependency_job_id":"68e6d24f-8c6a-491e-9729-ee35b59a6528","html_url":"https://github.com/isaacs/sax-js","commit_stats":{"total_commits":245,"total_committers":40,"mean_commits":6.125,"dds":"0.27755102040816326","last_synced_commit":"25ab118a3184d2070495e7eecf2e762f9044442c"},"previous_names":[],"tags_count":47,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/isaacs%2Fsax-js","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/isaacs%2Fsax-js/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/isaacs%2Fsax-js/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/isaacs%2Fsax-js/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/isaacs","download_url":"https://codeload.github.com/isaacs/sax-js/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254020659,"owners_count":22000757,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T06:02:03.619Z","updated_at":"2025-05-13T20:19:05.353Z","avatar_url":"https://github.com/isaacs.png","language":"JavaScript","funding_links":["https://github.com/sponsors/isaacs"],"categories":["JavaScript","file format (文件格式)"],"sub_categories":[],"readme":"# sax js\n\nA sax-style parser for XML and HTML.\n\nDesigned with [node](http://nodejs.org/) in mind, but should work fine in\nthe browser or other CommonJS implementations.\n\n## What This Is\n\n* A very simple tool to parse through an XML string.\n* A stepping stone to a streaming HTML parser.\n* A handy way to deal with RSS and other mostly-ok-but-kinda-broken XML\n  docs.\n\n## What This Is (probably) Not\n\n* An HTML Parser - That's a fine goal, but this isn't it.  It's just\n  XML.\n* A DOM Builder - You can use it to build an object model out of XML,\n  but it doesn't do that out of the box.\n* XSLT - No DOM = no querying.\n* 100% Compliant with (some other SAX implementation) - Most SAX\n  implementations are in Java and do a lot more than this does.\n* An XML Validator - It does a little validation when in strict mode, but\n  not much.\n* A Schema-Aware XSD Thing - Schemas are an exercise in fetishistic\n  masochism.\n* A DTD-aware Thing - Fetching DTDs is a much bigger job.\n\n## Regarding `\u003c!DOCTYPE`s and `\u003c!ENTITY`s\n\nThe parser will handle the basic XML entities in text nodes and attribute\nvalues: `\u0026amp; \u0026lt; \u0026gt; \u0026apos; \u0026quot;`. It's possible to define additional\nentities in XML by putting them in the DTD. This parser doesn't do anything\nwith that. If you want to listen to the `ondoctype` event, and then fetch\nthe doctypes, and read the entities and add them to `parser.ENTITIES`, then\nbe my guest.\n\nUnknown entities will fail in strict mode, and in loose mode, will pass\nthrough unmolested.\n\n## Usage\n\n```javascript\nvar sax = require(\"./lib/sax\"),\n  strict = true, // set to false for html-mode\n  parser = sax.parser(strict);\n\nparser.onerror = function (e) {\n  // an error happened.\n};\nparser.ontext = function (t) {\n  // got some text.  t is the string of text.\n};\nparser.onopentag = function (node) {\n  // opened a tag.  node has \"name\" and \"attributes\"\n};\nparser.onattribute = function (attr) {\n  // an attribute.  attr has \"name\" and \"value\"\n};\nparser.onend = function () {\n  // parser stream is done, and ready to have more stuff written to it.\n};\n\nparser.write('\u003cxml\u003eHello, \u003cwho name=\"world\"\u003eworld\u003c/who\u003e!\u003c/xml\u003e').close();\n\n// stream usage\n// takes the same options as the parser\nvar saxStream = require(\"sax\").createStream(strict, options)\nsaxStream.on(\"error\", function (e) {\n  // unhandled errors will throw, since this is a proper node\n  // event emitter.\n  console.error(\"error!\", e)\n  // clear the error\n  this._parser.error = null\n  this._parser.resume()\n})\nsaxStream.on(\"opentag\", function (node) {\n  // same object as above\n})\n// pipe is supported, and it's readable/writable\n// same chunks coming in also go out.\nfs.createReadStream(\"file.xml\")\n  .pipe(saxStream)\n  .pipe(fs.createWriteStream(\"file-copy.xml\"))\n```\n\n\n## Arguments\n\nPass the following arguments to the parser function.  All are optional.\n\n`strict` - Boolean. Whether or not to be a jerk. Default: `false`.\n\n`opt` - Object bag of settings regarding string formatting.  All default to `false`.\n\nSettings supported:\n\n* `trim` - Boolean. Whether or not to trim text and comment nodes.\n* `normalize` - Boolean. If true, then turn any whitespace into a single\n  space.\n* `lowercase` - Boolean. If true, then lowercase tag names and attribute names\n  in loose mode, rather than uppercasing them.\n* `xmlns` - Boolean. If true, then namespaces are supported.\n* `position` - Boolean. If false, then don't track line/col/position.\n* `strictEntities` - Boolean. If true, only parse [predefined XML\n  entities](http://www.w3.org/TR/REC-xml/#sec-predefined-ent)\n  (`\u0026amp;`, `\u0026apos;`, `\u0026gt;`, `\u0026lt;`, and `\u0026quot;`)\n* `unquotedAttributeValues` - Boolean. If true, then unquoted\n  attribute values are allowed. Defaults to `false` when `strict`\n  is true, `true` otherwise.\n\n## Methods\n\n`write` - Write bytes onto the stream. You don't have to do this all at\nonce. You can keep writing as much as you want.\n\n`close` - Close the stream. Once closed, no more data may be written until\nit is done processing the buffer, which is signaled by the `end` event.\n\n`resume` - To gracefully handle errors, assign a listener to the `error`\nevent. Then, when the error is taken care of, you can call `resume` to\ncontinue parsing. Otherwise, the parser will not continue while in an error\nstate.\n\n## Members\n\nAt all times, the parser object will have the following members:\n\n`line`, `column`, `position` - Indications of the position in the XML\ndocument where the parser currently is looking.\n\n`startTagPosition` - Indicates the position where the current tag starts.\n\n`closed` - Boolean indicating whether or not the parser can be written to.\nIf it's `true`, then wait for the `ready` event to write again.\n\n`strict` - Boolean indicating whether or not the parser is a jerk.\n\n`opt` - Any options passed into the constructor.\n\n`tag` - The current tag being dealt with.\n\nAnd a bunch of other stuff that you probably shouldn't touch.\n\n## Events\n\nAll events emit with a single argument. To listen to an event, assign a\nfunction to `on\u003ceventname\u003e`. Functions get executed in the this-context of\nthe parser object. The list of supported events are also in the exported\n`EVENTS` array.\n\nWhen using the stream interface, assign handlers using the EventEmitter\n`on` function in the normal fashion.\n\n`error` - Indication that something bad happened. The error will be hanging\nout on `parser.error`, and must be deleted before parsing can continue. By\nlistening to this event, you can keep an eye on that kind of stuff. Note:\nthis happens *much* more in strict mode. Argument: instance of `Error`.\n\n`text` - Text node. Argument: string of text.\n\n`doctype` - The `\u003c!DOCTYPE` declaration. Argument: doctype string.\n\n`processinginstruction` - Stuff like `\u003c?xml foo=\"blerg\" ?\u003e`. Argument:\nobject with `name` and `body` members. Attributes are not parsed, as\nprocessing instructions have implementation dependent semantics.\n\n`sgmldeclaration` - Random SGML declarations. Stuff like `\u003c!ENTITY p\u003e`\nwould trigger this kind of event. This is a weird thing to support, so it\nmight go away at some point. SAX isn't intended to be used to parse SGML,\nafter all.\n\n`opentagstart` - Emitted immediately when the tag name is available,\nbut before any attributes are encountered.  Argument: object with a\n`name` field and an empty `attributes` set.  Note that this is the\nsame object that will later be emitted in the `opentag` event.\n\n`opentag` - An opening tag. Argument: object with `name` and `attributes`.\nIn non-strict mode, tag names are uppercased, unless the `lowercase`\noption is set.  If the `xmlns` option is set, then it will contain\nnamespace binding information on the `ns` member, and will have a\n`local`, `prefix`, and `uri` member.\n\n`closetag` - A closing tag. In loose mode, tags are auto-closed if their\nparent closes. In strict mode, well-formedness is enforced. Note that\nself-closing tags will have `closeTag` emitted immediately after `openTag`.\nArgument: tag name.\n\n`attribute` - An attribute node.  Argument: object with `name` and `value`.\nIn non-strict mode, attribute names are uppercased, unless the `lowercase`\noption is set.  If the `xmlns` option is set, it will also contains namespace\ninformation.\n\n`comment` - A comment node.  Argument: the string of the comment.\n\n`opencdata` - The opening tag of a `\u003c![CDATA[` block.\n\n`cdata` - The text of a `\u003c![CDATA[` block. Since `\u003c![CDATA[` blocks can get\nquite large, this event may fire multiple times for a single block, if it\nis broken up into multiple `write()`s. Argument: the string of random\ncharacter data.\n\n`closecdata` - The closing tag (`]]\u003e`) of a `\u003c![CDATA[` block.\n\n`opennamespace` - If the `xmlns` option is set, then this event will\nsignal the start of a new namespace binding.\n\n`closenamespace` - If the `xmlns` option is set, then this event will\nsignal the end of a namespace binding.\n\n`end` - Indication that the closed stream has ended.\n\n`ready` - Indication that the stream has reset, and is ready to be written\nto.\n\n`noscript` - In non-strict mode, `\u003cscript\u003e` tags trigger a `\"script\"`\nevent, and their contents are not checked for special xml characters.\nIf you pass `noscript: true`, then this behavior is suppressed.\n\n## Reporting Problems\n\nIt's best to write a failing test if you find an issue.  I will always\naccept pull requests with failing tests if they demonstrate intended\nbehavior, but it is very hard to figure out what issue you're describing\nwithout a test.  Writing a test is also the best way for you yourself\nto figure out if you really understand the issue you think you have with\nsax-js.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fisaacs%2Fsax-js","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fisaacs%2Fsax-js","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fisaacs%2Fsax-js/lists"}