{"id":13551927,"url":"https://github.com/WICG/scroll-to-text-fragment","last_synced_at":"2025-04-03T02:32:17.428Z","repository":{"id":43483789,"uuid":"167226608","full_name":"WICG/scroll-to-text-fragment","owner":"WICG","description":"Proposal to allow specifying a text snippet in a URL fragment","archived":false,"fork":false,"pushed_at":"2024-02-03T02:42:59.000Z","size":1794,"stargazers_count":585,"open_issues_count":28,"forks_count":43,"subscribers_count":28,"default_branch":"main","last_synced_at":"2024-07-30T05:18:22.142Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/WICG.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"security-privacy-questionnaire.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-01-23T17:43:36.000Z","updated_at":"2024-07-30T05:18:26.689Z","dependencies_parsed_at":"2024-07-30T05:18:25.863Z","dependency_job_id":"cca9645f-a398-440f-96f0-a0eb88d2fcda","html_url":"https://github.com/WICG/scroll-to-text-fragment","commit_stats":null,"previous_names":["wicg/scrolltotextfragment"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WICG%2Fscroll-to-text-fragment","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WICG%2Fscroll-to-text-fragment/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WICG%2Fscroll-to-text-fragment/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WICG%2Fscroll-to-text-fragment/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/WICG","download_url":"https://codeload.github.com/WICG/scroll-to-text-fragment/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246925459,"owners_count":20855887,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T12:01:55.915Z","updated_at":"2025-04-03T02:32:16.180Z","avatar_url":"https://github.com/WICG.png","language":"HTML","funding_links":[],"categories":["HTML","others"],"sub_categories":[],"readme":"# Text Fragments\n\n[Draft Spec](https://wicg.github.io/scroll-to-text-fragment/)  \n[Web Platform Tests](https://wpt.fyi/results/scroll-to-text-fragment?label=experimental\u0026label=master\u0026aligned)  \n[ChromeStatus entry](https://chromestatus.com/feature/4733392803332096)  \n\n## Introduction\n\nTo enable users to easily link to specific content in a web page, we propose\nadding support for specifying a text snippet in the URL. When navigating to\nsuch a URL, the browser understands more precisely what the user is interested\nin on the destination page. It may then provide an improved experience, for\nexample: visually emphasizing the text or automatically bringing it into view\nor allowing the user to jump directly to it.\n\n\nWeb standards currently specify support for scrolling to anchor elements with\nname attributes, as well as DOM elements with ids, when [navigating to a\nfragment](https://html.spec.whatwg.org/multipage/browsing-the-web.html#scroll-to-fragid).\nWhile named anchors and elements with ids enable scrolling to limited specific\nparts of web pages, not all documents make use of these elements, and not all\nparts of pages are addressable by named anchors or elements with ids.\n\n### Current Status\n\nThis feature, as currently [specified in this repo](https://wicg.github.io/scroll-to-text-fragment/),\nis shipping to stable channel in Chrome M80.\n\n### Motivating Use Cases\n\nWhen following a link to read a specific part of a web page, finding the\nrelevant part of the document after navigating can be cumbersome. This is\nespecially true on mobile devices, where it can be difficult to find specific\ncontent when scrolling through long pages or using the browser's \"find in page\"\nfeature. Fewer than 1% of clients use the \"Find in Page\" feature in Chrome on\nAndroid.\n\nTo enable users to more quickly find the content they're interested in, we\npropose generalizing the existing support for scrolling to elements based on\nthe fragment identifier. We believe this capability could be used by a variety\nof websites (e.g. search engine results pages, Wikipedia reference links), as\nwell as by end users when sharing links from a browser.\n\n#### Search Engines\n\nSearch engines, which link to pages that contain content relevant to user\nqueries, would benefit from being able to scroll users directly to the part of\nthe page most relevant to their query.\n\nFor example, Google Search currently links to named anchors and elements with\nids when they are available.  For the query \"lincoln gettysburg address\nsources\", Google Search provides a link to the named anchor\n[#Lincoln’s_sources](https://en.wikipedia.org/wiki/Gettysburg_Address#Lincoln's_sources)\nfor the [wikipedia page for Gettysburg Address](https://en.wikipedia.org/wiki/Gettysburg_Address)\nas a \"Jump to\" link:\n\n![Example \"Jump to\" link in search results](jumpto.png)\n\nHowever, there are many pages with relevant passages with no named anchor or\nid, and search engines cannot provide a \"Jump to\" link in such cases.\n\n#### Citations / Reference links\n\nLinks are sometimes used as citations in web pages where the author wishes to\nsubstantiate a claim by referencing another page (e.g. references in\nWikipedia). These reference pages can often be large, so finding the exact\npassage that supports the claim can be very time consuming. By linking to the\npassage that supports their underlying claim, authors can make it more\nefficient for readers to follow their overall argument.\n\n#### Sharing a specific passage in a web page\n\nWhen referencing a specific section of a web page, for example as part of\nsharing that content via email or on social media, it is desirable to be able\nto link directly to the specific section. If a section is not linkable by a\nnamed anchor or element with id, it is not currently possible to share a link\ndirectly to a specific section.\n\nUsers may work around this by sharing screenshots of the relevant portion of\nthe document (preventing the recipient of the content from engaging with the\nactual web page that hosts the content), or by including extra instructions to\nscroll to a specific part of the document (e.g. \"skip to the sixth paragraph\").\n\nWe would like to enable users to link to the relevant section of a document\ndirectly. Linking directly to the relevant section of a document preserves\nattribution, and allows the user following the URL to engage directly with the\noriginal publisher.\n\n## Proposed Solution\n\n### tl;dr\n\nAllow specifying text as part of the URL fragment:\n\nhttps://example.com#:~:text=prefix-,startText,endText,-suffix\n\nUsing this syntax\n\n```\n:~:text=[prefix-,]textStart[,textEnd][,-suffix]\n\n         context  |-------match-----|  context\n```\n_(Square brackets indicate an optional parameter)_\n\nNavigating to such a URL will cause the browser to indicate the first instance\nof the matched text. The exact details of what a browser should do once it\nfinds a match are mostly beyond the scope of this proposal. Browsers are mostly\nfree to choose what kind of UI to surface, whether or not to scroll the text\ninto view on load, and how to visually emphasize it.\n\nTo restrict an attacker's ability to exfiltrate information across origins,\nseveral restrictions are applied on when such an anchor is activated. A user\nactivation is required and consumed; text matching can only occur on word\nboundaries. Additionally, the fragment will activate only if the document is\nsufficiently isolated from other pages (is the only one in its browsing context\ngroup, e.g.  no window.opener or iframes).\n\nThe text directive is delimited from the rest of the fragment using the `:~:`\ntoken to indicate that it is a _fragment directive_ that the user agent should\nprocess and then remove from the URL fragment that is exposed to the page. The\ndirective syntax solves the issue of compatibility with page that rely on the\nURL fragment for routing/state, see\n[issue #15](https://github.com/WICG/ScrollToTextFragment/issues/15).\n\n### Background\n\nWe propose generalizing [existing\nsupport](https://html.spec.whatwg.org/multipage/browsing-the-web.html#find-a-potential-indicated-element)\nfor scrolling to elements as part of a navigation by adding support for\nspecifying a text snippet in the URL. We modify the [indicated part of the\ndocument](https://html.spec.whatwg.org/multipage/browsing-the-web.html#the-indicated-part-of-the-document)\nprocessing model to allow using a text snippet as the indicated part. The\nuser agent may then follow the existing logic for [scrolling to the fragment identifier](https://html.spec.whatwg.org/multipage/browsing-the-web.html#scroll-to-the-fragment-identifier)\nand/or apply other UI effects.\n\nThis extends the existing support for scrolling to anchor elements with name\nattributes, as well as DOM elements with ids, to scrolling to other textual\ncontent on a web page. Browsers first attempt to find an element that matches\nthe fragment using the existing support for elements with id attributes and\nanchor elements with name attributes. If no matches are found, browsers then\nwill process the text snippet specification.\n\n### Usability Goals\n\n * Users should be able to specify multiple, non-contiguous passages. There are\n   two reasons this is important. The first is intrinsic; users sometimes want\n   to emphasise multiple snippets of a larger text. [Examples](https://twitter.com/KingJames/status/1158904415618662400)\n   [abound](https://twitter.com/surn_name/status/1205397168342716416) on\n   [Twitter](https://twitter.com/anildash/status/574389867154661377).\n\n   The second is to deal with complicated DOM cases where DOM order and text\n   order doesn't align. A common example would be a column in a table, or a\n   contiguous paragraph with an inline ad.\n\n * The user may wish to specify text that spans multiple paragraphs, list items,\n   table entries, and other structures. Our proposal aims to allow users to\n   target test crossing arbitrary DOM and visual boundaries.\n\n * The text the user wishes to target may not be unique on the page. The\n   solution must account for this by providing ways to disambiguate multiple\n   matches on a page.\n\n  * Such links should be creatable for arbitrary pages across the web. This\n    means they must be compatible with the vast majority of existing and future\n    web sites.\n\n### Identifying a Text Snippet\n\nHere's an example URL encoding some text to indicate on the destination page:\n\nhttps://en.wikipedia.org/w/index.php?title=Cat\u0026oldid=916388819#:~:text=Claws-,Like%20almost,the%20Felidae%2C,-cats\n\n```\n:~:text=[prefix-,]textStart[,textEnd][,-suffix]\n\n         context  |-------match-----|  context\n```\n_(Square brackets indicate an optional parameter)_\n\nThough existing HTML support for id and name attributes specifies the target\nelement directly in the fragment, most other mime types make use of this x=y\npattern in the fragment, such as [Media\nFragments](https://www.w3.org/TR/media-frags/#media-fragment-syntax) (e.g.\n#track=audio\u0026t=10,20), [PDF](https://tools.ietf.org/html/rfc3778#section-3)\n(e.g. #page=12) or [CSV](https://tools.ietf.org/html/rfc7111#section-2) (e.g.\n#row=4).\n\nThe _text_ keyword will be used to identify a block of text that should be\nindicated.  The provided text is percent-decoded before matching. Dash (-),\nampersand (\u0026), and comma (,) characters in text snippets must be\npercent-encoded to avoid being interpreted as part of the text fragment\nsyntax.\n\nThe [URL standard](https://url.spec.whatwg.org/) specifies that a fragment can\ncontain [URL code points](https://url.spec.whatwg.org/#url-code-points), as\nwell as [UTF-8 percent encoded\ncharacters](https://url.spec.whatwg.org/#utf-8-percent-encode). Characters in\nthe [fragment percent encode\nset](https://url.spec.whatwg.org/#fragment-percent-encode-set) must be percent\nencoded.\n\nThere are two kinds of terms specified in the text directive: the _match_ and\nthe _context_. The match is the portion of text that’s to be indicated. The\ncontext is used only to disambiguate the match and is not highlighted.\n\nContext is optional, it need not be provided. However, the text directive must\nalways specify a match term.\n\n#### Match\nA match can be specified as either a single argument or as a pair.\n\nIf the match is provided using two arguments, the left argument is considered\nthe starting snippet and the right argument is considered the ending snippet\n(e.g. `text=_startText_,_endText_`). In this case, the browser will perform\na \"range search\" for a block of text that starts with _startText_ and ends with\n_endText_. If multiple blocks match the first in DOM order is chosen (i.e. find\nthe first occurrence of startText, from there find the first occurrence of\nendText). When a match is specified with two arguments, we allow highlighting\ntext that spans multiple elements.\n\nIf the match is specified as a single argument, we consider it an \"exact\nsearch\" (e.g. `text=_textSnippet_`). The browser will highlight the first\noccurrence of exactly the _textSnippet_ string. In this case, the specified text\nwill be matched only if it is contained within a single node.\n\nRange matches are useful when the desired text match is extremely long.\nFor example, selecting multiple paragraphs of text using an exact match would\nresult in a very long and cumbersome URL.\n\n\u003ctable\u003e\u003ctr\u003e\u003ctd\u003e\nE.g. Given:\n            \n * Text1\n * Text2\n * Text3\n * Text4\n\n`text=Text2,Text4` will highlight all items except the first:\n\n* Text1\n* __Text2__\n* __Text3__\n* __Text4__\n\n`text=Text2` will highlight just the second item:\n\n* Text1\n* __Text2__\n* Text3\n* Text4\n\n\u003c/td\u003e\u003c/tr\u003e\u003c/table\u003e\n\n#### Context\n\nTo disambiguate non-unique snippets of text on a page, arguments can\nspecify optional _prefix_ and _suffix_ terms. If provided, the match term will\nonly match text that is immediately preceded by the _prefix_ text and/or\nimmediately followed by the _suffix_ text (allowing for an arbitrary amount of\nwhitespace in between). Immediately preceded, in these cases, means there are\nno other text nodes between the match and the context term in DOM order. There\nmay be arbitrary whitespace and the context text may be the child of a\ndifferent element (i.e. searching for context crosses element boundaries).\n\nIf provided, the prefix must end (and suffix must begin) with a dash (-)\ncharacter. This is to disambiguate the prefix and suffix in the presence of\noptional parameters. It also leaves open the possibility of extending the\nsyntax in the future to allow multiple context terms, allowing more complicated\ncontext matching across elements.\n\nIf provided, the prefix must be the first argument to the text directive.\nSimilarly, the suffix must be the last argument.\n\n\u003ctable\u003e\u003ctr\u003e\u003ctd\u003e\n\nFor example, suppose we want to perform the following highlight:\n\n![The highlighted text appears multiple times](draft96.png)\n\nSince the text “United States” is ambiguous, we must provide a suffix to disambiguate it:\n\n`text=United States,-Minnesota Timberwolves`\n\n\u003c/td\u003e\u003c/tr\u003e\u003c/table\u003e\n\n### Multiple Text Directives\n\nUsers can specify multiple snippets by providing additional text directives in\nthe _fragment directive_, separated by the ampersand (\u0026) character.\n\nEach `text=` directive is considered independent in the sense that success or\nfailure to match in one does not affect matching of any others. Each starts\nsearching from the top of the document.\n\nOnly the left-most, successfully matched, directive will be the indicated part\nof the document (i.e. used as the CSS target, scrolled into view). That is, if\n“foo” did not appear anywhere on the page but “bar” does, we scroll “bar” into\nview. However, all matched directives will be visually indicated on the page.\n\n\u003ctable\u003e\u003ctr\u003e\u003ctd\u003e\nFor example:\n\n```\nexample.com#:~:text=foo\u0026text=bar\u0026text=baz\n```\n\nwill target each of “foo”, “bar”, and “baz” and use the “foo” result as the\nindicated part of the document, assuming all appear on the page.\n\n\u003c/td\u003e\u003c/tr\u003e\u003c/table\u003e\n\nMultiple terms can be useful when the desired text has unrelated inline\nelements like images, ads, tables, etc:\n\n![Highlighted text has an unrelated table inline](baracuda.png)\n\nUsers may also wish to emphasize multiple passages of a larger text. We've\nfound many such examples online:\n\n![Example of an screenshot with multiple highlights](twitter.png)\n\n### Fragment Directive\n\nSome existing pages on the web use fragments for their own state/routing. These\npages may break if an unexpected fragment is provided. See\n[#15](https://github.com/WICG/ScrollToTextFragment/issues/15)\n\nElement-id based fragments also cause these pages to break; however, text\nfragments are much more likely to be user-generated and are thus more likely to\ncause unexpected breakage. Pages that rely on fragment routing are also\nunlikely to provide anchor points, whereas they are likely to have text.\n\nOur solution to this is to introduce the concept of a _fragment directive_.\nThe fragment directive is a specially-delimited part of the URL fragment that\nis meant for UA instructions only. It's stripped out from the URL during\ndocument loading so that it's completely invisible to the page.\n\nThis allows specifying UA instructions like a text fragment in a way that's\nguaranteed not to interfere with page script and ensures maximal compatibility\nwith the existing web.\n\nHowever, stripping arbitrary parts of a fragment may not be web compatible! We\nwent through several ideas here:\n\n#### The Double-Hash\n\nWe tried delimiting the fragment directive using `##`. It's ergonomic and works\nwell since, if the original URL doesn't have a fragment, the double-hash\ndelimiter will already be parsed as a fragment!\n\nHowever, `#` is [not a valid code\npoint](https://url.spec.whatwg.org/#url-code-points) in the URL spec. As was\nexplained in a thread on the [w3.org URI mailing\nlist](https://lists.w3.org/Archives/Public/uri/2019Sep/0000.html), some URL\nparsers parse from right to left. Having an additional `#` character will cause\nthese parsers to break. Worse, we don't have a good way to measure the risk.\n\nUse counters we added to Chrome in M77 showed that, on Windows, about 0.08% of\npage loads already have a `#` character in the fragment. While small, that's a\nnon trivial percentage.\n\n#### Enter :~:\n\nA new delimiter would have to be both spec-compliant with the URL spec and have\nsufficiently low usage on the existing web such that this change would be\nweb-compatible.\n\nWe assumed this would preclude any single or double character sequences and\nproduced a list of candidates to consider:\n* !~!\n* !~~!\n* \\~\u0026\\~\n* :~:\n* \\~@\\~\n* \\~\\_\\~\n* \\_~\\_\n\nWe also considered using a more verbose delimiter:\n* \u0026directive\n* @directive\n* $directive\n* /directive\n* -directive\n\nLooking through links seen in the last 5 years by the Google Search crawler, we\neliminated some of this list. None of the \"verbose\" list had been seen;\nhowever, given valid candidates in the first list, we prefered them for\nsuccinctness and to reduce English-centric keywords.\n\nOf the above list, the following had never been seen in a URL fragment by the\ncrawler:\n\n* \\~\u0026\\~ no hits\n* :~: no hits\n* \\~@\\~ one hit\n\nWhile this doesn't guarantee compatibility, it did give us some confidence.  We\nchose `:~:` from this list somewhat arbitrarily. However, we've also added\nChrome use-counters to M78 for all these delimiters. `:~:` is seen on fewer\nthan 0.0000039% of page loads (or about 1 in 25 million) so we currently\nbelieve this is a safe choice.\n\n#### Directives and Delimiters\n\nWhen appending the `:~:` token to a URL, it must appear inside a fragment so a\n`#` must also be added:\n\n`https://example.com` --\u003e `https://example.com#:~:text=foo`\n\nHowever, a URL with an existing fragment can simply be appended to:\n\n`https://example.com#fallback:~:text=foo`\n\nIn this case, if the text match isn't found, the browser can fallback to\nscrolling the element-id specified in the fragment (e.g. id=\"fallback\" in this\ncase). Note that the text directive will always begin searching at the top of\nthe document, even if a matching element-id fragment is provided.\n\n#### Compatibility and Interop\n\nUser agents that haven't implemented this feature won't know how to process the\nfragment directive. Because it is part of the fragment, on most pages this will\nsimply be processed as a non-existent fragment so the page will load scrolled\nto the top, as if a fragment weren't supplied. This is a graceful fallback.\n\nA more risky scenario is apps that use the fragment for state and routing. In\nthese cases, the page is using the fragment in an application-defined manner and\nadding any content to it impact how the page operates (this is one of the\nmotivating cases for using the fragment delimiter for `text=`).\n\nIn the worst case, such a URL on an unimplementing UA may navigate to a broken\npage. However, most such pages we've seen handle this gracefully, e.g.:\n\nhttps://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/OOZIrtSPLeM:~:text=test\n\nIs a Google Groups post with a directive appended. Loading it in an\nunimplementing UA displays an \"The input is invalid.\" toast in the corner but the\npage otherwise loads as if without the directive. We expect many cases will\nbehave similarly but the potential of more serious breakage does exist.\n\nNote: the fragment directive behavior (stripping everything after and including\nthe `:~:` delimiter from the fragment) can be implemented independently of the\nlarger proposal.\n\n### Feature Detection and Future APIs\n\nAn author may wish to detect whether a UA has implemented support for\ntext-fragments. This can be used by pages that generate such links to avoid\ngenerating fragment-directives for non-implementing UAs. It can also be used by\nlibraries or authors to strip the fragment-directive from user or author\ngenerated links.\n\nThis proposal includes a new property on the `document` object:\n\n```\ndocument.fragmentDirective\n```\n\nAuthors can check for the existence of this (currently empty) object to\ndetermine if a UA has implemented support for text-fragments.\n\nThis also serves as an extension point for future APIs. For example, we'd like\nto expose information about the text-fragments included in the URL so that\nauthors can build functionality on it. See\n[#128](https://github.com/WICG/scroll-to-text-fragment/issues/128) for more\ndetails.\n\n### :target\n\nFor element-id based fragments (e.g.\nhttps://en.wikipedia.org/wiki/Cat#References), navigation causes the identified\nelement to receive the `:target` CSS pseudo-class. This is a nice feature as it\nallows the page to add some customized highlighting or styling for an element\nthat’s been targeted. For example, note that navigating to a citation on a\nWikipedia page highlights the citation text:\nhttps://en.wikipedia.org/w/index.php?title=Cat\u0026direction=prev\u0026oldid=916388819#cite_note-Linaeus1758-1\nThe `:target` CSS pseudo-class can only apply to elements whereas a text\nsnippet may only be a portion of the text in a node or span multiple nodes.\n\nThe `:target` pseudo-class is applied to the first common ancestor element that\ncontains all the matching text, for the left-most matching `text=` directive.\n\n### Security Considerations\n\n_Some of the more detailed reasoning behind the security decisions is described\nin our [security review doc](https://docs.google.com/document/d/1YHcl1-vE_ZnZ0kL2almeikAj2gkwCq8_5xwIae7PVik/edit#heading=h.g7hd03ifqsc)_\n\nIf an attacker can detect a side-effect of a successful match, this feature\ncould be used to detect the presence of arbitrary text on the page. For\nexample, if the UA scrolls to the targeted text on navigation, an attacker\nmight be able to determine whether a scroll occurred by listening to network\nrequests or using an IntersectionObserver from an attacker-controlled iframe\nembedded on the target page.\n\nA related attack is possible if the existence of a match takes significantly\nmore or less work than non-existence. An attacker can navigate to a text\n_fragment directive_ and time how busy the JS thread is; a high load may imply\nthe existence or non-existence of an arbitrary text snippet. This is a\nvariation of a documented\n[proof-of-concept](https://blog.sheddow.xyz/css-timing-attack/).\n\nUAs are free to determine how a successfully matched text fragment should be\nsurfaced to the user based on their own assessment of how much risk certain\nactions present. For example, whether scrolling on navigation is likely to be\ndetectable in enough cases.\n\nTo prevent brute force attacks from guessing important words on a page (e.g.\npasswords, pin codes), matches and prefix/suffix are only matched on word\nboundaries. E.g.  “range” will match in “mountain range” but not in “color\norange” nor “forest ranger”.\n\nWord boundaries are simple in languages with spaces but can become more subtle\nin languages without breaks (e.g. Chinese). A library like ICU [provides\nsupport](http://userguide.icu-project.org/boundaryanalysis#TOC-Word-Boundary)\nfor finding word boundaries across all supported languages based on the Unicode\nText Segmentation standard. Some browsers already allow word-boundary\nmatching for the window.find API which allows specifying wholeWord as an\nargument. We hope this existing usage can be leveraged in the same way.\n\nAdditionally, a text directive is invoked only if a user activation occurred and\nthe loaded document is the only one in its browsing context group. The latter\nrestriction is effectively requiring `rel=noopener` be specified on a\nnavigation.\n\nVisual emphasis is performed using a visual-only indicator (i.e. don’t cause\nselection), styled by the UA and undetectable from script. This helps prevents\ndrag-and-drop or copy-paste attacks.\n\n#### Client-Side Redirects\n\nDue to the prevelance of client-side redirects (i.e. loading a document that\nnavigates via e.g. `window.location`), special care is taken to enable these\nscenarios, despite the fact they lack a user activation. See\n[redirects.md](redirects.md) for details.\n\n### Opting Out\n\nFor product reasons, or acute privacy restrictions, pages may wish to disallow\nscrolling to a text fragment (or regular fragment) on load, see\n[#80](https://github.com/WICG/ScrollToTextFragment/issues/80). To allow websites\nto opt out of text fragments, we propose adding a [Document\nPolicy](https://github.com/w3c/webappsec-feature-policy/blob/master/document-policy-explainer.md)\nnamed force-load-at-top that ensures the page is loaded without any form of\nscrolling, including via text fragments, regular element fragments, and scroll\nrestoration. Websites can use this document policy by serving the HTTP header:\n\n```\nDocument-Policy: force-load-at-top\n```\n\n## Alternatives Considered\n\n### Text Fragment Directive 0.1\n\nA prior revision of this document contained a somewhat similar proposal. The\nmain difference in the updated proposal is that it adds context terms to the\ntext directive. This helps to allow disambiguating text on a page as well as\nbrings this proposal more in-line with the Open Annotation's\n[TextQuoteSelector](https://www.w3.org/TR/annotation-model/#text-quote-selector).\nMany use cases and details were considered while iterating on the initial\nrevision. The updated proposal is a sum of lessons learned and improved\nunderstanding as we experimented with and considered the initial version and\nits limitations\n\n### CSS Selector Fragments\n\nOur initial idea, explored in some detail, was to allow encoding a CSS selector\nin the URL fragment. The selector would determine which element on the page\nshould be the \"indicated element\" in the [navigating to a\nfragment](https://html.spec.whatwg.org/multipage/browsing-the-web.html#scroll-to-fragid)\nsteps. In fact, this explainer is based on @bryanmcquade's original [CSS\nSelector Fragment\nexplainer](https://github.com/bryanmcquade/scroll-to-css-selector).\n\nThe main drawback with this approach was making it secure. Allowing scroll on\nload to a CSS selector allows several ways an attacker could exfiltrate hidden\ninformation (e.g. CSRF tokens) from the page. One such attack is demonstrated\n[here](https://blog.sheddow.xyz/css-timing-attack/) but others were quickly\ndiscovered as well.\n\nTrying to pare down the allowable set of primitives to make selectors secure\nturned out to be quite complex. Text snippets, which can be searched\nasynchronously and are generally less security sensitive, became our preferred\nsolution. As an additional bonus, we expect text snippets to be more stable and\neasier to understand by non-technical users.\n\n### Increase use of elements with named anchors / id attributes in existing web pages\n\nAs an alternative, we could ask web developers to include additional named\nanchor tags in their pages, and reference those new anchors. There are two\nissues that make this less appealing. First, legacy content on the web won’t\nget updated, but users consuming that legacy content could still benefit from\nthis feature. Second, it is difficult for web developers to reason about all of\nthe possible points other sites might want to scroll to in their pages. Thus,\nto be most useful, we prefer a solution that supports scrolling to any point in\na web page.\n\n### JavaScript-based API (instead of URL fragment)\n\nWe also considered specifying the target element via a JavaScript-based\nnavigation API, such as via a new parameter to location.assign(). It was\nconcluded that such an API is less useful, as it can only be used in contexts\nwhere JavaScript is available. Sharing a link to a specific part of a document\nis one use case that would not be possible if the target element was specified\nvia a JavaScript API. Using a JavaScript API is also less consistent than\nexisting cases where a scroll target is specified in a URL, such as the\nexisting support in HTML, as well as support for other document formats such as\nPDF and CSV.\n\n## Future Work\n\nOne important use case that's not covered by this proposal is being able to\nscroll to an image. A nearby text snippet can be used to scroll to the image\nbut it depends on the page and is indirect. We'd eventually like to support\nthis use case more directly.\n\nA potential option is to consider this just one of many available [Open\nAnnotation selectors](https://www.w3.org/TR/annotation-model/#selectors).\nFuture specification and implementation work could allow using selectors other\nthan TextQuote to allow targetting various kinds of content.\n\nAnother avenue of exploration is allowing users to specify highlighting in more\ndetail. There are also cases where the user may wish to prevent highlights\naltogether, as in the image search case described above.\n\nWe've thought about these cases insofar as making sure our proposed solution\ndoesn't preclude these enhancements in the future. However, the work of\nactually realizing them will be left for future iterations of this effort.\n\n## Additional Considerations\n\n### Constructing Arguments to Text Fragments\n\nWe imagine URLs with text fragment directives to primarily be\nmachine-generated rather than crafted by hand by users. At the same time, we\nbelieve there's a benefit to keeping the URL relatively\nhuman-readable: in most cases, simply copying and pasting the desired passage\nshould generate a text fragment directive that will scroll and highlight the\ndesired passage.\n\nThe two systems that we believe will generate the bulk of such URLs are\nbrowsers and search engines. We forsee users selecting text from the browser,\nwith an option to \"share a link to here\". These links can then be shared\nfurther as wikipedia reference links or over channels like social media or\nemail.\n\nSearch engines can also generate text directive URLs as links to search results\nfor user queries; these links may scroll to and highlight relevant passages to\nthe user's query. Note that even though using the selected text as the\ntextStart argument to the text directive may work reasonably well in practice\nas a heuristic, generating URLs targetting arbitrary text requires access to\nthe full document text up to the desired text. Both browsers and search\nengines have access to the entire visible text of the page, so it is indeed\npossible for these systems to generate proper URLs with text directive\narguments that scroll and highlight any arbitrary text.\n\n### Web and Browser Compatibility\n\nAs noted in [issue #15](https://github.com/WICG/ScrollToTextFragment/issues/15),\nweb pages could potentially be using the fragment to store parameters, e.g.\n`http://example.com/#name=test`. If sites don't handle unexpected tokens when\nprocessing the fragment, this feature could break those sites. In particular,\nsome frameworks use the fragment for routing. This is solved by the user agent\nhiding the :~:text part of the fragment from the site, but browsers that\ndo not have this feature implemented would still break such sites.\n\nFor pages that don't process the fragment, a browser that doesn't yet support\nthis feature will attempt to process the fragment and _fragment directive_\n(i.e. :~:text) using the existing logic to find a [potential indicated\nelement](https://html.spec.whatwg.org/multipage/browsing-the-web.html#find-a-potential-indicated-element).\nIf a fragment exists in the URL alongside the _fragment directive_, the browser\nmay not scroll to the desired fragment due to the confusion with parsing the\n_fragment directive_.  If a fragment does not exist alongside the _fragment\ndirective_, the browser will just load the page and won't initiate any\nscrolling.  In either case, the browser will just fall back to the default\nbehavior of not scrolling the document.\n\n### Relation to existing support for navigating to a fragment\n\nBrowsers currently support scrolling to elements with ids, as well as anchor\nelements with name attributes. This proposal is intended to extend this\nexisting support, to allow navigating to additional parts of a document. As\nShaun Inman [notes](https://shauninman.com/archive/2011/07/25/cssfrag) (in\nsupport of CSS selector fragments), this feature is \"not meant to replace more\nconcise, author-designed urls\" using id attributes, but rather \"enables a\nsite’s users to address specific sub-content that the site’s author may not\nhave anticipated as being interesting\".\n\n## Related Work / Additional Resources\n\n### Using CSS Selectors as Fragment Identifiers\n\nSimon St. Laurent and Eric Meyer\n[proposed](http://simonstl.com/articles/cssFragID.html) using CSS Selectors as\nfragment identifiers (last updated in 2012). Their proposal differs only in\nsyntax used: St. Laurent and Meyer proposed specifying the CSS selector using a\n```#css(...)``` syntax, for example ```#css(.myclass)```. This syntax is based\non the XML Pointer Language (XPointer) Framework, an \"extensible system for XML\naddressing\" ... \"intended to be used as a basis for fragment identifiers\".\nXPointer does not appear to be supported by commonly used browsers, so we have\nelected to not depend on it in this proposal.\n\n[Shaun Inman](https://shauninman.com/archive/2011/07/25/cssfrag) and others\nlater implemented browser extensions using this #css() syntax for Firefox,\nSafari, Chrome, and Opera, which shows that it is possible to implement this\nfeature across a variety of browsers.\n\nThe [Open Annotation Community\nGroup](https://www.w3.org/community/openannotation/) aims to allow annotating\narbitrary content. There is significant overlap in our goal of specifying a\nsnippet of text in a resource. In fact, they've already specified a\n[TextQuoteSelector](https://www.w3.org/TR/annotation-model/#text-quote-selector)\nfor similar purposes.\n\nThis proposal has been made similar to the TextQuoteSelector in hopes that we\ncan extend and reuse that processing model rather than inventing a new one,\nalbeit with a stripped down syntax for ease of use in a URL. Our work has been\ninformed specifically by prior efforts at selecting arbitrary textual content\nfor an annotation.\n\nScroll Anchoring\n\n* [https://drafts.csswg.org/css-scroll-anchoring/](https://github.com/WICG/ScrollAnchoring/blob/master/explainer.md)\n* [https://docs.google.com/document/d/1YaxJ0cxFADA_xqUhGgHkVFgwzf6KXHaxB9hPksim7nc/edit](https://docs.google.com/document/d/1YaxJ0cxFADA_xqUhGgHkVFgwzf6KXHaxB9hPksim7nc/edit)\n\nScroll to text\n\n* [https://indieweb.org/fragmention](https://indieweb.org/fragmention)\n* [http://zesty.ca/crit/draft-yee-url-textsearch-00.txt](http://zesty.ca/crit/draft-yee-url-textsearch-00.txt)\n* [http://1997.webhistory.org/www.lists/www-talk.1995q1/0284.html](http://1997.webhistory.org/www.lists/www-talk.1995q1/0284.html)\n* [Fragment Search - A Greasemonkey script by Gervase Markham](http://www.gerv.net/software/fragment-search/)\n* [NYT Emphasis](https://open.blogs.nytimes.com/2011/01/11/emphasis-update-and-source/)\n\nOther\n\n* [https://en.wikipedia.org/wiki/Fragment_identifier#Examples](https://en.wikipedia.org/wiki/Fragment_identifier#Examples)\n* [https://www.w3.org/TR/2017/REC-annotation-model-20170223/](https://www.w3.org/TR/2017/REC-annotation-model-20170223/)\n\n## Acknowledgements\n\nMany people have contributed greatly to the ideas and content in this repo, both through excellent work on linking\nto text as well as direct feedback and comments in issues on this repo which helped to improve this feature. In particular,\nwe'd like to thank:\n\n * @BigBlueHat\n * Ivan Herman\n * Randall Leeds\n * Kevin Marks\n * Isiah Meadows\n * Wes Turner\n * Dan Whaley\n * Gerben\n * And many others who've provided comments, questions, examples, and opinions. Thank you!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FWICG%2Fscroll-to-text-fragment","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FWICG%2Fscroll-to-text-fragment","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FWICG%2Fscroll-to-text-fragment/lists"}