{"id":50665660,"url":"https://github.com/samuraiwriter7/purity-detection-algorithm-v0.1","last_synced_at":"2026-06-08T06:04:46.691Z","repository":{"id":360105133,"uuid":"1248723604","full_name":"SamuraiWriter7/purity-detection-algorithm-v0.1","owner":"SamuraiWriter7","description":"A draft algorithmic specification for estimating origin purity, AI-generated ratio, warning flags, and review readiness in AI source-preservation systems.","archived":false,"fork":false,"pushed_at":"2026-05-25T02:11:23.000Z","size":25,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-25T03:25:39.673Z","etag":null,"topics":["ai-governance","ai-provenance","civilization-os","epicenter-preservation","model-collapse","purity-protocol","rag-provenance","royalty-os","source-preservation","trace-protocol"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SamuraiWriter7.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-25T01:49:11.000Z","updated_at":"2026-05-25T02:04:11.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/SamuraiWriter7/purity-detection-algorithm-v0.1","commit_stats":null,"previous_names":["samuraiwriter7/purity-detection-algorithm-v0.1"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/SamuraiWriter7/purity-detection-algorithm-v0.1","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamuraiWriter7%2Fpurity-detection-algorithm-v0.1","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamuraiWriter7%2Fpurity-detection-algorithm-v0.1/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamuraiWriter7%2Fpurity-detection-algorithm-v0.1/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamuraiWriter7%2Fpurity-detection-algorithm-v0.1/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SamuraiWriter7","download_url":"https://codeload.github.com/SamuraiWriter7/purity-detection-algorithm-v0.1/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamuraiWriter7%2Fpurity-detection-algorithm-v0.1/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34050246,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-08T02:00:07.615Z","response_time":111,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-governance","ai-provenance","civilization-os","epicenter-preservation","model-collapse","purity-protocol","rag-provenance","royalty-os","source-preservation","trace-protocol"],"created_at":"2026-06-08T06:04:31.706Z","updated_at":"2026-06-08T06:04:46.685Z","avatar_url":"https://github.com/SamuraiWriter7.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Epicenter Preservation OS Protocol v0.1\n\n**Status:** Draft v0.1.1  \n**Version:** 0.1.1  \n**Release Date:** 2026-05-25\n\nEpicenter Preservation OS Protocol is a draft specification for preserving primary sources, reference traces, value circulation, and model-collapse monitoring in AI civilization.\n\nThis protocol defines a minimal structure for identifying and preserving **Epicenters** — primary sources of thought, expression, observation, practice, and structural invention — while making AI reference events, purity signals, and royalty-readiness review more traceable.\n\nThe goal is not to prohibit AI-assisted creation.\n\nThe goal is to preserve the source ecology of AI civilization.\n\n---\n\n## Concept\n\nAI systems increasingly depend on human-created sources:\n\n- articles\n- books\n- notes\n- logs\n- datasets\n- talks\n- protocols\n- structural concepts\n- firsthand observations\n\nHowever, when AI systems consume, summarize, rewrite, and recursively reuse these sources, the original Epicenter can become difficult to identify.\n\nThis creates several risks:\n\n- primary sources become invisible\n- reference events are not recorded\n- creators are not credited or compensated\n- synthetic data begins to replace natural data\n- model-collapse risk increases\n- AI civilization loses source diversity\n\nEpicenter Preservation OS Protocol proposes a minimal structure for addressing these risks.\n\n```text\nEpicenter\n↓\nReferenceEvent\n↓\nTraceProtocol\n↓\nPurityProtocol\n↓\nCollapseMonitor\n↓\nRoyaltyProtocol / Allocation Review\n```\n\n---\n\n## Core Idea\n\nThe protocol separates five layers:\n\n1. **Epicenter Layer**  \n   Identifies the original source unit.\n\n2. **Reference Layer**  \n   Records when an AI system references or uses an Epicenter.\n\n3. **Purity Layer**  \n   Estimates the origin composition of a source, including natural, synthetic, hybrid, or recursive synthetic signals.\n\n4. **Collapse Monitoring Layer**  \n   Observes ecosystem-level risks caused by declining natural-data ratios or recursive synthetic reuse.\n\n5. **Royalty / Value Circulation Layer**  \n   Provides a path toward auditable value return, without directly automating final allocation decisions.\n\n---\n\n## Design Principles\n\n### 1. Preserve the Source\n\nAI civilization depends on primary sources.\n\nIf the source layer collapses, downstream AI systems become increasingly shallow, repetitive, and derivative.\n\n### 2. Trace Before Allocation\n\nReference events must be recorded before any royalty or compensation logic is applied.\n\n```text\nNo trace\n↓\nNo reliable allocation\n```\n\n### 3. Purity Is Not Moral Judgment\n\nOrigin purity is not a measure of artistic value, legal authorship, or human worth.\n\nIt is an ecological signal for understanding data composition.\n\n```text\nPurity ≠ Value\nPurity ≠ Copyright\nPurity ≠ Authorship\nPurity = estimated origin composition\n```\n\n### 4. AI-Assisted Creation Is Not Invalid\n\nThe protocol does not reject AI-assisted work.\n\nInstead, it distinguishes:\n\n- human-origin sources\n- AI-assisted sources\n- hybrid sources\n- synthetic sources\n- recursively generated sources\n\n### 5. Review Before Royalty\n\nPurity scores and trace records should support review.\n\nThey should not automatically determine legal ownership or final payment.\n\n---\n\n## Repository Structure\n\n```text\n.\n├── README.md\n├── LICENSE\n├── CHANGELOG.md\n├── CITATION.cff\n├── spec/\n│   └── epicenter-preservation-os-v0.1.yaml\n├── schemas/\n│   ├── epicenter-preservation-os.schema.json\n│   └── purity-assessment.schema.json\n├── docs/\n│   └── purity-detection-algorithm.md\n├── examples/\n│   ├── epicenter.sample.yaml\n│   ├── reference-event.sample.yaml\n│   ├── royalty-record.sample.yaml\n│   ├── model-profile.sample.yaml\n│   └── purity-assessment.sample.yaml\n└── .github/\n    └── workflows/\n        └── validate-examples.yml\n```\n\n---\n\n## Key Documents\n\n### `spec/epicenter-preservation-os-v0.1.yaml`\n\nDefines the core entities and protocol layers of the Epicenter Preservation OS, including:\n\n- `Epicenter`\n- `ReferenceEvent`\n- `RoyaltyRecord`\n- `ModelProfile`\n- `TraceProtocol`\n- `RoyaltyProtocol`\n- `PurityProtocol`\n- `CollapseMonitor`\n\n---\n\n### `docs/purity-detection-algorithm.md`\n\nDefines the draft algorithmic framework for estimating:\n\n- `origin_purity_score`\n- `ai_generated_ratio`\n- `warning_flags`\n\nThis document supports:\n\n- natural / synthetic data separation\n- hybrid data classification\n- recursive synthetic data risk detection\n- model-collapse monitoring\n- reference auditing\n- royalty-readiness review\n\n---\n\n### `schemas/epicenter-preservation-os.schema.json`\n\nProvides a simplified machine-readable schema for validating the main protocol objects.\n\n---\n\n### `schemas/purity-assessment.schema.json`\n\nProvides a machine-readable validation schema for purity assessment records, including:\n\n- signal scores\n- confidence values\n- warning flags\n- review status\n- downstream-use permissions\n\n---\n\n### `examples/`\n\nContains sample objects for:\n\n- Epicenter registration\n- AI reference logging\n- royalty records\n- model profiles\n- purity assessment results\n\n---\n\n## Start Here\n\nThe recommended reading order is:\n\n### 1. Core Protocol\n\nStart with:\n\n```text\nspec/epicenter-preservation-os-v0.1.yaml\n```\n\nThis file defines the main structure of the protocol.\n\nIt introduces the core entities:\n\n- `Epicenter`\n- `ReferenceEvent`\n- `RoyaltyRecord`\n- `ModelProfile`\n\nAnd the core protocol layers:\n\n- `TraceProtocol`\n- `RoyaltyProtocol`\n- `PurityProtocol`\n- `CollapseMonitor`\n\n---\n\n### 2. Purity Detection Algorithm\n\nThen read:\n\n```text\ndocs/purity-detection-algorithm.md\n```\n\nThis document explains how the system may estimate the origin purity of an Epicenter.\n\nIt defines:\n\n- natural data\n- synthetic data\n- hybrid data\n- recursive synthetic data\n- `origin_purity_score`\n- `ai_generated_ratio`\n- `warning_flags`\n- human / multi-wing review triggers\n\nThis layer is especially important because the protocol does not treat AI-assisted creation as invalid.\n\nInstead, it separates ecological data-quality signals from legal authorship, copyright ownership, or final compensation decisions.\n\n---\n\n### 3. Purity Assessment Example\n\nReview:\n\n```text\nexamples/purity-assessment.sample.yaml\n```\n\nThis file shows how a purity assessment result may be represented in practice.\n\nIt includes:\n\n- input signal scores\n- score basis notes\n- `origin_purity_score`\n- `ai_generated_ratio`\n- `warning_flags`\n- review status\n- downstream-use decisions\n\n---\n\n### 4. Schema Layer\n\nReview:\n\n```text\nschemas/epicenter-preservation-os.schema.json\nschemas/purity-assessment.schema.json\n```\n\nThese files provide machine-readable validation structures for the core protocol objects and purity assessment outputs.\n\n---\n\n### 5. Example Objects\n\nFinally, review the remaining sample files in:\n\n```text\nexamples/\n```\n\nThese examples show how Epicenters, reference events, royalty records, model profiles, and purity assessment results may be represented in practice.\n\n---\n\n## Validation\n\nThis repository includes a GitHub Actions workflow for validating example files against their corresponding JSON Schemas.\n\nThe validation workflow is defined in:\n\n```text\n.github/workflows/validate-examples.yml\n```\n\nCurrent validation targets:\n\n```text\nexamples/purity-assessment.sample.yaml\n↓\nschemas/purity-assessment.schema.json\n```\n\nThe workflow checks that the sample purity assessment object conforms to the schema definition, including:\n\n- required fields\n- score ranges from `0.0` to `1.0`\n- allowed `method` values\n- allowed `warning_flags`\n- review status structure\n- downstream-use permission structure\n- ISO 8601 date-time format for `assessed_at`\n\nThis validation layer supports the `PurityProtocol` defined in the core protocol by ensuring that `origin_purity_score`, `ai_generated_ratio`, and `warning_flags` can be represented in a machine-readable and testable format.\n\nTo run the validation automatically, push changes to the `main` branch or open a pull request.\n\nThe workflow can also be triggered manually from the GitHub Actions tab.\n\n### Local Validation Concept\n\nThe GitHub Actions workflow uses:\n\n```text\nPython 3.12\njsonschema\nPyYAML\n```\n\nThe validation process is:\n\n```text\nLoad YAML example\n↓\nLoad JSON Schema\n↓\nValidate example against schema\n↓\nReport pass / fail\n```\n\nIf validation fails, the workflow prints the failing field path and the corresponding schema error.\n\nThis makes the protocol easier to maintain as the specification evolves.\n\n---\n\n## Core Entities\n\n### Epicenter\n\nAn Epicenter is a primary source unit.\n\nIt may represent:\n\n- an article\n- a note\n- a book\n- a talk\n- a log\n- a dataset\n- a protocol\n- a structural concept\n\nExample fields include:\n\n- `id`\n- `type`\n- `source_platform`\n- `author_id`\n- `created_at`\n- `updated_at`\n- `language`\n- `tags`\n- `structure_fingerprint`\n- `origin_purity_score`\n- `ai_generated_ratio`\n- `metadata`\n\n---\n\n### ReferenceEvent\n\nA ReferenceEvent records when an AI system references, uses, retrieves, trains on, or reasons from an Epicenter.\n\nExample fields include:\n\n- `id`\n- `model_id`\n- `epicenter_id`\n- `timestamp`\n- `context_window_hash`\n- `usage_type`\n- `request_id`\n- `trace_id`\n- `weight`\n\nSupported usage types may include:\n\n- `training`\n- `inference`\n- `rag`\n- `fine_tune`\n\n---\n\n### RoyaltyRecord\n\nA RoyaltyRecord represents a possible value-circulation record derived from reference activity.\n\nExample fields include:\n\n- `id`\n- `epicenter_id`\n- `creator_id`\n- `total_reference_weight`\n- `amount`\n- `period`\n- `settlement_status`\n- `settlement_channel`\n\nThis object does not automatically prove legal payment entitlement.\n\nIt represents a candidate accounting layer for future review and settlement.\n\n---\n\n### ModelProfile\n\nA ModelProfile describes model-side purity and source-dependency characteristics.\n\nExample fields include:\n\n- `id`\n- `provider`\n- `type`\n- `training_data_purity`\n- `collapse_risk_score`\n- `epicenter_dependency_index`\n\nThis object helps monitor whether AI systems remain connected to diverse primary sources or drift toward recursive synthetic dependency.\n\n---\n\n## Protocol Layers\n\n### TraceProtocol\n\nDefines the minimum requirements for recording reference events.\n\nRequired fields include:\n\n- `ReferenceEvent.id`\n- `ReferenceEvent.epicenter_id`\n- `ReferenceEvent.model_id`\n- `ReferenceEvent.timestamp`\n- `ReferenceEvent.usage_type`\n- `ReferenceEvent.trace_id`\n\nExpected guarantees include:\n\n- immutability\n- auditability\n- creator accessibility\n- regulator accessibility\n\n---\n\n### PurityProtocol\n\nDefines how the protocol may estimate the origin composition of an Epicenter.\n\nInput signals may include:\n\n- `ai_generated_ratio`\n- structure-fingerprint similarity to known AI patterns\n- author self-declaration\n- provenance evidence\n- revision lineage\n- citation transparency\n- structural originality\n\nOutputs may include:\n\n- `origin_purity_score`\n- `ai_generated_ratio`\n- `warning_flags`\n\nThe detailed algorithm is defined in:\n\n```text\ndocs/purity-detection-algorithm.md\n```\n\n---\n\n### CollapseMonitor\n\nDefines how ecosystem-level risk may be monitored.\n\nInputs may include:\n\n- model training-data purity\n- synthetic-data ratio\n- natural-data ratio\n- reference statistics\n- collapse-risk score\n- Epicenter production decline\n\nPossible alerts include:\n\n- `natural_data_ratio_below_threshold`\n- `epicenter_production_decline`\n- `model_collapse_risk_high`\n\n---\n\n### RoyaltyProtocol\n\nDefines an abstract structure for aggregating reference weight into possible value-circulation records.\n\nA draft formula may look like:\n\n```text\nΣ(ReferenceEvent.weight.value) × rate_table[usage_type]\n```\n\nPossible modifiers include:\n\n- `origin_purity_score`\n- `ai_generated_ratio`\n- review status\n- dispute status\n- platform policy\n- creator permission\n\nRoyaltyProtocol is intentionally abstract in v0.1.1.\n\nIt should not be treated as a final legal or financial settlement mechanism.\n\n---\n\n## Example Flow\n\n```text\n1. A creator publishes an original article.\n   ↓\n2. The article is registered as an Epicenter.\n   ↓\n3. The Epicenter receives a structure fingerprint and metadata.\n   ↓\n4. An AI system references the Epicenter through RAG or inference.\n   ↓\n5. A ReferenceEvent is logged.\n   ↓\n6. A purity assessment estimates origin composition.\n   ↓\n7. CollapseMonitor uses aggregated purity signals.\n   ↓\n8. RoyaltyProtocol may prepare candidate value-circulation records.\n   ↓\n9. Human / multi-wing review determines readiness.\n```\n\n---\n\n## Non-Goals\n\nThis repository does not attempt to:\n\n- define final copyright ownership\n- prove legal authorship\n- automatically determine payment\n- ban AI-assisted creation\n- classify all AI-generated content as low value\n- replace human review\n- provide a production-ready payment system\n- define universal originality\n- make moral judgments about creators or content\n\nThis protocol is a draft structural layer for traceability, review, and ecosystem-health monitoring.\n\n---\n\n## Relationship to Other Concepts\n\nEpicenter Preservation OS Protocol may relate to:\n\n- Trace Protocol\n- Structure Fingerprint\n- Royalty OS\n- Allocation Readiness\n- Model Collapse monitoring\n- RAG provenance\n- AI data governance\n- creator compensation systems\n- natural / synthetic data separation\n- AI reference auditing\n\n---\n\n## Version History\n\nSee:\n\n```text\nCHANGELOG.md\n```\n\nCurrent version:\n\n```text\n0.1.1\n```\n\n### v0.1.1\n\nAdds the PurityProtocol implementation layer:\n\n- purity detection algorithm document\n- purity assessment sample\n- purity assessment JSON Schema\n- GitHub Actions validation workflow\n- README validation documentation\n\n### v0.1.0\n\nInitial draft protocol.\n\nDefined the core entities and high-level protocol layers:\n\n- `Epicenter`\n- `ReferenceEvent`\n- `RoyaltyRecord`\n- `ModelProfile`\n- `TraceProtocol`\n- `RoyaltyProtocol`\n- `PurityProtocol`\n- `CollapseMonitor`\n\n---\n\n## Citation\n\nIf you use this specification, please cite it using:\n\n```text\nCITATION.cff\n```\n\n---\n\n## License\n\nThis repository is released under the license defined in:\n\n```text\nLICENSE\n```\n\n---\n\n## Summary\n\nEpicenter Preservation OS Protocol v0.1.1 defines a draft structure for preserving primary sources in AI civilization.\n\nIt connects:\n\n```text\nsource preservation\n↓\nreference tracing\n↓\npurity assessment\n↓\ncollapse monitoring\n↓\nvalue-circulation readiness\n```\n\nThe protocol does not claim to solve legal authorship or compensation automatically.\n\nInstead, it provides a machine-readable foundation for preserving the source layer that AI civilization depends on.\n\nIn short:\n\n```text\nIf AI civilization is a river,\nEpicenters are the springs.\nThis protocol is a first draft of the water-source registry.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamuraiwriter7%2Fpurity-detection-algorithm-v0.1","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsamuraiwriter7%2Fpurity-detection-algorithm-v0.1","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamuraiwriter7%2Fpurity-detection-algorithm-v0.1/lists"}