{"id":50035063,"url":"https://github.com/ozefe/yoktez","last_synced_at":"2026-05-21T00:01:35.120Z","repository":{"id":357723526,"uuid":"1191835461","full_name":"ozefe/yoktez","owner":"ozefe","description":"Typed Python client for searching, fetching metadata, and downloading theses from the National Thesis Center of Turkey (YÖK Ulusal Tez Merkezi)","archived":false,"fork":false,"pushed_at":"2026-05-14T03:09:41.000Z","size":611,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-14T03:25:38.095Z","etag":null,"topics":["academic-project","api-client","api-wrapper","httpx-client","thesis","ulusal-tez-merkezi","web-scraping"],"latest_commit_sha":null,"homepage":"https://github.com/ozefe/yoktez","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ozefe.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":".github/SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-25T16:23:07.000Z","updated_at":"2026-05-14T03:09:17.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ozefe/yoktez","commit_stats":null,"previous_names":["ozefe/yoktez"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/ozefe/yoktez","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ozefe%2Fyoktez","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ozefe%2Fyoktez/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ozefe%2Fyoktez/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ozefe%2Fyoktez/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ozefe","download_url":"https://codeload.github.com/ozefe/yoktez/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ozefe%2Fyoktez/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33281294,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-20T15:12:43.734Z","status":"ssl_error","status_checked_at":"2026-05-20T15:12:42.300Z","response_time":356,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["academic-project","api-client","api-wrapper","httpx-client","thesis","ulusal-tez-merkezi","web-scraping"],"created_at":"2026-05-21T00:00:58.038Z","updated_at":"2026-05-21T00:01:35.104Z","avatar_url":"https://github.com/ozefe.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# yoktez\n\n\u003cimg alt=\"yoktez mascot generated by Google's Nano Banana 2\" align=\"right\" src=\".github/mascot.png\" width=\"200\" /\u003e\n\nTyped Python client for the [National Thesis Center of Turkey](https://tez.yok.gov.tr/UlusalTezMerkezi/).\n\n`yoktez` wraps the YOK NTC JSP/AJAX surface behind a single synchronous `Client` with frozen-dataclass return types, a deterministic exception hierarchy, and bilingual-aware fields. Built for application and CLI developers who need a typed surface and a small install footprint without writing bespoke scraping code for each project.\n\n## Installation\n\n```bash\npip install yoktez\n```\n\nRequires Python 3.14+.\n\n## Quickstart\n\n```python\n\"\"\"End-to-end yoktez quickstart: search -\u003e metadata -\u003e assets.\n\nDemonstrates the typical three-call flow without writing files to disk.\n\nRun with: `python examples/quickstart.py`\n\"\"\"\n\nfrom yoktez import AssetStatus, Client\n\n_QUERY = \"yapay zeka\"\n\n\nwith Client() as client:\n    results = client.search.simple(_QUERY)\n    print(f\"{results.total} matches for {_QUERY!r}\")\n\n    thesis = results[0]\n    print(f\"  title:   {thesis.title}\")\n    print(f\"  author:  {thesis.author}\")\n    print(f\"  year:    {thesis.year}\")\n    print(f\"  keys:    {thesis.registration_no} / {thesis.thesis_no}\")\n\n    metadata = client.metadata.get(thesis)\n    print(f\"  advisor: {metadata.supervisor}\")\n    if metadata.affiliation is not None:\n        print(f\"  uni:     {metadata.affiliation.university}\")\n    if metadata.keywords is not None:\n        print(f\"  tags:    {len(metadata.keywords)} keywords\")\n\n    assets = client.assets.get(thesis)\n    print(f\"  status:  {assets.status.name}\")\n    if assets.status is AssetStatus.AVAILABLE:\n        print(f\"  pdf_key: {assets.pdf_key}\")\n```\n\nSample output:\n\n```text\n6841 matches for 'yapay zeka'\n  title:   Kimya eğitiminde yapay zekâ araştırmalarına ilişkin bir meta-sentez çalışması\n  author:  MURAT EBUBEKİR YAYLA\n  year:    2026\n  keys:    nslbSyAODG1_FIruL8qUAA / THvIvDpZXvJIiHZpuqpKVw\n  advisor: PROF. DR. MUSA ÜCE\n  uni:     MARMARA ÜNİVERSİTESİ\n  tags:    5 keywords\n  status:  AVAILABLE\n  pdf_key: 5T1_CZ5-UGb9QCmoURec4AbpuuyvqUeed_1PcCh_6DVZ4b1fbX7Gcu-DQFLIcE11\n```\n\n## Features\n\n- **Four search modes:** `simple`, `advanced`, `detail`, and `recent` from a single `client.search` namespace, all returning a sliceable `SearchResults` carrying the database-wide match total alongside the result window.\n- **Structured metadata:** `client.metadata.get(thesis)` returns a typed `ThesisMetadata` with bilingual keywords (`Bilingual(raw, tr, en)`), a tiered `Affiliation`, and pre-formatted citation strings (APA / IEEE / MLA / Chicago / Harvard).\n- **Two-step asset download:** `client.assets.get(thesis)` resolves to one of `AVAILABLE` / `UNDER_EMBARGO` / `NO_PERMIT` / `PREPARING` before any bytes move; the available branch exposes a `pdf_key` (and optional `appendix_key`) to feed `download_pdf` / `download_appendix`.\n- **Catalog lookups:** `client.lookups` covers universities (TR / INT), institutes, divisions, subjects, departments, sections, and keywords, with per-instance memoization and an explicit `refresh()`.\n- **Typed value objects:** every returned record is a `@dataclass(frozen=True, slots=True)`; values are immutable, hashable where field types allow, and ship with `py.typed` for downstream type checkers.\n- **Sync-only, thread-friendly:** no `async`/`await` surface; the recommended concurrency pattern is one `Client` per thread.\n- **Small dependency surface:** `httpx`, `beautifulsoup4`, and `lxml`. No Rust core, no auth, no hidden state.\n\n## Usage\n\nAll snippets assume `with Client() as client:` for deterministic cleanup of the underlying HTTP connection pool.\n\n### Search\n\nSimple search by free text, optionally narrowed to a single field:\n\n```python\nfrom yoktez import Client, SearchField\n\nwith Client() as client:\n    results = client.search.simple(\"yapay zeka\", field=SearchField.ABSTRACT)\n\n    print(f\"{results.total} matches\")\n    for thesis in results[:5]:\n        print(thesis.year, thesis.title)\n```\n\nAdvanced search joins up to three terms with boolean operators:\n\n```python\nfrom yoktez import AdvancedOperator, Client, MatchType\n\nwith Client() as client:\n    results = client.search.advanced(\n        \"sosyal\",\n        term2=\"medya\",\n        op1=AdvancedOperator.AND,\n        match=MatchType.INCLUDES,\n    )\n```\n\nDetail search accepts the full filter surface; enum-shaped parameters also accept the member name as a string or the raw int code:\n\n```python\nfrom yoktez import Client, ThesisType\n\nwith Client() as client:\n    unis = client.lookups.universities()\n    results = client.search.detail(\n        university=unis[0],\n        year_min=2020,\n        year_max=2025,\n        degree_type=ThesisType.MASTER,  # also accepts \"MASTER\" or 1\n    )\n```\n\nRecently added theses (server-fixed 15-day window):\n\n```python\nfrom yoktez import Client\n\nwith Client() as client:\n    results = client.search.recent()\n```\n\n### Metadata\n\n```python\nfrom yoktez import Client\n\nwith Client() as client:\n    thesis = client.search.simple(\"makine öğrenmesi\")[0]\n    metadata = client.metadata.get(thesis)\n\n    if metadata.affiliation is not None:\n        print(metadata.affiliation.university)\n    if metadata.keywords:\n        print(metadata.keywords[0].tr, \"=\", metadata.keywords[0].en)\n    if metadata.references is not None:\n        print(metadata.references.apa)\n```\n\n### Assets (two-step download)\n\n```python\nfrom yoktez import AssetStatus, Client\n\nwith Client() as client:\n    thesis = client.search.simple(\"yapay zeka\")[0]\n    assets = client.assets.get(thesis)\n\n    if assets.status is AssetStatus.AVAILABLE and assets.pdf_key is not None:\n        client.assets.download_pdf(assets.pdf_key, \"thesis.pdf\")\n\n        if assets.appendix_key is not None:\n            client.assets.download_appendix(assets.appendix_key, \"thesis-ek.rar\")\n```\n\n`download_pdf` and `download_appendix` accept a filesystem path (`Path` or `str`, opened and closed for you) or a pre-opened binary file-like (written to but not closed — ownership stays with the caller).\n\n### Lookups\n\n```python\nfrom yoktez import Client, UniversitySource\n\nwith Client() as client:\n    unis = client.lookups.universities(UniversitySource.TR)\n    institutes = client.lookups.institutes(unis[0])\n    divisions = client.lookups.divisions(unis[0], institutes[0])\n\n    # Bulk catalogs; keywords() also accepts group / language / first_letter / search.\n    keywords = client.lookups.all_keywords()\n```\n\nEvery `client.lookups.*` call is memoized on the `Client` instance. Call `client.lookups.refresh()` to clear the cache if YOKSIS IDs are suspected to have rotated.\n\n### HTTP client configuration\n\n`Client` accepts keyword-only overrides for the underlying `httpx.Client`:\n\n```python\nfrom yoktez import Client\n\nwith Client(timeout=60, retries=5, user_agent=\"my-app/1.0\") as client:\n    ...\n```\n\nFor full control, inject a pre-built `httpx.Client` via `http_client=`. Ownership stays with the caller; `Client.close()` is a no-op for an injected client:\n\n```python\nimport httpx\nfrom yoktez import Client\n\nhttp = httpx.Client(timeout=30.0, follow_redirects=True)\ntry:\n    with Client(http_client=http) as client:\n        ...\nfinally:\n    http.close()\n```\n\n## Concurrency\n\n`yoktez.Client` is single-threaded by design — share one per thread, never across threads. The library ships no concurrency primitives; threading strategy is the caller's choice.\n\n## Design principles\n\n- **Synchronous-only API:** Sync is sufficient for YOK NTC's IO patterns; an async surface would double the API and complicate testing for no proven benefit. Concurrency strategy belongs to the caller, and `examples/multithreaded_pool.py` demonstrates the one-`Client`-per-thread pattern.\n- **Frozen-dataclass value objects:** Every returned record is `@dataclass(frozen=True, slots=True)`. Stdlib-only, immutable, hashable, and very fast.\n- **Coerce-on-input enum handling:** Enum-shaped parameters accept the matching `Enum` member, its name (e.g., `\"MASTER\"`), or its raw int code; the raw-`int` passthrough tolerates new YOK NTC codes the library hasn't yet enumerated, so wire-side additions don't gate a release.\n- **Two-step download flow:** `client.assets.get(...)` resolves status first; `download_pdf` and `download_appendix` run only on the available branch. Honest to the underlying YOK NTC flow, and lets callers inspect embargo dates and appendix availability before committing to a second request.\n- **Hierarchical logger naming:** Every sub-package logs under `yoktez.\u003cconcern\u003e` (`yoktez.http`, `yoktez.search`, `yoktez.lookups`, `yoktez.assets`). Operators can silence the high-volume HTTP DEBUG channel while preserving the rarer parser WARNING channels; a single `logging.getLogger(\"yoktez\").setLevel(...)` still catches every child through parent propagation.\n\n## Limitations\n\n`yoktez` is intentionally narrow. The following are out of scope and will not land in this package:\n\n- **No async API:** Synchronous code throughout; no `async def`, no asyncio surface.\n- **No multi-threaded helper functions:** Concurrency strategy is the caller's choice.\n- **No authentication or login flows (e-Devlet):** Anonymous public-data access only; features requiring login (favorites, history) are excluded.\n- **No bypassing access restrictions:** Embargoed and no-permit theses surface their state via `AssetStatus` and the matching exception types; the library does not attempt to circumvent these.\n- **No data hosting or mirroring:** The library fetches on demand; no bundled snapshots of the YOK NTC database.\n- **No CLI shipped from this package:** A separate package may add one later — out of scope here.\n\n## License\n\nMIT — see [`LICENSE`](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fozefe%2Fyoktez","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fozefe%2Fyoktez","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fozefe%2Fyoktez/lists"}