{"id":21615975,"url":"https://github.com/pharo-ai/tf-idf","last_synced_at":"2025-03-18T17:19:39.809Z","repository":{"id":88081752,"uuid":"181912812","full_name":"pharo-ai/tf-idf","owner":"pharo-ai","description":"Implementation of TF-IDF in Pharo","archived":false,"fork":false,"pushed_at":"2022-02-04T10:32:01.000Z","size":39,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-24T20:45:45.794Z","etag":null,"topics":["pharo","statistics","term-frequency","tf-idf"],"latest_commit_sha":null,"homepage":null,"language":"Smalltalk","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pharo-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2019-04-17T14:51:07.000Z","updated_at":"2024-03-01T13:45:55.000Z","dependencies_parsed_at":"2023-05-18T05:00:33.851Z","dependency_job_id":null,"html_url":"https://github.com/pharo-ai/tf-idf","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pharo-ai%2Ftf-idf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pharo-ai%2Ftf-idf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pharo-ai%2Ftf-idf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pharo-ai%2Ftf-idf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pharo-ai","download_url":"https://codeload.github.com/pharo-ai/tf-idf/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244267475,"owners_count":20425835,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pharo","statistics","term-frequency","tf-idf"],"created_at":"2024-11-24T22:13:17.109Z","updated_at":"2025-03-18T17:19:39.789Z","avatar_url":"https://github.com/pharo-ai.png","language":"Smalltalk","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Term Frequency - Inverse Document Frequency (TF-IDF)\n\n[![Build status](https://github.com/pharo-ai/tf-idf/workflows/CI/badge.svg)](https://github.com/pharo-ai/tf-idf/actions/workflows/test.yml)\n[![Coverage Status](https://coveralls.io/repos/github/pharo-ai/TF-IDF/badge.svg?branch=master)](https://coveralls.io/github/pharo-ai/TF-IDF?branch=master)\n[![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://raw.githubusercontent.com/pharo-ai/TF-IDF/master/LICENSE)\n\nThis repository contains the implementation of TF-IDF algorithm in Pharo.\n\nFor more infomation please refer to the Pharo-AI wiki: https://github.com/pharo-ai/wiki\n\n## How to install it\n\nTo install `TF-IDF`, go to the Playground (Ctrl+OW) in your [Pharo](https://pharo.org/) image and execute the following Metacello script (select it and press Do-it button or Ctrl+D):\n\n```Smalltalk\nMetacello new\n  baseline: 'AITfIdf';\n  repository: 'github://pharo-ai/tf-idf/src';\n  load.\n```\n\n## How to depend on it\n\nIf you want to add a dependency on `TF-IDF` to your project, include the following lines into your baseline method:\n\n```Smalltalk\nspec\n  baseline: 'AITfIdf'\n  with: [ spec repository: 'github://pharo-ai/tf-idf/src' ].\n```\n\nIf you are new to baselines and Metacello, check out the [Baselines](https://github.com/pharo-open-documentation/pharo-wiki/blob/master/General/Baselines.md) tutorial on Pharo Wiki.\n\n## How to use it\n\nHere is a simple example of how you can train a TF-IDF model and use it to assign scores to words. You are given an array of sentences where each sentence is represented as an array of words:\n\n```Smalltalk\nsentences := #(\n  (I am Sam)\n  (Sam I am)\n  (I 'don''t' like green eggs and ham)).\n```\n\nTrain a TF-IDF model on those sentences:\n\n```Smalltalk\ntfidf := AITermFrequencyInverseDocumentFrequency new.\ntfidf trainOn: sentences.\n```\n\nUse it to assign TF-IDF scores to words:\n\n```Smalltalk\ntfidf scoreOf: 'Sam' in: #(I am Sam). \"0.4054651081081644\"\n```\n\nYou can also encode any given text with a TF-IDF vector\n\n```Smalltalk\ntfidf vectorFor: #(I am green green ham). \"#(0.0 0.0 0.4054651081081644 0.0 0.0 0.0 2.1972245773362196 1.0986122886681098 0.0)\"\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpharo-ai%2Ftf-idf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpharo-ai%2Ftf-idf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpharo-ai%2Ftf-idf/lists"}