{"id":24611118,"url":"https://github.com/jonathanporta/headline-generator","last_synced_at":"2025-06-25T07:05:16.716Z","repository":{"id":150178029,"uuid":"171723551","full_name":"JonathanPorta/headline-generator","owner":"JonathanPorta","description":"Hacking on code snippets from around the web","archived":false,"fork":false,"pushed_at":"2019-02-20T18:03:50.000Z","size":32444,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-03-18T15:25:10.563Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JonathanPorta.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-02-20T18:02:39.000Z","updated_at":"2024-06-13T19:02:16.000Z","dependencies_parsed_at":"2023-04-09T14:17:04.805Z","dependency_job_id":null,"html_url":"https://github.com/JonathanPorta/headline-generator","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/JonathanPorta/headline-generator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JonathanPorta%2Fheadline-generator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JonathanPorta%2Fheadline-generator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JonathanPorta%2Fheadline-generator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JonathanPorta%2Fheadline-generator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JonathanPorta","download_url":"https://codeload.github.com/JonathanPorta/headline-generator/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JonathanPorta%2Fheadline-generator/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261823738,"owners_count":23215141,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-24T19:34:49.336Z","updated_at":"2025-06-25T07:05:16.666Z","avatar_url":"https://github.com/JonathanPorta.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Automatically generate headlines to short articles\n\n\u003ca target=\"_blank\" href=\"http://twitter.com/udibr\"\u003e\u003cimg alt='Twitter followers' src=\"https://img.shields.io/twitter/follow/udibr.svg?style=social\"\u003e\u003c/a\u003e\n\n\nThis project attempts to reproduce the results in the paper:\n[Generating News Headlines with Recurrent Neural Networks](http://arxiv.org/abs/1512.01712)\n\n## How to run\n### Software\n* The code is running with [jupyter notebook](http://jupyter.org/)\n* Install [Keras](http://keras.io/)\n* `pip install python-Levenshtein`\n\n### Data\nIt is assumed that you already have training and test data.\nThe data is made from many examples (I'm using 684K examples),\neach example is made from the text\nfrom the start of the article, which I call description (or `desc`),\nand the text of the original headline (or `head`).\nThe texts should be already tokenized and the tokens separated by spaces.\n\nOnce you have the data ready save it in a python pickle file as a tuple:\n`(heads, descs, keywords)` were `heads` is a list of all the head strings,\n`descs` is a list of all the article strings in the same order and length as `heads`.\nI ignore the `keywords` information so you can place `None`.\n\n### Build a vocabulary of words\nThe [vocabulary-embedding](./vocabulary-embedding.ipynb)\nnotebook describes how a dictionary is built for the tokens and how\nan initial embedding matrix is built from [GloVe](http://nlp.stanford.edu/projects/glove/)\n\n### Train a model\n[train](./train.ipynb) notebook describes how a model is trained on the data using [Keras](http://keras.io/)\n\n### Use model to generate new headlines\n[predict](./predict.ipynb) generate headlines by the trained model and\nshowes the attention weights used to pick words from the description.\nThe text generation includes a feature which was\nnot described in the original paper, it allows for words that are outside\nthe training vocabulary to be copied from the description to the generated headline.\n\n## Examples of headlines generated\nGood (cherry picking) examples of headlines generated\n![cherry picking of generated headlines](./cherry_picking.png)\n![cherry picking of generated headlines](./cherry_picking1.png)\n\n## Examples of attention weights\n![attention weights](./attention_weights.png)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjonathanporta%2Fheadline-generator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjonathanporta%2Fheadline-generator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjonathanporta%2Fheadline-generator/lists"}