{"id":13561467,"url":"https://github.com/facebookresearch/personal-timeline","last_synced_at":"2025-04-05T11:13:42.250Z","repository":{"id":175683644,"uuid":"653337426","full_name":"facebookresearch/personal-timeline","owner":"facebookresearch","description":"A public release of TimelineBuilder for building personal digital data timelines.","archived":false,"fork":false,"pushed_at":"2024-09-03T16:37:14.000Z","size":48378,"stargazers_count":351,"open_issues_count":0,"forks_count":27,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-03-29T10:11:25.062Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/facebookresearch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-13T21:36:53.000Z","updated_at":"2025-03-26T20:25:21.000Z","dependencies_parsed_at":"2023-10-05T03:45:20.754Z","dependency_job_id":"3a57c84c-c3a2-45c4-a0dc-bf45297595a5","html_url":"https://github.com/facebookresearch/personal-timeline","commit_stats":null,"previous_names":["facebookresearch/personal-timeline"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Fpersonal-timeline","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Fpersonal-timeline/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Fpersonal-timeline/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Fpersonal-timeline/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/facebookresearch","download_url":"https://codeload.github.com/facebookresearch/personal-timeline/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247325696,"owners_count":20920714,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T13:00:56.993Z","updated_at":"2025-04-05T11:13:42.229Z","avatar_url":"https://github.com/facebookresearch.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook","Open Source Projects"],"sub_categories":["Wealth"],"readme":"\u003c!-- This file explains how to create LifeLog entries from several data sources. --\u003e\n\n# TimelineBuilder\n\n## Table of Content\n\n- [Setup](#general-setup): how to set up for this repo\n- [Importers](#digital-data-importers): how to create LifeLog entries from several data sources.\n  - [Downloading Digital Data](#downloading-your-personal-data)\n  - [Running the importers](#running-the-code)\n- [Sample Dataset](DATASET.md): a sampled set of anonymized data for testing\n- [Data Visualization](#visualization-of-the-personal-timeline): a ReactJS-based visualization frontend of the personal timeline\n- [Question Answering](#question-answer-over-the-personal-timeline): a LLM-based QA engine over the personal timeline\n- [TimelineQA](#timelineqa-a-benchmark-for-question-answer-over-the-personal-timeline): a synthetic benchmark for evaluating personal timeline QA systems\n\n## General Setup\n\n## Step 0: Create environment\n\n1. Install Docker Desktop from [this link](https://docs.docker.com/desktop/).\n\n2. Follow install steps and use the Desktop app to start the docker engine.\n\n3. Install `git-lfs` and clone the repo. You may need a conda env to do that:\n```\nconda create -n personal-timeline python=3.10\nconda activate personal-timeline\n\nconda install -c conda-forge git-lfs\ngit lfs install\n\ngit clone https://github.com/facebookresearch/personal-timeline\ncd personal-timeline\n```\n\n4. Run init script (needs python)\n```\nsh src/init.sh\n```\nThis will create a bunch of files/folders/symlinks needed for running the app.\nThis will also create a new directory under your home folder `~/personal-data`, the directory where your personal data will reside.\n\n## Step 1: Setting up\n\n\n## For Data Ingestion\n\nIngestion configs are controlled via parameters in `conf/ingest.conf` file. The configurations\nare defaulted for optimized processing and don't need to be changed. \nYou can adjust values for these parameters to run importer with a different configuration.\n\n\n\n## For Data visualization\n\n1. To set up a Google Map API (free), follow these [instructions](https://developers.google.com/maps/documentation/embed/quickstart#create-project).\n\nCopy the following lines to `env/frontend.env.list`:\n```\nGOOGLE_MAP_API=\u003cthe API key goes here\u003e\n```\n\n2. To embed Spotify, you need to set up a Spotify API (free) following [here](https://developer.spotify.com/dashboard/applications). You need to log in with a Spotify account, create a project, and show the `secret`.\n\nCopy the following lines to `env/frontend.env.list`:\n```\nSPOTIFY_TOKEN=\u003cthe token goes here\u003e\nSPOTIFY_SECRET=\u003cthe secret goes here\u003e\n```\n\n## For Question-Answering\n\nSet up an OpenAI API following these [instructions](https://openai.com/api/).\n\nCopy the following line to `env/frontend.env.list`:\n```\nOPENAI_API_KEY=\u003cthe API key goes here\u003e\n```\n\n## Digital Data Importers\n\n\n## Downloading your personal data\n\nWe currently support 9 data sources. Here is a summary table:\n\n| Digital Services | Instructions                                                                        | Destinations                                                             | Use cases                                              |\n|------------------|-------------------------------------------------------------------------------------|--------------------------------------------------------------------------|--------------------------------------------------------|\n| Apple Health     | [Link](https://github.com/facebookresearch/personal-timeline#apple-health)  | personal-data/apple-health                                               | Exercise patterns, calorie counts                      |\n| Amazon           | [Link](https://github.com/facebookresearch/personal-timeline#amazon)        | personal-data/amazon                                                     | Product recommendation, purchase history summarization |\n| Amazon Kindle    | [Link](https://github.com/facebookresearch/personal-timeline#amazon)        | personal-data/amazon-kindle                                              | Book recommendation                                    |\n| Spotify          | [Link](https://github.com/facebookresearch/personal-timeline#spotify)       | personal-data/spotify                                                    | Music / streaming recommendation                       |\n| Venmo            | [Link](https://github.com/facebookresearch/personal-timeline#venmo)         | personal-data/venmo                                                      | Monthly spend summarization                            |\n| Libby            | [Link](https://github.com/facebookresearch/personal-timeline#libby)         | personal-data/libby                                                      | Book recommendation                                    |\n| Google Photos    | [Link](https://github.com/facebookresearch/personal-timeline#google-photos) | personal-data/google_photos                                              | Food recommendation, Object detections, and more               |\n| Google Location  | [Link](https://github.com/facebookresearch/personal-timeline#google-photos) | personal-data/google-timeline/Location History/Semantic Location History | Location tracking / visualization                      |\n| Facebook posts   | [Link](https://github.com/facebookresearch/personal-timeline#facebook-data) | personal-data/facebook                                                   | Question-Answering over FB posts / photos              |\n\nIf you have a different data source not listed above, follow the instructions [here](NEW_DATASOURCE.md)\nto add this data source to the importer.\n\n### GOOGLE PHOTOS and GOOGLE TIMELINE\n\u003c!--1. You need to download your Google photos from [Google Takeout](https://takeout.google.com/).  \nThe download from Google Takeout would be in multiple zip files. Unzip all the files.\n\n2. It may be the case that some of your photo files are .HEIC. In that case follow the steps below to convert them to .jpeg  \nThe easiest way to do this on a Mac is:\n\n     -- Select the .HEIC files you want to convert.   \n     -- Right click and choose \"quick actions\" and then you'll have an option to convert the image.  \n     -- If you're converting many photos, this may take a few minutes. \n\n2. Move all the unzipped folders inside `~/personal-data/google_photos/`. There can be any number of sub-folders under `google_photos`.--\u003e\n\n1. You can download your Google photos and location (also Gmail, map and google calendar) data from [Google Takeout](https://takeout.google.com/).\n2. The download from Google Takeout would be in multiple zip files. Unzip all the files.\n3. For Google photos, move all the unzipped folders inside `~/personal-data/google_photos/`. There can be any number of sub-folders under `google_photos`.\n4. For Google locations, move the unzipped files to `personal-data/google-timeline/Location History/Semantic Location History`.\n\n### FACEBOOK DATA\n1. Go to [Facebook Settings](https://www.facebook.com/settings?tab=your_facebook_information) \n2. Click on \u003cb\u003eDownload your information\u003c/b\u003e and download FB data in JSON format\n3. Unzip the downloaded file and copy the directory `posts` sub-folder to `~/personal-data/facebook`. The `posts` folder would sit directly under the Facebook folder.\n\n### APPLE HEALTH\n1. Go to the Apple Health app on your phone and ask to export your data. This will create a file called iwatch.xml and that's the input file to the importer.\n2. Move the downloaded file to this `~/personal-data/apple-health`\n\n### AMAZON\n1. Request your data from Amazon here: https://www.amazon.com/gp/help/customer/display.html?nodeId=GXPU3YPMBZQRWZK2\nThey say it can take up to 30 days, but it took about 2 days. They'll email you when it's ready.\n\nThey separate Amazon purchases from Kindle purchases into two different directories.\n\nThe file you need for Amazon purchases is Retail.OrderHistory.1.csv\nThe file you need for Kindle purchases is Digital Items.csv\n\n2. Move data for Amazon purchases to `~/personal-data/amazon` folder and of kindle downloads to `~/personal-data/amazon-kindle` folder\n\n### VENMO\n1. Download your data from Venmo here -- https://help.venmo.com/hc/en-us/articles/360016096974-Transaction-History\n\n2. Move the data into `~/personal-data/venmo` folder.\n\n### LIBBY\n1. Download your data from Libby here -- https://libbyapp.com/timeline/activities. Click on `Actions` then `Export Timeline`\n\n2. Move the data into `~/personal-data/libby` folder.\n\n\n### SPOTIFY\n\n1. Download your data from Spotify here -- https://support.spotify.com/us/article/data-rights-and-privacy-settings/\nThey say it can take up to 30 days, but it took about 2 days. They'll email you when it's ready.\n\n2. Move the data into `~/personal-data/spotify` folder.\n\n# Running the code\nNow that we have all the data and setting in place, we can either run individual steps or the end-to-end system.\nThis will import your photo data to SQLite (this is what will go into the episodic database), build summaries\nand make data available for visualization and search.\n\n\nRunning the Ingestion container will add two types of file to `~/personal-data/app_data` folder\n - Import your data to an SQLite DB named `raw_data.db`\n - Export your personal data into csv files such as `books.csv`, `exercise.csv`, etc.\n\n### Option 1:\nTo run the pipeline end-to-end (with frontend and QA backend), simply run \n```\ndocker-compose up -d --build\n```\n\n### Option 2:\nYou can also run ingestion, visualization, and the QA engine separately.\nTo start data ingestion, use  \n```\ndocker-compose up -d backend --build\n```\n\n## Check progress\nOnce the docker command is run, you can see running containers for backend and frontend in the docker for Mac UI.\nCopy the container Id for ingest and see logs by running the following command:  \n```\ndocker logs -f \u003ccontainer_id\u003e\n```\n\n\u003c!-- # Step 5: Visualization and Question Answering --\u003e\n\n## Visualization of the personal timeline\n\nTo start the visualization frontend:\n```\ndocker-compose up -d frontend --build\n```\n\nRunning the Frontend will start a ReactJS UI at `http://localhost:3000`. See [here](src/frontend/) for more details.\n\nWe provide an anonymized digital data [dataset](sample_data/) for testing the UI and QA system, see [here](DATASET.md) for more details.\n\n![Timeline Visualization](ui.png)\n\n\n## Question Answer over the personal timeline\n\nThe QA engine is based on PostText, a QA system for answering queries that require computing aggregates over personal data.\n\nPostText Reference ---  [https://arxiv.org/abs/2306.01061](https://arxiv.org/abs/2306.01061):\n```\n@article{tan2023posttext,\n      title={Reimagining Retrieval Augmented Language Models for Answering Queries},\n      author={Wang-Chiew Tan and Yuliang Li and Pedro Rodriguez and Richard James and Xi Victoria Lin and Alon Halevy and Scott Yih},\n      journal={arXiv preprint:2306.01061},\n      year={2023},\n}\n```\n\nTo start the QA engine, run:\n```\ndocker-compose up -d qa --build\n```\nThe QA engine will be running on a flask server inside a docker container at `http://localhost:8085`. \n\nSee [here](src/qa) for more details.\n\n![QA Engine](qa.png)\n\nThere are 3 options for the QA engine.\n* *ChatGPT*: uses OpenAI's gpt-3.5-turbo [API](https://platform.openai.com/docs/models/overview) without the personal timeline as context. It answers world knowledge question such as `what is the GDP of US in 2021` but not personal questions.\n* *Retrieval-based*: answers question by retrieving the top-k most relevant episodes from the personal timeline as the LLM's context. It can answer questions over the personal timeline such as `show me some plants in my neighborhood`.\n* *View-based*: translates the input question to a (customized) SQL query over tabular views (e.g., books, exercise, etc.) of the personal timeline. This QA engine is good at answering aggregate queries (`how many books did I purchase?`) and min/max queries (`when was the last time I travel to Japan`).\n\n\nExample questions you may try:\n* `Show me some photos of plants in my neighborhood`\n* `Which cities did I visit when I traveled to Japan?`\n* `How many books did I purchase in April?`\n\n## TimelineQA: a benchmark for Question Answer over the personal timeline\n\nTimelineQA is a synthetic benchmark for accelerating progress on querying personal timelines. \nTimelineQA generates lifelogs of imaginary people. The episodes in the lifelog range from major life episodes such as high\nschool graduation to those that occur on a daily basis such as going for a run. We have evaluated SOTA models for atomic and multi-hop QA on the benchmark. \n\nPlease check out the TimelineQA github [repo](https://github.com/facebookresearch/TimelineQA) and the TimelineQA paper ---  [https://arxiv.org/abs/2306.01061](https://arxiv.org/abs/2306.01061):\n```\n@article{tan2023timelineqa,\n  title={TimelineQA: A Benchmark for Question Answering over Timelines},\n  author={Tan, Wang-Chiew and Dwivedi-Yu, Jane and Li, Yuliang and Mathias, Lambert and Saeidi, Marzieh and Yan, Jing Nathan and Halevy, Alon Y},\n  journal={arXiv preprint arXiv:2306.01069},\n  year={2023}\n}\n```\n\n## License\n\nThe codebase is licensed under the [Apache 2.0 license](LICENSE).\n\n## Contributing\n\nSee [contributing](CONTRIBUTING.md) and the [code of conduct](CODE_OF_CONDUCT.md).\n\n## Contributor Attribution\n\nWe'd like to thank the following contributors for their contributions to this project:\n- [Tripti Singh](https://github.com/tripti-singh)\n  - Design and implementation of the sqlite DB backend\n  - Designing a pluggable data import and enrichment layer and building the pipeline orchestrator.\n  - Importers for all six [data sources](​​https://github.com/facebookresearch/personal-timeline#digital-data-importers)\n  - Generic csv and json data sources importer with [instructions](https://github.com/facebookresearch/personal-timeline/blob/main/NEW_DATASOURCE.md)\n  - Dockerization\n  - Contributing in Documentation\n- [Wang-Chiew Tan](https://github.com/wangchiew)\n  - Implementation of the [PostText](https://arxiv.org/abs/2306.01061) query engine\n- [Pierre Moulon](https://github.com/SeaOtocinclus) for providing open-sourcing guidelines and suggestions\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2Fpersonal-timeline","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffacebookresearch%2Fpersonal-timeline","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2Fpersonal-timeline/lists"}