{"id":15347393,"url":"https://github.com/vtlim/patfam","last_synced_at":"2026-01-07T04:43:19.280Z","repository":{"id":54488673,"uuid":"320127982","full_name":"vtlim/patfam","owner":"vtlim","description":"Web app to determine whether patent applications from different jurisdictions (USPTO, EPO, WIPO, etc.) are of the same family.","archived":false,"fork":false,"pushed_at":"2021-04-16T00:48:54.000Z","size":672,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-01T21:44:44.214Z","etag":null,"topics":["css","d3js","epo","html","javascript","materializecss","patent-applications","patent-offices","python","selenium","uspto","web-scraping","wipo"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vtlim.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-12-10T01:43:34.000Z","updated_at":"2023-05-15T19:20:50.000Z","dependencies_parsed_at":"2022-08-13T17:31:04.705Z","dependency_job_id":null,"html_url":"https://github.com/vtlim/patfam","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vtlim%2Fpatfam","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vtlim%2Fpatfam/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vtlim%2Fpatfam/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vtlim%2Fpatfam/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vtlim","download_url":"https://codeload.github.com/vtlim/patfam/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245905447,"owners_count":20691782,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["css","d3js","epo","html","javascript","materializecss","patent-applications","patent-offices","python","selenium","uspto","web-scraping","wipo"],"created_at":"2024-10-01T11:33:07.944Z","updated_at":"2026-01-07T04:43:19.242Z","avatar_url":"https://github.com/vtlim.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PatFam: Patent Families\n\n_README last edited_: 16 Dec 2020\n\nThe purpose of this tool is to allow users to easily determine whether patent applications are in the same family. Instead of having to check each individual patent office's website, users can enter patent, application, or publication numbers for various patent offices and see the relationship between the documents. \n\nCurrently supports: TBD\n\n![screenshot](https://github.com/vtlim/patfam/blob/foundations/archive/screenshot.png)\n\n\n## Setup\n\n### Summary of conda installs\nSee below for important notes including system dependencies, version compatibility, access keys, etc.\n```\nconda create -n patfam python=3.6\nconda activate patfam\n\nconda install pip\npip install uspto-opendata-python\n\nconda install -c anaconda lxml networkx pygraphviz graphviz flask gunicorn\nconda install -c conda-forge selenium matplotlib\n```\n\n### [USPTO] [USPTO Open Data API Client](https://docs.ip-tools.org/uspto-opendata-python/index.html)\nFor data from the United States Patent and Trademark Office (USPTO), we'll use the API client for the USPTO Patent Examination Data System (PEDS). Do not mix up PEDS with PBD, which is the USPTO PAIR Bulk Data (PBD) system. The PBD has been decommissioned.\n\n```\n# install libxml2 dependencies (make sure you have -dev versions)\nsudo apt-get update\nsudo apt-get install libxml2-dev libxslt1-dev python-lxml\n\n# create conda environment and install some packages (see note below)\nconda create -n patfam python=3.6\nconda activate patfam\nconda install -c anaconda lxml\n\n# install pip (if you don't already have it) and the api client\nconda install pip\npip install uspto-opendata-python\n\n# run a test query\nuspto-peds get \"15431686\" --type=application --format=xml\n```\nNOTE: As of Dec. 2020, `uspto-opendata-python` will not install if using Python 3.9.0 (returning gcc compilation error of [this type](https://github.com/pandas-dev/pandas/issues/32114)).\n\n#### See my tutorial walkthrough of the USPTO Open Data API Client [here](uspto/explore_uspto_data.ipynb).\n\n### [EPO] [Python EPO OPS Client](https://github.com/gsong/python-epo-ops-client)\nFor data from the European Patent Office (EPO), we'll use the Open Patent Services (OPS) client developed by George Song et al. In order to get API access, you will need to [request access credentials](https://developers.epo.org/) from OPS. After I submitted my request, I was granted access from EPO the next day.\n```\nconda activate patfam\npip install python-epo-ops-client\n```\n#### See my tutorial walkthrough of the Python EPO OPS Client [here](epo/explore_epo_data.ipynb).\n\n### Web Scraping\n\nAs far as I can tell, there is no freely available API access to the World Intellectual Property Organization (WIPO; for PCT applications) or to the Japan Patent Office (JPO). For those applications, we'll request data from the web directly.\n\nBefore this I tried the Python `requests` package to obtain site data. I made a GET search query to WIPO PatentScope for `docId=WO2001029057`. However, no patent-related data could be found in the HTML content. It looks like the site is rendered dynamically using JavaScript so we'll use Selenium instead:\n\n```\nconda activate patfam\nconda install -c conda-forge selenium\n```\n\nSince I'm working in the Windows Subsystem for Linux (WSL2), I needed to download an Internet browser and related driver (following the [tutorial](https://www.gregbrisebois.com/posts/chromedriver-in-wsl2/) from Greg Brisebois). If you already have an Internet browser in the same OS that you're working in, you would just get to obtain the relevant driver for your browser, browser version, and OS.\n\n\n```\n# install dependencies\nsudo apt-get update\nsudo apt-get install -y curl unzip xvfb libxi6 libgconf-2-4\n\n# get chrome browser\nwget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb\nsudo apt install ./google-chrome-stable_current_amd64.deb\n\n# optional - prevent updates of google chrome (if it updates, you also have to update the chromedriver)\nsudo apt-mark hold google-chrome-stable\n\n# check browser install\ngoogle-chrome --version\n\n# get chrome driver; USE THE DRIVER VERSION RELEVANT TO YOUR SETUP\nwget https://chromedriver.storage.googleapis.com/87.0.4280.88/chromedriver_linux64.zip\nunzip chromedriver_linux64.zip\nsudo mv chromedriver /usr/bin/chromedriver\nsudo chown root:root /usr/bin/chromedriver\nsudo chmod +x /usr/bin/chromedriver\n\n# check driver install\nchromedriver --version\n```\n\nOptionally, if you want to work with the Chrome window (as opposed to using it in \"headless form\"), try the command `google-chrome` to check that the window will come up. I had to resolve a few issues with WSL2 on my end before it worked. More details [here](https://github.com/vtlim/patfam/blob/main/wsl2_xserver.md).\n\n#### See an example of using Selenium to obtain WIPO PatentScope data [here](wipo/explore_wipo_data.ipynb).\n#### See an example of using Selenium to obtain JPO patent data [here](jpo/explore_jpo_data.ipynb).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvtlim%2Fpatfam","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvtlim%2Fpatfam","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvtlim%2Fpatfam/lists"}