{"id":18081995,"url":"https://github.com/vindarel/bookshops","last_synced_at":"2025-08-24T11:42:09.173Z","repository":{"id":45443627,"uuid":"58755266","full_name":"vindarel/bookshops","owner":"vindarel","description":"Search for books, CDs and DVDs on real bookshops' websites. Mirror of https://gitlab.com/vindarel/bookshops/ (sometimes out of sync)","archived":false,"fork":false,"pushed_at":"2022-07-06T19:15:33.000Z","size":359,"stargazers_count":3,"open_issues_count":2,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-12T04:17:40.568Z","etag":null,"topics":["books","bookshop","dvd","isbn","library"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vindarel.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null},"funding":{"ko_fi":"vindarel","liberapay":"vindarel","patreon":"vindarel"}},"created_at":"2016-05-13T16:19:43.000Z","updated_at":"2020-09-22T18:07:33.000Z","dependencies_parsed_at":"2022-07-15T15:17:19.537Z","dependency_job_id":null,"html_url":"https://github.com/vindarel/bookshops","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vindarel%2Fbookshops","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vindarel%2Fbookshops/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vindarel%2Fbookshops/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vindarel%2Fbookshops/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vindarel","download_url":"https://codeload.github.com/vindarel/bookshops/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247411241,"owners_count":20934650,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["books","bookshop","dvd","isbn","library"],"created_at":"2024-10-31T13:17:44.198Z","updated_at":"2025-04-05T22:29:19.418Z","avatar_url":"https://github.com/vindarel.png","language":"Python","funding_links":["https://ko-fi.com/vindarel","https://liberapay.com/vindarel","https://patreon.com/vindarel"],"categories":[],"sub_categories":[],"readme":"# Bibliographic search of BOOKS, CDs and DVDs\n\nThis library was initially to search for bibliographic data of books,\nand it was expanded for **DVDs** and **CDs**. We can search with\n**keywords**, with the **isbn** (so than we can use barcode scanners),\nwith some advanced search, and we have pagination.\n\nWe get the data from existing websites. We scrape:\n\n- for French books:\n  - [Dilicom](https://dilicom-prod.centprod.com/)'s profesional provider (**new in v0.4, jan. 2020**),\n  - http://www.librairie-de-paris.fr See [its doc](doc/frenchscraper.md) ![](http://gitlab.com/vindarel/bookshops/badges/master/build.svg?job=french_scraper)\n- for Switzerland: [lelivre.ch](https://www.lelivre.ch) (**new in 0.5, feb. 2020**)\n- for Spain: http://www.casadellibro.com ![](http://gitlab.com/vindarel/bookshops/badges/master/build.svg?job=spanish_scraper)\n- for Germany: http://www.buchlentner.de ![](http://gitlab.com/vindarel/bookshops/badges/master/build.svg?job=german_scraper)\n- for DVDs: https://www.momox-shop.fr\n- and for CDs: https://www.discogs.com (may need more testing)\n\nWe retrieve: the title and authors, the price, the isbn, the publisher(s), the cover,...\n\nThis library forms the heart of\n[Abelujo](https://gitlab.com/vindarel/abelujo/), a free software for\nbookshops.\n\n![](cli-search.png)\n\n# Install\n\nInstall from pypi:\n\n    pip install bookshops\n\n# Use\n\n## Command line\n\nYou can try this lib on the command line with the following commands:\n- `livres`: french books\n- `lelivre`: swiss books\n- `dilicom`: Dilicom search (only searches ISBNs, no free search)\n- `libros`: spanish books\n- `bucher`: german books\n- `discogs`: CDs\n- `movies`: DVDs\n- come and ask for more :)\n\nFor example:\n\n    livres antigone\n\nor\n\n    livres 9782918059363\n\nand you get the above screenshot.\n\n**Options**: (this may vary according to the scrapers, check them with `-h`)\n- `-i` or `--isbn` to ensure to get all the isbn. The command line\n  tool won't get them by default if they need to be fetched with\n  another http request for each book. That depends on the websites.\n\n## Settings\n\nThe Dilicom interface requires you to set two environment variables:\n\n    export DILICOM_USER=\"300xxx\"\n    export DILICOM_PASSWORD=\"xyz\"\n\nIn addition, you can set this third one, that allows you to view a\nbook's product page on Dilicom's website (within your account). You\nfind it on the url of your account. For example, when I am visiting this\nbook page:\n\n    https://dilicom-prod.centprod.com/catalogue/detail_article_consultation.html?ean=9782840550877\u0026emet=3010xxxxx0100\n\nI set the environment variable like so:\n\n    export DILICOM_EMET=\"3010xxxxx0100\"\n\nIn Abelujo, this sets the \"details_url\" Card slot accordingly and you\ncan click on the \"source\" link when viewing a book's page.\n\n## As a library\n\nBut most of all, from within your program:\n\n    from bookshops.frFR.librairiedeparis.librairiedeparisScraper import Scraper as frenchScraper\n\n    scraper = frenchScraper(\"search keywords\")\n    cards = scraper.search()\n    # we get a list of dictionnaries with the title, the authors, etc.\n\n## Caching\n\nResults are cached in memory for about 1 day (except Dilicom results,\nin purpose). It allows long-running software based on this library\n(e.g., Abelujo) to feel more dynamic in certain cases.\n\n## Advanced search\n\nWork in progress.\n\nYou can search ``ed:agone`` to search for a specific publisher.\n\n## Pagination\n\nWe do pagination:\n\n    scraper = frenchScraper(\"search keywords\", page=2)\n\n\n# Why not… ?\n\n## Why not Amazon ?\n\nAmazon kills the book industry and its employees.  But moreover, we\ncan add value to our results. We can link to a good and independent\nbookshop from within our application, we could command books from it,\nwe could say if it has exemplaries in stock or not, etc.\n\nTechnically speaking, the Amazon API web service can be too limiting\nand not appropriate. One must register to Amazon Product Advertising\nand to AWS, making it more difficult for deployment or independant\nusers, and it changes way more often than our resailers' websites.\n\n## Why not Google books ?\n\nIt has very few data.\n\n## Why not the BNF (Bibliothèque Nationale de France) ?\n\nBecause, for bookshops, we need recent books (they enter the BNF\ndatabase after a few months), and the price.\n\n\n# Develop and test\n\nSee http://dev.abelujo.cc/webscraping.html\n\nDevelopment mode:\n\n    pip install -e .\n\nNow you can edit the project and run the development version like the\nlib is meant to be run, i.e. with the `entry_points`: `livres`,\n`libros`, etc.\n\ndoc: https://python-packaging-user-guide.readthedocs.org/en/latest/distributing/#working-in-development-mode\n\n\n# Bugs and shortcomings\n\nNote: the Dilicom interface is not concerned by these limitations.\n\nThis is webscraping, so it doesn't go without pitfalls:\n\n- the site can go down. It happened already.\n- the site can change, it which case we would have to change our\n  sraper too. To catch this early we run automatic tests every\n  week. The actual website didn't change in 3 years.\n\n\n# Changelog\n\n## 0.7\n\n- multiple ISBN search for Dilicom (by batches of one hundred).\n\n## 0.6\n\n- results are cached again. Simply in memory for about 1 day.\n\n## 0.5\n\n- added a Swiss interface.\n- added support to fetch and print prices in another currency.\n\n## 0.4\n\n- added Dilicom interface. It only provides search of ISBN(s), it doesn't provide free and advanced search.\n\n## 0.3.1\n\n- added search of DVDs\n- updated french scrapers (first time needed in four years).\n\n## 0.2.2\n\n- remove deprecated import from ods/csv feature. Might do a simpler one in the future.\n\n## 0.2.1\n\n- german scraper: search by isbn\n\n## 0.2.0\n\n- German scraper\n- multiprocessing for the german scraper (from 15 to 9s) (see [issue #1](https://gitlab.com/vindarel/bookshops/issues/1))\n- `--isbn` option for it\n\n## 0.1.x\n\n- french, spanish scrapers\n- command line tool\n\n# Licence\n\nLGPLv3\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvindarel%2Fbookshops","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvindarel%2Fbookshops","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvindarel%2Fbookshops/lists"}