https://github.com/droduit/pictdbm
⚙️ Pictures Database Manager
https://github.com/droduit/pictdbm
haystack image-database vips
Last synced: 12 months ago
JSON representation
⚙️ Pictures Database Manager
- Host: GitHub
- URL: https://github.com/droduit/pictdbm
- Owner: droduit
- Created: 2017-12-25T10:29:11.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2023-08-05T10:27:52.000Z (over 2 years ago)
- Last Synced: 2025-01-02T20:19:02.512Z (about 1 year ago)
- Topics: haystack, image-database, vips
- Language: C
- Homepage: https://dominique.roduit.com/en/pict-dbm
- Size: 2.4 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Pictures Database Manager
## Description
This project is a command line utility tool for managing images in a specific format database. This is an inspired and simplified version of the "Haystack" system used by Facebook.
Social networks have to manage hundreds of millions of images. Usual file systems (such as the one used on your hard disk) have efficiency problems with such numbers of files. Moreover, they are not designed to handle the fact that we want to have each of these images in several resolutions, for example very small (icon), medium for a quick "preview" and in normal size (original resolution ).
In the “Haystack” approach, several images are in the same file. Also, different resolutions of the same image are stored automatically. This single file contains both data (images) and metadata
(information about each image). The key idea is that the image server has a copy of this metadata in memory, in order to allow very fast access to a specific photo, and in the correct resolution.
This approach has a number of advantages: first, it reduces the number of files managed by the operating system; on the other hand, it makes it possible to elegantly implement two important aspects of the management of an image database:
- automatic management of different image resolutions, in our case the three supported resolutions;
- the possibility of not duplicating identical images submitted under different names (eg by different users at Facebook); it is an extremely useful optimization in any social network.
This “deduplication” is done using a “hash function” which summarizes binary content (in our case an image) into a much shorter signature. We use here the “SHA-256” function which summarizes all binary content in 256 bits, with the interesting cryptographic property that the function is resistant to collisions: for a given image, it is practically impossible to create another image which would have the same signature.
## Preview

## How to install
1. Clone this git repository locally
2. Make sure the following packages are installed :
- [glib](https://docs.gtk.org/glib/)
- [pkg-config](https://en.wikipedia.org/wiki/Pkg-config)
- [libvips](https://github.com/libvips/libvips/tree/master)
If not, MacOS: `brew install libvips pkg-config`
3. From the root of the project, run `cd libmongoose && make clean && make all`.
4. From the root of the project, run `make clean-all && make all`.
5. Copy `libmongoose/libmongoose.so` into the root folder: `cp libmongoose/libmongoose.so libmongoose.so`.
6. Run the server with `make server`.
7. Open `localhost:8000` on any browser.
## Makefile commands
* `make clean-all` Clear all objects files and executables generated by a call to `make`
* `make server` Launch the server, reachable on your web browser at `localhost:8000` (default value)
* `make style` Apply `astyle` on the whole project's `.c` and `.h` files
*
## Commands available
```java
./pictDBM [COMMAND] [ARGUMENTS]
```
* **help**
displays this help.
* **list** <dbfilename>
list pictDB content.
* **create** <dbfilename> [options]
create a new pictDB.
options are:
-max_files : maximum number of files.
-thumb_res : resolution for thumbnail images.
-small_res : resolution for small images.
* **read** <dbfilename> <pictID> [original|orig|thumbnail|thumb|small]
read an image from the pictDB and save it to a file.
default resolution is "original".
* **insert** <dbfilename> <pictID> <filename>
insert a new image in the pictDB.
* **delete** <dbfilename> <pictID>
delete picture pictID from pictDB.
* **gc** <dbfilename> <tmp dbfilename>
performs garbage collecting on pictDB. Requires a temporary filename for copying the pictDB.
## Authors
- Dominique Roduit ([@droduit](https://github.com/droduit))
- Thierry Treyer ([@ttreyer](https://github.com/ttreyer))
## Note
Project completed within the context of the EPFL course « [System programming project](https://edu.epfl.ch/coursebook/en/system-programming-project-CS-212) »