Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/trevorhobenshield/amazon_photos

Amazon Photos API
https://github.com/trevorhobenshield/amazon_photos
amazon amazon-photos api archive automation data
Last synced: 2 months ago
JSON representation
Amazon Photos API
Host: GitHub
URL: https://github.com/trevorhobenshield/amazon_photos
Owner: trevorhobenshield
License: mit
Created: 2023-08-14T19:13:12.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-01-13T20:33:00.000Z (about 1 year ago)
Last Synced: 2024-04-14T10:48:07.390Z (10 months ago)
Topics: amazon, amazon-photos, api, archive, automation, data
Language: Python
Homepage: https://pypi.org/project/amazon_photos
Size: 130 KB
Stars: 38
Watchers: 3
Forks: 4
Open Issues: 4
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        # Amazon Photos API

## Table of Contents

* [Installation](#installation)

* [Setup](#setup)

* [Examples](#examples)

* [Search](#search)

* [Nodes](#nodes)

    * [Restrictions](#restrictions)

    * [Range Queries](#range-queries)

* [Notes](#notes)

    * [Known File Types](#known-file-types)

* [Custom Image Labeling (Optional)](#custom-image-labeling-optional)

> It is recommended to use this API in a [Jupyter Notebook](https://jupyter.org/install), as the results from most

> endpoints

> are a [DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html#pandas.DataFrame)

> which can be neatly displayed and efficiently manipulated with vectorized ops. This becomes

> increasingly important if you have "large" amounts of data (e.g. >1 million photos/videos).

## Installation

```bash

pip install amazon-photos -U

```

### Output Examples

`ap.db`

|   | dateTimeDigitized        | id                     | name              | ... | model             | apertureValue | focalLength | width | height |   size |

|--:|:-------------------------|:-----------------------|:------------------|:----|:------------------|:--------------|:------------|------:|-------:|-------:|

| 0 | 2019-07-06T18:22:00.000Z | HeMReF-vvJiTTkdPIeWuoP | 1694252973839.png | ... | iPhone XS         | 54823/32325   | 17/4        |  3024 |   4032 | 432777 |

| 1 | 2023-01-18T09:36:22.000Z | z_HiIvASAKqWmdrkjWiqMZ | 1692626817154.jpg | ... | iPhone XS         | 54823/32325   | 17/4        |  3024 |   4032 | 234257 |

| 2 | 2022-08-14T14:13:21.000Z | LKXEZbqoVrhrOYBezisGEQ | 1798219686789.jpg | ... | iPhone 11 Pro Max | 54823/32325   | 17/4        |  3024 |   4032 | 423987 |

| 3 | 2020-06-28T19:32:30.000Z | EPUeciHtfKkGiYkfUyEuMa | 1593482220567.jpg | ... | iPhone XS         | 54823/32325   | 17/4        |  3024 |   4032 | 898957 |

| 4 | 2021-07-07T17:12:55.000Z | fdfKzRJbEyoVeGcfCoJgE- | 1592299282720.png | ... | iPhone XR         | 54823/32325   | 17/4        |  3024 |   4032 | 432556 |

| 5 | 2021-08-18T18:32:41.000Z | crskJSmKPFRhxbpfkivyLm | 1592902159105.png | ... | iPhone XR         | 54823/32325   | 17/4        |  3024 |   4032 | 123123 |

| 6 | 2023-08-23T19:12:21.000Z | qkBFUlyIdkUwVVSaVWWKEF | 1598138358650.png | ... | iPhone 11         | 54823/32325   | 17/4        |  3024 |   4032 | 437887 |

| 7 | 2021-06-19T17:14:13.000Z | TXKMKC-mHvSUrtRfwmtyDe | 1622199863606.jpg | ... | iPhone 12 Pro     | 14447/10653   | 21/5        |  1536 |   2048 | 758432 |

| 8 | 2023-02-15T22:45:40.000Z | FRDvvjcZdpFWiwrIZfTNHO | 1581874518054.jpg | ... | iPhone 8 Plus     | 54823/32325   | 399/100     |  1348 |   2049 | 862883 |

`ap.print_tree()`

```text

~ 

├── Documents 

├── Pictures 

│   ├── iPhone 

│   └── Web 

│       ├── foo 

│       └── bar

├── Videos 

└── Backup 

    ├── LAPTOP-XYZ 

    │   └── Desktop 

    └── DESKTOP-IJK 

        └── Desktop

```

## Setup

> [Update] Jan 04 2024: To avoid confusion, setting env vars is no longer supported. One must pass cookies directly as

> shown below.

Log in to Amazon Photos and copy the following cookies:

- `session-id`

- `ubid`*

- `at`*

### Canada/Europe

where `xx` is the TLD (top-level domain)

- `ubid-acbxx`

- `at-acbxx`

### United States

- `ubid_main`

- `at_main`

E.g.

```python

from amazon_photos import AmazonPhotos

ap = AmazonPhotos(

    ## US

    # cookies={

    #     'ubid_main': ...,

    #     'at_main': ...,

    #     'session-id': ...,

    # },

    ## Canada

    # cookies={

    #     'ubid-acbca': ...,

    #     'at-acbca': ...,

    #     'session-id': ...,

    # }

    ## Italy

    # cookies={

    #     'ubid-acbit': ...,

    #     'at-acbit': ...,

    #     'session-id': ...,

    # }

)

```

## Examples

> A database named `ap.parquet` will be created during the initial setup. This is mainly used to reduce upload conflicts

> by checking your local file(s) md5 against the database before sending the request.

```python

from amazon_photos import AmazonPhotos

ap = AmazonPhotos(

    # see cookie examples above

    cookies={...},

    # optionally cache all intermediate JSON responses

    tmp='tmp',

    # pandas options

    dtype_backend='pyarrow',

    engine='pyarrow',

)

# get current usage stats

ap.usage()

# get entire Amazon Photos library

nodes = ap.query("type:(PHOTOS OR VIDEOS)")

# query Amazon Photos library with more filters applied

nodes = ap.query("type:(PHOTOS OR VIDEOS) AND things:(plant AND beach OR moon) AND timeYear:(2023) AND timeMonth:(8) AND timeDay:(14) AND location:(CAN#BC#Vancouver)")

# sample first 10 nodes

node_ids = nodes.id[:10]

# move a batch of images/videos to the trash bin

ap.trash(node_ids)

# get trash bin contents

ap.trashed()

# permanently delete a batch of images/videos

ap.delete(node_ids)

# restore a batch of images/videos from the trash bin

ap.restore(node_ids)

# upload media (preserves local directory structure and copies to Amazon Photos root directory)

ap.upload('path/to/files')

# download a batch of images/videos

ap.download(node_ids)

# convenience method to get photos only

ap.photos()

# convenience method to get videos only

ap.videos()

# get all identifiers calculated by Amazon.

ap.aggregations(category="all")

# get specific identifiers calculated by Amazon.

ap.aggregations(category="location")

```

## Search

*Undocumented API, current endpoints valid Dec 2023.*

For valid **location** and **people** IDs, see the results from the `aggregations()` method.

| name            | type | description                                                                                                                                                                                                                                               |

|:----------------|:-----|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

| ContentType     | str  | `"JSON"`                                                                                                                                                                                                                                                  |

| _               | int  | `1690059771064`                                                                                                                                                                                                                                           |

| asset           | str  | `"ALL"`
`"MOBILE"`
`"NONE`
`"DESKTOP"`

default: `"ALL"`                                                                                                                                                                              |

| filters         | str  | `"type:(PHOTOS OR VIDEOS) AND things:(plant AND beach OR moon) AND timeYear:(2019) AND timeMonth:(7) AND location:(CAN#BC#Vancouver) AND people:(CyChdySYdfj7DHsjdSHdy)"`

default: `"type:(PHOTOS OR VIDEOS)"`                                   |

| groupByForTime  | str  | `"day"`
`"month"`
`"year"`                                                                                                                                                                                                                        |

| limit           | int  | `200`                                                                                                                                                                                                                                                     |

| lowResThumbnail | str  | `"true"`
`"false"`

default: `"true"`                                                                                                                                                                                                         |

| resourceVersion | str  | `"V2"`                                                                                                                                                                                                                                                    |

| searchContext   | str  | `"customer"`
`"all"`
`"unknown"`
`"family"`
`"groups"`

default: `"customer"`                                                                                                                                                     |

| sort            | str  | `"['contentProperties.contentDate DESC']"`
`"['contentProperties.contentDate ASC']"`
`"['createdDate DESC']"`
`"['createdDate ASC']"`
`"['name DESC']"`
`"['name ASC']"`

default: `"['contentProperties.contentDate DESC']"` |

| tempLink        | str  | `"false"`
`"true"`

default: `"false"`                                                                                                                                                                                                        |             |

## Nodes

*Docs last updated in 2015*

| FieldName                     | FieldType                | Sort Allowed | Notes                                                                                                                                                                                                                                       |

|-------------------------------|--------------------------|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

| isRoot                        | Boolean                  |              | Only lower case `"true"` is supported.                                                                                                                                                                                                      |

| name                          | String                   | Yes          | This field does an exact match on the name and prefix query. Consider `node1{ "name" : "sample" }` `node2 { "name" : "sample1" }` Query filter
`name:sample` will return node1
`name:sample*` will return node1 and node2             |

| kind                          | String                   | Yes          | To search for all the nodes which contains kind as FILE `kind:FILE`                                                                                                                                                                         |

| modifiedDate                  | Date (in ISO8601 Format) | Yes          | To Search for all the nodes which has modified from time `modifiedDate:{"2014-12-31T23:59:59.000Z" TO *]`                                                                                                                                   |

| createdDate                   | Date (in ISO8601 Format) | Yes          | To Search for all the nodes created on  `createdDate:2014-12-31T23:59:59.000Z`                                                                                                                                                              |

| labels                        | String Array             |              | Only Equality can be tested with arrays.
if labels contains `["name", "test", "sample"]`.
Label can be searched for name or combination of values.
To get all the labels which contain name and test
`labels: (name AND test)`  |

| description                   | String                   |              | To Search all the nodes for description with value 'test'
`description:test`                                                                                                                                                             |

| parents                       | String Array             |              | Only Equality can be tested with arrays.
if parents contains `["id1", "id2", "id3"]`.
Parent can be searched for name or combination of values.
To get all the parents which contains id1 and id2
`parents:id1 AND parents:id2` |

| status                        | String                   | Yes          | For searching nodes with AVAILABLE status.
`status:AVAILABLE`                                                                                                                                                                            |

| contentProperties.size        | Long                     | Yes          |                                                                                                                                                                                                                                             |

| contentProperties.contentType | String                   | Yes          | If prefix query, only the major content-type (e.g. `image*`, `video*`, etc.) is supported as a prefix.                                                                                                                                      |

| contentProperties.md5         | String                   |              |                                                                                                                                                                                                                                             |

| contentProperties.contentDate | Date (in ISO8601 Format) | Yes          | RangeQueries and equals queries can be used with this field                                                                                                                                                                                 |

| contentProperties.extension   | String                   | Yes          |                                                                                                                                                                                                                                             |

### Restrictions

> Max # of Filter Parameters Allowed is 8

| Filter Type | Filters                                                                               |

|:------------|:--------------------------------------------------------------------------------------|

| Equality    | createdDate, description, isRoot, kind, labels, modifiedDate, name, parentIds, status |

| Range       | contentProperties.contentDate, createdDate, modifiedDate                              |

| Prefix      | contentProperties.contentType, name                                                   |

### Range Queries

| Operation            | Syntax                                                           |

|----------------------|------------------------------------------------------------------|

| GreaterThan          | `{"valueToBeTested" TO *}`                                       |

| GreaterThan or Equal | `["ValueToBeTested" TO *]`                                       |

| LessThan             | `{* TO "ValueToBeTested"}`                                       |

| LessThan or Equal    | `{* TO "ValueToBeTested"]`                                       |

| Between              | `["ValueToBeTested_LowerBound" TO "ValueToBeTested_UpperBound"]` |

## Notes

#### `https://www.amazon.ca/drive/v1/batchLink`

- This endpoint is called when downloading a batch of photos/videos in the web interface. It then returns a URL to

  download a zip file, then makes a request to that url to download the content.

  When making a request to download data for 1200 nodes (max batch size), it turns out to be much slower (~2.5 minutes)

  than asynchronously downloading 1200 photos/videos individually (~1 minute).

### Known File Types

| Extension | Category |

|-----------|----------|

| \.pdf     | pdf      |

| \.doc     | doc      |

| \.docx    | doc      |

| \.docm    | doc      |

| \.dot     | doc      |

| \.dotx    | doc      |

| \.dotm    | doc      |

| \.asd     | doc      |

| \.cnv     | doc      |

| \.mp3     | mp3      |

| \.m4a     | mp3      |

| \.m4b     | mp3      |

| \.m4p     | mp3      |

| \.wav     | mp3      |

| \.aac     | mp3      |

| \.aif     | mp3      |

| \.mpa     | mp3      |

| \.wma     | mp3      |

| \.flac    | mp3      |

| \.mid     | mp3      |

| \.ogg     | mp3      |

| \.xls     | xls      |

| \.xlm     | xls      |

| \.xll     | xls      |

| \.xlc     | xls      |

| \.xar     | xls      |

| \.xla     | xls      |

| \.xlb     | xls      |

| \.xlsb    | xls      |

| \.xlsm    | xls      |

| \.xlsx    | xls      |

| \.xlt     | xls      |

| \.xltm    | xls      |

| \.xltx    | xls      |

| \.xlw     | xls      |

| \.ppt     | ppt      |

| \.pptx    | ppt      |

| \.ppa     | ppt      |

| \.ppam    | ppt      |

| \.pptm    | ppt      |

| \.pps     | ppt      |

| \.ppsm    | ppt      |

| \.ppsx    | ppt      |

| \.pot     | ppt      |

| \.potm    | ppt      |

| \.potx    | ppt      |

| \.sldm    | ppt      |

| \.sldx    | ppt      |

| \.txt     | txt      |

| \.text    | txt      |

| \.rtf     | txt      |

| \.xml     | markup   |

| \.htm     | markup   |

| \.html    | markup   |

| \.zip     | zip      |

| \.rar     | zip      |

| \.7z      | zip      |

| \.jpg     | img      |

| \.jpeg    | img      |

| \.png     | img      |

| \.bmp     | img      |

| \.gif     | img      |

| \.tif     | img      |

| \.svg     | img      |

| \.mp4     | vid      |

| \.m4v     | vid      |

| \.qt      | vid      |

| \.mov     | vid      |

| \.mpg     | vid      |

| \.mpeg    | vid      |

| \.3g2     | vid      |

| \.3gp     | vid      |

| \.flv     | vid      |

| \.f4v     | vid      |

| \.asf     | vid      |

| \.avi     | vid      |

| \.wmv     | vid      |

| \.swf     | exe      |

| \.exe     | exe      |

| \.dll     | exe      |

| \.ax      | exe      |

| \.ocx     | exe      |

| \.rpm     | exe      |

## Custom Image Labeling (Optional)

Categorize your images into folders using computer vision models.

```bash

pip install amazon-photos[extras] -U

```

See the [Model List](https://www.hobenshield.com/stats/bench/index.html) for a list of all available models.

### Sample Models

**Very Large**

```

eva02_base_patch14_448.mim_in22k_ft_in22k_in1k

```

**Large**

```

eva02_large_patch14_448.mim_m38m_ft_in22k_in1k

```

**Medium**

```

eva02_small_patch14_336.mim_in22k_ft_in1k

vit_base_patch16_clip_384.laion2b_ft_in12k_in1k

vit_base_patch16_clip_384.openai_ft_in12k_in1k

caformer_m36.sail_in22k_ft_in1k_384

```

**Small**

```

eva02_tiny_patch14_336.mim_in22k_ft_in1k

tiny_vit_5m_224.dist_in22k_ft_in1k

edgenext_small.usi_in1k

xcit_tiny_12_p8_384.fb_dist_in1k

```

```python

run(

    'eva02_base_patch14_448.mim_in22k_ft_in22k_in1k',

    path_in='images',

    path_out='labeled',

    thresh=0.0,  # threshold for predictions, 0.9 means you want very confident predictions only

    topk=5,

    # window of predictions to check if using exclude or restrict, if set to 1, only the top prediction will be checked

    exclude=lambda x: re.search('boat|ocean', x, flags=re.I),

    # function to exclude classification of these predicted labels

    restrict=lambda x: re.search('sand|beach|sunset', x, flags=re.I),

    # function to restrict classification to only these predicted labels

    dataloader_options={

        'batch_size': 4,  # *** adjust ***

        'shuffle': False,

        'num_workers': psutil.cpu_count(logical=False),  # *** adjust ***

        'pin_memory': True,

    },

    accumulate=False,

    # accumulate results in path_out, if False, everything in path_out will be deleted before running again

    device='cuda',

    naming_style='name',  # use human-readable label names, optionally use the label index or synset

    debug=0,

)

```