https://github.com/dmitryduev/penquins

A python client for Kowalski
https://github.com/dmitryduev/penquins

kowalski python-client ztf ztf-ii

Last synced: 6 months ago
JSON representation

A python client for Kowalski

Host: GitHub
URL: https://github.com/dmitryduev/penquins
Owner: dmitryduev
License: mit
Created: 2020-05-18T21:29:17.000Z (about 5 years ago)
Default Branch: master
Last Pushed: 2024-06-25T09:39:45.000Z (about 1 year ago)
Last Synced: 2024-10-06T12:41:58.866Z (9 months ago)
Topics: kowalski, python-client, ztf, ztf-ii
Language: Python
Homepage:
Size: 368 KB
Stars: 6
Watchers: 4
Forks: 8
Open Issues: 3
Metadata Files:
- Readme: readme.md
- License: LICENSE
- Citation: CITATION.cff

Awesome Lists containing this project

README

        # penquins: a python client for [Kowalski](https://github.com/skyportal/kowalski)

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5651471.svg)](https://doi.org/10.5281/zenodo.5651471)

`penquins` is a python client for [Kowalski](https://github.com/skyportal/kowalski), a multi-survey data archive and alert broker for time-domain astronomy.

## Quickstart

Install `penquins` from [PyPI](https://pypi.org/project/penquins/):

```bash

pip install penquins --upgrade

```

Connect to a Kowalski instance:

```python

from penquins import Kowalski

username = ""

password = ""

protocol, host, port = "https", "", 443

kowalski = Kowalski(

    username=username,

    password=password,

    protocol=protocol,

    host=host,

    port=port

)

```

*When connecting to only one instance, it will be labeled as "default". Keep this in mind when retrieving the results of your queries.*

Connect to multiple Kowalski instances:

```python

from penquins import Kowalski

instances = {

    "kowalski": {

        "name": "kowalski",

        "host": "",

        "protocol": "https"

        "port": 443,

        "token": "" # or username and password

    },

    ...

}

kowalski = Kowalski(instances=instances)

```

*When using multiple instances at once, you can specify a single instance to query using its name when calling `query(name=...)`, or no name at all. If no name is provided and the catalog(s) being queried is/are available on multiple instances, penquins will divide the load between instances automagically.*

*When retrieving the results, you'll have to use the instance(s) name instead of "default", or simply iterate over the results by instance and merge the results.*

It is recommended to authenticate once and then just reuse the generated token:

```python

token = kowalski.token

print(token)

kowalski = Kowalski(

    token=token,

    protocol=protocol,

    host=host,

    port=port

)

```

Check connection:

```python

kowalski.ping()

```

### Querying a Kowalski instance

Most users will be interacting with Kowalski using the `Kowalski.query` method.

Retrieve available catalog names:

```python

query = {

    "query_type": "info",

    "query": {

        "command": "catalog_names",

    }

}

response = kowalski.query(query=query)

data = response.get("default").get("data")

```

Query for 7 nearest sources to a sky position, sorted by the spheric distance, with a `near` query:

```python

query = {

    "query_type": "near",

    "query": {

        "max_distance": 2,

        "distance_units": "arcsec",

        "radec": {"query_coords": [281.15902595, -4.4160933]},

        "catalogs": {

            "ZTF_sources_20210401": {

                "filter": {},

                "projection": {"_id": 1},

            }

        },

    },

    "kwargs": {

        "max_time_ms": 10000,

        "limit": 7,

    },

}

response = kowalski.query(query=query)

data = response.get("default").get("data")

```

Retrieve available catalog names:

```python

query = {

    "query_type": "info",

    "query": {

        "command": "catalog_names",

    }

}

response = k.query(query=query)

data = response.get("default").get("data")

```

Query for 7 nearest sources to a sky position, sorted by the spheric distance, with a `near` query:

```python

query = {

    "query_type": "near",

    "query": {

        "max_distance": 2,

        "distance_units": "arcsec",

        "radec": {"query_coords": [281.15902595, -4.4160933]},

        "catalogs": {

            "ZTF_sources_20210401": {

                "filter": {},

                "projection": {"_id": 1},

            }

        },

    },

    "kwargs": {

        "max_time_ms": 10000,

        "limit": 7,

    },

}

response = k.query(query=query)

data = response.get("default").get("data")

```

Run a `cone_search` query:

```python

query = {

    "query_type": "cone_search",

    "query": {

        "object_coordinates": {

            "cone_search_radius": 2,

            "cone_search_unit": "arcsec",

            "radec": {

                "ZTF20acfkzcg": [

                    115.7697847,

                    50.2887778

                ]

            }

        },

        "catalogs": {

            "ZTF_alerts": {

                "filter": {},

                "projection": {

                    "_id": 0,

                    "candid": 1,

                    "objectId": 1

                }

            }

        }

    },

    "kwargs": {

        "filter_first": False

    }

}

response = kowalski.query(query=query)

data = response.get("default").get("data")

```

Run a `find` query:

```python

q = {

    "query_type": "find",

    "query": {

        "catalog": "ZTF_alerts",

        "filter": {

            "objectId": "ZTF20acfkzcg"

        },

        "projection": {

            "_id": 0,

            "candid": 1

        }

    }

}

response = kowalski.query(query=q)

data = response.get("default").get("data")

```

Run a batch of queries in parallel:

```python

queries = [

    {

        "query_type": "find",

        "query": {

            "catalog": "ZTF_alerts",

            "filter": {

                "candid": alert["candid"]

            },

            "projection": {

                "_id": 0,

                "candid": 1

            }

        }

    }

    for alert in data

]

responses = k.query(queries=queries, use_batch_query=True, max_n_threads=4)

```

### Querying multiple instances at once

When using multiple instances at once, you can specify a single instance to query using its name when calling `query(name=...)`, or no name at all. If no name is provided, and the catalog(s) being queried is/are available on multiple instances, penquins will divide the load between instances automagically.

When retrieving the results, you'll have to use the instance(s) name instead of "default", or simply iterate over the results by instance and merge the results.

Any of the queries mentioned for single instance querying also work here.

#### Examples

No instance name specified:

```python

q = {

    "query_type": "find",

    "query": {

        "catalog": "ZTF_alerts",

        "filter": {

            "objectId": "ZTF20acfkzcg"

        },

        "projection": {

            "_id": 0,

            "candid": 1

        }

    }

}

response = kowalski.query(query=q)

data = response.get()

data = response.get(

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dmitryduev/penquins

Awesome Lists containing this project

README