https://github.com/nyu-mlab/inspector-core-library
Library for core functionalities of IoT Inspector
https://github.com/nyu-mlab/inspector-core-library
internet-of-things iot iot-device iot-platform raspberry-pi raspberrypi
Last synced: 2 months ago
JSON representation
Library for core functionalities of IoT Inspector
- Host: GitHub
- URL: https://github.com/nyu-mlab/inspector-core-library
- Owner: nyu-mlab
- License: apache-2.0
- Created: 2025-02-23T00:54:48.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-03-04T05:08:14.000Z (4 months ago)
- Last Synced: 2026-03-04T08:35:11.778Z (4 months ago)
- Topics: internet-of-things, iot, iot-device, iot-platform, raspberry-pi, raspberrypi
- Language: Python
- Homepage: https://inspector.engineering.nyu.edu/
- Size: 14.6 MB
- Stars: 1
- Watchers: 1
- Forks: 2
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# inspector-core-library
[](https://github.com/astral-sh/ruff)
[](https://github.com/nyu-mlab/inspector-core-library/actions/workflows/libinspector_test.yml)
[](https://codecov.io/gh/nyu-mlab/inspector-core-library)
Library for core functionalities of IoT Inspector
## Installation
To install the `libinspector` module via pip, use the following command:
```sh
pip install libinspector
```
## Usage
### Running the Inspector
For debugging purposes, you can also set the following environment variables to control the behavior of the Inspector Core:
| Variable | Description | Default |
|:-------------------|:-----------------------------------------------------------------------------------------------------|:--------|
| `USE_IN_MEMORY_DB` | Set to `false` to use a physical `.db` file on disk. Useful for debugging the core library/database. | `true` |
| `SCAN_ALL_DEVICES` | Set to `true` to ARP-spoof all devices on the network BY DEFAULT. Disabled by default. | `false` |
| `ARP_SPOOF_ROUTER` | Set to `false` to NOT ARP-spoof the router. | `true` |
| `ARP_SPOOF_DEVICE` | Set to `false` to NOT ARP-spoof the device. | `true` |
To run the Inspector, you need to activate the virtual environment first and then run the following command (You need to pass environment variables here too):
```sh
sudo USE_IN_MEMORY_DB=false SCAN_ALL_DEVICES=true PYTHONPATH=~/.local/lib/python3.11/site-packages python3 -m libinspector.core
```
#### How to set environment variables (Linux/macOS):
```bash
export USE_IN_MEMORY_DB=false
export SCAN_ALL_DEVICES=true
export ARP_SPOOF_ROUTER=false
export ARP_SPOOF_DEVICE=false
```
#### How to set environment variables (Windows):
```powershell
$env:USE_IN_MEMORY_DB = "false"
$env:SCAN_ALL_DEVICES = "true"
$env:ARP_SPOOF_ROUTER = "false"
$env:ARP_SPOOF_DEVICE = "false"
```
### Embedding in Your Own Python Application
The preferred way to use `libinspector` is to embed it within your own Python application. You can do this by importing `libinspector.core` and calling the `start_threads()` method, which returns almost instantaneously. Your Python script will then need to read the in-memory SQLite database for information about the devices and the network traffic flows.
```python
import time
import libinspector.core
import libinspector.global_state
# This method returns almost instantaneously
libinspector.core.start_threads()
# Make sure to sleep and/or do other work here, such as analyzing the in-memory SQLite database. For example, you can keep printing the device list from the `devices` table.
db_conn, rwlock = libinspector.global_state.db_conn_and_lock
while True:
with rwlock:
for device in db_conn.execute('SELECT mac_address, ip_address FROM devices').fetchall():
print(f'MAC: {device["mac_address"]}, IP: {device["ip_address"]}')
time.sleep(5)
```
If you want to add additional packet parsing capabilities, you can specific a custom callback when you start Inspector. Here's an example that prints out the summary of each captured packet:
```python
import libinspector
libinspector.core.start_threads(
custom_packet_callback_func=lambda pkt: print(f'Packet captured: {pkt.summary()}')
)
```
### Data Schema
The data schema is defined in `mem_db.py` and includes the following tables:
- `devices`: Stores information about devices on the network.
- `mac_address` (TEXT, PRIMARY KEY): The MAC address of the device.
- `ip_address` (TEXT, NOT NULL): The IP address assigned to the device.
- `is_inspected` (INTEGER, DEFAULT 0): Indicates whether the device is being inspected (1) or not (0).
- `is_gateway` (INTEGER, DEFAULT 0): Indicates whether the device is a gateway (1) or not (0).
- `updated_ts` (INTEGER, DEFAULT 0): The timestamp of the last update.
- `metadata_json` (TEXT, DEFAULT '{}'): Additional metadata in JSON format.
- `hostnames`: Stores hostnames associated with IP addresses.
- `ip_address` (TEXT, PRIMARY KEY): The IP address associated with the hostname.
- `hostname` (TEXT, NOT NULL): The hostname of the device.
- `updated_ts` (INTEGER, DEFAULT 0): The timestamp of the last update.
- `data_source` (TEXT, NOT NULL): The source of the hostname data.
- `metadata_json` (TEXT, DEFAULT '{}'): Additional metadata in JSON format.
- `network_flows`: Stores information about network flows.
- `timestamp` (INTEGER): The timestamp of the network flow.
- `src_ip_address` (TEXT): The source IP address of the flow.
- `dest_ip_address` (TEXT): The destination IP address of the flow.
- `src_hostname` (TEXT): The source hostname of the flow.
- `dest_hostname` (TEXT): The destination hostname of the flow.
- `src_mac_address` (TEXT): The source MAC address of the flow.
- `dest_mac_address` (TEXT): The destination MAC address of the flow.
- `src_port` (TEXT): The source port of the flow.
- `dest_port` (TEXT): The destination port of the flow.
- `protocol` (TEXT): The protocol used in the flow.
- `byte_count` (INTEGER, DEFAULT 0): The number of bytes transferred in the flow.
- `packet_count` (INTEGER, DEFAULT 0): The number of packets transferred in the flow.
- `metadata_json` (TEXT, DEFAULT '{}'): Additional metadata in JSON format.
- PRIMARY KEY (`timestamp`, `src_mac_address`, `dest_mac_address`, `src_ip_address`, `dest_ip_address`, `src_port`, `dest_port`, `protocol`): The composite primary key for the table.
### How `libinspector` Works
The `libinspector` module works by starting various threads to monitor and inspect network traffic. Here is a high-level overview of the `start_threads` function in `core.py`:
1. **Ensure Single Instance**: The function first ensures that only one instance of the Inspector core is running.
2. **Initialize Database**: It initializes the database by calling `mem_db.initialize_db()`.
3. **Initialize Networking Variables**: It enables IP forwarding and updates the network information.
4. **Start Threads**: It starts several threads to perform various tasks:
- Update network info from the OS every 60 seconds.
- Discover devices on the network every 10 seconds.
- Collect and process packets from the network.
- Spoof internet traffic.
- Start the mDNS and UPnP scanner threads.
### Testing and Development
To test locally, run these commands:
```
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install .
```
## Notes
TODO:
- Create more test cases to obtain higher code coverage.
## Contributing
Contributions are welcome! Please feel free to submit a pull request or open an issue.
## License
This project is licensed under the Apache 2.0 License. See the LICENSE file for details.
## Contact
Ask Prof. Danny Y. Huang (dhuang@nyu.edu) or Andrew Quijano (andrew.quijano@nyu.edu).