https://github.com/wanghui5801/tsdss

Time series analysis tools.
https://github.com/wanghui5801/tsdss
filtering nonlinear signal-processing surrogate timeseries
Last synced: 8 months ago
JSON representation
Time series analysis tools.
Host: GitHub
URL: https://github.com/wanghui5801/tsdss
Owner: wanghui5801
License: mit
Created: 2024-11-18T04:36:11.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-11-21T17:41:55.000Z (over 1 year ago)
Last Synced: 2025-09-13T12:17:53.232Z (9 months ago)
Topics: filtering, nonlinear, signal-processing, surrogate, timeseries
Language: Python
Homepage:
Size: 139 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          


  



# TSDSS 📊 🔮 📈

[![PyPI version](https://badge.fury.io/py/tsdss.svg)](https://badge.fury.io/py/tsdss)

[![Python](https://img.shields.io/pypi/pyversions/tsdss.svg)](https://pypi.org/project/tsdss/)

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

[![Downloads](https://pepy.tech/badge/tsdss)](https://pepy.tech/project/tsdss)

[![Build Status](https://github.com/wanghui5801/tsdss/workflows/Python%20Tests/badge.svg)](https://github.com/wanghui5801/tsdss/actions)

TSDSS is a comprehensive Python package for time series analysis and surrogate data generation. It provides a wide range of tools for statistical analysis, preprocessing, feature extraction, and surrogate data generation for both univariate and multivariate time series.

## Features

### Time Series Analysis

- Basic statistics (mean, std, skewness, kurtosis)

- Stationarity tests (ADF test, Ljung-Box test)

- Correlation analysis (Pearson, Spearman, Kendall)

- Spectral analysis

- Nonlinear analysis (Lyapunov exponent, phase space reconstruction)

- Entropy measures

### Time Series Preprocessing

- Missing value interpolation

- Outlier detection

- Normalization

- Resampling

- Feature extraction

### Surrogate Data Generation

- IAAFT (Iterative Amplitude Adjusted Fourier Transform)

- IAAFT+ (Enhanced IAAFT)

- IPFT (Iterative Phase-adjusted Fourier Transform)

- AIAAFT (Adaptive IAAFT)

- IAAWT (Iterative Amplitude Adjusted Wavelet Transform)

- Multivariate surrogate methods

- Bootstrap methods

### Time Series Filtering

Each filter has its own characteristics and use cases:

- **Moving Average Filter**: Simple and effective for reducing random noise

- **Exponential Filter**: Gives more weight to recent data points

- **Savitzky-Golay Filter**: Preserves higher moments of the data while smoothing

- **Kalman Filter**: Optimal for tracking time-varying signals

- **Butterworth Filter**: Frequency domain filtering with flat response

- **Median Filter**: Excellent for removing impulse noise and outliers

For multivariate time series, the `multivariate_filter` function provides a unified interface to apply any of these filters to each dimension of the data. Key features:

- Supports all single-variable filtering methods

- Maintains correlations between dimensions

- Handles errors gracefully for each dimension

- Preserves the original data structure

## Installation

```bash

pip install tsdss

```

## Input Data Format

TSDSS accepts the following input formats:

- NumPy arrays (1D for univariate, 2D for multivariate)

- Pandas Series (for univariate)

- Pandas DataFrame (for multivariate)

Example shapes:

- Univariate: (n_samples,) or (n_samples, 1)

- Multivariate: (n_samples, n_features)

## Quick Start Examples

### Basic Statistics and Analysis

```python

import numpy as np

import pandas as pd

from tsdss  import ts_statistics, plot_decomposition, calculate_entropy

# Basic time series statistics

ts = np.random.normal(0, 1, 1000)

stats = ts_statistics(ts)

print(stats)

# Plot time series decomposition

plot_decomposition(ts)

# Calculate entropy

entropy = calculate_entropy(ts)

print(f"Entropy: {entropy}")

```

### Time Series Preprocessing

```python

from tsdss import interpolate_missing, detect_outliers, normalize_ts, resample_ts

# Handle missing values

ts = pd.Series([1, np.nan, 3, np.nan, 5])

ts_clean = interpolate_missing(ts, method='linear')  # Options: linear, ffill, bfill, cubic, spline

# Detect outliers

ts = np.random.normal(0, 1, 1000)

outliers = detect_outliers(ts, method='zscore', threshold=3)  # Options: zscore, iqr, mad

# Normalize data

ts_norm = normalize_ts(ts, method='zscore')  # Options: zscore, minmax, robust

# Resample time series (requires datetime index)

dates = pd.date_range('2023-01-01', periods=100, freq='D')

ts = pd.Series(np.random.randn(100), index=dates)

ts_resampled = resample_ts(ts, freq='W', method='mean')

```

### Feature Extraction

```python

from tsdss import extract_time_features, extract_freq_features

# Extract time domain features

ts = np.random.normal(0, 1, 1000)

time_features = extract_time_features(ts)

print("Time domain features:", time_features)

# Extract frequency domain features

freq_features = extract_freq_features(ts)

print("Frequency domain features:", freq_features)

```

### Correlation Analysis

```python

from tsdss import mutual_information, kendall_correlation

# Calculate mutual information

x = np.random.normal(0, 1, 1000)

y = 0.5 * x + np.random.normal(0, 1, 1000)

mi = mutual_information(x, y)

print(f"Mutual Information: {mi}")

# Calculate Kendall correlation

kendall = kendall_correlation(x, y)

print(f"Kendall Correlation: {kendall}")

```

### Surrogate Data Generation

```python

from tsdss  import (

    iaaft, iaaft_plus, ipft, aiaaft, 

    multivariate_iaaft, block_bootstrap, 

    stationary_bootstrap

)

# Generate univariate surrogate data

ts = np.random.normal(0, 1, 1000)

# IAAFT method

surrogate_iaaft = iaaft(ts, n_iterations=1000, num_surrogates=1)[0]

# IAAFT+ method

surrogate_iaaft_plus = iaaft_plus(ts, n_iterations=1000, num_surrogates=1)[0]

# IPFT method

surrogate_ipft = ipft(ts, n_iterations=1000, num_surrogates=1)[0]

# Generate multivariate surrogate data

data = np.random.normal(0, 1, (1000, 3))  # 3-dimensional time series

mv_surrogate = multivariate_iaaft(data, max_iter=100, num_surrogates=1)[0]

# Bootstrap methods

block_samples = block_bootstrap(ts, block_length=50, num_bootstrap=100)

stat_samples = stationary_bootstrap(ts, mean_block_length=50, num_bootstrap=100)

```

### Wavelet Analysis

```python  

from tsdss import dwt, idwt, iaawt

# Perform discrete wavelet transform

ts = np.random.normal(0, 1, 1024)  # Length should be power of 2

coeffs = dwt(ts, level=3)

# Perform inverse wavelet transform

reconstructed = idwt(coeffs)

# Generate wavelet-based surrogate

surrogate = iaawt(ts, n_iterations=1000, num_surrogates=1)[0]

```

### Advanced Multivariate Analysis

```python

from tsdss import (

    mvts_surrogate_s_transform, 

    mvts_surrogate_wavelet,

    mvts_surrogate_pca,

    copula_surrogate

)

# Generate multivariate data

data = np.random.normal(0, 1, (1000, 5))

# Different multivariate surrogate methods

surrogate_st = mvts_surrogate_s_transform(data, num_surrogates=1)[0]

surrogate_wavelet = mvts_surrogate_wavelet(data, num_surrogates=1)[0]

surrogate_pca = mvts_surrogate_pca(data, num_surrogates=1)[0]

surrogate_copula = copula_surrogate(data, num_surrogates=1)[0]

```

### Bootstrap Methods

```python

from tsdss import block_bootstrap, stationary_bootstrap

# 1. Block Bootstrap

# Fixed block length, suitable for data with strong local dependencies

ts = np.random.normal(0, 1, 1000)

block_samples = block_bootstrap(

    data=ts, 

    block_length=50,  # Fixed block length

    num_bootstrap=100

)

# 2. Stationary Bootstrap

# Random block length (geometric distribution), preserves stationarity

stat_samples = stationary_bootstrap(

    data=ts, 

    mean_block_length=50,  # Average block length

    num_bootstrap=100

)

# Compare the two methods

print("Block Bootstrap first sample:", block_samples[0][:10])

print("Stationary Bootstrap first sample:", stat_samples[0][:10])

# Using with pandas Series

ts_series = pd.Series(ts)

block_samples_pd = block_bootstrap(ts_series, block_length=50, num_bootstrap=100)

stat_samples_pd = stationary_bootstrap(ts_series, mean_block_length=50, num_bootstrap=100)

# Key differences:

# 1. Block Bootstrap: Uses fixed block length

# 2. Stationary Bootstrap: Uses random block length (geometric distribution)

#    - Better preserves stationarity

#    - More suitable for time series with varying dependence structures

```

### Time Series Filtering

```python

from tsdss import (

    moving_average_filter,

    exponential_filter,

    savitzky_golay_filter,

    kalman_filter,

    butterworth_filter,

    median_filter,

    multivariate_filter

)

import numpy as np

import matplotlib.pyplot as plt

# 1. Univariate Filtering Example

t = np.linspace(0, 10, 1000)

noisy_signal = np.sin(2*np.pi*0.5*t) + 0.5*np.random.normal(0, 1, 1000)

# Apply different filters

ma_filtered = moving_average_filter(noisy_signal, window_size=5)

ema_filtered = exponential_filter(noisy_signal, alpha=0.3)

sg_filtered = savitzky_golay_filter(noisy_signal, window_size=15, poly_order=3)

kalman_filtered = kalman_filter(noisy_signal, Q=1e-5, R=1e-2)

# 2. Multivariate Filtering Example

# Generate sample multivariate data

mv_data = np.column_stack([

    np.sin(2*np.pi*0.5*t) + 0.5*np.random.normal(0, 1, 1000),

    np.cos(2*np.pi*0.3*t) + 0.3*np.random.normal(0, 1, 1000),

    0.5*t + np.random.normal(0, 0.2, 1000)

])

# Apply multivariate filter

mv_filtered = multivariate_filter(

    mv_data,

    filter_type='kalman',

    Q=1e-5,

    R=1e-2

)

# You can also try different filter types

mv_ma = multivariate_filter(mv_data, filter_type='ma', window_size=5)

mv_butter = multivariate_filter(

    mv_data, 

    filter_type='butter',

    cutoff=0.1,

    fs=100

)

```

## Performance

The package uses optimized C++ implementations for core computations:

- Trend decomposition

- Skewness and kurtosis calculation

- ACF computation

- Ljung-Box test

## Requirements

- Python >= 3.7

- NumPy >= 1.19.0

- Pandas >= 1.0.0

- SciPy >= 1.6.0

- Statsmodels >= 0.13.0

- Scikit-learn >= 0.24.0

- Matplotlib >= 3.0.0

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Version History

### 0.2.0

- Added comprehensive time series filtering functionality

- Added multivariate filtering support

- Improved documentation and examples

- Bug fixes and performance improvements

### 0.1.0

- Initial release
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/wanghui5801/tsdss

Awesome Lists containing this project

README