https://github.com/wanghui5801/tsdss
Time series analysis tools.
https://github.com/wanghui5801/tsdss
filtering nonlinear signal-processing surrogate timeseries
Last synced: 8 months ago
JSON representation
Time series analysis tools.
- Host: GitHub
- URL: https://github.com/wanghui5801/tsdss
- Owner: wanghui5801
- License: mit
- Created: 2024-11-18T04:36:11.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-11-21T17:41:55.000Z (over 1 year ago)
- Last Synced: 2025-09-13T12:17:53.232Z (9 months ago)
- Topics: filtering, nonlinear, signal-processing, surrogate, timeseries
- Language: Python
- Homepage:
- Size: 139 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# TSDSS 📊 🔮 📈
[](https://badge.fury.io/py/tsdss)
[](https://pypi.org/project/tsdss/)
[](https://opensource.org/licenses/MIT)
[](https://pepy.tech/project/tsdss)
[](https://github.com/wanghui5801/tsdss/actions)
TSDSS is a comprehensive Python package for time series analysis and surrogate data generation. It provides a wide range of tools for statistical analysis, preprocessing, feature extraction, and surrogate data generation for both univariate and multivariate time series.
## Features
### Time Series Analysis
- Basic statistics (mean, std, skewness, kurtosis)
- Stationarity tests (ADF test, Ljung-Box test)
- Correlation analysis (Pearson, Spearman, Kendall)
- Spectral analysis
- Nonlinear analysis (Lyapunov exponent, phase space reconstruction)
- Entropy measures
### Time Series Preprocessing
- Missing value interpolation
- Outlier detection
- Normalization
- Resampling
- Feature extraction
### Surrogate Data Generation
- IAAFT (Iterative Amplitude Adjusted Fourier Transform)
- IAAFT+ (Enhanced IAAFT)
- IPFT (Iterative Phase-adjusted Fourier Transform)
- AIAAFT (Adaptive IAAFT)
- IAAWT (Iterative Amplitude Adjusted Wavelet Transform)
- Multivariate surrogate methods
- Bootstrap methods
### Time Series Filtering
Each filter has its own characteristics and use cases:
- **Moving Average Filter**: Simple and effective for reducing random noise
- **Exponential Filter**: Gives more weight to recent data points
- **Savitzky-Golay Filter**: Preserves higher moments of the data while smoothing
- **Kalman Filter**: Optimal for tracking time-varying signals
- **Butterworth Filter**: Frequency domain filtering with flat response
- **Median Filter**: Excellent for removing impulse noise and outliers
For multivariate time series, the `multivariate_filter` function provides a unified interface to apply any of these filters to each dimension of the data. Key features:
- Supports all single-variable filtering methods
- Maintains correlations between dimensions
- Handles errors gracefully for each dimension
- Preserves the original data structure
## Installation
```bash
pip install tsdss
```
## Input Data Format
TSDSS accepts the following input formats:
- NumPy arrays (1D for univariate, 2D for multivariate)
- Pandas Series (for univariate)
- Pandas DataFrame (for multivariate)
Example shapes:
- Univariate: (n_samples,) or (n_samples, 1)
- Multivariate: (n_samples, n_features)
## Quick Start Examples
### Basic Statistics and Analysis
```python
import numpy as np
import pandas as pd
from tsdss import ts_statistics, plot_decomposition, calculate_entropy
# Basic time series statistics
ts = np.random.normal(0, 1, 1000)
stats = ts_statistics(ts)
print(stats)
# Plot time series decomposition
plot_decomposition(ts)
# Calculate entropy
entropy = calculate_entropy(ts)
print(f"Entropy: {entropy}")
```
### Time Series Preprocessing
```python
from tsdss import interpolate_missing, detect_outliers, normalize_ts, resample_ts
# Handle missing values
ts = pd.Series([1, np.nan, 3, np.nan, 5])
ts_clean = interpolate_missing(ts, method='linear') # Options: linear, ffill, bfill, cubic, spline
# Detect outliers
ts = np.random.normal(0, 1, 1000)
outliers = detect_outliers(ts, method='zscore', threshold=3) # Options: zscore, iqr, mad
# Normalize data
ts_norm = normalize_ts(ts, method='zscore') # Options: zscore, minmax, robust
# Resample time series (requires datetime index)
dates = pd.date_range('2023-01-01', periods=100, freq='D')
ts = pd.Series(np.random.randn(100), index=dates)
ts_resampled = resample_ts(ts, freq='W', method='mean')
```
### Feature Extraction
```python
from tsdss import extract_time_features, extract_freq_features
# Extract time domain features
ts = np.random.normal(0, 1, 1000)
time_features = extract_time_features(ts)
print("Time domain features:", time_features)
# Extract frequency domain features
freq_features = extract_freq_features(ts)
print("Frequency domain features:", freq_features)
```
### Correlation Analysis
```python
from tsdss import mutual_information, kendall_correlation
# Calculate mutual information
x = np.random.normal(0, 1, 1000)
y = 0.5 * x + np.random.normal(0, 1, 1000)
mi = mutual_information(x, y)
print(f"Mutual Information: {mi}")
# Calculate Kendall correlation
kendall = kendall_correlation(x, y)
print(f"Kendall Correlation: {kendall}")
```
### Surrogate Data Generation
```python
from tsdss import (
iaaft, iaaft_plus, ipft, aiaaft,
multivariate_iaaft, block_bootstrap,
stationary_bootstrap
)
# Generate univariate surrogate data
ts = np.random.normal(0, 1, 1000)
# IAAFT method
surrogate_iaaft = iaaft(ts, n_iterations=1000, num_surrogates=1)[0]
# IAAFT+ method
surrogate_iaaft_plus = iaaft_plus(ts, n_iterations=1000, num_surrogates=1)[0]
# IPFT method
surrogate_ipft = ipft(ts, n_iterations=1000, num_surrogates=1)[0]
# Generate multivariate surrogate data
data = np.random.normal(0, 1, (1000, 3)) # 3-dimensional time series
mv_surrogate = multivariate_iaaft(data, max_iter=100, num_surrogates=1)[0]
# Bootstrap methods
block_samples = block_bootstrap(ts, block_length=50, num_bootstrap=100)
stat_samples = stationary_bootstrap(ts, mean_block_length=50, num_bootstrap=100)
```
### Wavelet Analysis
```python
from tsdss import dwt, idwt, iaawt
# Perform discrete wavelet transform
ts = np.random.normal(0, 1, 1024) # Length should be power of 2
coeffs = dwt(ts, level=3)
# Perform inverse wavelet transform
reconstructed = idwt(coeffs)
# Generate wavelet-based surrogate
surrogate = iaawt(ts, n_iterations=1000, num_surrogates=1)[0]
```
### Advanced Multivariate Analysis
```python
from tsdss import (
mvts_surrogate_s_transform,
mvts_surrogate_wavelet,
mvts_surrogate_pca,
copula_surrogate
)
# Generate multivariate data
data = np.random.normal(0, 1, (1000, 5))
# Different multivariate surrogate methods
surrogate_st = mvts_surrogate_s_transform(data, num_surrogates=1)[0]
surrogate_wavelet = mvts_surrogate_wavelet(data, num_surrogates=1)[0]
surrogate_pca = mvts_surrogate_pca(data, num_surrogates=1)[0]
surrogate_copula = copula_surrogate(data, num_surrogates=1)[0]
```
### Bootstrap Methods
```python
from tsdss import block_bootstrap, stationary_bootstrap
# 1. Block Bootstrap
# Fixed block length, suitable for data with strong local dependencies
ts = np.random.normal(0, 1, 1000)
block_samples = block_bootstrap(
data=ts,
block_length=50, # Fixed block length
num_bootstrap=100
)
# 2. Stationary Bootstrap
# Random block length (geometric distribution), preserves stationarity
stat_samples = stationary_bootstrap(
data=ts,
mean_block_length=50, # Average block length
num_bootstrap=100
)
# Compare the two methods
print("Block Bootstrap first sample:", block_samples[0][:10])
print("Stationary Bootstrap first sample:", stat_samples[0][:10])
# Using with pandas Series
ts_series = pd.Series(ts)
block_samples_pd = block_bootstrap(ts_series, block_length=50, num_bootstrap=100)
stat_samples_pd = stationary_bootstrap(ts_series, mean_block_length=50, num_bootstrap=100)
# Key differences:
# 1. Block Bootstrap: Uses fixed block length
# 2. Stationary Bootstrap: Uses random block length (geometric distribution)
# - Better preserves stationarity
# - More suitable for time series with varying dependence structures
```
### Time Series Filtering
```python
from tsdss import (
moving_average_filter,
exponential_filter,
savitzky_golay_filter,
kalman_filter,
butterworth_filter,
median_filter,
multivariate_filter
)
import numpy as np
import matplotlib.pyplot as plt
# 1. Univariate Filtering Example
t = np.linspace(0, 10, 1000)
noisy_signal = np.sin(2*np.pi*0.5*t) + 0.5*np.random.normal(0, 1, 1000)
# Apply different filters
ma_filtered = moving_average_filter(noisy_signal, window_size=5)
ema_filtered = exponential_filter(noisy_signal, alpha=0.3)
sg_filtered = savitzky_golay_filter(noisy_signal, window_size=15, poly_order=3)
kalman_filtered = kalman_filter(noisy_signal, Q=1e-5, R=1e-2)
# 2. Multivariate Filtering Example
# Generate sample multivariate data
mv_data = np.column_stack([
np.sin(2*np.pi*0.5*t) + 0.5*np.random.normal(0, 1, 1000),
np.cos(2*np.pi*0.3*t) + 0.3*np.random.normal(0, 1, 1000),
0.5*t + np.random.normal(0, 0.2, 1000)
])
# Apply multivariate filter
mv_filtered = multivariate_filter(
mv_data,
filter_type='kalman',
Q=1e-5,
R=1e-2
)
# You can also try different filter types
mv_ma = multivariate_filter(mv_data, filter_type='ma', window_size=5)
mv_butter = multivariate_filter(
mv_data,
filter_type='butter',
cutoff=0.1,
fs=100
)
```
## Performance
The package uses optimized C++ implementations for core computations:
- Trend decomposition
- Skewness and kurtosis calculation
- ACF computation
- Ljung-Box test
## Requirements
- Python >= 3.7
- NumPy >= 1.19.0
- Pandas >= 1.0.0
- SciPy >= 1.6.0
- Statsmodels >= 0.13.0
- Scikit-learn >= 0.24.0
- Matplotlib >= 3.0.0
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Version History
### 0.2.0
- Added comprehensive time series filtering functionality
- Added multivariate filtering support
- Improved documentation and examples
- Bug fixes and performance improvements
### 0.1.0
- Initial release