{"id":35492669,"url":"https://github.com/vivek-v-rao/cpp-dataframe","last_synced_at":"2026-01-04T17:01:26.586Z","repository":{"id":330995780,"uuid":"1124779884","full_name":"vivek-v-rao/cpp-dataframe","owner":"vivek-v-rao","description":"C++17 dataframe featuring CSV/binary I/O, date/datetime indices, arithmetic transforms,  statistics (including rolling windows and multiple correlation types), and random data generators","archived":false,"fork":false,"pushed_at":"2025-12-29T16:48:15.000Z","size":632,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-01T20:14:17.901Z","etag":null,"topics":["cpp","cpp17","dataframe","statistics","time-series","time-series-analysis"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vivek-v-rao.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-29T15:49:17.000Z","updated_at":"2025-12-30T18:55:25.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/vivek-v-rao/cpp-dataframe","commit_stats":null,"previous_names":["vivek-v-rao/cpp-dataframe"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/vivek-v-rao/cpp-dataframe","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vivek-v-rao%2Fcpp-dataframe","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vivek-v-rao%2Fcpp-dataframe/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vivek-v-rao%2Fcpp-dataframe/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vivek-v-rao%2Fcpp-dataframe/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vivek-v-rao","download_url":"https://codeload.github.com/vivek-v-rao/cpp-dataframe/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vivek-v-rao%2Fcpp-dataframe/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28206344,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2026-01-04T02:00:06.065Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpp","cpp17","dataframe","statistics","time-series","time-series-analysis"],"created_at":"2026-01-03T16:11:56.229Z","updated_at":"2026-01-04T17:01:26.579Z","avatar_url":"https://github.com/vivek-v-rao.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# C++ DataFrame Library\r\n\r\nA header-only (with small cpp helpers) C++17 implementation of a DataFrame that mimics some features familiar from Python's pandas—CSV/binary I/O, typed indices (including dates/datetimes), arithmetic transformations, statistics, rolling windows, correlations, sampling, and more. This repository also contains multiple sample programs demonstrating common workflows (basic loading, arithmetic, statistics, indexing, I/O, custom construction, intraday data).\r\n\r\n## Motivation\r\n\r\nC++ lacks a convenient, flexible in-memory tabular structure. Most teams end up reinventing custom row/column utilities or fall back to Python/R for data munging even when their core systems are C++. This project aims to provide a pragmatic DataFrame with the following goals:\r\n\r\n- **Native C++17**: no external dependencies apart from the standard library.\r\n- **Typed indices**: indexes can be integers, strings, `Date`, `DateTime`, etc., with custom parsing/formatting.\r\n- **Rich operations**: arithmetic, transforms, rolling statistics, correlations, resampling, selection, sorting.\r\n- **Simple I/O**: read/write CSVs, binary snapshots, contiguous buffers for hand-off to other libraries.\r\n- **Testable examples**: multiple `x_*` sample apps show end-to-end usage.\r\n\r\n## Key Features\r\n\r\n- **Construction \u0026 I/O**\r\n  - `from_csv`, `from_vectors`, `random_normal`, `random_uniform`, `from_binary`, `from_binary_file`.\r\n  - `to_csv`, `to_csv_file`, `to_binary`, `to_binary_file`, `to_row_major`, `to_column_major`.\r\n- **Index support**\r\n  - Template `DataFrame\u003cIndexT\u003e` with built-in `Date` and `DateTime` helpers and parsing/formatting.\r\n  - Selection: `select_rows`, `slice_rows_range`, `head/tail`, `sort_rows_by_column`, `sort_columns_by_row`.\r\n- **Column operations**\r\n  - Arithmetic (`add`, `subtract`, `multiply`, `divide`), log/exp, power, normalization, standardization, scaling by scalars or other frames.\r\n  - `add_column` for derived series.\r\n- **Statistics \u0026 Analytics**\r\n  - Column stats, summary with missing-data info, percentiles, rolling mean/std/rms, EMA, correlations (Pearson, Spearman, Kendall), covariance, percentiles.\r\n  - Resampling, NaN removal, random resampling, random-data generators (normal with optional correlation, uniform).\r\n- **Printing utilities**\r\n  - `print_frame`, column summaries, percentiles, autocorrelations.\r\n  - Sample programs (`x_basic`, `x_arithmetic`, `x_stats`, `x_indexing`, `x_io`, `x_construct`, `x_intraday`) cover different use cases.\r\n\r\n## Sample Programs\r\n\r\nAll executables build via `make`. `make run` executes:\r\n\r\n| Program        | Description |\r\n|----------------|-------------|\r\n| `df_demo`      | Comprehensive tour: CSV load, returns, stats, rolling metrics, correlations, binary I/O, random data, percentiles. |\r\n| `x_basic`      | Load prices and print shapes/head/tail. |\r\n| `x_arithmetic` | Scalar and element-wise arithmetic/log/exp transforms. |\r\n| `x_stats`      | Returns, summary stats, correlations, rolling stats. |\r\n| `x_indexing`   | Row slicing, selection, sorting. |\r\n| `x_io`         | CSV/binary round trip, contiguous buffer export. |\r\n| `x_construct`  | Build frames from vectors and add columns. |\r\n| `x_intraday`   | Intraday datetime indices, sorting, rolling mean. |\r\n\r\n## Limitations / Future Work\r\n\r\n- **Group-by / joins**: Not yet implemented; all operations treat columns independently. Users needing relational joins or aggregations must implement them manually or extend the library.\r\n- **Type coverage**: Columns are `double` only. Adding string or integer data columns would require significant rework.\r\n- **Performance**: Current storage is `std::vector\u003cstd::vector\u003cdouble\u003e\u003e`; heavy numeric workloads might prefer contiguous storage and SIMD-friendly operations.\r\n- **Error handling**: Many functions throw `std::runtime_error` for invalid input; there is no soft error mode.\r\n- **Thread safety**: No synchronization primitives; users must guard access if using from multiple threads.\r\n- **Binary format**: Custom, undocumented beyond code comments; subject to change.\r\n- **Dependencies**: Standard library only means no GPU/BLAS acceleration; integration with third-party libraries could be added.\r\n\r\n## Building\r\n\r\n```bash\r\n# GCC (default)\r\nmake            # build df_demo + samples\r\nmake run        # run df_demo + samples sequentially\r\n\r\n# Clang\r\nmake -f Makefile.clang\r\nmake -f Makefile.clang run\r\n\r\n# MSVC (cl.exe in PATH)\r\nmake -f Makefile.msvc\r\nmake -f Makefile.msvc run\r\n\r\n./df_demo       # run the main demo manually\r\n```\r\n\r\nEnsure the CSV inputs (e.g., `prices_2000_on.csv`, `SPY_intraday.csv`) are in the working directory.\r\n\r\n## Contributing\r\n\r\nIssues and pull requests are welcome.\r\n\r\n## License\r\n\r\nMIT License.\r\n\r\n## Sample Output\r\n\r\nHere is the output of the sample programs compiled with g++.\r\n\r\n```\r\n--- running df_demo ---\r\nloaded prices dataframe with 6536 rows and 10 columns\r\n\r\nprice data\r\n        Date          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n  2000-01-03          nan    91.617100          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-04          nan    88.034200          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-05          nan    88.191800          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-06          nan    86.774400          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-07          nan    91.813900          nan          nan          nan          nan          nan          nan          nan          nan\r\n...\r\n  2025-12-19   335.269000   680.590000    95.460000    53.720000    96.270000    87.550000    96.240000    82.760000   110.130000    80.360000\r\n  2025-12-22   337.600000   684.830000    95.700000    54.010000    96.310000    87.360000    96.140000    82.720000   110.110000    80.430000\r\n  2025-12-23   338.720000   687.960000    96.290000    54.310000    96.260000    87.500000    96.100000    82.680000   110.220000    80.490000\r\n  2025-12-24   339.880000   690.380000    96.410000    54.420000    96.550000    88.030000    96.350000    82.730000   110.650000    80.640000\r\n  2025-12-26   339.670000   690.310000    96.570000    54.800000    96.540000    87.740000    96.440000    82.790000   110.640000    80.600000\r\n\r\nreturn scaling factor: 100\r\n\r\ncomputed simple returns (proportional changes)\r\n\r\nreturns\r\n        Date          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n  2000-01-04          nan    -3.910733          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-05          nan     0.179021          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-06          nan    -1.607179          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-07          nan     5.807588          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-10          nan     0.343085          nan          nan          nan          nan          nan          nan          nan          nan\r\n...\r\n  2025-12-19     0.891223     0.906347     0.674963     0.977444    -0.048797    -0.373245    -0.224971    -0.018121    -0.257214    -0.048508\r\n  2025-12-22     0.695263     0.622989     0.251414     0.539836     0.041550    -0.217019    -0.103907    -0.048333    -0.018160     0.087108\r\n  2025-12-23     0.331754     0.457048     0.616510     0.555453    -0.051916     0.160256    -0.041606    -0.048356     0.099900     0.074599\r\n  2025-12-24     0.342466     0.351765     0.124624     0.202541     0.301267     0.605714     0.260146     0.060474     0.390129     0.186359\r\n  2025-12-26    -0.061787    -0.010139     0.165958     0.698273    -0.010357    -0.329433     0.093409     0.072525 -9.037506e-03    -0.049603\r\n\r\nreturn statistics\r\n   statistic          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n           n         6169         6535         6119         5713         4533         5891         5891         5891         5891         4709\r\n      median       0.0815       0.0693       0.0734       0.0958       0.0356       0.0517       0.0238       0.0000       0.0411       0.0319\r\n        mean       0.0437       0.0384       0.0327       0.0499       0.0206       0.0189       0.0154   7.9134e-03       0.0198       0.0217\r\n          sd       1.2084       1.2219       1.3128       1.7051       0.6834       0.9060       0.4307       0.0960       0.5286       0.6936\r\n        skew      -0.0994       0.0412      -0.0294       0.5021      -1.6047       0.0545       0.0818       0.2979       0.0122       0.7469\r\n ex_kurtosis      11.1870      12.0317      13.1951      18.4334      35.4483       3.4569       2.5309       7.1314      57.7707      42.2739\r\n         min     -11.3808     -10.9424     -11.1630     -16.1662     -10.1104      -6.6683      -2.5073      -0.6567      -9.1110      -8.0974\r\n         max      12.8299      14.5197      15.8874      22.7701       6.7298       7.5195       3.4263       0.9975       9.7678      12.2690\r\n\r\nreturn summary with missing data\r\n      column   first_idx    last_idx             n        median          mean            sd          skew       ex_kurt           min           max\r\n         VTI  2001-06-18  2025-12-26          6169        0.0815        0.0437        1.2084       -0.0994       11.1870      -11.3808       12.8299\r\n         SPY  2000-01-04  2025-12-26          6535        0.0693        0.0384        1.2219        0.0412       12.0317      -10.9424       14.5197\r\n         EFA  2001-08-28  2025-12-26          6119        0.0734        0.0327        1.3128       -0.0294       13.1951      -11.1630       15.8874\r\n         EEM  2003-04-15  2025-12-26          5713        0.0958        0.0499        1.7051        0.5021       18.4334      -16.1662       22.7701\r\n         EMB  2007-12-20  2025-12-26          4533        0.0356        0.0206        0.6834       -1.6047       35.4483      -10.1104        6.7298\r\n         TLT  2002-07-31  2025-12-26          5891        0.0517        0.0189        0.9060        0.0545        3.4569       -6.6683        7.5195\r\n         IEF  2002-07-31  2025-12-26          5891        0.0238        0.0154        0.4307        0.0818        2.5309       -2.5073        3.4263\r\n         SHY  2002-07-31  2025-12-26          5891        0.0000        0.0079        0.0960        0.2979        7.1314       -0.6567        0.9975\r\n         LQD  2002-07-31  2025-12-26          5891        0.0411        0.0198        0.5286        0.0122       57.7707       -9.1110        9.7678\r\n         HYG  2007-04-12  2025-12-26          4709        0.0319        0.0217        0.6936        0.7469       42.2739       -8.0974       12.2690\r\n\r\nreturn percentiles\r\n  percentile          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n           0     -11.3808     -10.9424     -11.1630     -16.1662     -10.1104      -6.6683      -2.5073      -0.6567      -9.1110      -8.0974\r\n           1      -3.3743      -3.4067      -3.6176      -4.6095      -1.7526      -2.2932      -1.1116      -0.2538      -1.2139      -1.9958\r\n           5      -1.8150      -1.9025      -1.9351      -2.3772      -0.8258      -1.4526      -0.6793      -0.1316      -0.7004      -0.8363\r\n          25      -0.4428      -0.4596      -0.5514      -0.7309      -0.2157      -0.5124      -0.2433      -0.0358      -0.2153      -0.1913\r\n          50       0.0815       0.0693       0.0734       0.0958       0.0356       0.0517       0.0238       0.0000       0.0411       0.0319\r\n          75       0.5980       0.5992       0.6649       0.8551       0.2755       0.5461       0.2684       0.0484       0.2667       0.2503\r\n          95       1.7093       1.7258       1.7947       2.2937       0.8211       1.4115       0.6765       0.1606       0.6702       0.8589\r\n          99       3.2556       3.3479       3.4150       4.4955       1.8954       2.3580       1.1482       0.2846       1.2054       1.8140\r\n         100      12.8299      14.5197      15.8874      22.7701       6.7298       7.5195       3.4263       0.9975       9.7678      12.2690\r\n\r\nrow completeness for returns\r\nrows with complete data: 4533\r\nfirst complete index: 2007-12-20\r\nlast complete index: 2025-12-26\r\n\r\nreturn autocorrelations\r\n         lag         VTI         SPY         EFA         EEM         EMB         TLT         IEF         SHY         LQD         HYG\r\n           1      -0.085      -0.088      -0.099      -0.111       0.092      -0.027      -0.027      -0.070      -0.001       0.001\r\n           2       0.001      -0.020       0.004      -0.036       0.058      -0.050      -0.042      -0.026      -0.052       0.013\r\n           3       0.009      -0.004       0.006       0.030      -0.002      -0.028      -0.004      -0.018       0.005      -0.064\r\n           4      -0.041      -0.026      -0.015      -0.042       0.021       0.005       0.015       0.012      -0.036       0.005\r\n           5      -0.016      -0.013      -0.015      -0.010      -0.017      -0.004      -0.012       0.011       0.018       0.007\r\n\r\nbootstrapped return autocorrelations\r\n         lag         VTI         SPY         EFA         EEM         EMB         TLT         IEF         SHY         LQD         HYG\r\n           1       0.005       0.001       0.002      -0.009       0.012       0.007       0.028       0.029       0.006      -0.008\r\n           2      -0.021      -0.014      -0.021      -0.025      -0.004       0.006       0.006      -0.008       0.007      -0.005\r\n           3      -0.015      -0.023      -0.019      -0.013       0.005       0.007       0.016       0.019       0.003       0.017\r\n           4      -0.002       0.013       0.022       0.022       0.009       0.007      -0.007       0.006       0.009       0.000\r\n           5       0.007      -0.002      -0.010      -0.013       0.019      -0.015      -0.009      -0.003       0.012       0.017\r\n\r\nreturn correlation matrix\r\n      column          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n         VTI        1.000        0.994        0.885        0.826        0.406       -0.314       -0.297       -0.224        0.204        0.689\r\n         SPY        0.994        1.000        0.887        0.826        0.393       -0.316       -0.300       -0.227        0.211        0.691\r\n         EFA        0.885        0.887        1.000        0.878        0.398       -0.300       -0.271       -0.187        0.234        0.671\r\n         EEM        0.826        0.826        0.878        1.000        0.367       -0.281       -0.274       -0.212        0.193        0.621\r\n         EMB        0.406        0.393        0.398        0.367        1.000        0.164        0.198        0.175        0.452        0.487\r\n         TLT       -0.314       -0.316       -0.300       -0.281        0.164        1.000        0.912        0.567        0.538       -0.140\r\n         IEF       -0.297       -0.300       -0.271       -0.274        0.198        0.912        1.000        0.758        0.556       -0.099\r\n         SHY       -0.224       -0.227       -0.187       -0.212        0.175        0.567        0.758        1.000        0.363       -0.058\r\n         LQD        0.204        0.211        0.234        0.193        0.452        0.538        0.556        0.363        1.000        0.456\r\n         HYG        0.689        0.691        0.671        0.621        0.487       -0.140       -0.099       -0.058        0.456        1.000\r\n\r\nreturn Spearman correlation\r\n      column          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n         VTI        1.000        0.991        0.826        0.747        0.386       -0.270       -0.274       -0.192        0.072        0.679\r\n         SPY        0.991        1.000        0.825        0.744        0.386       -0.272       -0.275       -0.194        0.071        0.676\r\n         EFA        0.826        0.825        1.000        0.809        0.433       -0.233       -0.217       -0.116        0.111        0.647\r\n         EEM        0.747        0.744        0.809        1.000        0.421       -0.232       -0.228       -0.144        0.080        0.591\r\n         EMB        0.386        0.386        0.433        0.421        1.000        0.229        0.253        0.237        0.471        0.534\r\n         TLT       -0.270       -0.272       -0.233       -0.232        0.229        1.000        0.920        0.597        0.733       -0.076\r\n         IEF       -0.274       -0.275       -0.217       -0.228        0.253        0.920        1.000        0.762        0.744       -0.053\r\n         SHY       -0.192       -0.194       -0.116       -0.144        0.237        0.597        0.762        1.000        0.562        0.024\r\n         LQD        0.072        0.071        0.111        0.080        0.471        0.733        0.744        0.562        1.000        0.297\r\n         HYG        0.679        0.676        0.647        0.591        0.534       -0.076       -0.053        0.024        0.297        1.000\r\n\r\nreturn Kendall tau\r\n      column          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n         VTI        1.000        0.909        0.627        0.563        0.272       -0.180       -0.188       -0.140        0.021        0.497\r\n         SPY        0.909        1.000        0.626        0.557        0.271       -0.179       -0.186       -0.139        0.022        0.492\r\n         EFA        0.627        0.626        1.000        0.616        0.306       -0.148       -0.143       -0.085        0.054        0.466\r\n         EEM        0.563        0.557        0.616        1.000        0.297       -0.144       -0.146       -0.098        0.034        0.421\r\n         EMB        0.272        0.271        0.306        0.297        1.000        0.164        0.181        0.166        0.341        0.389\r\n         TLT       -0.180       -0.179       -0.148       -0.144        0.164        1.000        0.761        0.443        0.574       -0.051\r\n         IEF       -0.188       -0.186       -0.143       -0.146        0.181        0.761        1.000        0.591        0.589       -0.035\r\n         SHY       -0.140       -0.139       -0.085       -0.098        0.166        0.443        0.591        1.000        0.429        0.017\r\n         LQD        0.021        0.022        0.054        0.034        0.341        0.574        0.589        0.429        1.000        0.201\r\n         HYG        0.497        0.492        0.466        0.421        0.389       -0.051       -0.035        0.017        0.201        1.000\r\n\r\nreturn covariance matrix\r\n      column          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n         VTI        1.611        1.585        1.550        1.830        0.352       -0.387       -0.167       -0.027        0.147        0.611\r\n         SPY        1.585        1.579        1.538        1.812        0.338       -0.386       -0.167       -0.027        0.150        0.606\r\n         EFA        1.550        1.538        1.903        2.115        0.375       -0.402       -0.166       -0.024        0.183        0.646\r\n         EEM        1.830        1.812        2.115        3.047        0.438       -0.477       -0.212       -0.035        0.191        0.757\r\n         EMB        0.352        0.338        0.375        0.438        0.467        0.109        0.060        0.011        0.175        0.232\r\n         TLT       -0.387       -0.386       -0.402       -0.477        0.109        0.945        0.393        0.052        0.297       -0.095\r\n         IEF       -0.167       -0.167       -0.166       -0.212        0.060        0.393        0.196        0.032        0.140       -0.031\r\n         SHY       -0.027       -0.027       -0.024       -0.035        0.011        0.052        0.032    8.902e-03        0.019   -3.850e-03\r\n         LQD        0.147        0.150        0.183        0.191        0.175        0.297        0.140        0.019        0.321        0.180\r\n         HYG        0.611        0.606        0.646        0.757        0.232       -0.095       -0.031   -3.850e-03        0.180        0.487\r\n\r\nreturns (%) first rows\r\n        Date          SPY          EFA\r\n  2000-01-04    -3.910733          nan\r\n  2000-01-05     0.179021          nan\r\n  2000-01-06    -1.607179          nan\r\n  2000-01-07     5.807588          nan\r\n  2000-01-10     0.343085          nan\r\n\r\nreturns sorted by SPY\r\n        Date          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n  2020-03-16   -11.380825   -10.942361   -10.319491   -12.479065    -4.617264     6.476601     2.641720     0.162222    -1.422204    -5.496592\r\n  2008-10-15    -9.352784    -9.844733   -10.669929   -16.166236    -0.578211     0.956207     0.819137     0.334834     0.000000    -4.363554\r\n  2020-03-12    -9.723928    -9.567668   -10.990059   -10.010762    -4.924777     0.619703     0.050845    -0.023237    -4.771914    -3.995457\r\n  2008-12-01    -8.938975    -8.857721    -8.171475    -9.629850    -2.990074     3.829675     1.353810     0.164502     1.563262    -2.062293\r\n  2008-09-29    -6.588890    -7.836196   -11.163045   -11.679326    -1.800770     2.909952     1.394669     0.575469    -9.111039    -8.097432\r\n\r\nreturns columns sorted by first row\r\n        Date          SPY          VTI          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n  2000-01-04    -3.910733          nan          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-05     0.179021          nan          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-06    -1.607179          nan          nan          nan          nan          nan          nan          nan          nan          nan\r\n\r\ncustom dataframe from vectors\r\n  CustomDate        Alpha         Beta\r\n  2025-01-01     1.000000     2.000000\r\n  2025-01-02     3.000000     4.000000\r\n  2025-01-03     5.000000     6.000000\r\n\r\nstandardized returns (z-scores)\r\n        Date          SPY          EFA\r\n  2000-01-04    -3.232033          nan\r\n  2000-01-05     0.115109          nan\r\n  2000-01-06    -1.346755          nan\r\n  2000-01-07     4.721647          nan\r\n  2000-01-10     0.249382          nan\r\n\r\nnormalized returns (last rows)\r\n        Date          SPY          EFA\r\n  2025-12-19     0.465347     0.437627\r\n  2025-12-22     0.454219     0.421969\r\n  2025-12-23     0.447702     0.435466\r\n  2025-12-24     0.443567     0.417282\r\n  2025-12-26     0.429353     0.418810\r\n\r\nreturns 2003-04-15..2003-04-22\r\n        Date          SPY          EFA\r\n  2003-04-15     0.933140     1.371317\r\n  2003-04-16    -1.704200    -0.660585\r\n  2003-04-17     1.484394     1.444811\r\n  2003-04-21     0.100523    -0.348365\r\n  2003-04-22     1.885125     1.449605\r\n\r\nreturns at endpoints\r\n        Date          SPY          TLT\r\n  2000-01-04    -3.910733          nan\r\n  2025-12-26    -0.010139    -0.329433\r\n\r\nlog price preview\r\n        Date          SPY          TLT\r\n  2000-01-03     4.517618          nan\r\n  2000-01-04     4.477725          nan\r\n  2000-01-05     4.479514          nan\r\n\r\nexp(log price) preview\r\n        Date          SPY          TLT\r\n  2000-01-03    91.617100          nan\r\n  2000-01-04    88.034200          nan\r\n  2000-01-05    88.191800          nan\r\n\r\nfirst two price columns\r\n        Date          VTI          SPY\r\n  2000-01-03          nan    91.617100\r\n  2000-01-04          nan    88.034200\r\n  2000-01-05          nan    88.191800\r\n\r\nlast two price columns\r\n        Date          LQD          HYG\r\n  2000-01-03          nan          nan\r\n  2000-01-04          nan          nan\r\n  2000-01-05          nan          nan\r\n\r\nSPY returns sample: first=-3.91073, last=-0.0101393, count=6535\r\n\r\nSPY returns with squared column\r\n        Date          SPY       SPY_sq\r\n  2000-01-04    -3.910733    15.293831\r\n  2000-01-05     0.179021     0.032049\r\n  2000-01-06    -1.607179     2.583024\r\n\r\nrow-major buffer sample: [nan, -3.91073, nan, ...]\r\n\r\nreturns reloaded from binary\r\n        Date          SPY          EFA\r\n  2000-01-04    -3.910733          nan\r\n  2000-01-05     0.179021          nan\r\n  2000-01-06    -1.607179          nan\r\n\r\nsample datetime-indexed returns\r\n   timestamp   SPY_return\r\n2000-01-04 00:00:00    -3.910730\r\n2000-01-05 01:00:00     0.179021\r\n2000-01-06 02:00:00    -1.607180\r\n2000-01-07 03:00:00     5.807590\r\n2000-01-10 04:00:00     0.343085\r\nfirst row values: SPY=nan, EFA=-3.91073\r\n\r\n5-day rolling mean\r\n        Date          SPY          EFA\r\n  2000-01-10     0.162357          nan\r\n  2000-01-11     0.705186          nan\r\n  2000-01-12     0.470429          nan\r\n\r\n5-day rolling std\r\n        Date          SPY          EFA\r\n  2000-01-10     3.592454          nan\r\n  2000-01-11     2.975157          nan\r\n  2000-01-12     3.071794          nan\r\n\r\n5-day rolling rms\r\n        Date          SPY          EFA\r\n  2000-01-10     3.217288          nan\r\n  2000-01-11     2.752914          nan\r\n  2000-01-12     2.787479          nan\r\n\r\nEMA(alpha=0.1) first rows\r\n        Date          SPY          EFA\r\n  2000-01-04    -3.910733          nan\r\n  2000-01-05    -3.501757          nan\r\n  2000-01-06    -3.312300          nan\r\nrows before NaN removal: 3, after: 0, columns after dropping NaNs: 0\r\n\r\nrandom normal target correlation: 0.7\r\n\r\nrandom normal stats\r\n   statistic        Alpha         Beta        Gamma\r\n           n         1000         1000         1000\r\n      median    -0.028973     0.034076    -0.029050\r\n        mean    -0.019795     0.043795 9.845398e-04\r\n          sd     1.013260     1.007175     1.034801\r\n        skew     0.055334     0.068212     0.148109\r\n ex_kurtosis 9.546673e-03     0.166948     0.061524\r\n         min    -3.364096    -2.849686    -3.249447\r\n         max     3.187033     3.316192     4.018276\r\n\r\nrandom normal correlations\r\n      column        Alpha         Beta        Gamma\r\n       Alpha        1.000        0.837        0.837\r\n        Beta        0.837        1.000        0.683\r\n       Gamma        0.837        0.683        1.000\r\n\r\nrandom normal covariances\r\n      column        Alpha         Beta        Gamma\r\n       Alpha        1.027        0.854        0.877\r\n        Beta        0.854        1.014        0.712\r\n       Gamma        0.877        0.712        1.071\r\n\r\nrandom uniform sample\r\n       index           U1           U2           U3\r\n           0       0.7225       0.8319       0.7341\r\n           1       0.7010       0.1933       0.9773\r\n           2       0.5777       0.5138       0.8515\r\n           3       0.3811       0.2739       0.6406\r\n           4       0.4936       0.3368       0.6163\r\n\r\nreturns shape: (6535, 10)\r\n\r\nSPY intraday sample (first 5 rows)\r\n    Datetime         Open         High          Low        Close       Volume\r\n2025-10-28 13:20:00   687.059998   687.299988   687.049988   687.265015     0.000000\r\n2025-10-28 13:25:00   687.270020   687.270020   686.969971   687.010010 3.912220e+05\r\n2025-10-28 13:30:00   687.020020   687.140015   686.875000   687.075012 3.632480e+05\r\n2025-10-28 13:35:00   687.059998   687.280029   687.059998   687.125000 3.217930e+05\r\n2025-10-28 13:40:00   687.130005   687.325500   687.114990   687.299988 4.112810e+05\r\n\r\n--- running x_basic ---\r\nprices shape: (6536, 10)\r\ncolumns: VTI SPY EFA EEM EMB TLT IEF SHY LQD HYG\r\n\r\nfirst rows\r\n\r\nprices head\r\n        Date          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n  2000-01-03          nan    91.617100          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-04          nan    88.034200          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-05          nan    88.191800          nan          nan          nan          nan          nan          nan          nan          nan\r\n\r\nprices tail\r\n        Date          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n  2025-12-23   338.720000   687.960000    96.290000    54.310000    96.260000    87.500000    96.100000    82.680000   110.220000    80.490000\r\n  2025-12-24   339.880000   690.380000    96.410000    54.420000    96.550000    88.030000    96.350000    82.730000   110.650000    80.640000\r\n  2025-12-26   339.670000   690.310000    96.570000    54.800000    96.540000    87.740000    96.440000    82.790000   110.640000    80.600000\r\n\r\n--- running x_arithmetic ---\r\n\r\noriginal subset\r\n        Date          SPY          EFA\r\n  2000-01-03    91.617100          nan\r\n  2000-01-04    88.034200          nan\r\n  2000-01-05    88.191800          nan\r\n  2000-01-06    86.774400          nan\r\n  2000-01-07    91.813900          nan\r\n\r\n+2\r\n        Date          SPY          EFA\r\n  2000-01-03    93.617100          nan\r\n  2000-01-04    90.034200          nan\r\n  2000-01-05    90.191800          nan\r\n  2000-01-06    88.774400          nan\r\n  2000-01-07    93.813900          nan\r\n\r\n-1\r\n        Date          SPY          EFA\r\n  2000-01-03    90.617100          nan\r\n  2000-01-04    87.034200          nan\r\n  2000-01-05    87.191800          nan\r\n  2000-01-06    85.774400          nan\r\n  2000-01-07    90.813900          nan\r\n\r\n*1.05\r\n        Date          SPY          EFA\r\n  2000-01-03    96.197955          nan\r\n  2000-01-04    92.435910          nan\r\n  2000-01-05    92.601390          nan\r\n  2000-01-06    91.113120          nan\r\n  2000-01-07    96.404595          nan\r\n\r\n/2\r\n        Date          SPY          EFA\r\n  2000-01-03    45.808550          nan\r\n  2000-01-04    44.017100          nan\r\n  2000-01-05    44.095900          nan\r\n  2000-01-06    43.387200          nan\r\n  2000-01-07    45.906950          nan\r\n\r\nlog subset\r\n        Date          SPY          EFA\r\n  2000-01-03     4.517618          nan\r\n  2000-01-04     4.477725          nan\r\n  2000-01-05     4.479514          nan\r\n  2000-01-06     4.463312          nan\r\n  2000-01-07     4.519764          nan\r\n\r\nexp(log subset)\r\n        Date          SPY          EFA\r\n  2000-01-03    91.617100          nan\r\n  2000-01-04    88.034200          nan\r\n  2000-01-05    88.191800          nan\r\n  2000-01-06    86.774400          nan\r\n  2000-01-07    91.813900          nan\r\n\r\n--- running x_stats ---\r\n\r\nreturn scaling factor: 100\r\n\r\nreturns head\r\n        Date          SPY          EFA\r\n  2000-01-04    -3.910733          nan\r\n  2000-01-05     0.179021          nan\r\n  2000-01-06    -1.607179          nan\r\n  2000-01-07     5.807588          nan\r\n  2000-01-10     0.343085          nan\r\n\r\nsummary stats\r\n   statistic          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n           n         6169         6535         6119         5713         4533         5891         5891         5891         5891         4709\r\n      median       0.0815       0.0693       0.0734       0.0958       0.0356       0.0517       0.0238       0.0000       0.0411       0.0319\r\n        mean       0.0437       0.0384       0.0327       0.0499       0.0206       0.0189       0.0154   7.9134e-03       0.0198       0.0217\r\n          sd       1.2084       1.2219       1.3128       1.7051       0.6834       0.9060       0.4307       0.0960       0.5286       0.6936\r\n        skew      -0.0994       0.0412      -0.0294       0.5021      -1.6047       0.0545       0.0818       0.2979       0.0122       0.7469\r\n\r\ncorrelation matrix\r\n      column          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n         VTI        1.000        0.994        0.885        0.826        0.406       -0.314       -0.297       -0.224        0.204        0.689\r\n         SPY        0.994        1.000        0.887        0.826        0.393       -0.316       -0.300       -0.227        0.211        0.691\r\n         EFA        0.885        0.887        1.000        0.878        0.398       -0.300       -0.271       -0.187        0.234        0.671\r\n         EEM        0.826        0.826        0.878        1.000        0.367       -0.281       -0.274       -0.212        0.193        0.621\r\n         EMB        0.406        0.393        0.398        0.367        1.000        0.164        0.198        0.175        0.452        0.487\r\n         TLT       -0.314       -0.316       -0.300       -0.281        0.164        1.000        0.912        0.567        0.538       -0.140\r\n         IEF       -0.297       -0.300       -0.271       -0.274        0.198        0.912        1.000        0.758        0.556       -0.099\r\n         SHY       -0.224       -0.227       -0.187       -0.212        0.175        0.567        0.758        1.000        0.363       -0.058\r\n         LQD        0.204        0.211        0.234        0.193        0.452        0.538        0.556        0.363        1.000        0.456\r\n         HYG        0.689        0.691        0.671        0.621        0.487       -0.140       -0.099       -0.058        0.456        1.000\r\n\r\ncovariance matrix\r\n      column          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n         VTI     1.611295     1.585318     1.550055     1.829655     0.351758    -0.386986    -0.167359    -0.026852     0.146677     0.611001\r\n         SPY     1.585318     1.579159     1.538016     1.812257     0.337551    -0.385572    -0.166881    -0.026874     0.150494     0.606072\r\n         EFA     1.550055     1.538016     1.903266     2.115369     0.375153    -0.401769    -0.165611    -0.024368     0.183189     0.646452\r\n         EEM     1.829655     1.812257     2.115369     3.046807     0.438151    -0.477047    -0.211801    -0.034985     0.191193     0.756800\r\n         EMB     0.351758     0.337551     0.375153     0.438151     0.467010     0.109193     0.060052     0.011253     0.175042     0.232344\r\n         TLT    -0.386986    -0.385572    -0.401769    -0.477047     0.109193     0.945435     0.392953     0.051999     0.296596    -0.095113\r\n         IEF    -0.167359    -0.166881    -0.165611    -0.211801     0.060052     0.392953     0.196425     0.031692     0.139667    -0.030547\r\n         SHY    -0.026852    -0.026874    -0.024368    -0.034985     0.011253     0.051999     0.031692 8.902280e-03     0.019416 -3.850090e-03\r\n         LQD     0.146677     0.150494     0.183189     0.191193     0.175042     0.296596     0.139667     0.019416     0.320950     0.180481\r\n         HYG     0.611001     0.606072     0.646452     0.756800     0.232344    -0.095113    -0.030547 -3.850090e-03     0.180481     0.487371\r\n\r\n5-day rolling mean\r\n        Date          SPY          EFA\r\n  2000-01-10     0.162357          nan\r\n  2000-01-11     0.705186          nan\r\n  2000-01-12     0.470429          nan\r\n\r\n--- running x_indexing ---\r\n\r\nreturn scaling factor: 100\r\n\r\nslice 2002-01-02..2002-01-10\r\n        Date          SPY          EFA\r\n  2002-01-02     1.076130     0.553313\r\n  2002-01-03     1.133950     0.900263\r\n  2002-01-04     0.667484     0.660823\r\n  2002-01-07    -0.705654    -1.329560\r\n  2002-01-08    -0.231223    -1.139470\r\n  2002-01-09    -0.815216    -1.052018\r\n  2002-01-10     0.441290     0.238177\r\n\r\nselected rows\r\n        Date          SPY          TLT\r\n  2000-01-04    -3.910733          nan\r\n  2000-01-05     0.179021          nan\r\n  2000-01-06    -1.607179          nan\r\n\r\nsorted by SPY\r\n        Date          SPY          EFA\r\n  2020-03-16   -10.942361   -10.319491\r\n  2008-10-15    -9.844733   -10.669929\r\n  2020-03-12    -9.567668   -10.990059\r\n  2008-12-01    -8.857721    -8.171475\r\n  2008-09-29    -7.836196   -11.163045\r\n\r\ncolumns sorted by first row\r\n        Date          SPY          VTI          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n  2000-01-04    -3.910733          nan          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-05     0.179021          nan          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-06    -1.607179          nan          nan          nan          nan          nan          nan          nan          nan          nan\r\n\r\n--- running x_io ---\r\n\r\nbinary reload\r\n        Date          VTI          SPY          EFA          EEM          EMB          TLT          IEF          SHY          LQD          HYG\r\n  2000-01-03          nan    91.617100          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-04          nan    88.034200          nan          nan          nan          nan          nan          nan          nan          nan\r\n  2000-01-05          nan    88.191800          nan          nan          nan          nan          nan          nan          nan          nan\r\nrow-major dump: nan 91.6171 nan nan nan nan nan nan nan nan nan 88.0342 nan nan nan nan nan nan nan nan nan 88.1918 nan nan nan nan nan nan nan nan\r\ncolumn-major dump: nan nan nan 91.6171 88.0342 88.1918 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan\r\n\r\n--- running x_construct ---\r\n\r\nfrom_vectors\r\n  CustomDate        Alpha         Beta\r\n  2024-01-01     1.000000     2.000000\r\n  2024-01-02     3.000000     4.000000\r\n  2024-01-03     5.000000     6.000000\r\n\r\nafter add_column\r\n  CustomDate        Alpha         Beta        Gamma\r\n  2024-01-01     1.000000     2.000000    10.000000\r\n  2024-01-02     3.000000     4.000000    20.000000\r\n  2024-01-03     5.000000     6.000000    30.000000\r\n\r\n--- running x_intraday ---\r\n\r\nintraday head\r\n    Datetime         Open         High          Low        Close\r\n2025-10-28 13:20:00   687.059998   687.299988   687.049988   687.265015\r\n2025-10-28 13:25:00   687.270020   687.270020   686.969971   687.010010\r\n2025-10-28 13:30:00   687.020020   687.140015   686.875000   687.075012\r\n2025-10-28 13:35:00   687.059998   687.280029   687.059998   687.125000\r\n2025-10-28 13:40:00   687.130005   687.325500   687.114990   687.299988\r\n\r\nsorted by close\r\n    Datetime        Close       Volume\r\n2025-11-21 10:35:00   651.429993 2.344265e+06\r\n2025-11-21 11:00:00   651.890015 8.309780e+05\r\n2025-11-20 15:45:00   652.000000 3.757045e+06\r\n2025-11-21 11:15:00   652.239990 7.683840e+05\r\n2025-11-20 15:55:00   652.530029 6.592158e+06\r\n\r\n3-period rolling mean\r\n    Datetime        Close\r\n2025-10-28 13:30:00   687.116679\r\n2025-10-28 13:35:00   687.070007\r\n2025-10-28 13:40:00   687.166667\r\n```\r\n\r\n\r\n\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvivek-v-rao%2Fcpp-dataframe","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvivek-v-rao%2Fcpp-dataframe","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvivek-v-rao%2Fcpp-dataframe/lists"}