Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/chrisconlan/fast-python
Source code for Fast Python (2020) by Chris Conlan
https://github.com/chrisconlan/fast-python
Last synced: about 12 hours ago
JSON representation
Source code for Fast Python (2020) by Chris Conlan
- Host: GitHub
- URL: https://github.com/chrisconlan/fast-python
- Owner: chrisconlan
- License: other
- Created: 2020-05-18T18:31:00.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2020-06-28T16:45:01.000Z (over 4 years ago)
- Last Synced: 2024-08-02T13:24:22.476Z (3 months ago)
- Language: Python
- Size: 3.84 MB
- Stars: 131
- Watchers: 7
- Forks: 26
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- Contributing: contributing.md
- License: license.txt
Awesome Lists containing this project
README
# Fast Python
Source code for Fast Python (2020) by Chris ConlanPaperback available for purchase [on Amazon](https://amzn.to/2XmvJ3x).
---------------
#### Code profiles
The following code profiles can be run as stand-alone scripts. They may or may not depend on explanation provided in the accompanying book.
+ Binary search: [binary_search.py](src/binary_search.py)
+ Dictionary construction: [build_dict.py](src/build_dict.py)
+ Concatenating strings, string construction: [concatenate_strings.py](src/concatenate_strings.py)
+ Counting the frequency of a value: [count_occurrences.py](src/count_occurrences.py)
+ Computing a cumulative sum: [cumulative_sum.py](src/cumulative_sum.py)
+ The `in` operator and early stopping: [early_stopping.py](src/early_stopping.py)
+ Time series filters/convolutions: [filters.py](src/filters.py)
+ Find largest `k` values in a list: [find_top_k.py](src/find_top_k.py)
+ List construction/declaration/flattening: [flatten_lists.py](src/flatten_lists.py)
+ Counting lines in a file: [line_count.py](src/line_count.py)
+ Set intersection, finding matches in a list: [match_within.py](src/match_within.py)
+ Matrix multiplication: [matrix_multiplication.py](src/matrix_multiplication.py)
+ Computing moving averages: [moving_averages.py](src/moving_averages.py)
+ Counting frequency of a word in text: [occurrences_of.py](src/occurrences_of.py)
+ Looping through `pd.DataFrame` objects: [pandas_loops.py](src/pandas_loops.py)
+ Sorting algorithms: [sorting.py](src/sorting.py)
+ Low-level sorting algorithms: [sorting_v2.py](src/sorting_v2.py)
+ Adding a list of numbers: [sum.py](src/sum.py)Running them is simple ...
```
cd fast-python/src
python cumulative_sum.py
```-----------
#### Profiling
All the profiles use a simple profiling module in [src/utils/profiler.py](src/utils/profiler.py). It produces tables and charts like the following.
![](sample_1.png)
![](sample_2.png)
```
np_fast_cusum
n = 56234132 values
t = 201.806 ms
n/t = 278653.8114 values per msnp_fast_cusum
n = 100000000 values
t = 350.611 ms
n/t = 285216.7553 values per ms...
function n_values t_milliseconds values_per_ms
0 slow_cusum 1 0.012 85.0196
1 slow_cusum 3 0.005 640.7530
...
14 slow_cusum 5623 1298.218 4.3313
15 slow_cusum 10000 4140.327 2.4153
...
30 slow_cusum_expanded 5623 1878.419 2.9935
31 slow_cusum_expanded 10000 5767.316 1.7339
...
62 python_fast_cusum 56234132 5727.162 9818.8478
63 python_fast_cusum 100000000 10939.993 9140.7733
...
94 pandas_fast_cusum 56234132 442.652 127039.2437
95 pandas_fast_cusum 100000000 780.461 128129.3962
...
126 numba_fast_cusum 56234132 139.602 402816.3295
127 numba_fast_cusum 100000000 236.445 422930.9936
...
158 np_fast_cusum 56234132 201.806 278653.8114
159 np_fast_cusum 100000000 350.611 285216.7553```
I use the profiler frequently in my own work. It allows me to analyze the relationship between computational complexity and raw execution time pretty easily.
-----------------
#### Dependencies
I have included a `dependencies.txt`, but you should be fine with a blank Python 3 environment followed by ...
```
pip install numpy pandas numba joblib matplotlib pillow
```----
![](cover.png)