Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/s-leroux/fin
Set of tools for personal investment
https://github.com/s-leroux/fin
Last synced: about 1 month ago
JSON representation
Set of tools for personal investment
- Host: GitHub
- URL: https://github.com/s-leroux/fin
- Owner: s-leroux
- License: mit
- Created: 2023-02-01T10:47:34.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-13T21:45:57.000Z (9 months ago)
- Last Synced: 2024-04-13T21:52:59.697Z (9 months ago)
- Language: Python
- Size: 646 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# fin
This project aims to provide a set of personal investment tools with minimum dependencies.The project does not have a GUI. You interact with the tools by writing Python scripts. The most actively developed part of the project is the `fin.seq` package that provides time/data series manipulation functions.
Topic specific documentations:
* [Simulation and strategy backtesting](./docs/simul.md)
* [Cleaning up Data](./docs/cleaningup.md)# Getting started
I keep the dependencies to a minimum. Currently, outside Python 3 (≥ 3.6.9) and the standard Python library, you need:* GNU Make (≥ 4.1)
* Python Requests (≥ 2.18.4)
* Cython3 (≥ 0.26.1)
* Gnuplot (≥ 5.2)Some development was made regarding web crawling and data mining using BeautifulSoup, but it is currently out of the main tree.
## Prerequisites
The development is done under Linux Ubuntu Bionic.```
apt-get install python3 cython3 python3-requests gnuplot-x11
```## Installation
Download the project using Git, enter the directory, and run `make compile` to compile and build the Cython-generated C files, and `make tests-all` to run the all test suite:```
git clone [email protected]:s-leroux/fin.git
cd fin
make compile
make tests-all
```# `fin.seq`
This package allows data manipulations using the concept of series.
A serie is a set of columns associated with an index.
The index itself is a column with the special property of being ordered (in ascending order).
Series are implemented in the `fin.seq.Serie` class.The most straightforward example is a time serie representing stock quotes.
In that case, the _date_ is the index of the serie, and the _open_,_high_, _low_, and _close_ values are stored in the individual data columns of the serie.
For example, the `fin.api` package can retrieve historical quotations from *Yahoo! Finance*, and return the result as a serie:```
from fin.api import yf
from fin.datetime import CalendarDateDelta, CalendarDateclient = yf.Client()
t = client.historical_data("TSLA", CalendarDateDelta(days=5), CalendarDate(2023, 7, 20))
print(t)
``````
sh$ cat ./docs/snippets/snippet_1_00*.py | python3
Date | Open | High | Low | Close | Adj Cl… | Volume
---------- | ------- | ------- | ------- | ------- | ------- | ---------
2023-07-17 | 286.630 | 292.230 | 283.570 | 290.380 | 290.380 | 131569600
2023-07-18 | 290.150 | 295.260 | 286.010 | 293.340 | 293.340 | 112434700
2023-07-19 | 296.040 | 299.290 | 289.520 | 291.260 | 291.260 | 142355400
2023-07-20 | 279.560 | 280.930 | 261.200 | 262.900 | 262.900 | 175158300
```You can manually create serie using the `fin.seq.serie.Serie.create` factory method.
The first parameter defines the index, and the remaining parameters define the data columns.
Each is defined using a LISP-inspired mini-language:Here is a short example (from `examples/fin/seq/basic.py`):
```
from fin.seq import serie
from fin.seq import fcfrom math import pi, sin, cos
"""
Basic usage of the `fin.seq` packageUsage:
PYTHONPATH="$PWD" python3 examples/fin/seq/basic.py
"""def deg2rad(deg):
return 2*pi*deg/360t = serie.Serie.create(
# Create a 361-rows serie
(fc.named("ROW NUMBER"), fc.range(361)),
# Maps the first column to the [0, 2π] range
(fc.named("ANGLE"), fc.map(deg2rad), "ROW NUMBER"),
# Do the same to map than ANGLE column to sin() and cos()
(fc.named("SIN"), fc.map(sin), "ANGLE"),
(fc.named("COS"), fc.map(cos), "ANGLE"),
)# Print the serie
print(t)
```Here is the result when you run this script:
```
sh$ python3 < examples/fin/seq/basic.py | head -10
RO… | ANGLE | SIN | COS
--- | -------------------- | ----------------------- | -----------------------
0 | 0.0 | 0.0 | 1.0
1 | 0.017453292519943295 | 0.01745240643728351 | 0.9998476951563913
2 | 0.03490658503988659 | 0.03489949670250097 | 0.9993908270190958
3 | 0.05235987755982988 | 0.05233595624294383 | 0.9986295347545738
4 | 0.06981317007977318 | 0.0697564737441253 | 0.9975640502598242
5 | 0.08726646259971647 | 0.08715574274765817 | 0.9961946980917455
6 | 0.10471975511965977 | 0.10452846326765346 | 0.9945218953682733
7 | 0.12217304763960307 | 0.12186934340514748 | 0.992546151641322
```You can load that table in your favorite spreadsheet to plot the SIN/COS graph. If you have `gnuplot` installed on your system, you can also plot it directly from Python:
```
# Plot the SIN/COS function:
from fin.seq import plot
mp = plot.Multiplot(t, "SIN", mode="XY")
p = mp.new_plot()
p.draw_line("COS")plot.gnuplot(mp, size=(800,600))
```![A basic usage example of `fin.seq` displaying a circle](docs/images/basic.png)
## Joining two series
Series support join operations on the _index_ column.
It is the caller's responsibility to ensure the key columns are *sorted in ascending order*.
Future versions will enforce that requirement.
Until that, joining series using an unordered index should be considered an *undefined behavior*.### Inner join
When performing an _inner join_, the result serie will contain only rows present in both series according to the index.
The _inner join_ is implemented as the `&` (*and*) operator between series:```
from fin.seq import serie
from fin.seq import fcs1 = serie.Serie.create(
(fc.named("X"), fc.sequence((1,2,3,4))),
(fc.named("Y"), fc.mul, "X", fc.constant(10)),
)print(s1)
# Display:
# X, Y
# 1, 10.0
# 2, 20.0
# 3, 30.0
# 4, 40.0s2 = serie.Serie.create(
(fc.named("X"), fc.sequence((1,4))),
(fc.named("Z"), fc.mul, "X", fc.constant(100)),
)print(s2)
# Display:
# X, Z
# 1, 100.0
# 4, 400.0
# 5, 500.0print(s1 & s2)
```The result of the inner join operation is:
```
sh$ < ./docs/snippets/snippet_2_001.py python3 | sed -n '/s1 & s2/,$p'
s1 & s2 is:
X | Y | Z
- | ---- | -----
1 | 10.0 | 100.0
4 | 40.0 | 400.0
```### Full outer join
When performing a _full outer join_, the result serie will contain the rows present in either (or both) series in the index order.
The _full outer join_ is implemented as the `|` (*or*) operator between series:```
# Continuing from the previous exampleprint(s1 | s2)
``````
sh$ cat ./docs/snippets/snippet_2_00[12].py | python3 | sed -n '/s1 | s2/,$p'
s1 | s2 is:
X | Y | Z
- | ---- | -----
1 | 10.0 | 100.0
2 | 20.0 | None
3 | 30.0 | None
4 | 40.0 | 400.0
5 | None | 500.0
```## Loading financial data
You can use the `fin.seq` package like a command-line spreadsheet. However, its primary purpose remains working with financial data.Currently, the library supports the *Yahoo! Finance* and *eodhistoricaldata.com* data providers for historical quotes.
In the next example, we will load from *Yahoo! Finance* the last 100 end-of-day quote for *Bank of America* (ticker `BAC`):
```
from fin.api import yf
from fin.seq import fc
from fin.seq import plot# Use the Yahoo! Finance provider
provider = yf.Client()t1 = provider.historical_data("BAC", dict(days=100))
```The provider returns a serie (instance of `serie.Serie`) with the data, open, high, low, close, adj close, and volumes columns.
Serie are _immutable_.However, you can create a projection with the `selection` member function.
A projection is a series whose columns are calculated from the original serie.For example, if you are interested only in the _open_, _high_, _low_, _close_ values, and the 5-perod simple moving average (_sma_) of the _close_ prices, you can write:
```
t2 = t1.select(
"Open",
"High",
"Low",
"Close",
(fc.sma(5), "Close"),
)print(t2)
```Running from the terminal, you get:
```
sh$ cat ./docs/snippets/snippet_3_00[12].py | python3 | head -10
Date | Open | High | Low | Close | SMA(…
---------- | ----- | ----- | ----- | ----- | -----
2023-12-26 | 33.45 | 33.96 | 33.37 | 33.86 | None
2023-12-27 | 33.80 | 33.95 | 33.66 | 33.84 | None
2023-12-28 | 33.82 | 33.97 | 33.77 | 33.88 | None
2023-12-29 | 33.94 | 33.99 | 33.55 | 33.67 | None
2024-01-02 | 33.39 | 34.07 | 33.27 | 33.90 | 33.83
2024-01-03 | 33.65 | 33.77 | 33.24 | 33.53 | 33.76
2024-01-04 | 33.57 | 34.31 | 33.54 | 33.80 | 33.76
2024-01-05 | 33.80 | 34.69 | 33.71 | 34.43 | 33.87
```Finally, let's plot the graph:
```
sma = t2.columns[-1]mp = plot.Multiplot(t2, "Date")
p = mp.new_plot(3)
p.draw_candlestick("Open", "High", "Low", "Close")
p.draw_line(sma.name)plot.gnuplot(mp, size=(1000,600), font="Sans,8")
```
Et voilà:
![A candlestick plot of the last 100 daily quotations for Bank of America](docs/images/candlesticks.png)# `fin.model.solvers`
Version 0.2.1 introduced a new multi-variable solver framework in `fin.model.solvers`.
Currently, two solvers have been implemented:1. The `RandomSolver` simply draws a (potentially large) number of random solutions and returns the best guess.
This solver is mostly a proof-of-concept for the solver framework.
2. The `ParticleSwarmSolver`, an implementation of the [Particle swarm optimization](https://en.wikipedia.org/wiki/Particle_swarm_optimization) algorithm.To use those solvers, you must first build a `fin.model.complexmodel.ComplexModel` to describe the problem to solve.
Once done, the `ComplexModel` can export the necessary information to feed the solver.In the following example we will find the duration of a placement to buy a good, taking into consideration the inflation.
Let's assume I plan to buy a good that costs $1000 today.
I only have $800 in the bank. The yearly inflation is 2%, and I have a placement that yields 4% each year.
How much time should I wait before I can buy that good?The solution to that problem can be found by solving the two constraints below:
```math
\begin{align}
800\times1.04^{duration} &= {buy price} \\
1000\times1.02^{duration} &= {buy price}
\end{align}
```The solver always tries to minimize (in absolute value) the constraints. We have to rewrite our equations to have zero on the right side:
```math
\begin{align}
800\times1.04^{duration} - {buy price} &= 0 \\
1000\times1.02^{duration} - {buy price} &= 0
\end{align}
```We are now ready to write the code:
```
from fin.model.complexmodel import ComplexModelmodel = ComplexModel()
eq1 = model.register(
lambda duration, buyprice : 800*1.04**duration-buyprice,
dict(name="duration", description="Placement duration in years"),
dict(name="buyprice", description="Good's buy price"),
)
eq2 = model.register(
lambda duration, buyprice : 1000*1.02**duration-buyprice,
dict(name="duration", description="Placement duration in years"),
dict(name="buyprice", description="Good's buy price"),
)
```We used the same name for the corresponding parameters in both equations.
However, the `ComplexModel` logic does not automatically infer that those parameters are the same.
You have to say it explicitly:```
model.bind(eq1, "duration", eq2, "duration")
model.bind(eq1, "buyprice", eq2, "buyprice")
```We will also set the domain of possible solutions for the _duration_ parameter between 1 and 100 years,
and for the _buyprice_ between $1 and $10000:```
model.domain(eq1, "duration", 1, 100)
model.domain(eq1, "buyprice", 1, 10000)
```**Pitfall:** While not mandatory, providing a domain for the unknown parameters is always better.
This will speed up convergence toward a solution and, most importantly, prevent the solver from remaining stuck in areas producing _infinity_ or _NaN_ ([Not a Number](https://en.wikipedia.org/wiki/NaN)) results.You are now ready to export the model:
```
params, domains, eqs = model.export()from pprint import pprint
pprint(params)
pprint(domains)
pprint(eqs)
```Displaying:
```
[{'description': 'Placement duration in years', 'name': 'duration'},
{'description': "Good's buy price", 'name': 'buyprice'}]
[(1, 100), (1, 10000)]
[( at 0x7f8a5bbf7ea0>, [0, 1]),
( at 0x7f8a58694bf8>, [0, 1])]
```It is not very useful _per se_, but we may not feed that into a solver to obtain a solution:
```
from fin.model.solvers import ParticleSwarmSolver
solver = ParticleSwarmSolver()
score, result = solver.solve(domains, eqs)print(f"Score {score}")
for param, value in zip(params, result):
print(f"{param['description']:20s}: {value}")
``````
Score 1.9085888931664045e-09
Placement duration in years: 11.491534021542874
Good's buy price : 1255.5360323116345
```The closer the score is to zero, the better the solution is.
Here, with a score of 2e-09, we have a pretty good solution.I will have to wait 11½ years, and the buy price will be $1255—assuming, of course, all parameters remain constant for such a long time.
# `fin.model`
The solver presented in this section is a legacy solver. It is mostly superseded by the new multi-variable solver implemented in `fin.model.solvers`.
The predefined models haven't been ported to that new framework, though.
Until then, the information given here remains valid.The ``fin`` package also contains a simple 1-variable solver (implemented in ``fin.math``) designed to work seamlessly with predefined models.
For example, using the [Kelly Criterion](https://en.wikipedia.org/wiki/Kelly_criterion) you can find the optimum allocation for a risky investment:
```
WIN=0.20
LOSS=0.20
WIN_PROB=0.60model = kelly.KellyCriterion(dict(
p=WIN_PROB,
a=WIN,
b=LOSS,
))f_star = model['f_star']
```You can solve a model for any variable (bearing the solver's limitation).
For example, if I'm ready to raise my allocation up to 50% of the available funds, and given a +/- 20% outcome, which probability to win do I implicitly assume?```
WIN=0.20
LOSS=0.20
ALLOC=0.50model = kelly.KellyCriterion(dict(
a=WIN,
b=LOSS,
f_star=ALLOC
))print("Implied probability to win =", model['p'])
```