https://github.com/vaaaaanquish/gokart-examples
gokart examples for m3 techbook 2
https://github.com/vaaaaanquish/gokart-examples
Last synced: about 2 months ago
JSON representation
gokart examples for m3 techbook 2
- Host: GitHub
- URL: https://github.com/vaaaaanquish/gokart-examples
- Owner: vaaaaanquish
- Created: 2020-02-13T07:46:53.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-02-15T06:41:09.000Z (about 5 years ago)
- Last Synced: 2025-01-18T03:40:39.677Z (3 months ago)
- Language: Jupyter Notebook
- Size: 11.7 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# gokart-examples
gokart examples for m3 techbook 2for `技術書典8`.
url: https://techbookfest.org/event/tbf08/circle/5759156128579584# 1, Make project by cookiecutter-gokart
We can use project templete by [cookiecutter](https://github.com/cookiecutter/cookiecutter).
```
$ cookiecutter https://github.com/m3dev/cookiecutter-gokart
project_name [project_name]: GokartSample
package_name [package_name]: sample
python_version [3.6]:
author [your name]: vaaaaanquish
package_description [What's this project?]: gokart task example
license [MIT License]:
``````
$ tree .
.
├── GokartSample
│ ├── README.md
│ ├── conf
│ │ ├── logging.ini
│ │ └── param.ini
│ ├── main.py
│ ├── sample
│ │ ├── __init__.py
│ │ └── model
│ │ ├── __init__.py
│ │ └── sample.py
│ ├── setup.py
│ └── test
│ └── unit_test
│ ├── __init__.py
│ └── test_sample.py
└── README.md
```Unittest has already been added :)
```
$ cd GokartSample/
$ python -m unittest discover -s ./test/unit_test
.
----------------------------------------------------------------------
Ran 1 test in 0.001sOK
```
start test-driven development!# 2, Development gokart task
example: GokartSample/sample/model/sample.py
- TaskA: dump dict
- TaskB: update dict
- TaskC: TasA -> TaskB and update dictTDD. It is a correspondence table between task and test.
sample/model/sample.py
test/unit_test/test_sample.py
class TaskA(gokart.TaskOnKart):
sample = luigi.Parameter()def run(self):
results = {'param_a': 1, 'param_b': self.sample}
self.dump(results)class TestTaskA(unittest.TestCase):
def setUp(self):
self.task = TaskA(sample=None)
self.output_data = Nonedef test_run(self):
source = 'hoge'
target = {'param_a': 1, 'param_b': 'hoge'}self.task.sample = source
self.task.dump = MagicMock(side_effect=self._dump)self.task.run()
self.assertDictEqual(self.output_data, target)def _dump(self, data):
self.output_data = dataclass TaskB(gokart.TaskOnKart):
task = gokart.TaskInstanceParameter()def requires(self):
return self.taskdef run(self):
params = self.load()
params.update({'trained': True}) # training model
self.dump(params)class TestTaskB(unittest.TestCase):
def setUp(self):
self.task = TaskB(task=gokart.TaskOnKart())
self.output_data = Nonedef test_run(self):
source = {'test': 'hoge'}
target = {'test': 'hoge', 'trained': True}self.task.load = MagicMock(side_effect=lambda: source)
self.task.dump = MagicMock(side_effect=self._dump)self.task.run()
self.assertDictEqual(self.output_data, target)def _dump(self, data):
self.output_data = dataclass TaskC(gokart.TaskOnKart):
def requires(self):
return TaskB(task=self.clone(TaskA, sample='hoge'))def run(self):
model = self.load()
model.update({'task_name': 'task_c'})
self.dump(model)class TestTaskC(unittest.TestCase):
def setUp(self):
self.task = TaskC()
self.output_data = Nonedef test_run(self):
source = {'test': 'hoge'}
target = {'test': 'hoge', 'task_name': 'task_c'}self.task.load = MagicMock(side_effect=lambda: source)
self.task.dump = MagicMock(side_effect=self._dump)self.task.run()
self.assertDictEqual(self.output_data, target)def _dump(self, data):
self.output_data = data
# 3 running gokart
Let's running gokart task.
```
# add new parameter info GokartSample/conf/param.ini
[sample.TaskA]
sample='sample param'
```run.
```
$ python main.py sample.TaskC --local-scheduler...
===== Luigi Execution Summary =====
Scheduled 3 tasks of which:
* 3 ran successfully:
- 1 sample.TaskA(...)
- 1 sample.TaskB(...)
- 1 sample.TaskC(...)This progress looks :) because there were no failed tasks or missing dependencies
===== Luigi Execution Summary =====
```
↓
check output files
↓
```
$ tree resourceresource/
├── log
│ ├── module_versions
│ │ ├── TaskA_74416d6e12945172d2ae8a4eaa6bc9de.txt
│ │ ├── TaskB_cb786f70051e31e9f4fcdf2aac06b533.txt
│ │ └── TaskC_e49eb8498c2ad327e1ca4ead88cea1ad.txt
│ ├── processing_time
│ │ ├── TaskA_74416d6e12945172d2ae8a4eaa6bc9de.pkl
│ │ ├── TaskB_cb786f70051e31e9f4fcdf2aac06b533.pkl
│ │ └── TaskC_e49eb8498c2ad327e1ca4ead88cea1ad.pkl
│ ├── task_log
│ │ ├── TaskA_74416d6e12945172d2ae8a4eaa6bc9de.pkl
│ │ ├── TaskB_cb786f70051e31e9f4fcdf2aac06b533.pkl
│ │ └── TaskC_e49eb8498c2ad327e1ca4ead88cea1ad.pkl
│ └── task_params
│ ├── TaskA_74416d6e12945172d2ae8a4eaa6bc9de.pkl
│ ├── TaskB_cb786f70051e31e9f4fcdf2aac06b533.pkl
│ └── TaskC_e49eb8498c2ad327e1ca4ead88cea1ad.pkl
└── sample
└── model
└── sample
├── TaskA_74416d6e12945172d2ae8a4eaa6bc9de.pkl
├── TaskB_cb786f70051e31e9f4fcdf2aac06b533.pkl
└── TaskC_e49eb8498c2ad327e1ca4ead88cea1ad.pkl8 directories, 15 files
```# 3, Check data by thunderbolt
Please see `./ExampleThunderbolt.ipynb`
```
from thunderbolt import Thunderbolt
tb = Thunderbolt('./resource')
print(tb.get_data(’TaskC’)) # outout TaskC's result
print(tb.get_task_df()) # get all task parameter's
```# 4, Make Machine Learning Task using redshells
Please see redshells examples.
https://github.com/m3dev/redshells/tree/master/examples
# Thanks.
- luigi: https://github.com/spotify/luigi
- cookiecutter-gokart: https://github.com/m3dev/cookiecutter-gokart
- gokart: https://github.com/m3dev/gokart
- thunderbolt: https://github.com/m3dev/thunderbolt
- redshells: https://github.com/m3dev/redshellsThanks :)