{"id":23975911,"url":"https://github.com/matthieu637/lhpo","last_synced_at":"2025-04-14T00:12:48.181Z","repository":{"id":27399888,"uuid":"30876330","full_name":"matthieu637/lhpo","owner":"matthieu637","description":"Lightweight HyperParameter Optimizer","archived":false,"fork":false,"pushed_at":"2021-01-06T04:52:54.000Z","size":120,"stargazers_count":6,"open_issues_count":1,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-14T00:12:26.064Z","etag":null,"topics":["aws","cluster","grid5000","gridsearch","metaoptimization"],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/matthieu637.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-02-16T16:17:00.000Z","updated_at":"2024-04-25T12:24:13.000Z","dependencies_parsed_at":"2022-09-02T00:51:39.144Z","dependency_job_id":null,"html_url":"https://github.com/matthieu637/lhpo","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matthieu637%2Flhpo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matthieu637%2Flhpo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matthieu637%2Flhpo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matthieu637%2Flhpo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/matthieu637","download_url":"https://codeload.github.com/matthieu637/lhpo/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248799956,"owners_count":21163404,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","cluster","grid5000","gridsearch","metaoptimization"],"created_at":"2025-01-07T06:52:37.201Z","updated_at":"2025-04-14T00:12:48.148Z","avatar_url":"https://github.com/matthieu637.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# lhpo\nLightweight HyperParameter Optimizer (gridsearch)\n\nRun experiments with different parameters, to average, ...\n\nIf you find this code useful in your research, please consider citing:\n```\nZimmer, M. (2018). Apprentissage par renforcement développemental (Doctoral dissertation).\n\n@phdthesis{zimmer2018phd,\n  title = {Apprentissage par renforcement développemental},\n  author = {Zimmer, Matthieu},\n  school = {University of Lorraine},\n  year = {2018},\n  month = {January}\n}\n```\n\n### Dependencies :\n- python3\n- xmlstarlet\n- python3-joblib\n- octave (optional for statistics/graphs)\n```\n#for Ubuntu to have acces to xml as on ArchLinux\nsudo ln -s /usr/bin/xmlstartlet /usr/local/bin/xml\n#or add this line to your .bashrc\nalias xml='/usr/bin/xmlstarlet'\n```\n\n### Usage :\nCreate a rules.xml file in a dir and run\n```bash\n$ ./parsing_rules.bash dir/\n```\n\nThis will generate different folds in dir.\n\nNow you can call \n```bash\n$ ./optimizer.bash dir/\n```\nfrom every node that can participate.\n\nIf you want to monitor the progress :\n```bash\n$ ./count.bash dir/\n```\n\nIf you want to remove all fold in the dir (be careful you'll lost your previous generated data):\n```bash\n$ ./clear.bash dir/\n```\n\n### How to use with a computer cluster\n- [Grid5000](https://www.grid5000.fr/)\n- [Amazon Web Services](https://github.com/matthieu637/lhpo/tree/master/aws) setup\n\nlhpo relies on synchronization by NFS. There is no need to allocate a specific number of resources.\n\nExample with 3 agents controlloing the whole process :\n- [grid.booker] a script checks that there is work remaining (with ./count.bash \\\u003cdir\\\u003e), monitor which nodes are free, then makes a reservation\n- [grid.cleaner] a script checks that each running work has indeed an online node (they might be\nkilled before having the time to remove the allocation), if not the work is tagged to-be-done again (with ./count.bash \\\u003cdir\\\u003e --remove-dead-node)\n- [grid.balancer] a script monitors the CPU used by each reserved node, if it remains a free ”slot” it tells\nthe node to run another experiment in parallel (because some algorithms can be parallelized on several threads and\nothers not)\n\nScripts for those behaviors are given is aws/grid5000 directories.\n\n### Example (performing a gridsearch to optimize hyperparameters):\n\nrun.py\n```python\nimport configparser\nimport time\n \n#the main script read hyperparameters through config.ini\nconfig = configparser.ConfigParser()\nconfig.read('config.ini')\nalgo=config['agent']['algo']\nalpha=float(config['agent']['alpha'])\nnoise=float(config['agent']['noise'])\nstart_time = time.time()\n \n#YOUR MAIN ALGO\n#...\n \n#end of script - inform that the script finished\nwith open('time_elapsed', 'w') as f:\n  f.write('%d' % int((time.time() - start_time)/60))\n#you probably also want to write some file with the result\n```\n\nGet the hyper-optimization tool on your frontend.\n```bash\ncd YOUR_LHPO_PATH\ngit clone https://github.com/matthieu637/lhpo.git\n```\n\nPrepare experiment directory\n```bash\ncd\nmkdir -p exp/continuous_bandit_perfect_critic\ncd exp/continuous_bandit_perfect_critic\n#set up the file to describe your hyper parameters \nvim rules.xml\n```\n\n```xml\n\u003cxml\u003e\n        \u003ccommand value='/home/nfs/mzimmer/python_test/run.py' /\u003e\n        \u003cargs value='' /\u003e\n \n        \u003c!-- my script is already multi-threaded --\u003e\n        \u003cmax_cpu value='1' /\u003e\n \n        \u003cini_file value='config.ini' /\u003e\n        \u003cend_file value='time_elapsed' /\u003e\n \n        \u003cfold name='learning_speed' \u003e\n               \u003cparam name='algo' values='SPG,DPG' /\u003e\n               \u003cparam name='alpha' values='0.01,0.001,0.0001' /\u003e\n               \u003cparam name='noise' values='0.01,0.001,0.0001' /\u003e\n        \u003c/fold\u003e\n \n\u003c/xml\u003e\n```\n\n```bash\n#set up the config file (these values will be changed automatically) \nvim config.ini\n```\n\n```ini\n[agent]\nalgo=SPG\nnoise=0.01\nalpha=0.0001\n```\nGenerate the possible hyperparameters combinations with :\n```bash\ncd YOUR_LHPO_PATH\n./parsing_rules.bash ~/exp/continuous_bandit_perfect_critic\n#check how many exp will be performed\n./count.bash ~/exp/continuous_bandit_perfect_critic\n#-\u003e running : 0\n#-\u003e to do : 18\n#-\u003e done : 0\n```\n\nNow you can launch experiments (in this example, we use OAR with 2 jobs to perform the 18 exp).\n```bash\noarsub -lhost=1/thread=5,walltime=30:00:00 \"cdl ; ./optimizer.bash ~/exp/continuous_bandit_perfect_critic\"\noarsub -lhost=1/thread=10,walltime=30:00:00 \"cdl ; ./optimizer.bash ~/exp/continuous_bandit_perfect_critic\"\n \n#note that my bashrc contains those two lines : \nshopt -s expand_aliases\nalias cdl='cd YOUR_LHPO_PATH'\n```\nIf you don't use OAR but at least NFS to share the \"~/exp/continuous_bandit_perfect_critic\".\nYou must dispatch the optimizer.bash command by hand on each computer or through ssh:\n```bash\ncd YOUR_LHPO_PATH ;  ./optimizer.bash ~/exp/continuous_bandit_perfect_critic\n```\nIf you don't even use NFS, then you can only use the optimizer.bash on one computer.\n\nAfter that you can monitor the progress with\n```bash\ncd YOUR_LHPO_PATH\n./count.bash ~/exp/continuous_bandit_perfect_critic/\n#-\u003e running : 2\n#-\u003e to do : 16\n#-\u003e done : 0\n```\n\nOnce its finished, the results looks like :\n```bash\n#move to exp dir\ncd ~/exp/continuous_bandit_perfect_critic\n#move to fold\ncd learning_speed\nls\n#this contains all the dir with all setup of the fold\n#DAC_0.1_0.1  rules.out  SAC_0.001_0.01  SAC_0.01_0.001 ...\nls -l DAC_0.1_0.1\n#-\u003e config.ini : the config.ini for this experiment\n#-\u003e executable.trace : executable used with which args\n#-\u003e full.trace : everything your executable output to stderr and stdout\n#-\u003e host : which computer performed the experiment\n#-\u003e host_tmp : in which tmp dir (to reduce NFS read/write)\n#-\u003e perf.data : data written by script\n#-\u003e testing.data : data written by script\n#-\u003e time_elapsed : number of min to perform this exp\n```\n\n### Example of scripts\nIf you want to optimize python executable, you might need to specify first a bash script as \"command\" in rules.xml in order to activate the python virtual environmnent, etc.\n\nFor [DDRL](https://github.com/matthieu637/ddrl):\n```bash\n#!/bin/bash\n\n#define some environment variable\nexport LANG=en_US.UTF-8\nexport OMP_NUM_THREADS=1\nexport MUJOCO_PY_MJKEY_PATH=~/.mujoco/$(hostname)/mjkey.txt\n\n#activate virtual env\n. /home/nfs/mzimmer/git/aaal/scripts/activate.bash\n#run executable\npython -O /home/nfs/mzimmer/git/ddrl/gym/run.py --goal-based\n\nexit $?\n```\n\nFor [OpenAI baselines](https://github.com/openai/baselines):\n```bash\n#!/bin/bash\n\nexport OMP_NUM_THREADS=1\n\n#activate virtual env\n. /home/nfs/mzimmer/git/aaal/scripts/activate.bash\n\n#convert config.ini to command line args and call baselines\nOPENAI_LOGDIR=. OPENAI_LOG_FORMAT=csv python -O -m baselines.run $(cat config.ini | grep '=' | sed 's/^/--/' | sed 's/$/ /' | xargs echo)\n#store status\nr=$?\n\n#for lhpo compatibility\necho '0' \u003e time_elapsed\n\nexit $r\n```\n\nFor [Augmented Random Search](https://github.com/modestyachts/ARS):\n```bash\n#!/bin/bash\n\nexport OMP_NUM_THREADS=1\nexport MKL_NUM_THREADS=1\nexport GOTO_NUM_THREADS=1\nexport RCALL_NUM_CPU=1\n\nmin_number() {\n    printf \"%s\\n\" \"$@\" | sort -g | head -n1\n}\n\n. /home/mzimmer/git/aaal/scripts/activate.bash\nPORT=$RANDOM\nWORKER=$(min_number $(cat config.ini | grep 'n_directions' | cut -d'=' -f2) $(cat /proc/cpuinfo | grep processor | wc -l))\nRAYDIR=$(ray start --head --redis-port=$PORT --num-cpus=$WORKER |\u0026 grep '/tmp/ray/session' | head -1 | sed 's|.* /tmp|/tmp|' | sed 's|/logs.*||')\n\npython -O /home/mzimmer/git/aaal/build/ARS/code/ars.py --port $PORT --seed $RANDOM --dir_path . --n_workers $WORKER $(cat config.ini | grep '=' | grep -v run | sed 's/^/--/' | sed 's/$/ /' | xargs echo) \nr=$?\necho '0' \u003e time_elapsed\n\nif [ $r -ne 0 ] ; then\n    exit $r\nfi\n\ncat log.txt | cut -f4 | grep -v 'timesteps' \u003e x.learning.data\necho \"# { \\\"t_start\\\": 1550549468.1440182, \\\"env_id\\\": \\\"$(cat config.ini | grep env | cut -d '=' -f2)\\\"}\" \u003e 0.1.monitor.csv\ncat log.txt | cut -f1,2,3 | sed -e 's/[\t]/,/g' \u003e\u003e 0.1.monitor.csv\nrm log.txt\n\n#stop ray for specific port\nkill $(cat $RAYDIR/logs/redis.out | grep pid | sed -e 's/^.*pid=\\([0-9]*\\),.*$/\\1/')\nkill $(cat $RAYDIR/logs/redis-shard_0.out | grep pid | sed -e 's/^.*pid=\\([0-9]*\\),.*$/\\1/')\nkill $(lsof -t $RAYDIR/sockets/plasma_store)\nrm -rf $RAYDIR\n\nexit 0\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmatthieu637%2Flhpo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmatthieu637%2Flhpo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmatthieu637%2Flhpo/lists"}