{"id":19159898,"url":"https://github.com/mbelmadani/motifgp","last_synced_at":"2025-07-07T21:05:58.742Z","repository":{"id":162240872,"uuid":"65271881","full_name":"mbelmadani/motifgp","owner":"mbelmadani","description":"Motif discovery for DNA sequences using multiobjective optimization and genetic programming.","archived":false,"fork":false,"pushed_at":"2018-07-24T21:20:28.000Z","size":209,"stargazers_count":6,"open_issues_count":2,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-05-07T09:14:57.392Z","etag":null,"topics":["bioinformatics","chip-seq","deap","dna","dna-sequences","genetic-programming","jaspar","motif","motif-discovery","multiobjective-optimization","network-expressions","nsga-ii","pareto-front","python","regular-expressions","sequences","strongly-typed","transcription-factor-binding","transcription-factors"],"latest_commit_sha":null,"homepage":"https://mbelmadani.github.io/motifgp/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"lgpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mbelmadani.png","metadata":{"files":{"readme":"README.txt","changelog":"CHANGELOG.txt","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2016-08-09T07:10:20.000Z","updated_at":"2023-05-08T03:37:24.000Z","dependencies_parsed_at":null,"dependency_job_id":"2321ba4c-c3e2-4fbb-930f-3dd4b70b1bf2","html_url":"https://github.com/mbelmadani/motifgp","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbelmadani%2Fmotifgp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbelmadani%2Fmotifgp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbelmadani%2Fmotifgp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbelmadani%2Fmotifgp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mbelmadani","download_url":"https://codeload.github.com/mbelmadani/motifgp/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252847522,"owners_count":21813458,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","chip-seq","deap","dna","dna-sequences","genetic-programming","jaspar","motif","motif-discovery","multiobjective-optimization","network-expressions","nsga-ii","pareto-front","python","regular-expressions","sequences","strongly-typed","transcription-factor-binding","transcription-factors"],"created_at":"2024-11-09T08:52:42.956Z","updated_at":"2025-05-07T09:15:10.832Z","avatar_url":"https://github.com/mbelmadani.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"===============\n= MotifGP 0.2 =\n===============\nMotifGP is a de novo motif discovery tool for discriminatory network expression identification in ChIP-seq datasets.\n\nOriginal author: Manuel Belmadani\n\tmbelm006@uottawa.ca\n\nThe project is documented by the following publications.\n\nManuel Belmadani and Marcel Turcotte. MotifGP: Using multi-objective evolutionary computing for mining network expressions\nin DNA sequences. In IEEE International Conference on Computational Intelligence in Bioinformatics and Computational Biology\n(CIBCB 2016), Chiang Mai, Thailand, October, 5-7, 2016. \nhttps://doi.org/10.1109/CIBCB.2016.7758133\n\nManuel Belmadani. MotifGP: DNA motif discovery using multiobjective evolution. Master of computer science, \nUniversity of Ottawa, School of Electrical Engineering and Computer Science, 2016. \nAvailable from University of Ottawa Research under: http://www.ruor.uottawa.ca/handle/10393/34213\n\nAcknowledgements:\nMotifGP is using source code from these tools:\n-hypergeometric.py from the MEME Suite (License and copyright in source file).\n-altschulEriksonDinuclShuffle.py from Peter Clote - CLOTE Computational Biology LAB, http://clavius.bc.edu/~clotelab/RNAdinucleotideShuffle/\nThis software was also made using the DEAP - Fortin, F.-A., De Rainville, F.-M., Gardner, M.-A. G., \nParizeau, M. \u0026 Gagné, C. DEAP: Evolutionary Algorithms Made Easy. J. Mach. Learn. Res. 13, \n2171–2175 (2012).\n\n=======================================================================================\nLicense: (see LICENSE.txt)\n=======================================================================================\nInstallation: (see INSTALL.txt)\n=======================================================================================\nExamples: (see EXAMPLES.txt)\n=======================================================================================\nUsage: motifgp.py [options]\n\nOptions:\n  -h, --help            show this help message and exit\n  -p TRAINING_PATH, --training=TRAINING_PATH\n                        Fasta file to use for training (input) sequence data\n  -b BACKGROUND_PATH, --background=BACKGROUND_PATH\n                        [Optional] Fasta file to use for background (control)\n                        sequence data. If not provided, a the generated\n                        control sequences will be written to runtime_tmp/\n  -m MOO, --moo=MOO     Multi-objective optimization [SPEA2, NSGA2, NSGAR,\n                        MOEAD]. NSGAR is the NSGA-II_R (NSGA-II Revised)\n                        algorithm improvement of NSGA2.\n  -f FITNESS, --fitness=FITNESS\n                        Objective fitness function. Available objectives: D=Di\n                        scrimination,F=Fisher,I=ScipyFisher,O=OddsRatio,Q=Fals\n                        eDiscoveryRate,S=Support,R=ScipyOddsRatio. Each single\n                        character in the string represents an objective.\n                        Objectives are mapped by the configuration file at\n                        config/objectives. Default is 'DF' for\n                        [Discrimination,Fisher] (2-objectives).\n  --cxpb=CXPB           Probability [0.0 to 1.0] for a crossover during\n                        variation. Requires --mutpb to be set to (1.0-cxpb).\n                        Default is 0.7.\n  --mutpb=MUTPB         Probability [0.0 to 1.0] for a mutation during\n                        variation. Requires --cxpb to be set to (1.0-mutpb).\n                        Default is 0.3.\n  --short=SHORT         Stops reading in after \u003cSHORT\u003e input sequences.\n  --popsize=POPSIZE     Size of the population.\n  --revcomp             Compile regex with reverse complement\n  --random-seed=RANDOM_SEED\n                        Random seed value to set for execution\n  -n NGEN, --num-gen=NGEN\n                        Generation where runtime stops (even in the case of\n                        resumed checkpoints)\n  --timelimit=TIMELIMIT\n                        Time limit on the GP loop execution.\n  --matcher=MATCHER     Use a different matcher. Options: 'grep', 'python'.\n                        'grep' is faster on large datasets, while 'python' is\n                        a pure python version in case the system doesn't\n                        support grep.\n  -o OUTPUT_PATH, --output=OUTPUT_PATH\n                        Output directory. Default is ./OUT/\n  -t TAG, --tag=TAG     A tag for the output subdirectory. Use to describes\n                        the run and saves it in the tag's subdirectory in the\n                        output directory. default is 'default'.\n  -i, --inspector       Don't print any files. Can be useful with python -i\n                        (interactive mode).\n  --hardmask            Replace tandem repeats (lower-case typed nucleotides)\n                        by N\n  -g GRAMMAR, --grammar=GRAMMAR\n                        Grammar for the STGP [min, iupac, full, ne]. Default\n                        is iupac. 'min' only uses nucleotides. 'iupac' is a\n                        network expression grammar. 'full' is a network\n                        expression grammar with additional regular expression\n                        tokens. 'ne' is like iupac, but built with string\n                        primitives instead of booleans.\n  -e ERASE, --erase=ERASE\n                        Input .nef(t) file to delete from the dataset prior to\n                        execution. Used for sequential coverage.\n  --backpad             Pads background sequences with consecutive nucleotides\n                        (ie. AAAAAAAA,CCCCCCCC,GGGGGGGG,TTTTTTTT) of length 8\n                        every set of 4 sequences.\n  --bg-algo=BG_ALGO     Shuffling algorithm for background. Default is\n                        'dinuclShuffle', if no background dataset it provided.\n                        Currently, dinuclShuffle is the only implemented\n                        method.\n  --ncpu=NCPU           Number of CPUs to use when mapping evaluation of\n                        solutions. Use an integer, \"auto\" to automatically\n                        dertmine the maximum number. Default is no\n                        parallelism.\n  --termination=TERMINATION\n                        Use automatic termination algorithm. User 'auto' to\n                        used the automatic termination algorithm for MOEAs.\n  --hamming             [Experimental] Generates statistics on the hamming\n                        distance from a template regex and hof candidates.\n  --seeded-population   [Experimental] Use population seeds\n  -c CHECKPOINT_PATH, --checkpoint=CHECKPOINT_PATH\n                        [Temporarily disabled] Load a checkpoint at path.\n  -q, --quiet           [Unimplemented] don't print status messages to stdout\n\n\nAlso consider looking at EXAMPLES.txt for basic examples of MotifGP usage.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmbelmadani%2Fmotifgp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmbelmadani%2Fmotifgp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmbelmadani%2Fmotifgp/lists"}