{"id":28385521,"url":"https://github.com/x-plug/mm_storyagent","last_synced_at":"2025-06-26T06:31:55.629Z","repository":{"id":253083823,"uuid":"842397131","full_name":"X-PLUG/MM_StoryAgent","owner":"X-PLUG","description":null,"archived":false,"fork":false,"pushed_at":"2024-08-23T10:31:42.000Z","size":10730,"stargazers_count":247,"open_issues_count":3,"forks_count":41,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-05-30T12:20:15.375Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/X-PLUG.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-14T09:11:28.000Z","updated_at":"2025-05-22T06:39:02.000Z","dependencies_parsed_at":"2024-08-19T08:35:06.861Z","dependency_job_id":null,"html_url":"https://github.com/X-PLUG/MM_StoryAgent","commit_stats":null,"previous_names":["x-plug/mm_storyagent"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/X-PLUG/MM_StoryAgent","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/X-PLUG%2FMM_StoryAgent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/X-PLUG%2FMM_StoryAgent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/X-PLUG%2FMM_StoryAgent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/X-PLUG%2FMM_StoryAgent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/X-PLUG","download_url":"https://codeload.github.com/X-PLUG/MM_StoryAgent/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/X-PLUG%2FMM_StoryAgent/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262014697,"owners_count":23245179,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-30T10:40:31.569Z","updated_at":"2025-06-26T06:31:55.609Z","avatar_url":"https://github.com/X-PLUG.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MM-StoryAgent\nThis repo is the official implementation of \"MM-StoryAgent: Immersive Narrated Storybook Video Generation with a Multi-Agent Paradigm across Text, Image and Audio\".\n\n## Introduction\nMM-StoryAgent is a multi-agent framework that employs LLMs and diverse expert tools across several modalities to produce expressive storytelling videos. It hightlights in the following aspects:\n* MM-StoryAgent designs a reliable and **customizable** workflow. Users can define their own expert tools to improve the generation quality of each component.\n* MM-StoryAgent writes **high-quality** stories based on the input story setting, in a multi-agent, multi-stage pipeline.\n* Agents of all modalities (image, speech, sound, music) generated corresponding assets are composed to an **immersive** storytelling video.\n\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./assets/framework.png\" alt=\"Framework\" style=\"width: 80%;\"\u003e\n\u003c/div\u003e\n\n\nBesides, we provide a story topic list and story evaluation criteria for further story writing evaluation.\n\n## News\n* Aug 16, 2024: The initial version of MM-StoryAgent was released.\n\n## Demo Video\nThe demo video is available:\n\n\u003cdiv align=\"center\"\u003e\n    \u003ca href=\"https://www.youtube.com/watch?v=2HXGrA8mg90\" target=\"_blank\"\u003e\n        \u003cimg src=\"https://res.cloudinary.com/marcomontalbano/image/upload/v1723627863/video_to_markdown/images/youtube--2HXGrA8mg90-c05b58ac6eb4c4700831b2b3070cd403.jpg\" alt=\"MM-StoryAgent demo\" style=\"width: 60%;\"/\u003e\n    \u003c/a\u003e\n\u003c/div\u003e\n\n\n\n## Installation\nInstall the required dependencies and install this repo as a package:\n```bash\npip install -r requirements.txt\npip install -e .\n```\n\n## Quickstart\nMM-StoryAgent can be called by configuration files:\n```bash\npython run.py -c configs/mm_story_agent.yaml\n```\nEach agent is called in the following format:\n```yaml\nstory_writer: # agent name\n    tool: qa_outline_story_writer # name registered in the definition\n    cfg: # parameters for initializing the agent instance\n        max_conv_turns: 3\n        ...\n    params: # parameters for calling the agent instance\n        story_topic: \"Time Management: A child learning how to manage their time effectively.\"\n        ...\n```\nThe customization of new agents can refer to [music_agent.py](mm_story_agent/modality_agents/music_agent.py#L42). The agent class should implement `__init__` and `call` to work properly, like the following:\n```python\nfrom typing import Dict\nfrom mm_story_agent.base import register_tool\n\n@register_tool(\"my_speech_agent\")\nclass MySpeechAgent:\n    \n    def __init__(self, cfg: Dict):\n        # For example, the agent need `attr1` and `attr2` for initilization\n        self.attr1 = cfg.attr1\n        self.attr2 = cfg.attr2\n        ...\n    \n    def call(self, params: Dict):\n        # For example, calling the agent needs `voice` and `speed` parameters\n        voice = params[\"voice\"]\n        speed = params[\"speed\"]\n        ...\n    \n```\nThen the agent can be called by simply modifying the configuration like:\n```yaml\nspeech_generation:\n    tool: my_speech_agent\n    cfg:\n        attr1: val1\n        attr2: val2\n    params:\n        voice: en_female\n        speed: 1.0\n```\n\n## Evaluation Data\nThe evaluation topics are provided in [story_topics.json](story_eval/story_topics.json). Evaluation rubrics and prompts are also provided accordingly.\n\n### Story Content Evaluation\nWe use GPT-4 to automatically evaluate the story quality according to several aspects.\nOur story writing agent is compared with directly prompting LLM to write stories.\nEvaluation scores show the advantage of our multi-agent, multi-stage story writing pipeline.\n\n| Rubric Grading            |              | Attractiveness | Warmth | Education | Average |\n|---------------------------|--------------|----------------|--------|-----------|---------|\n| **Topic 1: Self-growing** | Direct       | 3.68           | 4.42   | 4.84      | 4.31    |\n|                           | Story Agent  | 4.1            | 4.5    | 4.80      | **4.47**|\n| **Topic 2: Family \u0026 Friendship** | Direct   | 3.94           | 5.0    | 4.72      | 4.55    |\n|                           | Story Agent  | 4.36           | 4.8    | 4.92      | **4.69**|\n| **Topic 3: Environments** | Direct       | 4.0            | 4.62   | 4.92      | 4.51    |\n|                           | Story Agent  | 4.44           | 4.68   | 4.86      | **4.66**|\n| **Topic 4: Knowledge Learning** | Direct | 4.46           | 4.14   | 4.86      | 4.49    |\n|                           | Story Agent  | 4.84           | 4.52   | 4.90      | **4.75**|\n| **All**                   | Direct       | 4.02           | 4.55   | 4.84      | 4.47    |\n|                           | Story Agent  | 4.44           | 4.63   | 4.87      | **4.65**|\n\n\n\n## Citation","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fx-plug%2Fmm_storyagent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fx-plug%2Fmm_storyagent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fx-plug%2Fmm_storyagent/lists"}