https://github.com/aaaastark/genai-hub-space
Unlock peak productivity and navigate the GenAI-Hub-Space with confidence. This repository is your central hub for carefully curated AI tools, models, and learning resources designed to help developers, researchers, and professionals work smarter, not harder.
https://github.com/aaaastark/genai-hub-space
artificial-intelligence audio-processing automation autonomous-agents computer-vision data-science deep-learning generative-ai image-editing image-generation llm machine-learning music-generation natural-language-processing reinforcement-learning research-and-development speech-recognition speech-to-text text-to-speech voice-assistant
Last synced: about 2 months ago
JSON representation
Unlock peak productivity and navigate the GenAI-Hub-Space with confidence. This repository is your central hub for carefully curated AI tools, models, and learning resources designed to help developers, researchers, and professionals work smarter, not harder.
- Host: GitHub
- URL: https://github.com/aaaastark/genai-hub-space
- Owner: aaaastark
- License: cc0-1.0
- Created: 2025-04-25T05:52:49.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-10-02T19:13:50.000Z (8 months ago)
- Last Synced: 2025-10-02T21:14:04.877Z (8 months ago)
- Topics: artificial-intelligence, audio-processing, automation, autonomous-agents, computer-vision, data-science, deep-learning, generative-ai, image-editing, image-generation, llm, machine-learning, music-generation, natural-language-processing, reinforcement-learning, research-and-development, speech-recognition, speech-to-text, text-to-speech, voice-assistant
- Language: Python
- Homepage: https://github.com/aaaastark/GenAI-Hub-Space
- Size: 10.9 MB
- Stars: 4
- Watchers: 2
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE-CODE.md
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project
README
[](https://mseep.ai/app/aaaastark-genai-hub-space)

๐บ**GenAI-Hub-Space** ๐บ
***A curated index of impactful AI tools and models, that emphasizes technical merit, practical utility and Prioritizing open-source.***
๐บ**Effective AI use requires understanding capabilities, limitations, and bias mitigation strategies.** ๐บ
[](https://github.com/aaaastark/GenAI-Hub-Space/blob/main/LICENSE.md)
[](https://github.com/aaaastark/GenAI-Hub-Space/blob/main/LICENSE-CODE.md)
ย
Table of contents
- [Introduction](#introduction)
- [AI Tutorials and Learning Resources](#ai-tutorials-and-learning-resources)
- [Tutorials](#tutorials)
- [Learning Resources](#learning-resources)
- [Audio Processing](#audio-processing)
- [Transcription and Summarization](#transcription-and-summarization)
- [Music Generation](#music-generation)
- [Text-to-Speech Synthesis](#text-to-speech-synthesis)
- [Text-to-Speech Models](#text-to-speech-models)
- [Text-to-Speech Providers](#text-to-speech-providers)
- [Speech Recognition](#speech-recognition)
- [Speech-to-Text Models](#speech-to-text-models)
- [Speech-to-Text Providers](#speech-to-text-providers)
- [Voice Assistants](#voice-assistants)
- [Voice Assistants Models](#voice-assistants-models)
- [Voice Assistants Providers](#voice-assistants-providers)
- [Automation](#automation)
- [Autonomous Agents](#autonomous-agents)
- [Automation tools](#automation-tools)
- [Computer Vision](#computer-vision)
- [Image Editing](#image-editing)
- [Image Generation](#image-generation)
- [Image Generation Models](#image-generation-models)
- [Cloud-based Image Generation Providers](#cloud-based-image-generation-providers)
- [Local Image Generation Providers](#local-image-generation-providers)
- [Video Generation](#video-generation)
- [Image-to-Video Models](#image-to-video-models)
- [Text-to-Video Models](#text-to-video-models)
- [Video Generation Providers](#video-generation-providers)
- [3D Model Generation](#3d-model-generation)
- [Text/Image-to-3D Models](#textimage-to-3d-models)
- [Data Analysis](#data-analysis)
- [Foundation Models](#foundation-models)
- [Language Only Large Language Models](#language-only-large-language-models)
- [Advanced Language and Reasoning LLMs](#advanced-language-and-reasoning-llms)
- [Open source Models](#open-source-models)
- [Proprietary Models](#proprietary-models)
- [Finetuned LLMs](#finetuned-llms)
- [Astrophysics](#astrophysics)
- [Coding](#coding)
- [Function calling](#function-calling)
- [Math](#math)
- [Role Play](#role-play)
- [Uncensored](#uncensored)
- [LLM Providers](#llm-providers)
- [Cloud-based LLM Providers](#cloud-based-llm-providers)
- [Local LLM Providers](#local-llm-providers)
- [Coding-focused LLM Providers](#coding-focused-llm-providers)
- [AI-Augmented Integrated Development Environments](#ai-augmented-integrated-development-environments)
- [Multimodal Foundation Models](#multimodal-foundation-models)
- [Vision Language Models](#vision-language-models)
- [Open source VLMs](#open-source-vlms)
- [Proprietary VLMs](#proprietary-vlms)
- [Multimodal Large Language Models](#multimodal-large-language-models)
- [Open-Source MLLMs](#open-source-mllms)
- [Proprietary MLLMs](#proprietary-mllms)
- [Search and Research Tools](#search-and-research-tools)
- [Academic and Scientific Research](#academic-and-scientific-research)
- [AI-Powered Web Browsers](#ai-powered-web-browsers)
- [Deep Research Tools](#deep-research-tools)
- [Search Engines](#search-engines)
- [Other Applications](#other-applications)
- [Language Learning Tools](#language-learning-tools)
- [Meeting Transcription and Summarization](#meeting-transcription-and-summarization)
- [Presentation Slides Generation](#presentation-slides-generation)
- [Versatile Productivity Tools](#versatile-productivity-tools)
- [Website Building Tools](#website-building-tools)
# Introduction
**Unlock peak productivity and navigate the GenAI-Hub-Space with confidence.** This repository is your central hub for carefully curated AI tools, models, and learning resources designed to help developers, researchers, and professionals **work smarter, not harder.**
Inside, discover resources to automate tasks, enhance workflows, and stay cutting-edge:
* **Categorized AI solutions** (Audio, Vision, LLMs, etc.), with a focus on **Open Source** options.
* Discover top model insights with comprehensive rankings and leverage our **LLM Model Evaluation Framework (version 0.3.1) for informed decision-making.** (Version 0.4 is currently in development.)
* **Practical tutorials and guides.**
### ๐ Understanding the Tools
Quickly grasp licensing and pricing models with these indicators:
* **Licensing:** [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) (Proprietary) vs. [
](https://opensource.com/resources/what-open-source) (Open Source)
* **Pricing:**
(Free) | [
](https://builtin.com/articles/freemium) (Freemium) |
(Paid)
*(Documentation: [CC0 License](link). Code/Framework Contributions: [MIT License](link).)*
### ๐ค Join the Community & Contribute!
This project thrives on collaboration. Hereโs how you can get involved:
* ๐ **Star** this repository to follow updates.
* ๐ก **[Contribute](https://github.com/aaaastark/GenAI-Hub-Space/blob/main/CONTRIBUTING.md)** your favorite tools, resources, or improvements (Issues & PRs welcome!).
* ๐ Help refine our **AI Model Evaluation Framework**.
* ๐ฃ๏ธ Share your experiences and use cases.
**Let's build the future of intelligent workflows together!**
# AI Tutorials and Learning Resources
### Tutorials
Master AI concepts through **hands-on tutorials and practical implementations.**
***Local tutorials***
* ***[How to run LLMs locally on your machine](https://github.com/aaaastark/GenAI-Hub-Space/blob/main/Tutorials/run-llms-locally-on-your-machine.md)*** - Deploy and operate Large Language Models on local hardware.
* ***[Integrating AI Models into Your Integrated Development Environment (IDE)](https://github.com/aaaastark/GenAI-Hub-Space/blob/main/Tutorials/integrating-ai-models-into-ide.md)*** - Configure development environments for seamless AI model integration.
* ***[Local Image Generation with Fooocus: A Comprehensive Tutorial](https://github.com/aaaastark/GenAI-Hub-Space/blob/main/Tutorials/local-image-generation-with-fooocus.md)***- Implement local image generation using open-source AI models.
* ***[How to Use AI Privately](https://github.com/aaaastark/GenAI-Hub-Space/blob/main/Tutorials/How-to-Use-AI-Privately.md)*** - Implement privacy-preserving practices for AI implementation.
***Online tutorials***
* **[AI Prompt Engineering Tutor](https://huggingface.co/spaces/baconnier/PrompTutor)** - Interactive platform for mastering prompt engineering methodologies. Developed by Loic Baconnier.
* **[Prompt Engineering Interactive Tutorial](https://docs.google.com/spreadsheets/d/1jIxjzUWG-6xBVIa2ay6yDpLyeuOh_hR_ZB75a47KX_E/edit?gid=869808629#gid=869808629)** - Systematic guide to prompt optimization techniques. Created by Anthropic.
### Learning Resources
***
Beginner***
| Title | Description | Platform |
|:----------------------------------------------------------------------------|--------------------------------------------------------------|:--------------------:|
| [Fundamentals of Generative AI](https://microsoft.github.io/generative-ai-for-beginners/#/01-introduction-to-genai/README?wt.mc_id=academic-105485-koreyst) | Introduction to Generative AI and Large Language Models (LLMs). | [
](https://www.microsoft.com/) |
| [Fundamentals of Responsible Generative AI](https://microsoft.github.io/generative-ai-for-beginners/#/03-using-generative-ai-responsibly/README?wt.mc_id=academic-105485-koreyst) | Using Generative AI responsibly. | [
](https://www.microsoft.com/) |
| [Introduction to Generative AI](https://www.cloudskillsboost.google/paths/118/course_templates/536) | An introduction to the capabilities, applications, and distinct characteristics of generative artificial intelligence (AI). | [
](https://gemini.google.com/) |
| [Introduction to Image Generation](https://www.cloudskillsboost.google/paths/183/course_templates/541) | Introduces diffusion models: a novel approach to machine learning that has generated remarkable results in image creation and manipulation. | [
](https://gemini.google.com/) |
| [Introduction to Large Language Models](https://www.cloudskillsboost.google/paths/118/course_templates/539) | Introduction to large language models (LLMs) and the opportunities they present for natural language processing: use cases, limitations, and optimization strategies. | [
](https://gemini.google.com/) |
| [Introduction to Responsible AI](https://www.cloudskillsboost.google/paths/118/course_templates/554) | The case for responsible AI: understanding its significance in ensuring that machine learning systems align with human values and promote social good.| [
](https://gemini.google.com/) |
| [What are foundation models?](https://research.ibm.com/blog/what-are-foundation-models) | Discover how Foundation models are revolutionizing AI with their cutting-edge capabilities. | [
](https://github.com/ibm-granite/) |
| [What are large language models (LLMs)?](https://www.ibm.com/topics/large-language-models) | Quick introduction to LLMs and their use cases. | [
](https://github.com/ibm-granite/) |
| [What is Conversational AI?](https://aws.amazon.com/what-is/conversational-ai/) | Basic understanding of how conversational AI works. | [
](https://aws.amazon.com/ai/) |
| [What is Generative AI?](https://aws.amazon.com/what-is/generative-ai/) | Overview of foundational ideas and principles in generative AI. | [
](https://aws.amazon.com/ai/) |
| [What is Generative AI?](https://research.ibm.com/blog/what-is-generative-AI) | Introduction to Generative AI by Understanding its Potential and Applications. | [
](https://github.com/ibm-granite/) |
| [What is NLP (natural language processing)?](https://www.ibm.com/topics/natural-language-processing) | Understand how Models understand our Language. | [
](https://github.com/ibm-granite/) |
| [What are vision language models (VLMs)?](https://www.ibm.com/topics/large-language-models) | Quick introduction to VLMs and their use cases. | [
](https://github.com/ibm-granite/) |
***
Intermediate***
| Title | Description | Platform |
|:----------------------------------------------------------------------------|--------------------------------------------------------------|:--------------------:|
| [Evaluation of generative AI applications](https://microsoft.github.io/generative-ai-for-beginners/#/02-exploring-and-comparing-different-llms/README?wt.mc_id=academic-105485-koreyst) | Exploring and comparing different LLMs. | [
](https://www.microsoft.com/) |
| [Generative AI Explained](https://learn.nvidia.com/courses/course-detail?course_id=course-v1:DLI+S-FX-07+V1) | Concepts, applications, challenges, and opportunities in Generative AI. | [
](https://build.nvidia.com/) |
| [Introduction to prompt engineering](https://microsoft.github.io/generative-ai-for-beginners/#/04-prompt-engineering-fundamentals/README?wt.mc_id=academic-105485-koreyst) | Hands-on best practices for prompt engineering. | [
](https://www.microsoft.com/) |
| [Vision Language Models Explained](https://huggingface.co/blog/vlms) | An overview of vision language models, their functionality, and usage. | [
](https://huggingface.co/) |
| [What are AI hallucinations?](https://www.ibm.com/topics/ai-hallucinations) | Learn why AI systems can generate nonsensical outputs by perceiving non-existent patterns or objects. | [
](https://github.com/ibm-granite/) |
| [What is Prompt Engineering?](https://aws.amazon.com/what-is/prompt-engineering/) | A concise guide to the key concepts, considerations, and methodologies behind prompt engineering. | [
](https://aws.amazon.com/ai/) |
| [What is prompt-tuning?](https://research.ibm.com/blog/what-is-ai-prompt-tuning) | A lightweight method for fine-tuning AI foundation models on downstream tasks. | [
](https://github.com/ibm-granite/) |
***
Advanced***
| Title | Description | Platform |
|:----------------------------------------------------------------------------|--------------------------------------------------------------|:--------------------:|
| [Augment your LLM Using Retrieval Augmented Generation](https://learn.nvidia.com/courses/course-detail?course_id=course-v1:DLI+S-FX-16+V1) | High-level overview of Retrieval Augmented Generation and its benefits for Generative AI (GenAI). | [
](https://build.nvidia.com/) |
| [Introduction to Quantization](https://huggingface.co/blog/merve/quantization) | An introduction to Quantization, a technique to reduce model size to improve training and inference speed. | [
](https://huggingface.co/)|
| [Mixture of Experts Explained](https://huggingface.co/blog/moe) | Overview of MoEs, how theyโre trained, and the tradeoffs to consider. | [
](https://huggingface.co/) |
| [Preference Tuning LLMs with Direct Preference Optimization Methods](https://huggingface.co/blog/pref-tuning) | Exploration of three promising methods to align language models without reinforcement learning (or preference tuning). | [
](https://huggingface.co/) |
| [Prompt engineering techniques](https://microsoft.github.io/generative-ai-for-beginners/#/05-advanced-prompts/README?wt.mc_id=academic-105485-koreyst) | Techniques that improve the outcome of your prompts. | [
](https://www.microsoft.com/) |
| [What is AI inferencing?](https://research.ibm.com/blog/AI-inference-explained) | Introduction to the Principles and Methods of AI Inference. | [
](https://github.com/ibm-granite/) |
| [What is instruction tuning?](https://www.ibm.com/topics/instruction-tuning) | Learn how Instruction tuning enhances pre-trained LLMs by improving their ability to follow and execute instructions accurately. | [
](https://github.com/ibm-granite/) |
| [What is KV Cache Quantization](https://huggingface.co/blog/kv-cache-quantization) | Understanding KV Cache Quantization to reduce memory usage for long-context text generation. |[
](https://huggingface.co/) |
| [Whatโs an LLM context window and why is it getting larger?](https://research.ibm.com/blog/larger-context-window) | Understanding the Role of LLM Context Windows in AI. | [
](https://github.com/ibm-granite/) |
| [What is LLM orchestration](https://www.ibm.com/think/topics/llm-orchestration) | Understanding LLM orchestration and how it helps prompt, chain, manage and monitor LLMs | [
](https://github.com/ibm-granite/) |
| [What is Model Context Protocol (MCP)](https://huggingface.co/blog/Kseniase/mcp) | Understanding MCP to connect LLMs to many different sources of context. |[
](https://huggingface.co/) |
| [What is reasoning in AI?](https://www.ibm.com/think/topics/ai-reasoning) | Understanding AI Reasoning and why it is usefull. | [
](https://github.com/ibm-granite/) |
| [What is retrieval-augmented generation?](https://research.ibm.com/blog/retrieval-augmented-generation-RAG) | Learn what is retrieval-augmented generation (RAG) and why it is usefull. | [
](https://github.com/ibm-granite/) |
| [What is reinforcement learning from human feedback (RLHF)?](https://www.ibm.com/topics/rlhf) | Learn what is reinforcement learning from human feedback (RLHF) and why it is usefull. | [
](https://github.com/ibm-granite/) |
| [What is tool calling?](https://www.ibm.com/think/topics/tool-calling) | Understanding how LLMs interact with external tools. | [
](https://github.com/ibm-granite/) |
# Audio Processing
### Transcription and Summarization
AI-powered media processing tools**leverage Natural Language Processing (NLP) and computer vision algorithms to automate transcription and content summarization** from audio-visual sources. These solutions streamline content analysis by **generating accurate text outputs and key insights from multimedia data.**
| Tool | Description | Licence | Pricing |
|------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------|:-----------:|:---------:|
| [Eightify](https://eightify.app/) | A powerful tool that utilizes YouTube AI technology to summarize videos quickly, providing users with key ideas in seconds. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Exemplary AI](https://exemplary.ai/) | A cloud-based tool that harnesses Artificial Intelligence (AI) and LLMs to offer transcription solutions. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [Riverside](https://riverside.fm/) | An online studio that specializes in high-quality podcast and video recording and editing. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [SolidPoint](https://solidpoint.ai/) | A range of tools that leverage AI technology to enhance productivity and efficiency in various tasks. One of its key features is the Summarize tool. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Summarize.tech](https://www.summarize.tech/) | An AI-powered tool that automatically generates summaries of long videos from YouTube. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [Summify](https://summify.io/) | A powerful tool that efficiently condenses lengthy videos into concise and informative summaries. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [Voxweave](https://voxweave.xyz/) | An innovative AI-powered tool that revolutionizes the interaction with YouTube videos by transforming them into concise summaries. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [WavoAI](https://wavoai.com/) | An AI-powered tool that provides accurate transcriptions and insights from audio recordings. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
### Music Generation
Music generation algorithms utilize deep learning models to **synthesize original compositions, enabling style-specific audio creation and adaptive soundtrack generation.**
| Tool | Description | Licence | Pricing |
|:--------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------:|:----------:|
| [Jukebox](https://jukebox.openai.com/) | A generative AI model developed by OpenAI that can create original music, including rudimentary singing, in a variety of genres and artist styles. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Magenta](https://magenta.tensorflow.org/) | AI project developed by Google that explores the use of machine learning as a tool for creative applications, particularly in music and art. | [
](https://opensource.com/resources/what-open-source) |
|
| [Mubert](https://mubert.com/) | A generative AI platform that allows users to create and stream original, AI-generated music and audio. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [MuseNet](https://openai.com/research/musenet) | An AI model developed by OpenAI that can generate original 4-minute musical compositions with up to 10 different instruments. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Stable Audio](https://stability.ai/stable-audio) | A generative AI system developed by Stability AI for creating high-quality audio and music. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [Suno](https://suno.com/) | A cutting-edge AI-powered music generator that lets users create custom songs in various genres using text prompts. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
### Text-to-Speech Synthesis
Text-to-speech (TTS) systems employ **neural networks for voice synthesis, converting text input into natural speech output.** These models **support voice customization parameters including timbre, prosody, and linguistic variations.**
#### Text-to-Speech Models
> [!NOTE]
> The models are ranked according to their **Arena Elo score (with higher scores indicating better performance)** from the [Artifical Analysis' Leaderboard](https://artificialanalysis.ai/speech-to-text).
| Organization | Model Name | Arena Elo | Licence | Pricing |
|:------------------:|:----------------------------------------------------|:--------------------------------------------------------------------------------:|:--------------:|:---------------:|
| [
](https://openai.com/) | [TTS-1-HD](https://platform.openai.com/docs/models/tts-1-hd) | 1151 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)|
|
| [
](https://openai.com/) | [TTS-1](https://platform.openai.com/docs/models/tts-1) | 1137 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)|
|
| [
](https://elevenlabs.io/) | [Multilingual v2](https://elevenlabs.io/blog/eleven-multilingual-v2) | 1114 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)| [
](https://builtin.com/articles/freemium) |
| [
](https://elevenlabs.io/) | [Turbo v2.5](https://elevenlabs.io/blog/introducing-turbo-v2-5) | 1110 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)| [
](https://builtin.com/articles/freemium) |
| [
](https://elevenlabs.io/) | [Flash v2.5](https://elevenlabs.io/blog/meet-flash) | 1108 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)| [
](https://builtin.com/articles/freemium) |
| [
](https://www.cartesia.ai/) | [Sonic English](https://docs.cartesia.ai/getting-started/available-models) | 1106 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Hexgrad](https://github.com/hexgrad/kokoro) | [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) | 1091 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://www.minimaxi.com/en) | [T2A-01-HD](https://www.minimax.io/news/speech-01-hd-release) | 1081 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)| [
](https://builtin.com/articles/freemium) |
| [
](https://aws.amazon.com/ai/) | [Polly Generative](https://aws.amazon.com/polly/) | 1060 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://www.microsoft.com/) | [Azure Neural](https://speech.microsoft.com/portal) | 1058 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://aws.amazon.com/ai/) | [Polly Long-form](https://aws.amazon.com/polly/) | 1058 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://www.minimaxi.com/en) | [T2A-01-Turbo](https://www.minimax.io/news/speech-01) | 1042 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)| [
](https://builtin.com/articles/freemium) |
| [
](https://gemini.google.com/) | [TTS Studio](https://cloud.google.com/text-to-speech?hl=en) | 1039 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://fish.audio/) | [Fish Speech 1.5](https://fish.audio/) | 1034 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)| [
](https://builtin.com/articles/freemium) |
| [
](https://play.ai/) | [Dialog](https://play.ai/play-dialog) | 1014 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)| [
](https://builtin.com/articles/freemium) |
| [Zyphra](https://www.zyphra.com/) | [Zonos v0.1](https://www.zyphra.com/post/beta-release-of-zonos-v0-1) | 1000 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)| [
](https://builtin.com/articles/freemium) |
| [
](https://play.ai/) | [3.0 Mini](https://play.ai/play-mini) | 994 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)| [
](https://builtin.com/articles/freemium) |
| [
](https://research.myshell.ai) | [OpenVoice V2](https://cloud.google.com/text-to-speech?hl=en) | 972 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://murf.ai/) | [Murf Speech Gen 2](https://murf.ai/text-to-speech-gen-2) | 972 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)| [
](https://builtin.com/articles/freemium) |
| [
](https://www.lmnt.com/) | [LMNT](https://www.lmnt.com/) | 971 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://platform.stepfun.com/) | [Step TTS Mini](https://www.lmnt.com/) | 959 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://coqui.ai/) | [XTTS V2](https://huggingface.co/coqui/XTTS-v2) | 898 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://github.com/yl4579) | [StyleTTS 2](https://huggingface.co/spaces/styletts2/styletts2) | 889 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://studio.themetavoice.xyz/) | [MetaVoice V1](https://github.com/metavoiceio/metavoice-src) | 784 | [
](https://opensource.com/resources/what-open-source) |
|
#### Text-to-Speech Providers
| Tool | Description | Licence | Pricing |
|:-----------------------------------------------------------|-------------------------------------------------------------------------------------------------------------|:-----------:|:---------:|
| [Audioread](https://audioread.com/) | A transformative tool that converts text into lifelike speech. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Bark](https://github.com/suno-ai/bark) | A groundbreaking text-to-audio model developed by Suno, leveraging GPT-style models. | [
](https://opensource.com/resources/what-open-source) |
|
| [Coqui](https://github.com/coqui-ai/TTS) | A pioneering project that focused on advancing generative voice technology. | [
](https://opensource.com/resources/what-open-source) |
|
| [Eleven Labs](https://elevenlabs.io/) | Industry leader proprietary tool for generating speech from text using deep learning. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [Listnr](https://listnr.ai/) | A cutting-edge AI voice generator that seamlessly converts text into natural-sounding speech. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [MeloTTS](https://github.com/myshell-ai/MeloTTS) | An open-source text-to-speech tool that uses deep learning to generate high-quality speech synthesis. | [
](https://opensource.com/resources/what-open-source) |
|
| [Metavoice](https://github.com/metavoiceio/metavoice-src) | A groundbreaking model that has been developed to create human-like speech with emotional nuances. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Murf](https://murf.ai/) | A n innovative voice generator tool that revolutionizes the process of creating voiceovers. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [SpeechT5](https://github.com/microsoft/SpeechT5/) | A cutting-edge model in speech synthesis and natural language processing that offers a unified approach to various speech-related tasks. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Speechki](https://speechki.org/) | An advanced AI Realistic Voice Generator that offers over 1100 voices in more than 80 languages. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [Unrealspeech](https://unrealspeech.com/) | A text-to-speech software that stands out for its human-like audio output, providing a superior listening experience. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [VoiceCraft](https://jasonppy.github.io/VoiceCraft_web/) | A state-of-the-art text-to-speech (TTS) model that can perform zero-shot speech editing and TTS on diverse audio data. | [
](https://opensource.com/resources/what-open-source) |
|
### Speech Recognition
Speech recognition systems **convert acoustic signals into text through automated speech recognition (ASR) models.** These systems process audio input for text transcription and voice command interpretation.
#### Speech-to-Text Models
> [!NOTE]
> Models are ranked according to their Word Error Rate (%) (% of words transcribed incorrectly. Lower is better) from the [Artifical Analysis' Leaderboard](https://artificialanalysis.ai/speech-to-text).
| Organization | Model Name | Word Error Rate (%) | Licence | Pricing |
|:------------------:|:----------------------------------------------------|:-------------------:|:-----------:|:---------:|
| [
](https://elevenlabs.io/) | [Scribe](https://elevenlabs.io/speech-to-text) | 7.7 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [
](https://www.speechmatics.com/) | [Enhanced](https://www.speechmatics.com/) | 8.6 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://www.assemblyai.com/) | [Universal-2](https://www.assemblyai.com/research/universal-1) | 8.6 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://www.assemblyai.com/) | [Universal-1](https://www.assemblyai.com/research/universal-1) | 8.7 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://gemini.google.com/) | [Chirp 2](https://cloud.google.com/speech-to-text?hl=en) | 9.8 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://openai.com/) | [Whisper Large V3](https://huggingface.co/openai/whisper-large-v3) | 10.3 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://openai.com/) | [Whisper Large V2](https://huggingface.co/openai/whisper-large-v2) | 10.6 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://aws.amazon.com/ai/) | [Transcribe](https://aws.amazon.com/transcribe/) | 11.2 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://gemini.google.com/) | [Chirp](https://cloud.google.com/speech-to-text?hl=en) | 12.4 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://www.speechmatics.com/) | [Standard](https://www.speechmatics.com/) | 12.6 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://deepgram.com/) | [Nova-3](https://deepgram.com/product/speech-to-text) | 12.8 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://huggingface.co/) | [distil-large-v3](https://huggingface.co/openai/whisper-large-v3) | 13.0 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://openai.com/) | [GPT-4o Transcribe](https://platform.openai.com/docs/models/gpt-4o-transcribe) | 13.2 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://deepgram.com/) | [Nova-2](https://deepgram.com/learn/nova-2-speech-to-text-api) | 15.1 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://fish.audio/) | [Fish Speech to text](https://fish.audio/) | 19.1 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary)| [
](https://builtin.com/articles/freemium) |
#### Speech-to-Text Providers
Tool | Description | Models | Pricing |
|:-----------------|-----------------------------------------------------------------------------------------------------------------------------|:-----------:|:-----------:|
| [Amazon Web Services (AWS)](https://aws.amazon.com/bedrock/) | A fully managed service provided by Amazon Web Services (AWS) designed to facilitate the development of generative AI applications. | [Amazon Transcribe]((https://aws.amazon.com/transcribe/)) |
|
| [AssemblyAI](https://www.assemblyai.com/products/speech-to-text) | A powerful speech recognition and audio intelligence platform. | [Universal-1](https://www.assemblyai.com/research/universal-1) |
|
| [Deepgram](https://deepgram.com/) | A powerful accurate speech recognition with advanced AI capabilities and developer-friendly tools. | [Nova-2](https://deepgram.com/learn/nova-2-speech-to-text-api) and [Whisper Large V2](https://huggingface.co/openai/whisper-large-v2) |
|
| [DeepInfra](https://deepinfra.com/chat) | A platform that provides scalable and cost-effective infrastructure for deploying machine learning models. | [Whisper Large V3](https://huggingface.co/openai/whisper-large-v3) and [distil-large-v3](https://huggingface.co/openai/whisper-large-v3) | [
](https://builtin.com/articles/freemium) |
| [Fal.ai](https://fal.ai/) | A powerful cloud platform designed for deploying and integrating AI models into applications. | [Whisper Large V3](https://huggingface.co/openai/whisper-large-v3) |
|
| [Gladia](https://www.gladia.io/) | An advanced AI platform that specializes in real-time transcription, translation, and audio intelligence. | [Whisper Large V2](https://huggingface.co/openai/whisper-large-v2) | [
](https://builtin.com/articles/freemium) |
| [Google](https://cloud.google.com/speech-to-text?hl=en) | A powerful service offered by Google Cloud that utilizes advanced machine learning techniques to convert spoken language into written text. | [Chirp](https://cloud.google.com/speech-to-text?hl=en) | [
](https://builtin.com/articles/freemium) |
| [Groq](https://groq.com/) | Specializes in high-performance AI inference with custom LPU (Language Processing Unit) hardware, offering models like Meta's Llama 3. | [Whisper Large V3](https://huggingface.co/openai/whisper-large-v3) and [distil-large-v3](https://huggingface.co/openai/whisper-large-v3) | [
](https://builtin.com/articles/freemium) |
| [Microsoft Azure](https://azure.microsoft.com/) | A comprehensive suite of AI services and tools designed to help developers and organizations build, deploy, and manage AI applications at scale. | [Whisper Large V2](https://huggingface.co/openai/whisper-large-v2) |
|
| [OpenAI](https://openai.com/index/whisper/) | A state-of-the-art automatic speech recognition (ASR) system developed by OpenAI. | [Whisper Large V2](https://huggingface.co/openai/whisper-large-v2) |
|
| [Replicate](https://replicate.com/home) | A cloud platform that allows developers to easily run and deploy open-source machine learning models. | All Whisper Familly |
|
| [Rev AI](https://www.rev.ai/) | A sophisticated speech recognition platform that provides automatic speech-to-text transcription services. | Rev AI |
|
| [Speechmatics](https://www.speechmatics.com/) | A powerful AI-driven speech recognition and transcription platform. | [Universal-1](https://www.assemblyai.com/research/universal-1) |
|
### Voice Assistants
These systems combine multiple AI technologies to create interactive voice experiences.
#### Voice Assistants Models
| Organization | Model Familly | Best Model | Licence | Pricing |
|:------------------:|:----------------------------------------------------|:---------------------------------------------------------------------------------|:---------------:|:---------------:|
| [
](https://kyutai.org/) | [Moshi](https://moshi.chat/) | [Moshi v0.1](https://huggingface.co/kyutai/moshiko-pytorch-bf16) | [
](https://opensource.com/resources/what-open-source) |
|
#### Voice Assistants Providers
| Tool | Description | Models | Pricing |
|:-----------------|-----------------------------------------------------------------------------------------------------------------------------|:-----------:|:-----------:|
| [OpenAI](https://openai.com/index/chatgpt-can-now-see-hear-and-speak/) | Premium voice interface for GPT-4, offering natural conversations with high-quality voice synthesis and recognition. Features multiple voice options and seamless integration with ChatGPT.| [GPT4-o](https://openai.com/index/hello-gpt-4o/) |
|
| [Gemini](https://gemini.google/assistant/?hl=en) | Google's conversational AI assistant offering natural voice interactions through the Gemini app. Features multilingual support, voice input/output, and integration with Google services. | [Gemini 1.5 Pro](https://gemini.google.com/?hl=en) | [
](https://builtin.com/articles/freemium) |
# Automation
### Autonomous Agents
AI agents are autonomous software systems that execute predefined tasks through decision-making algorithms and environment interaction protocols. These systems **implement adaptive learning mechanisms and inter-agent communication frameworks to achieve specified objectives.**
| Tool | Description | Licence | Pricing |
|:-----------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|:-----------:|:-----------:|
| [AgentGPT](https://agentgpt.reworkd.ai/) | A generative artificial intelligence tool that allows users to create autonomous AI agents capable of performing various tasks autonomously. | [
](https://opensource.com/resources/what-open-source) | [
](https://builtin.com/articles/freemium) |
| [Cognosys](https://www.cognosys.ai/) | An AI assistant that can help you automate tasks, organize your work, and perform research. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [Evo.ninja](https://evo.ninja/) | a generalist agent that can flow between multiple agent personas to solve any task. | [
](https://opensource.com/resources/what-open-source) | [
](https://builtin.com/articles/freemium) |
| [Godmode](https://godmode.space/) | A web platform that provides access to innovative AI agents like autoGPT and babyAGI, allowing users to harness the power of autonomous AI agents. | [
](https://opensource.com/resources/what-open-source) |
|
| [GPT-Engineer](https://github.com/gpt-engineer-org/gpt-engineer) | An open-source AI-powered application builder that generates codebases from natural language project descriptions. | [
](https://opensource.com/resources/what-open-source) |
|
| [Super AGI](https://superagi.com/) | An open-source autonomous AI agent framework that enables developers to build, manage, and run useful autonomous agents efficiently and reliably. | [
](https://opensource.com/resources/what-open-source) |
|
### Automation tools
Execute predefined task sequences through algorithmic workflows to **optimize process efficiency and minimize operational variance.**
| Tool | Description | Licence | Pricing |
|:-----------------|-----------------------------------------------------------------------------------------------------------------------------|:-------------:|:-------------:|
| [Bardeen](https://www.bardeen.ai/) | An AI-powered automation platform that enables users to automate repetitive tasks across various applications without writing code. It offers pre-built integrations with popular tools and allows users to create custom workflows. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Cykel](https://www.cykel.ai/) | an AI company focused on developing intelligent automation solutions that can understand natural language and interact with various software and websites to automate complex digital tasks for businesses. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Gumloop](https://www.gumloop.com/) | AI-native workflow automation platform that allows users to build complex automations by visually connecting modular components on a canvas | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [Lindy](https://www.lindy.ai/) | An advanced automation platform designed to create custom AI assistants that streamline various business workflows without requiring coding skills. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [N8N](https://n8n.io/) | A free and open-source fair-code licensed workflow automation tool. It allows users to create workflows using a visual editor and connect various services to automate tasks. N8N can be self-hosted, providing users with more control over their data. | [
](https://opensource.com/resources/what-open-source) |
|
| [ProFlow](https://useproflow.com/) | an AI-powered workflow automation and optimization platform that helps businesses streamline their sales, marketing, and operations processes. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [Taskade](https://www.taskade.com/) | An all-in-one collaboration platform that combines project management, task tracking, and team communication features. It offers real-time syncing, customizable templates, and integrations with popular tools. Taskade also has AI-powered features like smart due dates and natural language processing for better task management. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Zapier](https://zapier.com/) | A popular web-based automation platform that connects various apps and services to automate workflows. It offers a wide range of pre-built integrations and allows users to create custom automation rules called "Zaps" without needing to write code. Zapier's AI capabilities include filtering, formatting, and transforming data between apps. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
# Computer Vision
Computer Vision (CV) frameworks implement **neural architectures for visual data processing, analysis, and synthesis across image and video domains**.
> [!CAUTION]
> Use AI-generated images responsibly: **Always disclose that they were generated by AI.**
> Be mindful of **intellectual property rights.**
> [!TIP]
> **Learn prompt engineering techniques for image generation models to enhance output quality and artistic control.** Follow [@nickfloats](https://x.com/nickfloats) on **๐** for valuable insights on **crafting prompts that achieve your desired visual outputs.**
### Image Editing
| Tool | Description | Licence | Pricing |
|:------------------|---------------------------------------------------------------------------------------------------------------------|:-----------:|:-----------:|
| [BRIA AI](https://huggingface.co/spaces/briaai/BRIA-RMBG-1.4) | An AI-powered model to automatically remove backgrounds from images. | [
](https://opensource.com/resources/what-open-source) |
|
| [Clarity AI](https://github.com/philz1337x/clarity-upscaler) |AI Image Upscaler & Enhancer - free and open-source Magnific Alternative | [
](https://opensource.com/resources/what-open-source) |
|
| [ImageFX](https://aitestkitchen.withgoogle.com/tools/image-fx) | An AI-powered tool for applying various image effects and filters. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Lensa](https://prisma-ai.com/lensa) | An AI-powered mobile app for editing and enhancing photos, particularly for portrait editing. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Luminar Neo](https://skylum.com/fr/luminar) | An AI-powered photo editing software developed by Skylum. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Magnific AI](https://magnific.ai/) | an AI-powered image upscaler and enhancer designed for professionals and enthusiasts in photography, graphic design, digital art, and illustration. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [Pixlr](https://pixlr.com/) | An AI-powered online photo editing tool. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [Removebg](https://www.remove.bg/) | An online tool that allows users to automatically remove backgrounds from images. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [ZMO AI](https://remover.zmo.ai/) | Comprehensive online platform offering AI-powered image editing tools. Features include background removal, object erasure, image enhancement, and creative modifications. | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
### Image Generation
#### Image Generation Models
> [!NOTE]
> The models are ranked according to their **Elo scores (with higher scores indicating better performance)** from the [artificialanalysis.ai text to Image Arena](https://artificialanalysis.ai/text-to-image/arena) and [Imgsys.org Ranking](https://imgsys.org/rankings). Please note that **Elo scores are subject to change** based on user votes and will be updated regularly to reflect the latest rankings.
>
> To provide a comprehensive overview of the generative image model landscape, only **pre-trained versions of the listed models are included in this ranking.**
>
> Due to the continuous evolution and **vast number of possible fine-tuned configurations, it is impractical to comprehensively list every variant here.**
| Organization | Model | Elo score | Licence | Pricing |
|:--------------------:|:------------------------------------------------------------------------------|:----------:|:-----------:|:---------:|
| [
](https://openai.com/) | [GPT-4o](https://openai.com/index/hello-gpt-4o/) | 1144 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [
](https://www.recraft.ai/) | [Recraft V3](https://www.recraft.ai/blog/recraft-introduces-a-revolutionary-ai-model-that-thinks-in-design-language) | 1105 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [HiDream](https://hidreamai.com/home) | [HiDream-I1-Dev](https://huggingface.co/HiDream-ai/HiDream-I1-Dev) | 1103 | [
](https://opensource.com/resources/what-open-source) |
| [Reve AI](https://preview.reve.art/) | [Reve Image 1.0](https://preview.reve.art/) | 1098 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [
](https://gemini.google.com/) | [Imagen 3](https://deepmind.google/technologies/imagen-3/) | 1095 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [
](https://blackforestlabs.ai) | [Flux1.1 Pro](https://blackforestlabs.ai/) | 1079 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://blackforestlabs.ai) | [Flux.1 Pro](https://blackforestlabs.ai/) | 1064 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://www.minimaxi.com/en) | [MiniMax Image-01](https://replicate.com/minimax/image-01) | 1049 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://www.midjourney.com/home) | [Midjourney v6.1](https://www.midjourney.com/home) | 1045 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://blackforestlabs.ai) | [Flux.1 Dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) | 1042 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://ideogram.ai/login) | [Ideogram v2](https://ideogram.ai/login) | 1041 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [
](https://www.midjourney.com/home) | [Midjourney v7 Alpha](https://www.midjourney.com/home) | 1039 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://www.midjourney.com/home) | [Midjourney v6](https://www.midjourney.com/home) | 1038 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) |
|
| [
](https://ideogram.ai/login) | [Ideogram v2 Turbo](https://ideogram.ai/login) | 1033 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [
](https://lumalabs.ai/) | [Photon](https://lumalabs.ai/photon) | 1033 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [
](https://stability.ai/) | [Stable Diffusion 3.5 Large Turbo](https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo) | 1030 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://stability.ai/) | [Stable Diffusion 3.5 Large](https://huggingface.co/stabilityai/stable-diffusion-3.5-large) | 1026 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://www.bytedance.com/en/) | [Infinity 8B](https://foundationvision.github.io/infinity.project/) | 1021 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://ideogram.ai/login) | [Ideogram v1](https://ideogram.ai/login) | 1021 | [
](https://www.heavybit.com/library/article/open-source-vs-proprietary) | [
](https://builtin.com/articles/freemium) |
| [
](https://stability.ai/) | [Stable Diffusion 3 Large](https://stability.ai/stable-image) | 1014 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://blackforestlabs.ai) | [Flux.1 schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell) | 1000 | [
](https://opensource.com/resources/what-open-source) |
|
| [
](https://playground.com) | [Playground v3 (beta)](https://playground.com/pg-v3) | 997 | [
](https://opensource.com/resources/what-open-sourc