https://github.com/sameerqureshii/jarvis-ai-assistant

A voice controlled AI assistant built with Python that combines speech recognition, text-to-speech, and Google's Gemini AI API to create a personalized virtual assistant experience.
https://github.com/sameerqureshii/jarvis-ai-assistant

Last synced: 5 months ago
JSON representation

A voice controlled AI assistant built with Python that combines speech recognition, text-to-speech, and Google's Gemini AI API to create a personalized virtual assistant experience.

Host: GitHub
URL: https://github.com/sameerqureshii/jarvis-ai-assistant
Owner: sameerqureshii
License: mit
Created: 2025-08-12T14:55:33.000Z (11 months ago)
Default Branch: main
Last Pushed: 2025-08-12T15:54:46.000Z (11 months ago)
Last Synced: 2025-08-12T17:18:35.066Z (11 months ago)
Size: 15.6 KB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Jarvis AI Assistant
A voice controlled AI assistant built with Python that combines speech recognition, text-to-speech, and Google's Gemini AI to create a personalized virtual assistant experience.

# ✨ Features:

🎤 `Voice Recognition`: Wake word detection with "Jarvis"

🗣️ `Text-to-Speech`: Natural voice responses using Google TTS

🧠 `AI Integration`: Powered by Google Gemini for intelligent responses

🌐 `Web Navigation`: Quick access to popular websites

🎵 `Music Playback`: Voice controlled music from your library

📰 `News Updates`: Fetch latest headlines from News API

⚡ `Fast Commands`: Optimized hardcoded commands for common tasks

# 🛠️ Installation:
**Prerequisites:**

1. Python 3.7 or higher

2. Requires a microphone for voice input

3. API keys must be valid for features to work

4. Requires an internet connection for gTTS, NewsAPI, and Gemini AI

# Required Dependencies:

pip install `speechrecognition`

pip install `pyttsx3`

pip install `requests`

pip install `google-generativeai`

pip install `gtts`

pip install `pygame`

pip install `pyaudio`

# API Keys Setup:

• News API: Get your free API key from NewsAPI.

• Google Gemini API: Get your API key from Google AI Studio.

# 📁 Project Structure:

jarvis-ai-assistant/

├── main.py # Main assistant code

├── musicLibrary.py # Predefined song names & YouTube links

└── README.md # Project documentation

# ⚙️ Configuration:

Before running the assistant, update the API keys in `main.py`:

• NEWS_API_KEY = "Your-API"

• GEMINI_API_KEY = "Your-API"

# 🚀 Usage:
**Run the assistant:**

python `main.py`

1. Wait for "Initializing Jarvis..." message
2. Say "Jarvis" to wake the assistant
3. Wait for "Yes, Sameer" response
4. Give your command within 6 seconds

# 📋 Function Documentation:
**Core Functions:**

`speak(text)`

**Purpose:** Converts text to speech using Google's TTS engine.

**How it works:**

1. Creates a gTTS (Google Text-to-Speech) object
2. Saves audio to temporary MP3 file
3. Uses pygame to play the audio
4. Automatically cleans up temporary files
5. Falls back to error handling if TTS fails

**Parameters:**

• `text` (string): Text to be spoken

`speak_old(text)`

**Purpose:** Alternative text-to-speech using offline pyttsx3 engine.

**How it works:**

• Uses the local pyttsx3 engine for offline TTS

• More reliable but less natural sounding than Google TTS

**Parameters:**

• `text` (string): Text to be spoken

`aiProcess(command)`

**Purpose:** Processes natural language commands using Google Gemini AI.

**How it works:**

1. Sends user command to Gemini 1.5 Flash model
2. Uses system instruction to maintain Jarvis personality
3. Returns AI generated response text
4. Includes error handling for API failures

**Parameters:**

1. `command` (string): User's voice command

**Returns:** AI generated response text

`processCommand(c)`

**Purpose:** Main command processor that handles both hardcoded and AI commands.

# How it works:

**1. Website Navigation (Hardcoded for speed)**:

• "open google" → Opens Google

• "open chatgpt" → Opens ChatGPT

• "open github" → Opens developer's GitHub profile

• "open linkedin" → Opens developer's LinkedIn profile

**4. Music Playback:**

• Command format: "play [song_name]"

• Searches musicLibrary.py for song links

• Opens YouTube links in default browser

• Provides feedback if song not found

**3. News Updates:**

• Triggers on "news" keyword

• Fetches top US headlines from News API

• Reads first 3 headlines aloud

• Handles API errors gracefully

**5. AI Fallback:**

• Any unrecognized command goes to Gemini AI

• Provides intelligent responses for general queries

**Parameters:**

• `c` (string): Voice command from user

# 🎵 Music Library:

The `musicLibrary.py` file contains a dictionary of song names mapped to YouTube URLs:

music = {

"stealth": "YouTube URL",

"march": "YouTube URL",

"skyfall": "YouTube URL",

"wolf": "YouTube URL"

}

**Usage:** Say "play [song_name]" where song_name matches a key in the dictionary.

**Adding Songs:** Add new entries to the music dictionary in the format `"song_name": "youtube_url"`

# Modifying Voice Settings:

Adjust recognition parameters in the main loop:

• `timeout`: Maximum wait time for voice input

• `phrase_time_limit`: Maximum length of spoken phrase

• `duration`: Ambient noise adjustment time

# Changing AI Personality:

Modify the system instruction in the Gemini model configuration:

`system_instruction="Your custom personality instructions here"`

# 📄 License:

This project is open source and available under the MIT License.

# 🤝 Contributing:

Feel free to fork this project and submit pull requests for improvements!

# 👨‍💻 Developer:

**Sameer Ahmed Qureshi**

`Portfolio`: https://sameer-personall-portfolio.vercel.app

`LinkedIn`: https://www.linkedin.com/in/sameer-ahmed-qureshi56

Built with using Python and Google AI

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/sameerqureshii/jarvis-ai-assistant

Awesome Lists containing this project

README