{"id":50929394,"url":"https://github.com/vitalune/pitt-llamaproject","last_synced_at":"2026-06-17T02:31:11.576Z","repository":{"id":317567855,"uuid":"1067630164","full_name":"vitalune/pitt-llamaproject","owner":"vitalune","description":"Helping students process and interact with their data.","archived":false,"fork":false,"pushed_at":"2025-10-15T20:39:44.000Z","size":13517,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-16T19:14:38.539Z","etag":null,"topics":["llamaindex","streamlit"],"latest_commit_sha":null,"homepage":"","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vitalune.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-01T06:33:09.000Z","updated_at":"2025-10-15T20:39:47.000Z","dependencies_parsed_at":"2025-10-02T14:16:27.168Z","dependency_job_id":null,"html_url":"https://github.com/vitalune/pitt-llamaproject","commit_stats":null,"previous_names":["vitalune/pitt-llamaproject"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/vitalune/pitt-llamaproject","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vitalune%2Fpitt-llamaproject","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vitalune%2Fpitt-llamaproject/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vitalune%2Fpitt-llamaproject/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vitalune%2Fpitt-llamaproject/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vitalune","download_url":"https://codeload.github.com/vitalune/pitt-llamaproject/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vitalune%2Fpitt-llamaproject/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34431810,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-17T02:00:05.408Z","response_time":127,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["llamaindex","streamlit"],"created_at":"2026-06-17T02:31:09.673Z","updated_at":"2026-06-17T02:31:11.565Z","avatar_url":"https://github.com/vitalune.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🤖 LlamaIndex RAG Chatbot Template\n\n![final product demo](example/visuals/embeddedui-demo.png)\n\nA Retrieval-Augmented Generation (RAG) chatbot template that answers questions based on your company's documents using LlamaIndex and OpenAI.\n\n\u003e **📘 For Students**: This is a template for your project. The main folder contains your workspace, and the `examples/` folder shows a complete working version for reference.\n\n---\n\n## 💼 Why Choose This High-Tech Option?\n\n**This project gives you a competitive edge.** By building an AI-powered chatbot with industry-standard tools, you'll gain hands-on experience with technologies that Fortune 500 companies and cutting-edge startups use daily—from working with OpenAI's API and deploying to professional platforms like Hugging Face, to managing code with Git and GitHub. These aren't just buzzwords: they're resume-ready skills that distinguish you in any career path, whether you're pursuing roles in business, healthcare, law, marketing, or technology. You'll create a live, public portfolio piece that demonstrates technical problem-solving, modern AI fluency, and the ability to build real-world applications—capabilities that employers across industries increasingly value. While the low-tech option is perfectly valid, this path transforms your class project into a genuine professional asset.\n\n---\n\n## 📋 Table of Contents\n\n- [Repository Structure](#-repository-structure)\n- [What This Does](#-what-this-does)\n- [Prerequisites](#-prerequisites)\n- [Setup Instructions](#-setup-instructions)\n- [Adding Your Data](#-adding-your-data)\n- [Development \u0026 Testing Options](#-development--testing-options)\n- [Testing with the Example](#-testing-with-the-example)\n- [Deploying to Hugging Face](#-deploying-to-hugging-face)\n- [Embedding in Your Website](#-embedding-in-your-website)\n- [Publishing to GitHub Pages](#-publishing-your-website-to-github-pages)\n- [Troubleshooting](#-troubleshooting)\n- [Project Integration](#-project-integration-engcmp-0600)\n- [Recommended Workflow](#-recommended-workflow)\n\n---\n\n## 📁 Repository Structure\n\n```\npitt-llama-project/\n├── README.md                    # ← You are here!\n├── .env.example                 # Template for your API key\n├── .gitignore                   # Protects sensitive files\n├── requirements.txt             # Python dependencies\n├── app.py                       # YOUR chatbot (work here!)\n├── data/                        # YOUR documents go here (currently empty)\n│   └── README.md\n│\n└── examples/                    # 👀 Reference only\n    ├── llama_test.ipynb         # Learning notebook for Colab\n    ├── index.html         # Full website (HTML, CSS, JS) saved locally w/ embedded chatbot script\n    ├── visuals/         # example images of UI \n    │   ├── embeddedui-demo.png  # example website w/ embedded chatbot\n    │   └── ui-demo.jpeg         # example ui during local streamlit testing\n    ├── data/                    # Example documents\n    │   ├── taylor_swift_biography.html\n    │   └── constitution.pdf\n    └── storage/                 # Pre-built index for example\n```\n\n### 🎯 Where to Work\n\n- **`app.py`** - Your main chatbot code (already complete, no edits needed!)\n- **`data/`** - Put YOUR company documents here\n- **`examples/`** - Look here if you get stuck (don't edit this!)\n\n### 📂 What Gets Created\n\nWhen you run the app, it will automatically create:\n- **`storage/`** - Cached index of your documents (speeds up loading)\n\n---\n\n## 🎯 What This Does\n\nThis chatbot uses **Retrieval-Augmented Generation (RAG)** to answer questions about your documents:\n\n1. **📖 Reads your documents** from the `data/` folder\n2. **🔍 Creates a searchable index** using AI embeddings\n3. **💬 Answers questions** by finding relevant information and generating responses\n4. **🧠 Remembers conversation** context within each chat session\n\n**Example Use Case**: A customer support chatbot that answers questions about your company's products, policies, or services.\n\n---\n\n## ✅ Prerequisites\n\nBefore starting, make sure you have:\n\n1. **A Google account** for Google Colab ([Sign up here](https://accounts.google.com/signup))\n2. **Google Colab Pro (FREE for students!)** ([Get it here](https://colab.research.google.com/signup/pricing))\n   - ✨ Faster execution\n   - ⏱️ Longer runtime limits\n   - 💾 More storage\n   - ⚡ Priority access to GPUs\n   - 🎓 **100% FREE with your .edu email** - verification takes ~2 seconds!\n3. **An OpenAI API key**\n   - 🎓 **I, Amir, will provide a shared API key** for the class\n   - No payment required! Use the key provided by me\n   - (Alternative: Use Gemini API within your Google Colab Workspace for free! For more info, see [this link](https://github.com/googlecolab/colabtools/blob/main/notebooks/Getting_started_with_google_colab_ai.ipynb))\n4. **A LlamaCloud API key (Optional but Recommended)**\n   - 🆓 **Free tier available** at [cloud.llamaindex.ai](https://cloud.llamaindex.ai/)\n   - Enables advanced parsing of PDFs with tables, charts, and complex layouts\n   - Get 1,000 free pages per month\n   - Not required but highly recommended for processing complex documents\n5. **A GitHub account** ([Sign up here](https://github.com/signup))\n6. **A Hugging Face account** ([Sign up here](https://huggingface.co/join)) - for deployment\n\n---\n\n## 🚀 Setup Instructions\n\n### Step 1: Create Your Google Colab Account\n\n1. Go to [Google Colab](https://colab.research.google.com/)\n2. Sign in with your Google account\n3. **Get Colab Pro for FREE**:\n   - Go to [Colab Pro pricing page](https://colab.research.google.com/signup/pricing)\n   - Click \"Get Colab Pro\" and verify with your .edu email\n   - Instant approval! No payment required for students 🎉\n   - Enjoy faster runtimes and priority access\n\n### Step 2: Fork This Repository\n\n1. Go to the repository on GitHub\n2. Click the \"Fork\" button in the top right\n3. This creates your own copy of the project\n\n### Step 3: Connect Colab to Your GitHub\n\n1. In Google Colab, click **File → Open notebook**\n2. Select the **GitHub** tab\n3. Enter your repository URL or search for your username\n4. Open `examples/llama_test.ipynb` to start learning!\n\n### Step 4: Set Up Your API Keys in Colab\n\n**Option A: Using Colab Secrets (Recommended)**\n\n1. In your Colab notebook, click the 🔑 key icon in the left sidebar\n2. Click \"Add new secret\"\n3. Add `OPENAI_API_KEY`:\n   - Name: `OPENAI_API_KEY`\n   - Value: `sk-proj-xxxxxxxxxxxxxxxxxxxxx` (your actual key)\n   - Toggle \"Notebook access\" to ON\n4. (Optional) Add `LLAMA_CLOUD_API_KEY`:\n   - Click \"Add new secret\" again\n   - Name: `LLAMA_CLOUD_API_KEY`\n   - Value: `llx-xxxxxxxxxxxxxxxxxxxxx` (your LlamaCloud key)\n   - Toggle \"Notebook access\" to ON\n\n**Option B: Using Code (Less Secure)**\n\n```python\nfrom google.colab import userdata\nimport os\n\n# This retrieves your secret keys\nos.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')\n# Optional: Enable advanced document parsing\nos.environ['LLAMA_CLOUD_API_KEY'] = userdata.get('LLAMA_CLOUD_API_KEY')\n```\n\n⚠️ **Important**: Never hardcode your API keys directly in the notebook!\n\n---\n\n## 📄 Adding Your Data\n\n### Supported File Types\n\n#### Complex Documents (Parsed with LlamaParse - requires LLAMA_CLOUD_API_KEY)\n- PDF documents (`.pdf`) - with advanced OCR, table extraction, and chart recognition\n- Word documents (`.docx`, `.doc`)\n- PowerPoint presentations (`.pptx`, `.ppt`)\n- Excel spreadsheets (`.xlsx`, `.xls`)\n\n#### Simple Text Files (Parsed with SimpleDirectoryReader)\n- HTML files (`.html`)\n- Text files (`.txt`)\n- Markdown files (`.md`)\n- CSV files (`.csv`)\n- JSON files (`.json`)\n- XML files (`.xml`)\n\n![LlamaParse demo](example/visuals/llamaparse-demo%20Large.jpeg)\n\n**New Feature**: The app now uses LlamaParse for advanced document parsing! LlamaParse provides:\n- High-quality OCR for scanned documents\n- Intelligent table extraction (even from images and charts)\n- Multi-column layout handling\n- Chart and graph text extraction\n- Better handling of complex PDFs with mixed content\n\nIf LLAMA_CLOUD_API_KEY is not set, the app will fall back to SimpleDirectoryReader for all files.\n\n### How to Add Documents to Colab\n\n**Method 1: Upload Directly (Quick Testing)**\n\n1. In your Colab notebook, run:\n   ```python\n   from google.colab import files\n   uploaded = files.upload()\n   ```\n2. Select your documents to upload\n3. Files will be in the current directory\n\n**Method 2: Mount Google Drive (Recommended)**\n\n1. Upload your documents to a folder in Google Drive (e.g., `My Drive/chatbot-data/`)\n2. In your Colab notebook:\n   ```python\n   from google.colab import drive\n   drive.mount('/content/drive')\n   ```\n3. Access files from: `/content/drive/MyDrive/chatbot-data/`\n\n**Method 3: Push to GitHub (For Deployment)**\n\n1. Add your documents to the `data/` folder in your repository\n2. Commit and push to GitHub\n3. Pull the repository in Colab or deploy directly to Hugging Face\n\n### Tips for Better Results\n- ✅ Use clear, well-formatted documents\n- ✅ Include only relevant company information\n- ✅ Break very large documents into smaller, topic-focused files\n- ❌ Don't include sensitive data (passwords, private info)\n- ❌ Avoid image-only PDFs (text must be selectable)\n\n---\n\n## 🧪 Development \u0026 Testing Options\n\nYou have **two options** for developing and testing your chatbot. Choose the one that works best for you!\n\n---\n\n### 🌐 Option 1: Google Colab (Recommended for Beginners)\n\n**Pros**: No installation needed, works in browser, free GPU access\n**Cons**: Temporary URLs, session expires after inactivity\n\n**Use this if**: You prefer browser-based development or don't want to install Python locally\n\n---\n\n### 💻 Option 2: Local Development (Recommended for Advanced Users)\n\n**Pros**: Persistent environment, faster development, works offline\n**Cons**: Requires Python installation and setup\n\n**Use this if**: You're comfortable with terminal/command line and want full control\n\n---\n\n## 🌐 Option 1: Testing in Google Colab\n\n### Phase 1: Learning with the Example Notebook\n\nThe example notebook (`examples/llama_test.ipynb`) teaches you RAG concepts interactively.\n\n1. **Open the example notebook** in Colab:\n   - Go to your forked repository\n   - Navigate to `examples/llama_test.ipynb`\n   - Click \"Open in Colab\" badge (or manually open via Colab)\n\n2. **Install dependencies** (First cell - run this first!):\n   ```python\n   # STEP 1: Install all required packages\n   print(\"📦 Installing dependencies...\")\n   \n   !pip install -q streamlit==1.50.0\n   !pip install -q llama-index==0.14.4\n   !pip install -q llama-index-core==0.14.4\n   !pip install -q llama-index-llms-openai==0.6.4\n   !pip install -q llama-index-embeddings-openai==0.5.1\n   !pip install -q openai==1.109.1\n   !pip install -q python-dotenv==1.1.1\n   !pip install -q jedi==0.19.2\n   \n   print(\"✅ All dependencies installed!\")\n   ```\n   ⏱️ This takes 1-2 minutes. Wait for \"✅ All dependencies installed!\" before continuing.\n\n3. **Set up your API key** (Second cell):\n   ```python\n   # STEP 2: Configure OpenAI API Key\n   from google.colab import userdata\n   import os\n   \n   # Get API key from Colab secrets (you must add this first!)\n   os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')\n   print(\"✅ API key loaded\")\n   ```\n\n4. **Load and index documents** (Third cell):\n   ```python\n   # STEP 3: Load documents and create index\n   from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n   from llama_index.llms.openai import OpenAI\n   from llama_index.embeddings.openai import OpenAIEmbedding\n   \n   # Configure models\n   llm = OpenAI(model=\"gpt-5-nano-2025-08-07\", temperature=0.1)\n   embed_model = OpenAIEmbedding(model=\"text-embedding-3-small\")\n   \n   # Load documents from data folder\n   documents = SimpleDirectoryReader(\"data\").load_data()\n   print(f\"📄 Loaded {len(documents)} documents\")\n   \n   # Create searchable index\n   index = VectorStoreIndex.from_documents(\n       documents,\n       llm=llm,\n       embed_model=embed_model\n   )\n   print(\"✅ Index created successfully!\")\n   ```\n\n5. **Query the chatbot** (Fourth cell):\n   ```python\n   # STEP 4: Ask questions!\n   query_engine = index.as_query_engine()\n   \n   # Try your first question\n   response = query_engine.query(\"Your question here\")\n   print(response)\n   ```\n\n6. **Test with example data** first, then replace with your own documents\n\n### Phase 2: Running Your Streamlit App in Colab\n\nOnce you understand how RAG works from the notebook, transition to testing your actual `app.py` Streamlit application.\n\n#### Why Transition to app.py?\n\n- 📓 **Notebook (`llama_test.ipynb`)**: Learning tool, shows RAG step-by-step\n- 🚀 **Streamlit app (`app.py`)**: Production-ready chatbot with UI, what you'll deploy\n\n#### Step-by-Step: Running app.py in Colab\n\n1. **Create a new Colab notebook** (or add cells to your existing one):\n   - File → New notebook\n   - Or continue in your existing notebook\n\n2. **Install dependencies** (same as before):\n   ```python\n   !pip install -q streamlit==1.50.0 llama-index==0.14.4 llama-index-core==0.14.4 llama-index-llms-openai==0.6.4 llama-index-embeddings-openai==0.5.1 openai==1.109.1 python-dotenv==1.1.1 jedi==0.19.2\n   ```\n\n3. **Clone your repository** (if not already in Colab):\n   ```python\n   # Clone your forked repository\n   !git clone https://github.com/YOUR-USERNAME/YOUR-REPO-NAME.git\n   %cd YOUR-REPO-NAME\n   ```\n\n4. **Set up your API key as environment variable**:\n   ```python\n   import os\n   from google.colab import userdata\n   \n   # Set API key for the app to use\n   os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')\n   ```\n\n5. **Upload your documents** (if not already in the repo):\n   ```python\n   # Option A: Upload directly to Colab\n   from google.colab import files\n   uploaded = files.upload()\n   # Move uploaded files to data folder\n   !mkdir -p data\n   !mv *.pdf data/  # Adjust file extensions as needed\n   \n   # Option B: Mount Google Drive\n   from google.colab import drive\n   drive.mount('/content/drive')\n   !cp -r /content/drive/MyDrive/chatbot-data/* data/\n   ```\n\n6. **Install localtunnel to expose Streamlit**:\n   ```python\n   # Install localtunnel for public URL\n   !npm install -g localtunnel\n   ```\n\n7. **Run Streamlit in the background**:\n   ```python\n   # Run Streamlit app in background\n   !streamlit run app.py \u0026\u003e/content/logs.txt \u0026\n   \n   # Wait for Streamlit to start\n   import time\n   time.sleep(5)\n   \n   # Verify it's running\n   !curl http://localhost:8501\n   ```\n\n8. **Expose with localtunnel to get a public URL**:\n   ```python\n   # Get a public URL using localtunnel\n   !npx localtunnel --port 8501 \u0026\n   \n   # Wait a moment for the URL\n   import time\n   time.sleep(3)\n   \n   # The URL will appear in the output above\n   # Look for: \"your url is: https://xxxxx.loca.lt\"\n   ```\n\n9. **Access your chatbot**:\n   - Click the URL from localtunnel output (looks like `https://xxxxx.loca.lt`)\n   - Click \"Click to Continue\" on the localtunnel page\n   - Your Streamlit chatbot interface will appear! 🎉\n\n![chatbot ui demo](example/visuals/ui-demo.jpeg)\n\n10. **Test your chatbot**:\n    - Ask questions about your documents\n    - Verify responses are accurate\n    - Test different types of queries\n\n#### Important Notes for Running app.py in Colab:\n\n⚠️ **Limitations**:\n- Localtunnel URLs are temporary (expire when Colab disconnects)\n- Not suitable for permanent hosting\n- Great for testing and development only\n\n✅ **When to Use This**:\n- Testing your app with real documents before deploying\n- Showing your team the chatbot interface during development\n- Debugging issues before Hugging Face deployment\n\n🚀 **For Production**:\n- After testing in Colab, deploy to Hugging Face Spaces (permanent hosting)\n- Colab is for **development and testing**\n- Hugging Face is for **production and embedding**\n\n#### Workflow Summary:\n\n```\nStep 1: Learn RAG concepts\n└─→ Use llama_test.ipynb notebook\n\nStep 2: Test with your data\n└─→ Add your documents to data/\n└─→ Run notebook cells to verify indexing works\n\nStep 3: Test the Streamlit UI\n└─→ Run app.py in Colab with localtunnel\n└─→ Verify chatbot interface works correctly\n\nStep 4: Deploy to production\n└─→ Push to GitHub\n└─→ Deploy to Hugging Face Spaces\n└─→ Embed in your website\n\nStep 5: Publish website\n└─→ Enable GitHub Pages\n└─→ Share your live URL!\n```\n\n### Phase 3: When You're Ready for Production\n\nOnce you've tested everything in Colab and your chatbot works well:\n\n1. ✅ Make sure all your documents are in the `data/` folder\n2. ✅ Push your code to GitHub\n3. ✅ Deploy to Hugging Face Spaces (see next section)\n4. ✅ Embed the permanent Hugging Face URL in your website\n\n---\n\n## 💻 Option 2: Testing Locally on Your Computer\n\nIf you prefer to develop on your local machine, follow these steps.\n\n### Prerequisites\n\n- Python 3.9+ installed\n- Terminal/Command Prompt access\n- Text editor or IDE (VS Code recommended)\n\n### Setup Steps\n\n1. **Clone your repository**:\n   ```bash\n   git clone https://github.com/YOUR-USERNAME/YOUR-REPO-NAME.git\n   cd YOUR-REPO-NAME\n   ```\n\n2. **Create a virtual environment**:\n   \n   **On macOS/Linux**:\n   ```bash\n   python3 -m venv venv\n   source venv/bin/activate\n   ```\n   \n   **On Windows**:\n   ```bash\n   python -m venv venv\n   venv\\Scripts\\activate\n   ```\n\n3. **Install dependencies**:\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n4. **Set up your API keys**:\n   ```bash\n   # Copy the template\n   cp .env.example .env\n\n   # Edit .env and add your keys\n   # OPENAI_API_KEY=your-provided-key-here\n   # LLAMA_CLOUD_API_KEY=llx-your-key-here (optional but recommended)\n   ```\n\n5. **Add your documents** to the `data/` folder:\n   ```bash\n   # Place your PDF, HTML, TXT files in data/\n   ls data/\n   ```\n\n6. **Run the Streamlit app**:\n   ```bash\n   streamlit run app.py\n   ```\n   \n   The app will open at `http://localhost:8501` 🎉\n\n### Testing Locally\n\n1. **First run**: The app will index your documents (takes 10-30 seconds)\n2. **Subsequent runs**: Loads from cached `storage/` folder (much faster)\n3. **To re-index**: Delete the `storage/` folder and restart\n\n### Local Development Tips\n\n✅ **Advantages**:\n- Faster iteration (no need to reinstall packages each time)\n- Persistent storage (index cache survives between sessions)\n- Works offline (once dependencies are installed)\n- Better debugging experience\n\n⚠️ **Remember**:\n- Keep your virtual environment activated when working\n- Never commit `.env` file to GitHub\n- Test thoroughly before deploying to Hugging Face\n\n### Workflow for Local Development\n\n```bash\n# 1. Activate environment\nsource venv/bin/activate  # or venv\\Scripts\\activate on Windows\n\n# 2. Make changes to your code or data/\n\n# 3. Test locally\nstreamlit run app.py\n\n# 4. When satisfied, push to GitHub\ngit add .\ngit commit -m \"Update chatbot\"\ngit push\n\n# 5. Deploy to Hugging Face (see next section)\n```\n\n---\n\n## 🎯 Which Option Should You Choose?\n\n| Factor | Google Colab | Local Development |\n|--------|--------------|-------------------|\n| **Setup Time** | ⚡ Instant | 🕐 10-15 minutes |\n| **No Installation** | ✅ Yes | ❌ Need Python |\n| **Persistent Environment** | ❌ Sessions expire | ✅ Always available |\n| **Speed** | 🐌 Slower | ⚡ Faster |\n| **Best For** | Beginners, quick tests | Serious development |\n| **Internet Required** | ✅ Always | ❌ Only for deployment |\n\n**Recommendation**: Start with Google Colab to learn, then switch to local development if you want a better experience!\n\n### Understanding the Workflow\n\n```\n📝 Colab Notebook → 🧪 Test RAG Logic → 🚀 Deploy to Hugging Face → 🌐 Embed in Website\n```\n\n- **Colab**: Development and testing environment\n- **Hugging Face**: Production hosting for your Streamlit app\n- **Website**: User-facing integration\n\n---\n\n## 🧪 Testing with the Example\n\n### Option 1: Use the Example Notebook in Colab\n\n1. Open `examples/llama_test.ipynb` in Google Colab\n2. Run all cells to see the chatbot in action\n3. Ask questions like:\n   - \"When did Taylor Swift become a superstar?\"\n   - \"What are the amendments in the Constitution?\"\n\n### Option 2: Copy Example Data for Testing\n\nIf you want to test with the example documents:\n\n1. Clone the example data to your Google Drive\n2. Or download from GitHub and upload to Colab\n3. Point your code to the example data folder\n\n---\n\n## 🌐 Deploying to Hugging Face\n\n### Why Deploy?\n- ✨ Makes your chatbot publicly accessible\n- 🆓 Free hosting for public projects\n- 🔗 Easy to share with your team and embed in websites\n- 🎨 Professional Streamlit interface\n\n### Deployment Steps\n\n1. **Create a new Space** at [huggingface.co/new-space](https://huggingface.co/new-space)\n   - Name: `your-company-chatbot`\n   - License: Apache 2.0\n   - SDK: **Streamlit** ⚠️ Important!\n   - Hardware: CPU Basic (free)\n\n2. **Upload your files** from your GitHub repository:\n   - `app.py` ✅\n   - `requirements.txt` ✅\n   - `data/` folder with YOUR documents ✅\n   - `storage/` folder (optional - speeds up first load) ⚠️\n\n3. **Add your API keys as Secrets**:\n   - Go to Space Settings → Repository secrets\n   - Add secret: `OPENAI_API_KEY` = `your-key-here` (required)\n   - Add secret: `LLAMA_CLOUD_API_KEY` = `llx-your-key-here` (optional but recommended for better document parsing)\n\n4. **Wait for build** (2-3 minutes)\n   - Check the \"Logs\" tab for any errors\n   - Look for: \"✅ Index loaded\" or \"✅ Index created\"\n   - Once running, your chatbot is live! 🎉\n\nYour chatbot URL will be: `https://huggingface.co/spaces/YOUR-USERNAME/your-company-chatbot`\n\n### 💡 Pro Tips for Hugging Face Deployment\n- Upload the `storage/` folder to skip indexing on first load (faster startup)\n- Test thoroughly in Colab or locally before deploying\n- Use descriptive Space names (e.g., `acme-support-bot` not `test123`)\n- The chatbot uses `gpt-5-nano-2025-08-07` for responses and `text-embedding-3-small` for indexing (configured in app.py)\n\n---\n\n## 🌍 Embedding in Your Website\n\nOnce deployed to Hugging Face, you can embed your chatbot in your company website HTML page.\n\n### Option 1: Floating Chat Widget (Recommended)\n\n**See it in action**: Check out `visuals/embeddedui-demo.html` for a working example!\n\nAdd this code before the closing `\u003c/body\u003e` tag of your `index.html`:\n\n```html\n\u003c!-- Chatbot Widget Styles --\u003e\n\u003cstyle\u003e\n  .chat-widget-container {\n    position: fixed;\n    bottom: 20px;\n    right: 20px;\n    z-index: 9999;\n    width: 400px;\n    height: 600px;\n    border-radius: 12px;\n    box-shadow: 0 8px 32px rgba(0, 0, 0, 0.2);\n    overflow: hidden;\n    display: none;\n    background: white;\n  }\n\n  .chat-widget-container.open {\n    display: block;\n    animation: slideUp 0.3s ease;\n  }\n\n  @keyframes slideUp {\n    from {\n      opacity: 0;\n      transform: translateY(20px);\n    }\n    to {\n      opacity: 1;\n      transform: translateY(0);\n    }\n  }\n\n  .chat-widget-button {\n    position: fixed;\n    bottom: 20px;\n    right: 20px;\n    z-index: 10000;\n    width: 60px;\n    height: 60px;\n    border-radius: 50%;\n    background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);\n    border: none;\n    color: white;\n    font-size: 24px;\n    cursor: pointer;\n    box-shadow: 0 4px 15px rgba(0, 0, 0, 0.3);\n    transition: all 0.3s ease;\n  }\n\n  .chat-widget-button:hover {\n    transform: scale(1.1);\n    box-shadow: 0 6px 20px rgba(0, 0, 0, 0.4);\n  }\n\n  @media (max-width: 768px) {\n    .chat-widget-container {\n      width: calc(100vw - 40px);\n      height: calc(100vh - 140px);\n      bottom: 10px;\n      right: 10px;\n    }\n  }\n\u003c/style\u003e\n\n\u003c!-- Chatbot Toggle Button --\u003e\n\u003cbutton class=\"chat-widget-button\" onclick=\"toggleChat()\" aria-label=\"Open chatbot\"\u003e💬\u003c/button\u003e\n\n\u003c!-- Chatbot Container --\u003e\n\u003cdiv class=\"chat-widget-container\" id=\"chatWidget\"\u003e\n  \u003ciframe \n    src=\"https://huggingface.co/spaces/YOUR-USERNAME/your-company-chatbot\"\n    width=\"100%\" \n    height=\"100%\" \n    frameborder=\"0\"\n    title=\"Company Chatbot\"\u003e\n  \u003c/iframe\u003e\n\u003c/div\u003e\n\n\u003c!-- Toggle Script --\u003e\n\u003cscript\u003e\n  function toggleChat() {\n    const widget = document.getElementById('chatWidget');\n    const button = document.querySelector('.chat-widget-button');\n    \n    if (widget.classList.contains('open')) {\n      widget.classList.remove('open');\n      button.textContent = '💬';\n      button.setAttribute('aria-label', 'Open chatbot');\n    } else {\n      widget.classList.add('open');\n      button.textContent = '✕';\n      button.setAttribute('aria-label', 'Close chatbot');\n    }\n  }\n\u003c/script\u003e\n```\n\n### Option 2: Full-Page Embed\n\n```html\n\u003ciframe \n  src=\"https://huggingface.co/spaces/YOUR-USERNAME/your-company-chatbot\"\n  width=\"100%\" \n  height=\"600px\" \n  frameborder=\"0\"\n  title=\"Company Chatbot\"\u003e\n\u003c/iframe\u003e\n```\n\n**⚠️ Important**: Replace `YOUR-USERNAME/your-company-chatbot` with your actual Space URL!\n\n### Customization\n- Change colors by editing the CSS `background` gradients\n- Adjust size with `width` and `height` properties\n- Move position with `bottom` and `right` values\n- Customize the button emoji (💬, 🤖, 💡, etc.)\n\n---\n\n## 🚀 Publishing Your Website to GitHub Pages\n\nOnce you have your chatbot embedded, publish your complete website live on GitHub Pages!\n\n### Step 1: Prepare Your Repository\n\nMake sure your repository has:\n- ✅ `index.html` (your main website page with embedded chatbot)\n- ✅ `style.css` (your website styles)\n- ✅ `app.py` (your chatbot code)\n- ✅ `data/` folder (your company documents)\n- ✅ `requirements.txt`\n- ✅ `README.md`\n\n### Step 2: Push Everything to GitHub\n\n```bash\n# Add all files\ngit add .\n\n# Commit with a descriptive message\ngit commit -m \"Add company website with AI chatbot\"\n\n# Push to your repository\ngit push origin main\n```\n\n### Step 3: Enable GitHub Pages\n\n1. Go to your repository on GitHub\n2. Click **Settings** → **Pages** (in the left sidebar)\n3. Under \"Source\", select:\n   - Branch: `main`\n   - Folder: `/ (root)`\n4. Click **Save**\n5. Wait 1-2 minutes for deployment\n\n### Step 4: Access Your Live Website\n\nYour website will be live at:\n```\nhttps://YOUR-USERNAME.github.io/YOUR-REPO-NAME/\n```\n\n🎉 **Your chatbot is now embedded in a live website!**\n\n### What Gets Published\n\n- ✅ Your `index.html` website\n- ✅ All CSS, JavaScript, and assets\n- ✅ The embedded Hugging Face chatbot iframe\n- ❌ Backend files (app.py, data/) are not served by GitHub Pages\n- ℹ️ The chatbot itself runs on Hugging Face, not GitHub Pages\n\n### Updating Your Live Site\n\nEvery time you push to GitHub, your site automatically updates:\n\n```bash\n# Make changes to your HTML/CSS\ngit add index.html style.css\ngit commit -m \"Update website design\"\ngit push origin main\n# Site updates in 1-2 minutes!\n```\n\n### Pro Tips\n- Test your website locally by opening `index.html` in a browser before pushing\n- Make sure your Hugging Face Space URL in the iframe is correct\n- Use relative paths for CSS/JS files (e.g., `./style.css` not `/style.css`)\n- Add a custom domain in GitHub Pages settings if you have one!\n\n---\n\n## 🔧 Troubleshooting\n\n### Google Colab Issues\n\n#### \"OPENAI_API_KEY not found\"\n- ✅ **Colab**: Make sure you added the secret (🔑 icon) and toggled \"Notebook access\" to ON\n- ✅ **Local**: Check that your `.env` file exists and contains the instructor-provided key\n- ✅ **Hugging Face**: Verify the secret is set in Settings → Repository secrets\n- ✅ Make sure the key is exactly as provided by your instructor (no extra spaces)\n\n#### \"Runtime disconnected\"\n- ✅ Colab Pro (FREE for students!) has longer runtimes than the free tier\n- ✅ Save your work frequently to GitHub or Google Drive\n- ✅ Consider running critical tasks in shorter sessions\n\n#### \"Module not found\" error\n- ✅ Run the install cells at the start of your notebook\n- ✅ Use `!pip install` (with !) in Colab, not regular `pip install`\n- ✅ Make sure you ran the **entire** installation cell and waited for it to complete\n- ✅ If issues persist, restart runtime (Runtime → Restart runtime) and run install cell again\n\n### Chatbot Issues\n\n#### \"Please add documents to the 'data' directory\"\n- ✅ Make sure you uploaded files to the data folder\n- ✅ Check that files are in supported formats (PDF, HTML, TXT, etc.)\n- 💡 Try the example: upload files from `examples/data/`\n\n#### Chatbot gives wrong answers\n- ✅ Make sure your documents contain the relevant information\n- ✅ Try rephrasing your question more specifically\n- ✅ Check if the document text is readable (not corrupted or image-only PDFs)\n- 💡 Test with the example first to verify it's working\n\n#### Slow response times in Colab\n- ⏱️ First query after starting is always slower (building index)\n- ⚡ Subsequent queries should be faster (using cached index)\n- 🚀 Get Colab Pro for FREE with your .edu email for better performance\n\n### Hugging Face Deployment Issues\n\n#### Space won't start\n- ✅ Check the \"Logs\" tab for error messages\n- ✅ Verify `OPENAI_API_KEY` is set in Repository secrets\n- ✅ Make sure you selected \"Streamlit\" as the SDK\n- ✅ Confirm you uploaded `requirements.txt` and `app.py`\n\n#### Embedded iframe not showing chatbot\n- ✅ Make sure your Hugging Face Space is running (check the Space URL directly)\n- ✅ Try hard refresh: Ctrl+Shift+R (Windows) or Cmd+Shift+R (Mac)\n- ✅ Check browser console for errors (F12 → Console tab)\n- ✅ Verify the iframe src URL is correct\n\n### GitHub Pages Issues\n\n#### Website not loading\n- ✅ Make sure GitHub Pages is enabled in Settings → Pages\n- ✅ Wait 1-2 minutes after enabling for initial deployment\n- ✅ Check that branch is set to `main` and folder is `/ (root)`\n\n#### Chatbot iframe not appearing on live site\n- ✅ Verify your Hugging Face Space URL is correct in the iframe src\n- ✅ Check browser console for CORS or iframe errors\n- ✅ Test the Hugging Face Space URL directly in a browser first\n\n#### CSS/JavaScript not loading\n- ✅ Use relative paths: `./style.css` not `/style.css`\n- ✅ Check file names match exactly (case-sensitive on GitHub Pages)\n- ✅ Clear browser cache and hard refresh\n\n---\n\n## 📚 Additional Resources\n\n- [Google Colab Documentation](https://colab.research.google.com/notebooks/welcome.ipynb)\n- [LlamaIndex Documentation](https://docs.llamaindex.ai/)\n- [Streamlit Documentation](https://docs.streamlit.io/)\n- [OpenAI API Documentation](https://platform.openai.com/docs)\n- [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces)\n\n---\n\n## 📝 Project Integration (ENGCMP 0600)\n\nThis chatbot template is designed for your company project:\n\n| Project Step | What to Do | Where |\n|--------------|-----------|-------|\n| **Steps 1-5** | Plan your company, identify documents needed | Team planning |\n| **Step 6** | Research and gather company documents | `data/` folder |\n| **Steps 7-9** | Test and refine your chatbot | Google Colab |\n| **Step 8** | Deploy chatbot to production | Hugging Face Spaces |\n| **Step 8** | Build website and embed chatbot | HTML/CSS with iframe |\n| **Step 9** | Push repository and publish website | GitHub → GitHub Pages |\n| **Step 10** | Present your live website with chatbot | Final demo |\n\n### Deliverables Checklist\n- ✅ Working chatbot with your company's documents\n- ✅ Chatbot deployed to Hugging Face Spaces\n- ✅ Company website with embedded chatbot\n- ✅ **Website live on GitHub Pages**\n- ✅ **Complete repository pushed to GitHub**\n- ✅ Documentation (README, etc.)\n- ✅ Google Colab notebook showing development process\n\n---\n\n## 🎓 Recommended Workflow\n\n```\n1. 📘 Learn RAG Concepts\n   └─→ Open examples/llama_test.ipynb in Google Colab\n   └─→ Understand how document indexing and retrieval works\n\n2. 📝 Plan Your Company\n   └─→ Identify what documents your chatbot needs\n   └─→ Gather company information (products, policies, FAQs)\n\n3. 📄 Prepare Documents\n   └─→ Collect and organize documents in supported formats\n   └─→ Add to data/ folder\n\n4. 🧪 Choose Development Environment\n   └─→ Option A: Google Colab (browser-based, beginner-friendly)\n   └─→ Option B: Local development (faster, more control)\n\n5. 🔧 Test Your Chatbot\n   └─→ Google Colab: Use localtunnel for temporary testing\n   └─→ Local: Run streamlit run app.py for instant feedback\n   └─→ Verify answers are accurate and relevant\n\n6. 🚀 Deploy to Production\n   └─→ Push code to GitHub repository\n   └─→ Deploy to Hugging Face Spaces (permanent hosting)\n   └─→ Get your permanent chatbot URL\n\n7. 🌐 Build Company Website\n   └─→ Create index.html with company branding\n   └─→ Embed Hugging Face chatbot using iframe code\n   └─→ Style with CSS\n\n8. 📤 Publish Website\n   └─→ Push website files to GitHub\n   └─→ Enable GitHub Pages in repository settings\n   └─→ Get your live website URL\n\n9. ✅ Verify Everything Works\n   └─→ Test chatbot on live website\n   └─→ Ask various questions to ensure accuracy\n   └─→ Check responsive design on mobile\n\n10. 🎤 Present Your Project\n    └─→ Demo your live website with working AI chatbot\n    └─→ Explain your company and how the bot helps customers\n    └─→ Share both GitHub and live website URLs\n```\n\n---\n\n## 🤝 Support\n\nIf you run into issues:\n1. ✅ Check the [Troubleshooting](#troubleshooting) section above\n2. 🧪 Try running the `examples/llama_test.ipynb` to verify setup\n3. 📋 Review your code against this README\n4. 🔍 Check Hugging Face Space logs for error messages\n5. 💬 Ask your instructor or TA for help\n\n---\n\n## 🎓 Learning Resources\n\n- **`examples/llama_test.ipynb`** - Jupyter notebook explaining RAG concepts (start here!)\n- **`examples/README.md`** - How the example chatbot works\n- **`data/README.md`** - Tips for adding documents\n- **`visuals/`** - UI demos and screenshots for reference\n\n---\n\n**Good luck building your AI-powered chatbot! 🚀**\n\n\u003e Remember: Develop in Google Colab, deploy to Hugging Face, embed in your website, publish on GitHub Pages!","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvitalune%2Fpitt-llamaproject","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvitalune%2Fpitt-llamaproject","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvitalune%2Fpitt-llamaproject/lists"}