Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/david-mwas/vidmindai

VIDMIND is a system designed to automatically summarize, analyze, and extract key information from YouTube video content. By leveraging text embeddings and natural language processing techniques, VIDMIND aims to provide users with concise summaries and key insights, reducing the need for manual video viewing and note-taking.
https://github.com/david-mwas/vidmindai

embeddings firebase-auth mongodb mongoose nlp nodejs openai reactjs tailwindcss vector vite youtube-api

Last synced: 6 days ago
JSON representation

Host: GitHub
URL: https://github.com/david-mwas/vidmindai
Owner: David-mwas
Created: 2024-03-01T13:48:21.000Z (11 months ago)
Default Branch: main
Last Pushed: 2024-11-18T16:31:14.000Z (2 months ago)
Last Synced: 2024-11-18T17:53:19.914Z (2 months ago)
Topics: embeddings, firebase-auth, mongodb, mongoose, nlp, nodejs, openai, reactjs, tailwindcss, vector, vite, youtube-api
Language: JavaScript
Homepage: https://vidmind.vercel.app
Size: 5.73 MB
Stars: 4
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Security: SECURITY.md

Awesome Lists containing this project

README

# VIDMIND

## Your video AI companion

![/src/assets/image.png](src/assets/image.png)

## Table of Contents

- [Introduction](#introduction)
- [Problem Statement](#problem-statement)
- [Solution Overview](#solution-overview)
- [Aims and Objectives](#aims-and-objectives)
- [Methodology](#methodology)
- [System Requirements](#system-requirements)
- [System Benefits](#system-benefits)
- [Budget](#budget)
- [Schedule](#schedule)
- [References](#references)

## Introduction

VIDMIND is developed to address the challenge of information overload in video content, particularly on platforms like YouTube. By automating video comprehension and summarization, VIDMIND aims to enhance efficiency and productivity for users seeking to extract key information quickly from videos.

## Problem Statement

The abundance of video content on platforms like YouTube makes it difficult for users to efficiently extract key information and insights. Manual video viewing and processing are time-consuming and often inefficient. VIDMIND addresses this challenge by automating video summarization and analysis.

## Solution Overview

VIDMIND extracts video transcripts using YouTube's API, generates text embeddings using OpenAI API or Gemini API, stores these embeddings in a vector database (such as Astra DB), and applies natural language processing techniques to generate concise summaries. The system then presents these summaries and key insights to users through an intuitive interface.

## Aims and Objectives

### Aim

To automate the understanding of YouTube video content, providing users with concise summaries, key insights, and extracted key information.

### Objectives

1. Analyze the performance of OpenAI API and Gemini API for video transcript embedding.
2. Design a system architecture for embedding generation, storage, and analysis.
3. Develop a user-friendly interface for interacting with the system.
4. Evaluate the accuracy and effectiveness of generated summaries.

## Methodology

VIDMIND employs a prototyping approach, iteratively refining the system based on user feedback to ensure that the final product aligns with user needs and expectations.

## System Requirements

### Technical Requirements

- Node.js (Backend)
- EJS (templating engine), React (Frontend)
- AstraDB or Redis (vector database)
- OpenAI API or Gemini API
- YouTube Data API (Transcript API)
- Additional libraries for Natural Language Processing (voice and speech recognition)

### Functional Requirements

- Extract video transcripts from YouTube URLs.
- Generate text embeddings from transcripts.
- Store and retrieve embeddings from AstraDB, vector database.
- Generate summaries of video content.
- Present summaries and key insights in a user-friendly interface.

### Non-Functional Requirements

- User-friendly interface
- High performance and response time
- Secure storage of data
- Reliability and accessibility

## System Benefits

- Rapid comprehension of video content without manual viewing
- Time savings for users seeking key information
- Improved decision-making based on extracted insights

## References

- [OpenAI API documentation](https://openai.com/)
- [Gemini API documentation](https://deepmind.google/technologies/gemini/#bard/)
- [AstraDB documentation](https://astra.datastax.com/)
- [YouTube Data API documentation](https://console.cloud.google.com/apis/api/youtube.googleapis.com/)
- Research papers on text embeddings and video summarization