Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/arsentievalex/newspulse-databricks-hackathon
NewsPulse is AI powered news analytics app for investors
https://github.com/arsentievalex/newspulse-databricks-hackathon
databricks dbrx duckduckgo-search langchain llm openai rag streamlit vector-database yahooquery
Last synced: 3 months ago
JSON representation
NewsPulse is AI powered news analytics app for investors
- Host: GitHub
- URL: https://github.com/arsentievalex/newspulse-databricks-hackathon
- Owner: arsentievalex
- Created: 2024-05-06T16:05:34.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-08-27T12:58:02.000Z (5 months ago)
- Last Synced: 2024-08-27T14:20:32.419Z (5 months ago)
- Topics: databricks, dbrx, duckduckgo-search, langchain, llm, openai, rag, streamlit, vector-database, yahooquery
- Language: Jupyter Notebook
- Homepage: https://newspulseai.streamlit.app/
- Size: 10.8 MB
- Stars: 9
- Watchers: 1
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[![Open in Streamlit](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://newspulseai.streamlit.app/)
# NewsPulse AI: Databricks Generative AI Hackathon [1st place winner in Financial Services]
## What It Does
This application is specifically designed to monitor and analyze the sentiment of the latest news articles regarding significant business events, such as layoffs, mergers and acquisitions, reorganizations, and disputes. These events can profoundly affect stock performance, making it vital for investors to stay informed.### Key Features
- **Sentiment Analysis:** Analyze sentiment by day and topic, with aggregated results.
- **Stock Price vs Sentiment:** A time series analysis to study the impact of news sentiment on stock performance.
- **Chatbot:** Provides Q&A capabilities using a vector search index and sourced information.### Data Acquisition Process
- **News Articles:** Uses the DuckDuckGo API to fetch recent news articles about selected companies.
- **Content Scraping:** Utilizes ScrapeGraphAI and GPT 3.5-Turbo to extract content from URLs.
- **Sentiment Extraction:** Applies DBRX Instruct and LangChain to determine sentiment from articles.
- **RAG System:** Articles are chunked, embedded using DBRX, and loaded into a Databricks vector store.
- **Stock Data:** Uses YahooQuery to gather historical stock price data from YahooFinance.Automated Databricks jobs are supposed to run daily or multiple times a day to continuously update the database and vector store with new articles.
## Tech Stack
- [Databricks](https://www.databricks.com/) - Data Processing, Storage, Vector Database
- [Streamlit](https://streamlit.io/) - Frontend
- [OpenAI](https://www.openai.com/) - LLM
- [DBRX](https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm) - LLM
- [Langchain](https://js.langchain.com/docs/) - LLM wrapper
- [DuckDuckGo](https://rapidapi.com/epctex-epctex-default/api/duckduckgo10/) - News API
- [ScrapeGraphAI](https://github.com/VinciGit00/Scrapegraph-ai/tree/main) - Web Scraping
- [Yahooquery](https://yahooquery.dpguthrie.com/) - Yahoo Finance API
- [Embedchain](https://embedchain.ai/) - RAG (used for demo as alternative to Databricks endpoint)