{"id":33301321,"url":"https://github.com/mdarkanurl/startups-from-ai","last_synced_at":"2026-04-08T23:34:03.772Z","repository":{"id":319092158,"uuid":"1073417217","full_name":"mdarkanurl/startups-from-ai","owner":"mdarkanurl","description":"This AI bot goes online, gathers information about AI startups, and posts updates about them on X and Dev.to.","archived":false,"fork":false,"pushed_at":"2026-02-05T10:03:34.000Z","size":275,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-05T21:48:31.846Z","etag":null,"topics":["ai","ai-agent","ai-bot","backend","bot","mongodb","nodejs","playwright","postgresql","puppeteer","typescript","webscraper","webscraping"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mdarkanurl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"contributing.md","funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-10T04:53:22.000Z","updated_at":"2026-02-05T10:03:39.000Z","dependencies_parsed_at":"2026-02-05T12:15:49.297Z","dependency_job_id":null,"html_url":"https://github.com/mdarkanurl/startups-from-ai","commit_stats":null,"previous_names":["mdarkanurl/startup.bot","mdarkanurl/startups-from-ai"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mdarkanurl/startups-from-ai","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdarkanurl%2Fstartups-from-ai","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdarkanurl%2Fstartups-from-ai/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdarkanurl%2Fstartups-from-ai/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdarkanurl%2Fstartups-from-ai/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mdarkanurl","download_url":"https://codeload.github.com/mdarkanurl/startups-from-ai/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdarkanurl%2Fstartups-from-ai/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31532165,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-07T16:28:08.000Z","status":"ssl_error","status_checked_at":"2026-04-07T16:28:06.951Z","response_time":105,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-agent","ai-bot","backend","bot","mongodb","nodejs","playwright","postgresql","puppeteer","typescript","webscraper","webscraping"],"created_at":"2025-11-18T11:00:53.532Z","updated_at":"2026-04-07T22:31:17.048Z","avatar_url":"https://github.com/mdarkanurl.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Startups From AI\n\nAn automated AI-powered system that discovers, analyzes, and creates content about startups from various online sources. The application continuously crawls the web for startup information, generates AI-powered summaries, and automatically posts engaging content to social media platforms.\n\n## Features\n\n- **Automated Web Crawling** - Continuously discovers startup information from various online sources\n- **AI-Powered Analysis** - Uses Google Gemini AI to generate intelligent summaries and insights\n- **Multi-Platform Content Generation** - Automatically creates tweets and blog posts about startups\n- **Social Media Integration** - Posts generated content to Twitter and Dev.to\n- **Data Aggregation** - Collects startup data from Product Hunt, websites, and other sources\n- **Scheduled Operations** - Runs automated workflows with different intervals for various tasks\n- **Structured Logging** - Comprehensive logging with Winston and Better Stack integration\n\n## Tech Stack\n\n- **Runtime**: Node.js with TypeScript\n- **Database**: PostgreSQL with Drizzle ORM + MongoDB for additional storage\n- **AI Integration**: Google Gemini API\n- **Web Crawling**: Crawlee with Playwright\n- **Social APIs**: Twitter API v2, Dev.to API, Product Hunt API\n- **Logging**: Winston with daily rotation and Better Stack integration\n- **Task Scheduling**: Custom timing system with configurable delays\n\n## Project Structure\n\n```\nsrc/\n├── modules/\n│   ├── ai/                     # AI-powered content generation\n│   │   ├── startups/           # Startup analysis and summarization\n│   │   ├── tweet/              # Tweet generation and posting\n│   │   └── blog/               # Blog generation and posting\n│   └── fetch-data-from-online/ # Data collection modules\n│       ├── product-hunt/       # Product Hunt integration\n│       └── website-crawler/    # Web scraping functionality\n├── db/                         # Database configurations\n├── utils/                      # Shared utilities and helpers\n├── connection.ts               # Database connection setup\n├── winston.ts                  # Logging configuration\n└── index.ts                    # Application entry point\n```\n\n## Data Models\n\n### Startup\n```typescript\ninterface Startup {\n  id: string;\n  name?: string;\n  VC_firm: string;\n  website: string;\n  founder_names: string[];\n  foundedAt?: string;\n}\n```\n\n### AI Generated Summary\n```typescript\ninterface AIGeneratedSummary {\n  id: string;\n  summary: string[];\n  startupId: string;\n  isUsedForTweets: boolean;\n  isUsedForBlogs: boolean;\n}\n```\n\n### Tweet\n```typescript\ninterface Tweet {\n  id: string;\n  startupId: string;\n  tweet: string;\n  isUsed: boolean;\n}\n```\n\n### Blog\n```typescript\ninterface Blog {\n  id: string;\n  startupId: string;\n  title: string;\n  blog: string;\n  isUsed: boolean;\n}\n```\n\n### Web Page Data\n```typescript\ninterface WebPageData {\n  id: string;\n  url: string;\n  title: string;\n  description: string;\n  text: string;\n  isUsed: boolean;\n  startupId: string;\n}\n```\n\n## Setup Instructions\n\n### Prerequisites\n- Node.js (v18 or higher)\n- pnpm\n- PostgreSQL\n- MongoDB\n\n### Installation\n\n1. **Install dependencies**\n   ```bash\n   pnpm install --frozen-lockfile\n   ```\n\n2. **Set up environment variables**\n   ```bash\n   cp .env.example .env\n   ```\n   Edit `.env` with your configuration:\n   - `GEMINI_API_KEY`: Your Google Gemini API key\n   - `MONGODB_CONNECT_URL`: MongoDB connection string\n   - `DATABASE_URL`: PostgreSQL connection string\n   - `X_*`: Twitter API credentials\n   - `DEVTO_API_KEY`: Dev.to API key\n   - `BEARER_TOKEN`: Product Hunt API token\n   - `BETTER_STACK_*`: Better Stack logging configuration\n\n3. **Run database migrations**\n   ```bash\n   pnpm db:migrate\n   ```\n\n4. **Start the application**\n   ```bash\n   # Development\n   pnpm run dev\n\n   # Production\n   pnpm run build\n   pnpm run start\n   ```\n\n## How It Works\n\n### Main Workflow Loop\n\nThe application runs in a continuous loop with the following schedule:\n\n1. **Every Loop Iteration**:\n   - Crawl websites for startup data\n   - Generate AI summaries of startups\n   - Generate tweet content\n   - Generate blog content\n\n2. **Every Hour**:\n   - Post generated tweets to Twitter\n\n3. **Every Day**:\n   - Post generated blogs to Dev.to\n   - Fetch fresh data from Product Hunt\n\n### Data Flow\n\n1. **Data Collection**: \n   - Product Hunt API for trending startups\n   - Web crawler for detailed startup information\n   - Website content extraction and analysis\n\n2. **AI Processing**:\n   - Google Gemini analyzes collected data\n   - Generates comprehensive summaries\n   - Creates engaging social media content\n\n3. **Content Distribution**:\n   - Automated posting to Twitter\n   - Blog publication on Dev.to\n   - Tracking of used content to prevent duplicates\n\n## API Integrations\n\n### Product Hunt\n- Fetches daily and trending startup data\n- Requires API bearer token for authentication\n\n### Twitter/X\n- Posts generated tweets automatically\n- Uses OAuth 1.0a authentication\n- Supports media attachments and threading\n\n### Dev.to\n- Publishes comprehensive blog posts\n- API key authentication\n- Markdown formatting support\n\n### Google Gemini\n- Powers content generation and analysis\n- Provides intelligent summaries\n- Creates engaging social media copy\n\n## Configuration\n\n### Environment Variables\n\n| Variable | Description | Required |\n|----------|-------------|----------|\n| `GEMINI_API_KEY` | Google Gemini API key | Yes |\n| `MONGODB_CONNECT_URL` | MongoDB connection string | Yes |\n| `DATABASE_URL` | PostgreSQL connection string | Yes |\n| `X_APP_KEY` | Twitter app key | Yes |\n| `X_APP_SECRET` | Twitter app secret | Yes |\n| `X_ACCESS_TOKEN` | Twitter access token | Yes |\n| `X_ACCESS_TOKEN_SECRET` | Twitter access token secret | Yes |\n| `DEVTO_API_KEY` | Dev.to API key | Yes |\n| `BEARER_TOKEN` | Product Hunt bearer token | Yes |\n| `HEADLESS` | Run browser in headless mode | No |\n| `MAX_REQUESTS` | Maximum requests per crawling session | No |\n| `DELAY_MS` | Delay between API requests | No |\n\n### Logging\n\nThe application uses Winston for structured logging with:\n- Daily log rotation\n- Better Stack integration for centralized monitoring\n- Different log levels for various components\n- Child loggers for better traceability\n\n## Development\n\n### Database Management\n\n```bash\n# Generate new migrations\npnpm db:generate\n\n# Run migrations\npnpm db:migrate\n\n# Open database studio\npnpm studio\n```\n\n### Monitoring\n\n- Check logs in the console or log files\n- Monitor Better Stack dashboard for centralized logging\n- Track API usage and rate limits\n- Monitor database performance and connections\n\n## Contributing\n\nContributions, issues, and feature requests are welcome! Please follow the guidelines outlined in the [contributing.md](contributing.md) file.\n\n## License\n\nMIT License\n\n## Support\n\nFor questions or support, please open an issue on the GitHub repository.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmdarkanurl%2Fstartups-from-ai","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmdarkanurl%2Fstartups-from-ai","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmdarkanurl%2Fstartups-from-ai/lists"}