https://github.com/orengrinker/jstextfromimage
Get descriptions of images using OpenAI's GPT-4 Vision models, Azure OpenAI, and Anthropic Claude in an easy way.
https://github.com/orengrinker/jstextfromimage
anthropic-claude azure-openai azure-openai-api claude-3-5-sonnet claude-api image-processing npm npm-package openai-api
Last synced: about 2 months ago
JSON representation
Get descriptions of images using OpenAI's GPT-4 Vision models, Azure OpenAI, and Anthropic Claude in an easy way.
- Host: GitHub
- URL: https://github.com/orengrinker/jstextfromimage
- Owner: OrenGrinker
- Created: 2024-11-21T17:39:49.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-11-27T17:29:27.000Z (6 months ago)
- Last Synced: 2025-03-25T19:41:34.079Z (2 months ago)
- Topics: anthropic-claude, azure-openai, azure-openai-api, claude-3-5-sonnet, claude-api, image-processing, npm, npm-package, openai-api
- Language: TypeScript
- Homepage:
- Size: 347 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# JSTextFromImage




A powerful TypeScript/JavaScript library for obtaining detailed descriptions of images using various AI models including OpenAI's GPT-4 Vision, Azure OpenAI, and Anthropic Claude. Supports image URLs with batch processing capabilities.
## 🌟 Key Features
- 🤖 **Multiple AI Providers**: Support for OpenAI, Azure OpenAI, and Anthropic Claude
- 🌐 **URL Support**: Process images from URLs
- 📦 **Batch Processing**: Process multiple images concurrently
- 📝 **TypeScript First**: Built with TypeScript for excellent type safety
- 🔄 **Async/Await**: Modern Promise-based API
- 🔑 **Flexible Auth**: Multiple authentication methods including environment variables
- 🛡️ **Error Handling**: Comprehensive error handling## 📦 Installation
```bash
npm install jstextfromimage
```## 🚀 Quick Start
You can use the services either with environment variables or direct initialization.
### Using Environment Variables
```typescript
import { openai, claude, azureOpenai } from 'jstextfromimage';// Services will automatically use environment variables
const description = await openai.getDescription('https://example.com/image.jpg');
```### Direct Initialization
```typescript
import { OpenAIService, ClaudeService, AzureOpenAIService } from 'jstextfromimage';// OpenAI custom instance
const customOpenAI = new OpenAIService('your-openai-api-key');// Claude custom instance
const customClaude = new ClaudeService('your-claude-api-key');// Azure OpenAI custom instance
const customAzure = new AzureOpenAIService({
apiKey: 'your-azure-api-key',
endpoint: 'your-azure-endpoint',
deploymentName: 'your-deployment-name'
});
```### OpenAI Service
```typescript
import { openai } from 'jstextfromimage';// Single image analysis
const description = await openai.getDescription('https://example.com/image.jpg', {
prompt: "Describe the main elements of this image",
maxTokens: 500,
model: 'gpt-4o'
});// Batch processing
const imageUrls = [
'https://example.com/image1.jpg',
'https://example.com/image2.jpg',
'https://example.com/image3.jpg'
];const results = await openai.getDescriptionBatch(imageUrls, {
prompt: "Analyze this image in detail",
maxTokens: 300,
concurrency: 2,
model: 'gpt-4o'
});// Process results
results.forEach(result => {
if (result.error) {
console.error(`Error processing ${result.imageUrl}: ${result.error}`);
} else {
console.log(`Description for ${result.imageUrl}: ${result.description}`);
}
});
```### Claude Service
```typescript
import { claude } from 'jstextfromimage';// Single image analysis
const description = await claude.getDescription('https://example.com/artwork.jpg', {
prompt: "Analyze this artwork, including style and composition",
maxTokens: 1000,
model: 'claude-3-sonnet-20240229'
});// Batch processing
const artworkUrls = [
'https://example.com/artwork1.jpg',
'https://example.com/artwork2.jpg'
];const analyses = await claude.getDescriptionBatch(artworkUrls, {
prompt: "Provide a detailed art analysis",
maxTokens: 800,
concurrency: 2,
model: 'claude-3-sonnet-20240229'
});
```### Azure OpenAI Service
```typescript
import { azureOpenai } from 'jstextfromimage';// Single image analysis
const description = await azureOpenai.getDescription('https://example.com/scene.jpg', {
prompt: "Describe this scene in detail",
maxTokens: 400,
systemPrompt: "You are an expert in visual analysis."
});// Batch processing
const sceneUrls = [
'https://example.com/scene1.jpg',
'https://example.com/scene2.jpg'
];const analyses = await azureOpenai.getDescriptionBatch(sceneUrls, {
prompt: "Analyze the composition and mood",
maxTokens: 500,
concurrency: 3,
systemPrompt: "You are an expert cinematographer."
});
```## 💡 Configuration
### Default Values
```typescript
// OpenAI defaults
{
model: 'gpt-4o',
maxTokens: 300,
prompt: "What's in this image?",
concurrency: 3 // for batch processing
}// Claude defaults
{
model: 'claude-3-sonnet-20240229',
maxTokens: 300,
prompt: "What's in this image?",
concurrency: 3
}// Azure OpenAI defaults
{
maxTokens: 300,
prompt: "What's in this image?",
systemPrompt: "You are a helpful assistant.",
concurrency: 3
}
```### Local File Support
```typescript
import { openai } from 'jstextfromimage';// Single local file
const description = await openai.getDescription('/path/to/local/image.jpg', {
prompt: "Describe this image",
maxTokens: 300,
model: 'gpt-4o'
});// Mix of local files and URLs in batch processing
const images = [
'/path/to/local/image1.jpg',
'https://example.com/image2.jpg',
'/path/to/local/image3.png'
];const results = await openai.getDescriptionBatch(images, {
prompt: "Analyze each image",
maxTokens: 300,
concurrency: 2
});
```### Environment Variables
```env
# OpenAI
OPENAI_API_KEY=your-openai-api-key# Claude
ANTHROPIC_API_KEY=your-claude-api-key# Azure OpenAI
AZURE_OPENAI_API_KEY=your-azure-api-key
AZURE_OPENAI_ENDPOINT=your-azure-endpoint
AZURE_OPENAI_DEPLOYMENT=your-deployment-name
```### Options Interfaces
```typescript
// Base options for all services
interface BaseOptions {
prompt?: string;
maxTokens?: number;
concurrency?: number; // For batch processing
}// OpenAI specific options
interface OpenAIOptions extends BaseOptions {
model?: string;
}// Claude specific options
interface ClaudeOptions extends BaseOptions {
model?: string;
}// Azure OpenAI specific options
interface AzureOpenAIOptions extends BaseOptions {
systemPrompt?: string;
}// Azure OpenAI configuration
interface AzureOpenAIConfig {
apiKey?: string;
endpoint?: string;
deploymentName?: string;
apiVersion?: string;
}// Batch processing results
interface BatchResult {
imageUrl: string;
description: string;
error?: string;
}
```## 🔍 Error Handling Examples
```typescript
// Single image with error handling
try {
const description = await openai.getDescription(imageUrl, {
maxTokens: 300
});
console.log(description);
} catch (error) {
console.error('Failed to process image:', error);
}// Batch processing with retry
async function processWithRetry(imageUrls: string[], maxRetries = 3) {
const results = await openai.getDescriptionBatch(imageUrls, {
maxTokens: 300,
concurrency: 2
});
// Handle failed items with retry
const failedItems = results.filter(r => r.error);
let retryCount = 0;
while (failedItems.length > 0 && retryCount < maxRetries) {
const retryUrls = failedItems.map(item => item.imageUrl);
const retryResults = await openai.getDescriptionBatch(retryUrls, {
maxTokens: 300,
concurrency: 1 // Lower concurrency for retries
});
// Update results with successful retries
retryResults.forEach(result => {
if (!result.error) {
const index = results.findIndex(r => r.imageUrl === result.imageUrl);
if (index !== -1) {
results[index] = result;
}
}
});
retryCount++;
}
return results;
}
```## 🛠️ Development
```bash
# Install dependencies
npm install# Run tests
npm test# Build the project
npm run build# Run linting
npm run lint
```## 🤝 Contributing
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -am 'feat: add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request## 📝 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 💬 Support
For support, please [open an issue](https://github.com/OrenGrinker/jstextfromimage/issues/new) on GitHub.