Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ashtishad/log-analyzer
A tool to identify the most loyal customers by analyzing their activity logs, utilizing concurrent processing using file chunks for improved performance.
https://github.com/ashtishad/log-analyzer
concurrent-programming golang structured-logging
Last synced: about 1 month ago
JSON representation
A tool to identify the most loyal customers by analyzing their activity logs, utilizing concurrent processing using file chunks for improved performance.
- Host: GitHub
- URL: https://github.com/ashtishad/log-analyzer
- Owner: ashtishad
- Created: 2024-10-05T13:49:11.000Z (3 months ago)
- Default Branch: master
- Last Pushed: 2024-10-05T22:37:02.000Z (3 months ago)
- Last Synced: 2024-11-16T06:44:47.799Z (about 2 months ago)
- Topics: concurrent-programming, golang, structured-logging
- Language: Go
- Homepage:
- Size: 18.3 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Log Analyzer for Loyal Customer Identification
A tool to identify the most loyal customers by analyzing their activity logs, utilizing concurrent processing using file chunks for improved performance.
## Input Data
- Daily logs of customer activity in a structured JSON format with log/slog.
- Log entry format: `{"userId": int, "pageName": string, "timestamp": ISO8601}`
- **1 million** log entries per day, **1 million** unique customers, 20 website pages/router group## Loyal Customer Criteria
1. Visited on both days under analysis
2. Accessed at least 4 unique pages# Expected Output
```
Loyal Customer Count: 180066
Time elapsed: 418.303584ms// commented in main.go to avoid large output slice
Loyal Customers: [2 8 13 15 19 22 25 35 40 45 ....................... ]```
## Implementation
### 1. Log Generation (seed/generator.go)
- Generate structured slog log files for consecutive days
- Provide a controlled dataset with mixed loyal and non-loyal customer activities### 2. Concurrent Log Processing (processor/processor.go)
- Split files into chunks for parallel processing while preserving log integrity
- NumOfChunks can be set to runtime.NumCPU() to utilize all cores.
- Implement concurrent processing using goroutines and channels
- Optimize file reading with buffered I/O and preallocated slices
- Parse logs using fast byte-level operations, tailored for our JSON structure### 3. Loyal Customer Identification Algorithm (main.go)
- Process log data from both days concurrently
- Track user visits and unique page views using a map for efficient lookups
- Apply loyal customer criteria: visits on both days and minimum unique pages
- Generate a sorted list of loyal customer IDs**Further Reading:** [The One Billion Row Challenge in Go](https://benhoyt.com/writings/go-1brc/)
# Directory structure
```
log-analyzer/
│
├── .github/
│ └── workflows/
│ └── test.yaml # CI workflow for automated testing
│
├── seed/
│ └── generator.go # Generates sample log data for testing
│
├── files/
│ ├── logs_2024-10-01.log # generated 1 million log entries from day 1
│ └── logs_2024-10-02.log # generated 1 million log entries from day 2
│
├── proccessor/
│ └── processor.go # Core log processing logic
│
├── main.go # Main application entry point
├── main_test.go # Unit, Fuzz and Benchmark tests.
├── README.md # Project documentation
└── go.mod # Go module definition
```