https://github.com/hyparam/hylang
A stupidly small and fast programming language detection model
https://github.com/hyparam/hylang
detection language language-detection language-detector
Last synced: 9 days ago
JSON representation
A stupidly small and fast programming language detection model
- Host: GitHub
- URL: https://github.com/hyparam/hylang
- Owner: hyparam
- License: mit
- Created: 2024-06-20T17:29:24.000Z (11 months ago)
- Default Branch: master
- Last Pushed: 2024-08-20T15:49:00.000Z (9 months ago)
- Last Synced: 2025-05-09T19:05:25.041Z (13 days ago)
- Topics: detection, language, language-detection, language-detector
- Language: Python
- Homepage:
- Size: 1.39 MB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# HyLang

[](https://github.com/hyparam/hylang/actions)
[](https://opensource.org/licenses/MIT)
[](https://www.npmjs.com/package/hylang?activeTab=dependencies)
A stupidly small and fast programming language detection model.
## Usage
```js
import { detectLanguage } from 'hylang'const input = `
function square(x) {
return x * x
}
`
console.log(`Predicted language: ${detectLanguage(input)}`)
```## Accuracy
Hylang eval is sampled from the starcoderdata dataset.
Hylang is 30.4kb packed and achieves 74.5% accuracy.
## Implementation
The language detector is implemented as a bag of words model trained on the starcoderdata dataset.
Training is done in python with torch. Model weights are exported to `params.json` so they can be used in javascript.