https://github.com/mebjas/movie-name-extractor
Mostly file-names of movies are encoded so as to transmit maximum information with it. This repo contains code in different languages to parse these names and get information out of them.
https://github.com/mebjas/movie-name-extractor
Last synced: 11 days ago
JSON representation
Mostly file-names of movies are encoded so as to transmit maximum information with it. This repo contains code in different languages to parse these names and get information out of them.
- Host: GitHub
- URL: https://github.com/mebjas/movie-name-extractor
- Owner: mebjas
- Created: 2013-11-21T21:48:31.000Z (over 11 years ago)
- Default Branch: master
- Last Pushed: 2014-04-21T12:10:18.000Z (about 11 years ago)
- Last Synced: 2025-04-10T07:32:58.742Z (12 days ago)
- Language: PHP
- Homepage:
- Size: 334 KB
- Stars: 16
- Watchers: 4
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
movie information extraction
====================**Continous Integration**
[](https://travis-ci.org/mebjas/movie-name-extractor)**Code coverage:**
[](https://coveralls.io/r/mebjas/movie-name-extractor)>code to extract movie name out of file name
**step 1:** regex filter 1: to remove all data in () and [] after extracting information like season no / episode no from it. Regular expressions used are:
"/\[.*?\]|\(.*?\)/"
"/s(\d{1,2})e(\d{1,3})|(\d{1,2})x(\d{1,3})/";**step 2:** Now a movie name may look like:
Iron.Man.3.213.3D.18pReplace the "." with " " or white space to get name like
Iron Man 3 213 3D 18p>Now the movie is of From
[name] [version] [year] [3D/2D] [resolution]
>So now we need to parse this kind of name to get all information!Parsing can be done using different **regex filters**
Regex 1: **to identify movie part no** like in Iron Man **3**
/\b\d /i
Regex 2: **to determine the year** like Iron man **213** or Iron Man **2013**/\d{3,4}/i
Regex 3: **to determine dimensions** like Iron man **3D** or Iron Man **3d**/[0-90]{1}d|[0-9]{1}D/i
Regex 4: **to determine Resolution** like Iron man **18p** or Iron Man **1080p**/\d{1,4}p|\d{1,4}P/And from this we can generate information like
Filename: Iron.Man.3.213.3D.18p
Title: Iron Man 3
Part: 3
Year: 213
Resolution: 18p
Dimension: 3D**Dependencies:**
[](https://gemnasium.com/mebjas/movie-name-extractor)