webiummedia
Posts: 2
Joined: Sat Dec 27, 2014 12:11 am

video language recognition

Sat Dec 27, 2014 12:22 am

Hey guys! I am in the process of creating a media center to stream on my TV all the movies I have. So fare, the project is working great. I would like to add a new feature but I am unsure where to start ... I have about 1 500 movies. Half of them are french and half are english. I am looking for a 100% working method to retreive the language of the mouvie. I tryed using the metadata (ID3) but it's wrong or not present in a lot of cases making this an unreliable method... What i would like is a linux command line script where i feed him the path to the video and it returns me the language. I guess the best way would be an extraction of the audio then transform it in to plain texte and from there identify the language ... I know how to find the language once it's in texte ... i just need a fast way to do the whole process in less then 3 seconds .... is that possible? and how? My code is in phyton but anything working with command lines is ok.

User avatar
davidcoton
Posts: 5026
Joined: Mon Sep 01, 2014 2:37 pm
Location: Cambridge, UK
Contact: Website

Re: video language recognition

Sat Dec 27, 2014 11:27 am

I imagine it's possible, but not easy. However, finding and opening a movie, finding and identifying some speech, and extracting enough to work with (as text) will undoubtedly take longer than 3 seconds.

How about keeping a database -- enter the language from metadata to start with, and a confirmed/provisional flag. Every time you call up a movie, if it's unconfirmed, present a dialogue to confirm it or change it. Subsequently accessing the database and retrieving the language should be fast enough.

If you can edit the metadata, that would be a more "object orientated" approach than an external database.
Signature retired

ghans
Posts: 7882
Joined: Mon Dec 12, 2011 8:30 pm
Location: Germany

Re: video language recognition

Sat Dec 27, 2014 11:37 am

It will propably take 3 minutes per movie , not three seconds.
A multicore PC is much better suited to this task.

ghans
• Don't like the board ? Missing features ? Change to the prosilver theme ! You can find it in your settings.
• Don't like to search the forum BEFORE posting 'cos it's useless ? Try googling : yoursearchtermshere site:raspberrypi.org

gkreidl
Posts: 6326
Joined: Thu Jan 26, 2012 1:07 pm
Location: Germany

Re: video language recognition

Sat Dec 27, 2014 12:53 pm

Move the French movies into a "French" sub-folder, English movies into an "English" sub-folder.

If you cannot do this manually (because you don't know and it is not reflected in the file name), create a script that walks through your video files / folders and use the output of "mediainfo" to check the language. That might work with a number of videos, but not all.

For the remaining videos I would create a script that starts them, jumps a bit ahead, plays for 30 seconds and then asks you to which language folder they should be moved.
Minimal Kiosk Browser (kweb)
Slim, fast webkit browser with support for audio+video+playlists+youtube+pdf+download
Optional fullscreen kiosk mode and command interface for embedded applications
Includes omxplayerGUI, an X front end for omxplayer

webiummedia
Posts: 2
Joined: Sat Dec 27, 2014 12:11 am

Re: video language recognition

Sat Dec 27, 2014 2:17 pm

Right now i am using a MySQL database to store the movie infos. I also fetch somme info from TMDB for movie description and image. I made my self a netflix like interface. My problem is that i prefer watching english movies but my wife prefer french ... So i wanted to add a filter in my interface to see only the relevant movies for the language chosen by the user.

I tryed using the metadata with ID3 .... but a lot of the movies does not seem to have a complete metadata and sometimes the language used is not the one provided... I am just trying to find a 100% accurate way to find in what language is the given movie.

As for the sub-folder strategy, thats not possible. I am trying to create a universal stupid proof media center where you just plug in the HDD containing the movies and POOF it's all added to the interface. I already have the crawler working i just wana add a new feature to my current code.

I was thinking using FFMPEG to extract 3 diffrent 30 seconds audio tracks then send that to an audio to texte script and then find out what is the most probable language used. If i have 2 on 3 pointing to the same language then i found a match. I can do this as a subprocess working in the background ... So if it really takes 3 minutes at lest the user can use the app while it's sorting out the languages.

Do you think rasberry can use FFMPEG and stream a video at the same time?

User avatar
davidcoton
Posts: 5026
Joined: Mon Sep 01, 2014 2:37 pm
Location: Cambridge, UK
Contact: Website

Re: video language recognition

Sat Dec 27, 2014 8:16 pm

webiummedia wrote:Do you think rasberry can use FFMPEG and stream a video at the same time?
There's only one way to find out...

If not, just leave the Pi on for a few days, and arrange to pause the crawler while you're watching something.
Signature retired

Return to “Advanced users”