The End goal of this project to to help people learn ( spanish) words from TV series subtitle using Quizlet (or other flash card project ):
...
phase 1- python script ( create a word list and translate ) [ + Create an dictionary per language / per season = summ of all word list — for future application / statistics
phase 2- website: subtitle search, check if already exist and create a Quizlet
phase 3- script apply to many subtitle languages / TV series with many episode
phase 4- graph on the website ( statistics of words usage per single subtitle files and multiple subtitle files )
phase 5- Integrate learning progress from Quizlet ( or other ) to create new list of words/ per subtitle files
...
-o <serieS01E01.csv>
-d <description_file.csv>
-e translate expression as one word: "Qué tal" and not "Qué" "tal"
--dbo <other output format, database format, to be defile in the future >
...
--ol <original_language: esp >
--tl <target_language: eng>
--st <second language: romanization or transliteration : PinYin , .... >
--ra <remove article: el, la, una, ... >
--bstat <basic Basic statistics : only summary of count of words>words
--fstat <full fstat Full statistics report: with everything from below >below
--swstat <single words statistics >swstat Single words statistics
--swnastat <single swnastat Single words no articles statistic> statistic ???? is it needed ????
--swonstat <single swonstat Single words only nouns statistic>statistic
--swovstat <single swovstat Single words only verbs statistic>statistic
--swotrstat <single swotrstat Single words only the rest statistic>statistic
--swiestat Sinle words including expression ( like "Qué tal" ) statistics
--mfl <multi Multi-file option: list of file >
--mfd <multi Multi-file directory >
--mfso multi Multi-file single output file .csv
--mfmo multi Multi-file multiple output file .csv
--mstat <multimstat Multi-file stats >
Future:
1- create an website to search / download subtitle file ( srt )
...