This dataset is presented in the paper Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video Podcasts published at LREC 2022.
Hi, thank you for this dataset. I need to use face cropped version of this dataset however when I tried to execute the "Extract Single-speaker Snippets" commands, only videos from 2006 were passed to Merkel_Single_Speaker folder and processed outputs were gone with "rm" commands. That's due to the "bash files/to_run.sh" command where only 2006 files are written. Could you provide other years' videos too? Thank you.
I don't know if this is something about my setup, but I had to delete the "stderr=subprocess.PIPE" part from the function call, then everything started working again.