How To: Improve video search by parsing video & text

Improve video search by parsing video & text

This is a Google Tech Talk from March, 26 2008. Timothee Cour - Research Scientist lectures. Movies and TV are a rich source of highly diverse and complex video of people, objects, actions and locales "in the wild". Harvesting automatically labeled sequences of actions from video would enable creation of large-scale and highly-varied datasets. To enable such collection, we focus on the task of recovering scene structure in movies and TV series for object/person tracking and action retrieval. We present a weakly supervised algorithm that uses the screenplay and closed captions to parse a movie into a hierarchy of shots and scenes. Scene boundaries in the movie are aligned with screenplay scene labels and shots are reordered into a sequence of long continuous tracks or threads which allow for more accurate tracking of people and actions across shot boundaries. Scene segmentation, alignment, and shot threading are formulated as inference in a unified generative model and a novel hierarchical dynamic programming algorithm that can handle alignment and jump-limited reorderings in linear time is introduced. We present quantitative and qualitative results on movie alignment and parsing, and use the recovered structure for tracking and naming of characters as well as retrieval of common actions in several episodes of popular TV series.

If time permits we will also present our recent results on approximate inference with eigenvalue optimization.

Speaker: Timothee Cour - Research Scientist
Timothee Cour is a fifth year PhD student at the University of Pennsylvania, Philadelphia, in Computer Science. He completed his undergraduate education at the Ecole Polytechnique in France, majoring in Computer Science and Applied Mathematics. His research advisor is Prof. Ben Taskar and he also worked closely with Prof. Jianbo Shi.

Want to master Microsoft Excel and take your work-from-home job prospects to the next level? Jump-start your career with our Premium A-to-Z Microsoft Excel Training Bundle from the new Gadget Hacks Shop and get lifetime access to more than 40 hours of Basic to Advanced instruction on functions, formula, tools, and more.

Buy Now (97% off) >

Other worthwhile deals to check out:

Join the Next Reality AR Community

Get the latest in AR — delivered straight to your inbox.

1 Comment

Tell that dumb#$%@ to speak clearer. And if he dicides to "cough" then his pronuonciation SHOULD be better, if not tell him the Shut The #$%@ Up, fire his ass and get someone else to represent w/e or if he's the writer/designer, tell him to get someone that CAN talk and stop studering ever word. Or.. You can tell him to use less big words because he's a #$%@ing ESL(English second languge).

Final conclusion : Just saying... this video would take 30 minutes less if he'd stop studering.

Share Your Thoughts

  • Hot
  • Latest