Multimedia (cs.MM)

  • PDF
    Network coding based peer-to-peer streaming represents an effective solution to aggregate user capacities and to increase system throughput in live multimedia streaming. Nonetheless, such systems are vulnerable to pollution attacks where a handful of malicious peers can disrupt the communication by transmitting just a few bogus packets which are then recombined and relayed by unaware honest nodes, further spreading the pollution over the network. Whereas previous research focused on malicious nodes identification schemes and pollution-resilient coding, in this paper we show pollution countermeasures which make a standard network coding scheme resilient to pollution attacks. Thanks to a simple yet effective analytical model of a reference node collecting packets by malicious and honest neighbors, we demonstrate that i) packets received earlier are less likely to be polluted and ii) short generations increase the likelihood to recover a clean generation. Therefore, we propose a recombination scheme where nodes draw packets to be recombined according to their age in the input queue, paired with a decoding scheme able to detect the reception of polluted packets early in the decoding process and short generations. The effectiveness of our approach is experimentally evaluated in a real system we developed and deployed on hundreds to thousands peers. Experimental evidence shows that, thanks to our simple countermeasures, the effect of a pollution attack is almost canceled and the video quality experienced by the peers is comparable to pre-attack levels.
  • PDF
    The production of sports highlight packages summarizing a game's most exciting moments is an essential task for broadcast media. Yet, it requires labor-intensive video editing. We propose a novel approach for auto-curating sports highlights, and use it to create a real-world system for the editorial aid of golf highlight reels. Our method fuses information from the players' reactions (action recognition such as high-fives and fist pumps), spectators (crowd cheering), and commentator (tone of the voice and word analysis) to determine the most interesting moments of a game. We accurately identify the start and end frames of key shot highlights with additional metadata, such as the player's name and the hole number, allowing personalized content summarization and retrieval. In addition, we introduce new techniques for learning our classifiers with reduced manual training data annotation by exploiting the correlation of different modalities. Our work has been demonstrated at a major golf tournament, successfully extracting highlights from live video streams over four consecutive days.