Multimedia (cs.MM)

  • PDF
    In this paper, we present a transfer learning approach for music classification and regression tasks. We propose to use a pretrained convnet feature, a concatenated feature vector using activations of feature maps of multiple layers in a trained convolutional network. We show that how this convnet feature can serve as a general-purpose music representation. In the experiment, a convnet is trained for music tagging and then transferred for many music-related classification and regression tasks as well as an audio-related classification task. In experiments, the convnet feature outperforms the baseline MFCC feature in all tasks and many reported approaches of aggregating MFCCs and low- and high-level music features.
  • PDF
    Virtual reality (VR) video provides an immersive 360 viewing experience to a user wearing a head-mounted display: as the user rotates his head, correspondingly different fields-of-view (FoV) of the 360 video are rendered for observation. Transmitting the entire 360 video in high quality over bandwidth-constrained networks from server to client for real-time playback is challenging. In this paper we propose a multi-stream switching framework for VR video streaming: the server pre-encodes a set of VR video streams covering different view ranges that account for server-client round trip time (RTT) delay, and during streaming the server transmits and switches streams according to a user's detected head rotation angle. For a given RTT, we formulate an optimization to seek multiple VR streams of different view ranges and the head-angle-to-stream mapping function simultaneously, in order to minimize the expected distortion subject to bandwidth and storage constraints. We propose an alternating algorithm that, at each iteration, computes the optimal streams while keeping the mapping function fixed and vice versa. Experiments show that for the same bandwidth, our multi-stream switching scheme outperforms a non-switching single-stream approach by up to 2.9dB in PSNR.
  • PDF
    This letter is about a principal weakness of the published article by Li et al. in 2014. It seems that the mentioned work has a terrible conceptual mistake while presenting its theoretical approach. In fact, the work has tried to design a new attack and its effective solution for a basic watermarking algorithm by Zhu et al. published in 2013, however in practice, we show the Li et al.'s approach is not correct to obtain the aim. For disproof of the incorrect approach, we only apply a numerical example as the counterexample of the Li et al.'s approach.
  • PDF
    The application of mobile computing is currently altering patterns of our behavior to a greater degree than perhaps any other invention. In combination with the introduction of BLE (Bluetooth Low Energy) and similar technologies enabling context-awareness, designers are today finding themselves empowered to build experiences and facilitate interactions with our physical surroundings in ways not possible before. The aim of this thesis is to present a research project, currently underway at the University of Cambridge, which is dealing with implementation of a BLE system into a museum environment. By assessing the technology, describing the design decisions as well as presenting a qualitative evaluation, this paper seeks to provide insight into some of the challenges and possible solutions connected to the process of developing ubiquitous BLE computing systems for public spaces. The project outcome revealed the potential use of BLE to engage whole new groups of audiences as well as made me argue in favor of a more seamful approach to the design of these systems.