Sep 08 2017 cs.CV
We design a compact but effective CNN model for optical flow by exploiting the well-known design principles: pyramid, warping, and cost volume. Cast in a learnable feature pyramid, our network uses the current optical flow estimate to warp the CNN features of the second image. It then uses the warped features and features of the first image to construct the cost volume, which is processed by a CNN network to decode the optical flow. As the cost volume is a more discriminative representation of the search space for the optical flow than raw images, a compact CNN decoder network is sufficient. Our model performs on par with the recent FlowNet2 method on the MPI Sintel and KITTI 2015 benchmarks, while being 17 times smaller in size and 2 times faster in inference. Our model protocol and learned parameters will be publicly available.
Jul 27 2017 cs.CV
Given two consecutive frames from a pair of stereo cameras, 3D scene flow methods simultaneously estimate the 3D geometry and motion of the observed scene. Many existing approaches use superpixels for regularization, but may predict inconsistent shapes and motions inside rigidly moving objects. We instead assume that scenes consist of foreground objects rigidly moving in front of a static background, and use semantic cues to produce pixel-accurate scene flow estimates. Our cascaded classification framework accurately models 3D scenes by iteratively refining semantic segmentation masks, stereo correspondences, 3D rigid motion estimates, and optical flow fields. We evaluate our method on the challenging KITTI autonomous driving benchmark, and show that accounting for the motion of segmented vehicles leads to state-of-the-art performance.
The rise of robotic applications has led to the generation of a huge volume of unstructured data, whereas the current cloud infrastructure was designed to process limited amounts of structured data. To address this problem, we propose a learn-memorize-recall-reduce paradigm for robotic cloud computing. The learning stage converts incoming unstructured data into structured data; the memorization stage provides effective storage for the massive amount of data; the recall stage provides efficient means to retrieve the raw data; while the reduction stage provides means to make sense of this massive amount of unstructured data with limited computing resources.
Apr 13 2017 cs.LG
When you need to enable deep learning on low-cost embedded SoCs, is it better to port an existing deep learning framework or should you build one from scratch? In this paper, we share our practical experiences of building an embedded inference engine using ARM Compute Library (ACL). The results show that, contradictory to conventional wisdoms, for simple models, it takes much less development time to build an inference engine from scratch compared to porting existing frameworks. In addition, by utilizing ACL, we managed to build an inference engine that outperforms TensorFlow by 25%. Our conclusion is that, on embedded devices, we most likely will use very simple deep learning models for inference, and with well-developed building blocks such as ACL, it may be better in both performance and development time to build the engine from scratch.
Mar 20 2017 cs.CV
Paleness or pallor is a manifestation of blood loss or low hemoglobin concentrations in the human blood that can be caused by pathologies such as anemia. This work presents the first automated screening system that utilizes pallor site images, segments, and extracts color and intensity-based features for multi-class classification of patients with high pallor due to anemia-like pathologies, normal patients and patients with other abnormalities. This work analyzes the pallor sites of conjunctiva and tongue for anemia screening purposes. First, for the eye pallor site images, the sclera and conjunctiva regions are automatically segmented for regions of interest. Similarly, for the tongue pallor site images, the inner and outer tongue regions are segmented. Then, color-plane based feature extraction is performed followed by machine learning algorithms for feature reduction and image level classification for anemia. In this work, a suite of classification algorithms image-level classifications for normal (class 0), pallor (class 1) and other abnormalities (class 2). The proposed method achieves 86% accuracy, 85% precision and 67% recall in eye pallor site images and 98.2% accuracy and precision with 100% recall in tongue pallor site images for classification of images with pallor. The proposed pallor screening system can be further fine-tuned to detect the severity of anemia-like pathologies using controlled set of local images that can then be used for future benchmarking purposes.
Mar 15 2016 cs.CV
Existing optical flow methods make generic, spatially homogeneous, assumptions about the spatial structure of the flow. In reality, optical flow varies across an image depending on object class. Simply put, different objects move differently. Here we exploit recent advances in static semantic scene segmentation to segment the image into objects of different types. We define different models of image motion in these regions depending on the type of object. For example, we model the motion on roads with homographies, vegetation with spatially smooth flow, and independently moving objects like cars and planes with affine motion plus deviations. We then pose the flow estimation problem using a novel formulation of localized layers, which addresses limitations of traditional layered models for dealing with complex scene motion. Our semantic flow method achieves the lowest error of any published monocular method in the KITTI-2015 flow benchmark and produces qualitatively better flow and segmentation than recent top methods on a wide range of natural videos.
In this paper we consider the robust secure beamformer design for MISO wiretap channels. Assume that the eavesdroppers' channels are only partially available at the transmitter, we seek to maximize the secrecy rate under the transmit power and secrecy rate outage probability constraint. The outage probability constraint requires that the secrecy rate exceeds certain threshold with high probability. Therefore including such constraint in the design naturally ensures the desired robustness. Unfortunately, the presence of the probabilistic constraints makes the problem non-convex and hence difficult to solve. In this paper, we investigate the outage probability constrained secrecy rate maximization problem using a novel two-step approach. Under a wide range of uncertainty models, our developed algorithms can obtain high-quality solutions, sometimes even exact global solutions, for the robust secure beamformer design problem. Simulation results are presented to verify the effectiveness and robustness of the proposed algorithms.
Epidemic outbreaks in human populations are facilitated by the underlying transportation network. We consider strategies for containing a viral spreading process by optimally allocating a limited budget to three types of protection resources: (i) Traffic control resources, (ii), preventative resources and (iii) corrective resources. Traffic control resources are employed to impose restrictions on the traffic flowing across directed edges in the transportation network. Preventative resources are allocated to nodes to reduce the probability of infection at that node (e.g. vaccines), and corrective resources are allocated to nodes to increase the recovery rate at that node (e.g. antidotes). We assume these resources have monetary costs associated with them, from which we formalize an optimal budget allocation problem which maximizes containment of the infection. We present a polynomial time solution to the optimal budget allocation problem using Geometric Programming (GP) for an arbitrary weighted and directed contact network and a large class of resource cost functions. We illustrate our approach by designing optimal traffic control strategies to contain an epidemic outbreak that propagates through a real-world air transportation network.
For the problems of low-rank matrix completion, the efficiency of the widely-used nuclear norm technique may be challenged under many circumstances, especially when certain basis coefficients are fixed, for example, the low-rank correlation matrix completion in various fields such as the financial market and the low-rank density matrix completion from the quantum state tomography. To seek a solution of high recovery quality beyond the reach of the nuclear norm, in this paper, we propose a rank-corrected procedure using a nuclear semi-norm to generate a new estimator. For this new estimator, we establish a non-asymptotic recovery error bound. More importantly, we quantify the reduction of the recovery error bound for this rank-corrected procedure. Compared with the one obtained for the nuclear norm penalized least squares estimator, this reduction can be substantial (around 50%). We also provide necessary and sufficient conditions for rank consistency in the sense of Bach (2008). Very interestingly, these conditions are highly related to the concept of constraint nondegeneracy in matrix optimization. As a byproduct, our results provide a theoretical foundation for the majorized penalty method of Gao and Sun (2010) and Gao (2010) for structured low-rank matrix optimization problems. Extensive numerical experiments demonstrate that our proposed rank-corrected procedure can simultaneously achieve a high recovery accuracy and capture the low-rank structure.
Nowadays, lots of open source communities adopt forum to acquire scattered stakeholders' requirements. But the requirements collection process always suffers from the unformatted description and unfocused discussions. In this paper, we establish a framework ReqForum to define the metamodel of the requirement elicitation forum. Based on it, we propose a lightweight forum-based requirements elicitation process which includes six steps: template-based requirements creation, opinions collection, requirements collection, requirements management, capability identification and the incentive mechanism. According to the proposed process, the prototype SKLSEForum is established by composing the Discuz and its existed pulg-ins. The implementation indicates that the process is feasible and the cost is economic.