May 16 2018 cs.CR
In this paper, we present an end-to-end view of IoT security and privacy and a case study. Our contribution is three-fold. First, we present our end-to-end view of an IoT system and this view can guide risk assessment and design of an IoT system. We identify 10 basic IoT functionalities that are related to security and privacy. Based on this view, we systematically present security and privacy requirements in terms of IoT system, software, networking and big data analytics in the cloud. Second, using the end-to-end view of IoT security and privacy, we present a vulnerability analysis of the Edimax IP camera system. We are the first to exploit this system and have identified various attacks that can fully control all the cameras from the manufacturer. Our real-world experiments demonstrate the effectiveness of the discovered attacks and raise the alarms again for the IoT manufacturers. Third, such vulnerabilities found in the exploit of Edimax cameras and our previous exploit of Edimax smartplugs can lead to another wave of Mirai attacks, which can be either botnets or worm attacks. To systematically understand the damage of the Mirai malware, we model propagation of the Mirai and use the simulations to validate the modeling. The work in this paper raises the alarm again for the IoT device manufacturers to better secure their products in order to prevent malware attacks like Mirai.
Epidemics are emergent phenomena depending on the epidemiological characteristics of pathogens and the interaction and movement of people. Public transit systems have provided much important information about the movement of people, but there are also other means of transportation (e.g., bicycle and private car), that are invisible to public transit data. This discrepancy can induce a bias in disease models that leads to mispredictions of epidemic growth (e.g., peak prevalence and time). In our study, we aim to advance and compare the epidemic spreading dynamics using public transit trips, in contrast to more accurate estimates of population movement using mobile phones traces. In our study, we simulate epidemic outbreaks in a cohort of two million mobile phone users. We use a metapopulation model incorporating susceptible-infected-recovered dynamics to analyze and compare different effective contract matrices, constructed by the public transit systems and mobile phones respectively, on the process of epidemics. We find that epidemic outbreaks using public transit trips tend to be underestimated in terms of the epidemic spreading dynamics, reaching epidemic peaks weaker and later. This is rooted in a later introduction of new infectious people into uninfected locations.
Apr 26 2018 cs.CV
Most existing methods of semantic segmentation still suffer from two aspects of challenges: intra-class inconsistency and inter-class indistinction. To tackle these two problems, we propose a Discriminative Feature Network (DFN), which contains two sub-networks: Smooth Network and Border Network. Specifically, to handle the intra-class inconsistency problem, we specially design a Smooth Network with Channel Attention Block and global average pooling to select the more discriminative features. Furthermore, we propose a Border Network to make the bilateral features of boundary distinguishable with deep semantic boundary supervision. Based on our proposed DFN, we achieve state-of-the-art performance 86.2% mean IOU on PASCAL VOC 2012 and 80.3% mean IOU on Cityscapes dataset.
Considering the use of Fully Connected (FC) layer limits the performance of Convolutional Neural Networks (CNNs), this paper develops a method to improve the coupling between the convolution layer and the FC layer by reducing the noise in Feature Maps (FMs). Our approach is divided into three steps. Firstly, we separate all the FMs into n blocks equally. Then, the weighted summation of FMs at the same position in all blocks constitutes a new block of FMs. Finally, we replicate this new block into n copies and concatenate them as the input to the FC layer. This sharing of FMs could reduce the noise in them apparently and avert the impact by a particular FM on the specific part weight of hidden layers, hence preventing the network from overfitting to some extent. Using the Fermat Lemma, we prove that this method could make the global minima value range of the loss function wider, by which makes it easier for neural networks to converge and accelerates the convergence process. This method does not significantly increase the amounts of network parameters (only a few more coefficients added), and the experiments demonstrate that this method could increase the convergence speed and improve the classification performance of neural networks.
Apr 18 2018 cs.CV
Data of different modalities generally convey complimentary but heterogeneous information, and a more discriminative representation is often preferred by combining multiple data modalities like the RGB and infrared features. However in reality, obtaining both data channels is challenging due to many limitations. For example, the RGB surveillance cameras are often restricted from private spaces, which is in conflict with the need of abnormal activity detection for personal security. As a result, using partial data channels to build a full representation of multi-modalities is clearly desired. In this paper, we propose a novel Partial-modal Generative Adversarial Networks (PM-GANs) that learns a full-modal representation using data from only partial modalities. The full representation is achieved by a generated representation in place of the missing data channel. Extensive experiments are conducted to verify the performance of our proposed method on action recognition, compared with four state-of-the-art methods. Meanwhile, a new Infrared-Visible Dataset for action recognition is introduced, and will be the first publicly available action dataset that contains paired infrared and visible spectrum.
Mar 12 2018 cs.CV
We present an effective blind image deblurring method based on a data-driven discriminative prior.Our work is motivated by the fact that a good image prior should favor clear images over blurred images.In this work, we formulate the image prior as a binary classifier which can be achieved by a deep convolutional neural network (CNN).The learned prior is able to distinguish whether an input image is clear or not.Embedded into the maximum a posterior (MAP) framework, it helps blind deblurring in various scenarios, including natural, face, text, and low-illumination images.However, it is difficult to optimize the deblurring method with the learned image prior as it involves a non-linear CNN.Therefore, we develop an efficient numerical approach based on the half-quadratic splitting method and gradient decent algorithm to solve the proposed model.Furthermore, the proposed model can be easily extended to non-uniform deblurring.Both qualitative and quantitative experimental results show that our method performs favorably against state-of-the-art algorithms as well as domain-specific image deblurring approaches.
Dec 20 2017 cs.CV
In real-world crowd counting applications, the crowd densities vary greatly in spatial and temporal domains. A detection based counting method will estimate crowds accurately in low density scenes, while its reliability in congested areas is downgraded. A regression based approach, on the other hand, captures the general density information in crowded regions. Without knowing the location of each person, it tends to overestimate the count in low density areas. Thus, exclusively using either one of them is not sufficient to handle all kinds of scenes with varying densities. To address this issue, a novel end-to-end crowd counting framework, named DecideNet (DEteCtIon and Density Estimation Network) is proposed. It can adaptively decide the appropriate counting mode for different locations on the image based on its real density conditions. DecideNet starts with estimating the crowd density by generating detection and regression based density maps separately. To capture inevitable variation in densities, it incorporates an attention module, meant to adaptively assess the reliability of the two types of estimations. The final crowd counts are obtained with the guidance of the attention module to adopt suitable estimations from the two kinds of density maps. Experimental results show that our method achieves state-of-the-art performance on three challenging crowd counting datasets.
This work presents a novel approach for robust PCA with total variation regularization for foreground-background separation and denoising on noisy, moving camera video. Our proposed algorithm registers the raw (possibly corrupted) frames of a video and then jointly processes the registered frames to produce a decomposition of the scene into a low-rank background component that captures the static components of the scene, a smooth foreground component that captures the dynamic components of the scene, and a sparse component that isolates corruptions. Unlike existing methods, our proposed algorithm produces a panoramic low-rank component that spans the entire field of view, automatically stitching together corrupted data from partially overlapping scenes. The low-rank portion of our robust PCA model is based on a recently discovered optimal low-rank matrix estimator (OptShrink) that requires no parameter tuning. We demonstrate the performance of our algorithm on both static and moving camera videos corrupted by noise and outliers.
Nov 20 2017 cs.RO
Location fingerprinting locates devices based on pattern matching signal observations to a pre-defined signal map. This paper introduces a technique to enable fast signal map creation given a dedicated surveyor with a smartphone and floorplan. Our technique (PFSurvey) uses accelerometer, gyroscope and magnetometer data to estimate the surveyor's trajectory post-hoc using Simultaneous Localisation and Mapping and particle filtering to incorporate a building floorplan. We demonstrate conventional methods can fail to recover the survey path robustly and determine the room unambiguously. To counter this we use a novel loop closure detection method based on magnetic field signals and propose to incorporate the magnetic loop closures and straight-line constraints into the filtering process to ensure robust trajectory recovery. We show this allows room ambiguities to be resolved. An entire building can be surveyed by the proposed system in minutes rather than days. We evaluate in a large office space and compare to state-of-the-art approaches. We achieve trajectories within 1.1 m of the ground truth 90% of the time. Output signal maps well approximate those built from conventional, laborious manual survey. We also demonstrate that the signal maps built by PFSurvey provide similar or even better positioning performance than the manual signal maps.
Oct 24 2017 cs.SI
In parallel to the increase of various mobile technologies, the mobile social network (MSN) service has brought us into an era of mobile social big data, where people are creating new social data every second and everywhere. It is of vital importance for businesses, government, and institutes to understand how peoples' behaviors in the online cyberspace can affect the underlying computer network, or their offline behaviors at large. To study this problem, we collect a dataset from WeChat Moments, called WeChatNet, which involves 25,133,330 WeChat users with 246,369,415 records of link reposting on their pages. We revisit three network applications based on the data analytics over WeChatNet, i.e., the information dissemination in mobile cellular networks, the network traffic prediction in backbone networks, and the mobile population distribution projection. Meanwhile, we discuss the potential research opportunities for developing new applications using the released dataset.
The lack of transparency often makes the black-box models difficult to be applied to many practical domains. For this reason, the current work, from the black-box model input port, proposes to incorporate data-based prior information into the black-box soft-margin SVM model to enhance its transparency. The concept and incorporation mechanism of data-based prior information are successively developed, based on which the transparent or partly transparent SVM optimization model is designed and then solved through handily rewriting the optimization problem as a nonlinear quadratic programming problem. An algorithm for mining data-based linear prior information from data set is also proposed, which generates a linear expression with respect to two appropriate inputs identified from all inputs of system. At last, the proposed transparency strategy is applied to eight benchmark examples and two real blast furnace examples for effectiveness exhibition.
Oct 10 2017 cs.CV
Automatically predicting age group and gender from face images acquired in unconstrained conditions is an important and challenging task in many real-world applications. Nevertheless, the conventional methods with manually-designed features on in-the-wild benchmarks are unsatisfactory because of incompetency to tackle large variations in unconstrained images. This difficulty is alleviated to some degree through Convolutional Neural Networks (CNN) for its powerful feature representation. In this paper, we propose a new CNN based method for age group and gender estimation leveraging Residual Networks of Residual Networks (RoR), which exhibits better optimization ability for age group and gender classification than other CNN architectures.Moreover, two modest mechanisms based on observation of the characteristics of age group are presented to further improve the performance of age estimation.In order to further improve the performance and alleviate over-fitting problem, RoR model is pre-trained on ImageNet firstly, and then it is fune-tuned on the IMDB-WIKI-101 data set for further learning the features of face images, finally, it is used to fine-tune on Adience data set. Our experiments illustrate the effectiveness of RoR method for age and gender estimation in the wild, where it achieves better performance than other CNN methods. Finally, the RoR-152+IMDB-WIKI-101 with two mechanisms achieves new state-of-the-art results on Adience benchmark.
We study the problem of testing for community structure in networks using relations between the observed frequencies of small subgraphs. We propose a simple test for the existence of communities based only on the frequencies of three-node subgraphs. The test statistic is shown to be asymptotically normal under a null assumption of no community structure, and to have power approaching one under a composite alternative hypothesis of a degree-corrected stochastic block model. We also derive a version of the test that applies to multivariate Gaussian data. Our approach achieves near-optimal detection rates for the presence of community structure, in regimes where the signal-to-noise is too weak to explicitly estimate the communities themselves, using existing computationally efficient algorithms. We demonstrate how the method can be effective for detecting structure in social networks, citation networks for scientific articles, and correlations of stock returns between companies on the S\&P 500.
Oct 03 2017 cs.CV
The Residual Networks of Residual Networks (RoR) exhibits excellent performance in the image classification task, but sharply increasing the number of feature map channels makes the characteristic information transmission incoherent, which losses a certain of information related to classification prediction, limiting the classification performance. In this paper, a Pyramidal RoR network model is proposed by analysing the performance characteristics of RoR and combining with the PyramidNet. Firstly, based on RoR, the Pyramidal RoR network model with channels gradually increasing is designed. Secondly, we analysed the effect of different residual block structures on performance, and chosen the residual block structure which best favoured the classification performance. Finally, we add an important principle to further optimize Pyramidal RoR networks, drop-path is used to avoid over-fitting and save training time. In this paper, image classification experiments were performed on CIFAR-10/100 and SVHN datasets, and we achieved the current lowest classification error rates were 2.96%, 16.40% and 1.59%, respectively. Experiments show that the Pyramidal RoR network optimization method can improve the network performance for different data sets and effectively suppress the gradient disappearance problem in DCNN training.
This work presents a novel approach for robust PCA with total variation regularization for foreground-background separation and denoising on noisy, moving camera video. Our proposed algorithm registers the raw (possibly corrupted) frames of a video and then jointly processes the registered frames to produce a decomposition of the scene into a low-rank background component that captures the static components of the scene, a smooth foreground component that captures the dynamic components of the scene, and a sparse component that can isolate corruptions and other non-idealities. Unlike existing methods, our proposed algorithm produces a panoramic low-rank component that spans the entire field of view, automatically stitching together corrupted data from partially overlapping scenes. The low-rank portion of our robust PCA model is based on a recently discovered optimal low-rank matrix estimator (OptShrink) that requires no parameter tuning. We demonstrate the performance of our algorithm on both static and moving camera videos corrupted by noise and outliers.
Jun 15 2017 cs.DB
Similarity join, which can find similar objects (e.g., products, names, addresses) across different sources, is powerful in dealing with variety in big data, especially web data. Threshold-driven similarity join, which has been extensively studied in the past, assumes that a user is able to specify a similarity threshold, and then focuses on how to efficiently return the object pairs whose similarities pass the threshold. We argue that the assumption about a well set similarity threshold may not be valid for two reasons. The optimal thresholds for different similarity join tasks may vary a lot. Moreover, the end-to-end time spent on similarity join is likely to be dominated by a back-and-forth threshold-tuning process. In response, we propose preference-driven similarity join. The key idea is to provide several result-set preferences, rather than a range of thresholds, for a user to choose from. Intuitively, a result-set preference can be considered as an objective function to capture a user's preference on a similarity join result. Once a preference is chosen, we automatically compute the similarity join result optimizing the preference objective. As the proof of concept, we devise two useful preferences and propose a novel preference-driven similarity join framework coupled with effective optimization techniques. Our approaches are evaluated on four real-world web datasets from a diverse range of application scenarios. The experiments show that preference-driven similarity join can achieve high-quality results without a tedious threshold-tuning process.
Face modeling has been paid much attention in the field of visual computing. There exist many scenarios, including cartoon characters, avatars for social media, 3D face caricatures as well as face-related art and design, where low-cost interactive face modeling is a popular approach especially among amateur users. In this paper, we propose a deep learning based sketching system for 3D face and caricature modeling. This system has a labor-efficient sketching interface, that allows the user to draw freehand imprecise yet expressive 2D lines representing the contours of facial features. A novel CNN based deep regression network is designed for inferring 3D face models from 2D sketches. Our network fuses both CNN and shape based features of the input sketch, and has two independent branches of fully connected layers generating independent subsets of coefficients for a bilinear face representation. Our system also supports gesture based interactions for users to further manipulate initial face models. Both user studies and numerical results indicate that our sketching system can help users create face models quickly and effectively. A significantly expanded face database with diverse identities, expressions and levels of exaggeration is constructed to promote further research and evaluation of face modeling techniques.
Apr 28 2017 cs.SE
Version information plays an important role in spreadsheet understanding, maintaining and quality improving. However, end users rarely use version control tools to document spreadsheet version information. Thus, the spreadsheet version information is missing, and different versions of a spreadsheet coexist as individual and similar spreadsheets. Existing approaches try to recover spreadsheet version information through clustering these similar spreadsheets based on spreadsheet filenames or related email conversation. However, the applicability and accuracy of existing clustering approaches are limited due to the necessary information (e.g., filenames and email conversation) is usually missing. We inspected the versioned spreadsheets in VEnron, which is extracted from the Enron Corporation. In VEnron, the different versions of a spreadsheet are clustered into an evolution group. We observed that the versioned spreadsheets in each evolution group exhibit certain common features (e.g., similar table headers and worksheet names). Based on this observation, we proposed an automatic clustering algorithm, SpreadCluster. SpreadCluster learns the criteria of features from the versioned spreadsheets in VEnron, and then automatically clusters spreadsheets with the similar features into the same evolution group. We applied SpreadCluster on all spreadsheets in the Enron corpus. The evaluation result shows that SpreadCluster could cluster spreadsheets with higher precision and recall rate than the filename-based approach used by VEnron. Based on the clustering result by SpreadCluster, we further created a new versioned spreadsheet corpus VEnron2, which is much bigger than VEnron. We also applied SpreadCluster on the other two spreadsheet corpora FUSE and EUSES. The results show that SpreadCluster can cluster the versioned spreadsheets in these two corpora with high precision.
We study the problem of testing for structure in networks using relations between the observed frequencies of small subgraphs. We consider the statistics \beginalign* T_3 & =(\textedge frequency)^3 - \texttriangle frequency\\ T_2 & =3(\textedge frequency)^2(1-\textedge frequency) - \textV-shape frequency \endalign* and prove a central limit theorem for $(T_2, T_3)$ under an Erdős-Rényi null model. We then analyze the power of the associated $\chi^2$ test statistic under a general class of alternative models. In particular, when the alternative is a $k$-community stochastic block model, with $k$ unknown, the power of the test approaches one. Moreover, the signal-to-noise ratio required is strictly weaker than that required for community detection. We also study the relation with other statistics over three-node subgraphs, and analyze the error under two natural algorithms for sampling small subgraphs. Together, our results show how global structural characteristics of networks can be inferred from local subgraph frequencies, without requiring the global community structure to be explicitly estimated.
We tightly analyze the sample complexity of CCA, provide a learning algorithm that achieves optimal statistical performance in time linear in the required number of samples (up to log factors), as well as a streaming algorithm with similar guarantees.
Community detection is a central problem of network data analysis. Given a network, the goal of community detection is to partition the network nodes into a small number of clusters, which could often help reveal interesting structures. The present paper studies community detection in Degree-Corrected Block Models (DCBMs). We first derive asymptotic minimax risks of the problem for a misclassification proportion loss under appropriate conditions. The minimax risks are shown to depend on degree-correction parameters, community sizes, and average within and between community connectivities in an intuitive and interpretable way. In addition, we propose a polynomial time algorithm to adaptively perform consistent and even asymptotically optimal community detection in DCBMs.
Jul 14 2016 cs.CY
Ads are an important revenue source for mobile app development, especially for free apps, whose expense can be compensated by ad revenue. The ad benefits also carry with costs. For example, too many ads can interfere the user experience, leading to less user retention and reduced earnings ultimately. In the paper, we aim at understanding the ad costs from users perspective. We utilize app reviews, which are widely recognized as expressions of user perceptions, to identify the ad costs concerned by users. Four types of ad costs, i.e., number of ads, memory/CPU overhead, traffic usage, and bettery consumption, have been discovered from user reviews. To verify whether different ad integration schemes generate different ad costs, we first obtain the commonly used ad schemes from 104 popular apps, and then design a framework named IntelliAd to automatically measure the ad costs of each scheme. To demonstrate whether these costs indeed influence users reactions, we finally observe the correlations between the measured ad costs and the user perceptions. We discover that the costs related to memory/CPU overhead and battery consumption are more concerned by users, while the traffic usage is less concerned by users in spite of its obvious variations among different schemes in the experiments. Our experimental results provide the developers with suggestions on better incorporating ads into apps and, meanwhile, ensuring the user experience.
Mar 08 2016 cs.CV
Detecting complex events in a large video collection crawled from video websites is a challenging task. When applying directly good image-based feature representation, e.g., HOG, SIFT, to videos, we have to face the problem of how to pool multiple frame feature representations into one feature representation. In this paper, we propose a novel learning-based frame pooling method. We formulate the pooling weight learning as an optimization problem and thus our method can automatically learn the best pooling weight configuration for each specific event category. Experimental results conducted on TRECVID MED 2011 reveal that our method outperforms the commonly used average pooling and max pooling strategies on both high-level and low-level 2D image features.
Sep 29 2015 cs.NI
With vast amounts of spectrum available in the millimeter wave (mmWave) band, small cells at mmWave frequencies densely deployed underlying the conventional homogeneous macrocell network have gained considerable interest from academia, industry, and standards bodies. Due to high propagation loss at higher frequencies, mmWave communications are inherently directional, and concurrent transmissions (spatial reuse) under low inter-link interference can be enabled to significantly improve network capacity. On the other hand, mmWave links are easily blocked by obstacles such as human body and furniture. In this paper, we develop a Multi-Hop Relaying Transmission scheme, termed as MHRT, to steer blocked flows around obstacles by establishing multi-hop relay paths. InMHRT, a relay path selection algorithmis proposed to establish relay paths for blocked flows for better use of concurrent transmissions. After relay path selection, we use a multi-hop transmission scheduling algorithm to compute near-optimal schedules by fully exploiting the spatial reuse. Through extensive simulations under various traffic patterns and channel conditions, we demonstrate MHRT achieves superior performance in terms of network throughput and connection robustness compared with other existing protocols, especially under serious blockage conditions. The performance of MHRT with different hop limitations is also simulated and analyzed for a better choice of themaximum hop number in practice.
Sep 29 2015 cs.NI
Heterogeneous cellular networks with small cells densely deployed underlying the conventional homogeneous macrocells are emerging as a promising candidate for the fifth generation (5G) mobile network. When a large number of base stations are deployed, the cost-effective, flexible, and green backhaul solution becomes one of the most urgent and critical challenges. With vast amounts of spectrum available, wireless backhaul in the millimeter wave (mmWave) band is able to provide several-Gbps transmission rates. To overcome high propagation loss at higher frequencies, mmWave backhaul utilize beamforming to achieve directional transmission, and concurrent transmissions (spatial reuse) under low inter-link interference can be enabled to significantly improve network capacity. To achieve an energy efficient solution for the mmWave backhauling of small cells, we first formulate the problem of minimizing the energy consumption via concurrent transmission scheduling and power control into a mixed integer nonlinear programming problem. Then we develop an energy efficient and practical mmWave backhauling scheme, where the maximum independent set based scheduling algorithm and the power control algorithm are proposed to exploit the spatial reuse for low energy consumption and high energy efficiency. We also theoretically analyze the conditions that our scheme reduces energy consumption, and the choice of the interference threshold for energy reduction. Through extensive simulations under various traffic patterns and system parameters, we demonstrate the superior performance of our scheme in terms of energy consumption and energy efficiency, and also analyze the choice of the interference threshold under different traffic loads, BS distributions, and the maximum transmission power.
Sep 25 2015 cs.NI
With huge unlicensed bandwidth available in most parts of the world, millimeter wave (mmWave) communications in the 60 GHz band has been considered as one of the most promising candidates to support multi-gigabit wireless services. Due to high propagation loss of mmWave channels, beamforming is likely to become adopted as an essential technique. Consequently, transmission in 60 GHz band is inherently directional. Directivity enables concurrent transmissions (spatial reuse), which can be fully exploited to improve network capacity. In this paper, we propose a multiple paths multi-hop scheduling scheme, termed MPMH, for mmWave wireless personal area networks, where the traffic across links of low channel quality is transmitted through multiple paths of multiple hops to unleash the potential of spatial reuse. We formulate the problem of multiple paths multi-hop scheduling as a mixed integer linear program (MILP), which is generally NP-hard. To enable the implementation of the multiple paths multi-hop transmission in practice, we propose a heuristic scheme including path selection, traffic distribution, and multiple paths multi-hop scheduling to efficiently solve the formulated problem. Finally, through extensive simulations, we demonstrate MPMH achieves near-optimal network performance in terms of transmission delay and throughput, and enhances the network performance significantly compared with existing protocols.
Sep 25 2015 cs.NI
With the explosive growth of mobile demand, small cells in millimeter wave (mmWave) bands underlying the macrocell networks have attracted intense interest from both academia and industry. MmWave communications in the 60 GHz band are able to utilize the huge unlicensed bandwidth to provide multiple Gbps transmission rates. In this case, device-to-device (D2D) communications in mmWave bands should be fully exploited due to no interference with the macrocell networks and higher achievable transmission rates. In addition, due to less interference by directional transmission, multiple links including D2D links can be scheduled for concurrent transmissions (spatial reuse). With the popularity of content-based mobile applications, popular content downloading in the small cells needs to be optimized to improve network performance and enhance user experience. In this paper, we develop an efficient scheduling scheme for popular content downloading in mmWave small cells, termed PCDS (popular content downloading scheduling), where both D2D communications in close proximity and concurrent transmissions are exploited to improve transmission efficiency. In PCDS, a transmission path selection algorithm is designed to establish multi-hop transmission paths for users, aiming at better utilization of D2D communications and concurrent transmissions. After transmission path selection, a concurrent transmission scheduling algorithm is designed to maximize the spatial reuse gain. Through extensive simulations under various traffic patterns, we demonstrate PCDS achieves near-optimal performance in terms of delay and throughput, and also superior performance compared with other existing protocols, especially under heavy load.
Community detection is a fundamental statistical problem in network data analysis. Many algorithms have been proposed to tackle this problem. Most of these algorithms are not guaranteed to achieve the statistical optimality of the problem, while procedures that achieve information theoretic limits for general parameter spaces are not computationally tractable. In this paper, we present a computationally feasible two-stage method that achieves optimal statistical performance in misclassification proportion for stochastic block model under weak regularity conditions. Our two-stage procedure consists of a generic refinement step that can take a wide range of weakly consistent community detection procedures as initializer, to which the refinement stage applies and outputs a community assignment achieving optimal misclassification proportion with high probability. The practical effectiveness of the new algorithm is demonstrated by competitive numerical results.
Mar 10 2015 cs.NI
With the explosive growth of mobile data demand, there has been an increasing interest in deploying small cells of higher frequency bands underlying the conventional homogeneous macrocell network, which is usually referred to as heterogeneous cellular networks, to significantly boost the overall network capacity. With vast amounts of spectrum available in the millimeter wave (mmWave) band, small cells at mmWave frequencies are able to provide multi-gigabit access data rates, while the wireless backhaul in the mmWave band is emerging as a cost-effective solution to provide high backhaul capacity to connect access points (APs) of the small cells. In order to operate the mobile network optimally, it is necessary to jointly design the radio access and backhaul networks. Meanwhile, direct transmissions between devices should also be considered to improve system performance and enhance the user experience. In this paper, we propose a joint transmission scheduling scheme for the radio access and backhaul of small cells in the mmWave band, termed D2DMAC, where a path selection criterion is designed to enable device-to-device transmissions for performance improvement. In D2DMAC, a concurrent transmission scheduling algorithm is proposed to fully exploit spatial reuse in mmWave networks. Through extensive simulations under various traffic patterns and user deployments, we demonstrate D2DMAC achieves near-optimal performance in some cases, and outperforms other protocols significantly in terms of delay and throughput. Furthermore, we also analyze the impact of path selection on the performance improvement of D2DMAC under different selected parameters.
Feb 25 2014 cs.CV
Tracking-by-detection has become an attractive tracking technique, which treats tracking as a category detection problem. However, the task in tracking is to search for a specific object, rather than an object category as in detection. In this paper, we propose a novel tracking framework based on exemplar detector rather than category detector. The proposed tracker is an ensemble of exemplar-based linear discriminant analysis (ELDA) detectors. Each detector is quite specific and discriminative, because it is trained by a single object instance and massive negatives. To improve its adaptivity, we update both object and background models. Experimental results on several challenging video sequences demonstrate the effectiveness and robustness of our tracking algorithm.
Transshipment problem is one of the basic operational research problems. In this paper, our first work is to develop a biologically inspired mathematical model for a dynamical system, which is first used to solve minimum cost flow problem. It has lower computational complexity than Physarum Solver. Second, we apply the proposed model to solve the traditional transshipment problem. Compared with the conditional methods, experiment results show the provided model is simple, effective as well as handling problem in a continuous manner.
The UDC (Universal Decimal Classification) is not only a classification language with a long history; it also presents a complex cognitive system worthy of the attention of complexity theory. The elements of the UDC: classes, auxiliaries, and operations are combined into symbolic strings, which in essence represent a complex networks of concepts. This network forms a backbone of ordering of knowledge and at the same time allows expression of different perspectives on various products of human knowledge production. In this paper we look at UDC strings derived from the holdings of libraries. In particular we analyse the subject headings of holdings of the university library in Leuven, and an extraction of UDC numbers from the OCLC WorldCat. Comparing those sets with the Master Reference File, we look into the length of strings, the occurrence of different auxiliary signs, and the resulting connections between UDC classes. We apply methods and representations from complexity theory. Mapping out basic statistics on UDC classes as used in libraries from a point of view of complexity theory bears different benefits. Deploying its structure could serve as an overview and basic information for users among the nature and focus of specific collections. A closer view into combined UDC numbers reveals the complex nature of the UDC as an example for a knowledge ordering system, which deserves future exploration from a complexity theoretical perspective.
To classify is to put things in meaningful groups, but the criteria for doing so can be problematic. Study of evolution of classification includes ontogenetic analysis of change in classification over time. We present an empirical analysis of the UDC over the entire period of its development. We demonstrate stability in main classes, with major change driven by 20th century scientific developments. But we also demonstrate a vast increase in the complexity of auxiliaries. This study illustrates an alternative to Tennis' "scheme-versioning" method.
Wikipedia, as a social phenomenon of collaborative knowledge creating, has been studied extensively from various points of views. The category system of Wikipedia, introduced in 2004, has attracted relatively little attention. In this study, we focus on the documentation of knowledge, and the transformation of this documentation with time. We take Wikipedia as a proxy for knowledge in general and its category system as an aspect of the structure of this knowledge. We investigate the evolution of the category structure of the English Wikipedia from its birth in 2004 to 2008. We treat the category system as if it is a hierarchical Knowledge Organization System, capturing the changes in the distributions of the top categories. We investigate how the clustering of articles, defined by the category system, matches the direct link network between the articles and show how it changes over time. We find the Wikipedia category network mostly stable, but with occasional reorganization. We show that the clustering matches the link structure quite well, except short periods preceding the reorganizations.
This study analyzes the differences between the category structure of the Universal Decimal Classification (UDC) system (which is one of the widely used library classification systems in Europe) and Wikipedia. In particular, we compare the emerging structure of category-links to the structure of classes in the UDC. With this comparison we would like to scrutinize the question of how do knowledge maps of the same domain differ when they are created socially (i.e. Wikipedia) as opposed to when they are created formally (UDC) using classificatio theory. As a case study, we focus on the category of "Arts".
We present a CAD framework for CMOL, a hybrid CMOS/ molecular circuit architecture. Our framework first transforms any logically synthesized circuit based on AND/OR/NOT gates to a NOR gate circuit, and then maps the NOR gates to CMOL. We encode the CMOL cell assignment problem as boolean conditions. The boolean constraint is satisfiable if and only if there is a way to map all the NOR gates to the CMOL cells. We further investigate various types of static defects for the CMOL architecture, and propose a reconfiguration technique that can deal with these defects through our CAD framework. This is the first automated framework for CMOL cell assignment, and the first to model several different CMOL static defects. Empirical results show that our approach is efficient and scalable.