Data driven research on Android has gained a great momentum these years. The abundance of data facilitates knowledge learning, however, also increases the difficulty of data preprocessing. Therefore, it is non-trivial to prepare a demanding and accurate set of data for research. In this work, we put forward AndroVault, a framework for the Android research composing of data collection, knowledge representation and knowledge extraction. It has started with a long-running web crawler for data collection (both apps and description) since 2013, which guarantees the timeliness of data; With static analysis and dynamic analysis of the collected data, we compute a variety of attributes to characterize Android apps. After that, we employ a knowledge graph to connect all these apps by computing their correlation in terms of attributes; Last, we leverage multiple technologies such as logical inference, machine learning, and correlation analysis to extract facts (more accurate and demanding, either high level or not, data) that are beneficial for a specific research problem. With the produced data of high quality, we have successfully conducted many research works including malware detection, code generation, and Android testing. We would like to release our data to the research community in an authenticated manner, and encourage them to conduct productive research.
Neural networks have demonstrated considerable success in a wide variety of real-world problems. However, the presence of adversarial examples - slightly perturbed inputs that are misclassified with high confidence - limits our ability to guarantee performance for these networks in safety-critical applications. We demonstrate that, for networks that are piecewise affine (for example, deep networks with ReLU and maxpool units), proving no adversarial example exists - or finding the closest example if one does exist - can be naturally formulated as solving a mixed integer program. Solves for a fully-connected MNIST classifier with three hidden layers can be completed an order of magnitude faster than those of the best existing approach. To address the concern that adversarial examples are irrelevant because pixel-wise attacks are unlikely to happen in natural images, we search for adversaries over a natural class of perturbations written as convolutions with an adversarial blurring kernel. When searching over blurred images, we find that as opposed to pixelwise attacks, some misclassifications are impossible. Even more interestingly, a small fraction of input images are provably robust to blurs: every blurred version of the input is classified with the same, correct label.
Cloud vendors are increasingly offering machine learning services as part of their platform and services portfolios. These services enable the deployment of machine learning models on the cloud that are offered on a pay-per-query basis to application developers and end users. However recent work has shown that the hosted models are susceptible to extraction attacks. Adversaries may launch queries to steal the model and compromise future query payments or privacy of the training data. In this work, we present a cloud-based extraction monitor that can quantify the extraction status of models by observing the query and response streams of both individual and colluding adversarial users. We present a novel technique that uses information gain to measure the model learning rate by users with increasing number of queries. Additionally, we present an alternate technique that maintains intelligent query summaries to measure the learning rate relative to the coverage of the input feature space in the presence of collusion. Both these approaches have low computational overhead and can easily be offered as services to model owners to warn them of possible extraction attacks from adversaries. We present performance results for these approaches for decision tree models deployed on BigML MLaaS platform, using open source datasets and different adversarial attack strategies.
Nov 21 2017 cs.CR
The AN.ON-Next project aims to integrate privacy-enhancing technologies into the internet's infrastructure and establish them in the consumer mass market. The technologies in focus include a basis protection at internet service provider level, an improved overlay network-based protection and a concept for privacy protection in the emerging 5G mobile network. A crucial success factor will be the viable adjustment and development of standards, business models and pricing strategies for those new technologies.
A large user base relies on software updates provided through package managers. This provides a unique lever for improving the security of the software update process. We propose a transparency system for software updates and implement it for a widely deployed Linux package manager, namely APT. Our system is capable of detecting targeted backdoors without producing overhead for maintainers. In addition, in our system, the availability of source code is ensured, the binding between source and binary code is verified using reproducible builds, and the maintainer responsible for distributing a specific package can be identified. We describe a novel "hidden version" attack against current software transparency systems and propose as well as integrate a suitable defense. To address equivocation attacks by the transparency log server, we introduce tree root cross logging, where the log's Merkle tree root is submitted into a separately operated log server. This significantly relaxes the inter-operator cooperation requirements compared to other systems. Our implementation is evaluated by replaying over 3000 updates of the Debian operating system over the course of two years, demonstrating its viability and identifying numerous irregularities.
Fast development of sharing services becomes a crucial part of the process in constructing a cyber-enabled world, as sharing services reinvent how people exchange and obtain goods or services. However, privacy leakage or disclosure is a key concern which may hinder the development of sharing services. While significant efforts have been undertaken to address various privacy issues in recent years, there is a surprising lack of a review for privacy concerns in the cyber-enabled sharing world. To bridge the gap, in this study, we survey and evaluate existing and emerging privacy issues relating to sharing services from various perspectives. Differing from existing similar works on surveying sharing practices in various fields, our work comprehensively covers six directions of sharing services in the cyber-enabled world and selects solutions mostly from the recent five years. Finally, we conclude the issues and solutions from three perspectives, namely, the user, platform and service provider perspectives.
Nov 21 2017 cs.CR
Most user authentication methods and identity proving systems rely on a centralized database. Such information storage presents a single point of compromise from a security perspective. If this system is compromised it poses a direct threat to users' digital identities. This paper proposes a decentralized authentication method, called the Horcrux protocol, in which there is no such single point of compromise. The protocol relies on decentralized identifiers (DIDs) under development by the W3C Verifiable Claims Community Group and the concept of self-sovereign identity. To accomplish this, we propose specification and implementation of a decentralized biometric credential storage option via blockchains using DIDs and DID documents within the IEEE 2410-2017 Biometric Open Protocol Standard (BOPS).
Nov 21 2017 cs.CR
The widespread use of mobile electronic devices increases the complexities of mobile security. This paper aims to provide a secure communication environment for smartphone users. Some research proves that the one-time pad is one of the securest encryption methods, and the key distribution problem can be solved by using the QKD (quantum key distribution). The objective of this project is to design an Android APP (application) to exchange several random keys between mobile phones. Inspired by QKD, the developed APP uses the quick response (QR) code as a carrier to dispatch large amounts of one-time keys. After evaluating the performance of APP, it allows the mobile phone to capture and decode 1800 bytes of random data in 600ms. The continuous scanning mode of APP is designed to improve the overall transmission performance and user experience, and the maximum transmission rate of this mode is around 2200 bytes/s. The omnidirectional readability and error correction capability of QR code gives it better real-life application, and the features of adequate storage capacity and quick response optimize overall transmission efficiency. The security of this APP is guaranteed since QR code is exchanged face-to-face, eliminating the risk of being eavesdropped. Also, the id of QR code is the only message that would be transmitted through the whole communication. The experimental results show this project can achieve superior transmission performance, and the correlation between the transmission rate of the system and several parameters, such as the QR code size, has been analyzed. In addition, some existing technologies and the main findings in the context of the project are summarized and critically compared in detail.
Nov 21 2017 cs.CR
Irreducible polynomials play an important role till now, in construction of 8-bit S-Boxes in ciphers. The 8-bit S-Box of Advanced Encryption Standard is a list of decimal equivalents of Multiplicative Inverses (MI) of all the elemental polynomials of a monic irreducible polynomial over Galois Field GF(2^8) . In this paper a new method to search monic Irreducible Polynomials (IPs) over Galois fields GF(p^q) has been introduced. Here the decimal equivalents of each monic elemental polynomial (ep), two at a time, are split into the p-nary coefficients of each term, of those two monic elemental polynomials. From those coefficients the p-nary coefficients of the resultant monic basic polynomials (BP) have been obtained. The decimal equivalents of resultant basic polynomials with p-nary coefficients are treated as decimal equivalents of the monic reducible polynomials, since monic reducible polynomials must have two monic elemental polynomials as its factor. The decimal equivalents of polynomials belonging to the list of reducible polynomials are cancelled leaving behind the monic irreducible polynomials. A non-monic irreducible polynomial is computed by multiplying a monic irreducible polynomial by alpha where alpha belongs to GF(p^q) and assumes values from 2 to (p-1).