Machine Learning (ML) models are applied in a variety of tasks such as network intrusion detection or malware classification. Yet, these models are vulnerable to a class of malicious inputs known as adversarial examples. These are slightly perturbed inputs that are classified incorrectly by the ML model. The mitigation of these adversarial inputs remains an open problem. As a step towards a model-agnostic defense against adversarial examples, we show that they are not drawn from the same distribution than the original data, and can thus be detected using statistical tests. As the number of malicious points included in samples presented to the test diminishes, its detection confidence decreases. Hence, we introduce a complimentary approach to identify specific inputs that are adversarial among sets of inputs flagged by the statistical test. Specifically, we augment our ML model with an additional output, in which the model is trained to classify all adversarial inputs. We evaluate our approach on multiple adversarial example crafting methods (including the fast gradient sign and Jacobian-based saliency map methods) with several datasets. The statistical test flags sample sets containing adversarial inputs with confidence above 80%. Furthermore, our augmented model either detects adversarial examples with high accuracy (>80%) or increases the adversary's cost---the perturbation added---by more than 150%. In this way, we show that statistical properties of adversarial examples are essential to their detection.
Feb 22 2017 cs.CR
Recent years have shown that more than ever governments and intelligence agencies try to control and bypass the cryptographic means used for the protection of data. Backdooring encryption algorithms is considered as the best way to enforce cryptographic control. Until now, only implementation backdoors (at the protocol/implementation/management level) are generally considered. In this paper we propose to address the most critical issue of backdoors: mathematical backdoors or by-design backdoors, which are put directly at the mathematical design of the encryption algorithm. While the algorithm may be totally public, proving that there is a backdoor, identifying it and exploiting it, may be an intractable problem. We intend to explain that it is probably possible to design and put such backdoors. Considering a particular family (among all the possible ones), we present BEA-1, a block cipher algorithm which is similar to the AES and which contains a mathematical backdoor enabling an operational and effective cryptanalysis. The BEA-1 algorithm (80-bit block size, 120-bit key, 11 rounds) is designed to resist to linear and differential cryptanalyses. A challenge will be proposed to the cryptography community soon. Its aim is to assess whether our backdoor is easily detectable and exploitable or not.
Critical infrastructure protection (CIP) is envisioned to be one of the most challenging security problems in the coming decade. One key challenge in CIP is the ability to allocate resources, either personnel or cyber, to critical infrastructures with different vulnerability and criticality levels. In this work, a contract-theoretic approach is proposed to solve the problem of resource allocation in critical infrastructure with asymmetric information. A control center (CC) is used to design contracts and offer them to infrastructures' owners. A contract can be seen as an agreement between the CC and infrastructures using which the CC allocates resources and gets rewards in return. Contracts are designed in a way to maximize the CC's benefit and motivate each infrastructure to accept a contract and obtain proper resources for its protection. Infrastructures are defined by both vulnerability levels and criticality levels which are unknown to the CC. Therefore, each infrastructure can claim that it is the most vulnerable or critical to gain more resources. A novel mechanism is developed to handle such an asymmetric information while providing the optimal contract that motivates each infrastructure to reveal its actual type. The necessary and sufficient conditions for such resource allocation contracts under asymmetric information are derived. Simulation results show that the proposed contract-theoretic approach maximizes the CC's utility while ensuring that no infrastructure has an incentive to ask for another contract, despite the lack of exact information at the CC.
Human mobility data has been ubiquitously collected through cellular networks and mobile applications, and publicly released for academic research and commercial purposes for the last decade. Since releasing individual's mobility records usually gives rise to privacy issues, datasets owners tend to only publish aggregated mobility data, such as the number of users covered by a cellular tower at a specific timestamp, which is believed to be sufficient for preserving users' privacy. However, in this paper, we argue and prove that even publishing aggregated mobility data could lead to privacy breach in individuals' trajectories. We develop an attack system that is able to exploit the uniqueness and regularity of human mobility to recover individual's trajectories from the aggregated mobility data without any prior knowledge. By conducting experiments on two real-world datasets collected from both mobile application and cellular network, we reveal that the attack system is able to recover users' trajectories with accuracy about 73%~91% at the scale of tens of thousands to hundreds of thousands users, which indicates severe privacy leakage in such datasets. Through the investigation on aggregated mobility data, our work recognizes a novel privacy problem in publishing statistic data, which appeals for immediate attentions from both academy and industry.
Feb 22 2017 cs.CR
Software is everywhere, from mission critical systems such as industrial power stations, pacemakers and even common household appliances we are surrounded by software with potentially exploitable vulnerabilities. The growing complexity of software, the rise in IoT devices coupled with our dependence on technology has made program analysis more specifically binary analysis an important area of research in computer science. Moreover these needs and dependencies have made it a necessity to explore building automated analysis systems that can operate at scale, speed and efficacy all while performing with the skill of a human expert. Though great progress has been made in this area of research, there remains limitations and open challenges to be addressed. Recognizing this need, DARAP sponsored the Cyber Grand Challenge (CGC), a competition to showcase the current state of the art in systems that perform; automated vulnerability detection, exploit generation and software patching. This paper is a survey of the vulnerability detection and exploit generation techniques, underlying technologies and related works of two of the winning systems Mayhem and Mechanical Phish.
Feb 22 2017 cs.CR
Personal sensory data is used by context-aware mobile applications to provide utility. However, the same data can be used by an adversary to make sensitive inferences about a user thereby violating her privacy. We present DEEProtect, a framework that enables a novel form of access control that we refer to as the inference-based access control, in which mobile apps with access to sensor data are limited (provably) in their ability to make inferences about user's sensitive data and behavior. DEEProtect adopts a two-layered privacy strategy. First, it leverages novel deep learning techniques to perform data minimization and limits the amount of information being shared; the learning network is used to derive a compact representation of sensor data consisting only of features relevant to authorized utility-providing inferences. Second, DEEProtect obfuscates the previously learnt features, thereby providing an additional layer of protection against sensitive inferences; our approach can provide both conventional and relaxed notions of local differential privacy, depending on how sensitive inferences are specified. Through theoretical analysis and extensive experiments using real-world apps and datasets, we demonstrate that when compared to existing approaches DEEProtect provides provable privacy guarantees with up to 8x improvement in utility. Finally, DEEProtect shares obfuscated but raw sensor data reconstructed from the perturbed features, thus requiring no changes to the existing app interfaces.