Wireless connectivity is becoming common in increasingly diverse personal devices, enabling various interoperation- and Internet-based applications and services. More and more interconnected devices are simultaneously operated by a singleuser with short-lived connections, making usable device authentication methods imperative to ensure both high security and seamless user experience. Unfortunately, current authentication methods that heavily require human involvement, in addition to form factor and mobility constraints, make this balance hard to achieve, often forcing users to choose between security and convenience. In this work, we present a novel over-the-air device authentication scheme named AeroKey that achieves both high security and high usability. With virtually no hardware overhead,AeroKey leverages ubiquitously observable ambient electromagnetic radiation to autonomously generate spatiotemporally unique secret that can be derived only by devices that are closely located to each other. Devices can make use of this unique secret to form the basis of a symmetric key, making the authentication procedure more practical, secure and usable with no active human involvement. We propose and implement essential techniques to overcome challenges in realizing AeroKey on low-cost microcontroller units, such as poor time synchronization, lack of precision analog front-end, and inconsistent sampling rates. Our real-world experiments demonstrate reliable authentication as well as its robustness against various realistic adversaries with low equal-error rates of 3.4% or less and usable authentication time of as low as 24s.
Voice assistants are deployed widely and provide useful functionality. However, recent work has shown that commercial systems like Amazon Alexa and Google Home are vulnerable to voice-based confusion attacks that exploit design issues. We propose a systems-oriented defense against this class of attacks and demonstrate its functionality for Amazon Alexa. We ensure that only the skills a user intends execute in response to voice commands. Our key insight is that we can interpret a user’s intentions by analyzing their activity on counterpart systems of the web and smartphones. For example, the Lyft ride-sharing Alexa skill has an Android app and a website. Our work shows how information from counterpart apps can help reduce dis-ambiguities in the skill invocation process. We buildSkillFence, a browser extension that existing voice assistant users can install to ensure that only legitimate skills run in response to their commands. Using real user data from MTurk (𝑁=116) and experimental trials involving synthetic and organic speech, we show that SkillFence provides a balance between usability and security by securing 90.83% of skills that a user will need with a False acceptance rate of 19.83%.
Business Collaboration Platforms like Microsoft Teams and Slack enable teamwork by supporting text chatting and third-party resource integration. A user can access online file storage, make video calls, and manage a code repository, all from within the platform, thus making them a hub for sensitive communication and resources. The key enabler for these productivity features is a third-party application model. We contribute an experimental security analysis of this model and the third-party apps. Performing this analysis is challenging because commercial platforms and their apps are closed-source systems. Our analysis methodology is to systematically investigate different types of interactions possible between apps and users. We discover that the access control model in these systems violates two fundamental security principles: least privilege and complete mediation. These violations enable a malicious app to exploit the confidentiality and integrity of user messages and third-party resources connected to the platform. We construct proof-of-concept attacks that can: (1) eavesdrop on user messages without having permission to read those messages; (2) launch fake video calls; (3) automatically merge code into repositories without user approval or involvement. Finally, we provide an analysis of countermeasures that systems like Slack and Microsoft Teams can adopt today.
Voice assistants rely on keyword spotting to process vocal commands issued by humans: commands are prepended with a keyword, such as Alexa or Ok Google, which must be spotted to activate the voice assistant. Typically, keyword spotting is two-fold: an on-device model first identifies the keyword, then the resulting voice sample triggers a second on-cloud model which verifies and processes the activation. In this work, we explore the significant privacy and security concerns that this raises under two threat models. First, our experiments demonstrate that accidental activations result in up to a minute of speech recording being uploaded to the cloud. Second, we verify that adversaries can systematically trigger misactivations through adversarial examples, which exposes the integrity and availability of services connected to the voice assistant. We propose EKOS which leverages the semantics of the KWS task to defend against both accidental and adversarial activations. EKOS incorporates spatial redundancy from the audio environment at training and inference time to minimize distribution drifts responsible for accidental activations. It also exploits a physical property of speech—its redundancy at different harmonics—to deploy an ensemble of models trained on different harmonics and provably force the adversary to modify more of the frequency spectrum to obtain adversarial examples. Our evaluation shows that EKOS increases the cost of adversarial attacks, while preserving the natural accuracy. We validate the performance of EKOS with over-the-air experiments on commercial devices; we find that EKOS improves the precision of the KWS detection in non-adversarial settings.
As real-world images come in varying sizes, the machine learning model is part of a larger system that includes an upstream image scaling algorithm. In this paper, we investigate the interplay between vulnerabilities of the image scaling procedure and machine learning models in the decision-based black-box setting. We propose a novel sampling strategy to make a black-box attack exploit vulnerabilities in scaling algorithms, scaling defenses, and the final machine learning model in an end-to-end manner. Based on this scaling-aware attack, we reveal that most existing scaling defenses are ineffective under threat from downstream models. Moreover, we empirically observe that standard black-box attacks can significantly improve their performance by exploiting the vulnerable scaling procedure. We further demonstrate this problem on a commercial Image Analysis API with transfer-based black-box attacks.
In the post-pandemic era, video conferencing apps (VCAs) have converted previously private spaces — bedrooms, living rooms, and kitchens — into semi-public extensions of the office. And for the most part, users have accepted these apps in their personal space, without much thought about the permission models that govern the use of their personal data during meetings. While access to a device’s video camera is carefully controlled, little has been done to ensure the same level of privacy for accessing the microphone. In this work, we ask the question: what happens to the microphone data when a user clicks the mute button in a VCA? We first conduct a user study to analyze users’ understanding of the permission model of the mute button. Then, using runtime binary analysis tools, we trace raw audio in many popular VCAs as it traverses the app from the audio driver to the network. We find fragmented policies for dealing with microphone data among VCAs — some continuously monitor the microphone input during mute, and others do so periodically. One app transmits statistics of the audio to its telemetry servers while the app is muted. Using network traffic that we intercept en route to the telemetry server, we implement a proof-of-concept background activity classifier and demonstrate the feasibility of inferring the ongoing background activity during a meeting — cooking, cleaning, typing, etc. We achieved 81.9% macro accuracy on identifying six common background activities using intercepted outgoing telemetry packets when a user is muted.
Online websites use cookie notices to elicit consent from the users, as required by recent privacy regulations like the GDPR and the CCPA. Prior work has shown that these notices use dark patterns to manipulate users into making website-friendly choices which put users’ privacy at risk. In this work, we develop CookieEnforcer, a new system for automatically discovering cookie notices and deciding on the options that result in disabling all non-essential cookies. In order to achieve this, we first build an automatic cookie notice detector that utilizes the rendering pattern of the HTML elements to identify the cookie notices. Next, CookieEnforcer analyzes the cookie notices and predicts the set of actions required to disable all unnecessary cookies. This is done by modeling the problem as a sequence-to-sequence task, where the input is a machine-readable cookie notice and the output is the set of clicks to make. We demonstrate the efficacy of CookieEnforcer via an end-to-end accuracy evaluation, showing that it can generate the required steps in 91% of the cases. Via a user study, we show that CookieEnforcer can significantly reduce the user effort. Finally, we use our system to perform several measurements on the top 5k websites from the Tranco list (as accessed from the US and the UK), drawing comparisons and observations at scale.
Recent years have seen a surge of popularity of acoustics-enabled personal devices powered by machine learning. Yet, machine learning has proven to be vulnerable to adversarial examples. Large number of modern systems protect themselves against such attacks by targeting the artificiality, i.e., they deploy mechanisms to detect the lack of human involvement in generating the adversarial examples. However, these defenses implicitly assume that humans are incapable of producing meaningful and targeted adversarial examples. In this paper, we show that this base assumption is wrong. In particular, we demonstrate that for tasks like speaker identification, a human is capable of producing analog adversarial examples directly with little cost and supervision: by simply speaking through a tube, an adversary reliably impersonates other speakers in eyes of ML models for speaker identification. Our findings extend to a range of other acoustic-biometric tasks such as liveness, bringing into question their use in security-critical settings in real life, such as phone banking.
Detecting deepfakes is an important problem, but recent work has shown that DNN-based deepfake detectors are brittle against adversarial deepfakes, in which an adversary adds imperceptible perturbations to a deepfake to evade detection. In this work, we show that a modification to the detection strategy in which we replace a single classifier with a carefully chosen ensemble, in which input transformations for each model in the ensemble induces pairwise orthogonal gradients, can significantly improve robustness beyond the de facto solution of adversarial training. We present theoretical results to show that such orthogonal gradients can help thwart a first-order adversary by reducing the dimensionality of the input subspace in which adversarial deepfakes lie. We validate the results empirically by instantiating and evaluating a randomized version of such “orthogonal” ensembles for adversarial deepfake detection and find that these randomized ensembles exhibit significantly higher robustness as deepfake detectors compared to state-of-the-art deepfake detectors against adversarial deepfakes, even those created using strong PGD-500 attacks.
As social robots become increasingly prevalent in day-to-day environments, they will participate in conversations and appropriately manage the information shared with them. However, little is known about how robots might appropriately discern the sensitivity of information, which has major implications for human-robot trust. As a first step to address a part of this issue, we designed a privacy controller, CONFIDANT, for conversational social robots, capable of using contextual metadata (e.g., sentiment, relationships, topic) from conversations to model privacy boundaries. Afterwards, we conducted two crowdsourced user studies. The first study (n=174) focused on whether a variety of human-human interaction scenarios were perceived as either private/sensitive or non-private/non-sensitive. The findings from our first study were used to generate association rules. Our second study (n=95) evaluated the effectiveness and accuracy of the privacy controller in human-robot interaction scenarios by comparing a robot that used our privacy controller against a baseline robot with no privacy controls. Our results demonstrate that the robot with the privacy controller outperforms the robot without the privacy controller in privacy-awareness, trustworthiness, and social-awareness. We conclude that the integration of privacy controllers in authentic human-robot conversations can allow for more trustworthy robots. This initial privacy controller will serve as a foundation for more complex solutions.
Vehicle-to-Pedestrian (V2P) communication enables numerous safety benefits such as real-time collision detection and alert, but poses new security challenges. An imminent and probable scenario is where a malicious node claiming to be a legitimate pedestrian within the network broadcasts false observations or phenomena on the roads (e.g., traffic load, road hazard, and false road crossing alarms) in order to impede traffic flow, erode user’s trust in alert messages, or even cause traffic accidents. Therefore, it is crucial to identify legitimate road users against adversaries pretending to be one. In this work, we propose PEDRO, a PEDestRian mObility verification mechanism for pedestrians using commodity hardware, where only legitimate mobile pedestrians can be admitted to the ad hoc network consisting of trustworthy vehicles and pedestrians. We leverage the round-trip time (RTT) of wireless signal between vehicle and pedestrian’s devices, and verify only moving (mobile) ones while rejecting stationary ones, based on the realistic assumption that the adversaries are likely to remotely launch attacks through static malicious devices. Through an extensive analysis based on simulation as well as real-world experiments, we show that PEDRO’s verification takes under 8s while achieving an 8.5% Equal Error Rate (EER) under regular road environments.
Online privacy settings aim to provide users with control over their data. However, in their current state, they suffer from usability and reachability issues. The recent push towards automatically analyzing privacy notices has not accompanied a similar effort for the more critical case of privacy settings. So far, the best efforts targeted the special case of making opt-out pages more reachable. In this work, we present PriSEC, a Privacy Settings Enforcement Controller that leverages machine learning techniques towards a new paradigm for automatically enforcing web privacy controls. PriSEC goes beyond finding the web-pages with privacy settings to discovering fine-grained options, presenting them in a searchable, centralized interface, and – most importantly – enforcing them on demand with minimal user intervention. We overcome the open nature of web development through novel algorithms that leverage the invariant behavior and rendering of web-pages. We evaluate the performance of PriSEC to find that it is able to precisely annotate the privacy controls for 94.3% of the control pages in our evaluation set. To demonstrate the usability of PriSEC, we conduct a user study with 148 participants. We show an average reduction of 3.75x in the time taken to adjust privacy settings as compared to the baseline system.
Recent advances in sensing and computing technologies have led to the rise of eye-tracking platforms. Ranging from mobiles to high-end mixed reality headsets, a wide spectrum of interactive systems now employs eye-tracking. However, eye gaze data is a rich source of sensitive information that can reveal an individual’s physiological and psychological traits. Prior approaches to protecting eye-tracking data suffer from two major drawbacks: they are either incompatible with the current eye-tracking ecosystem or provide no formal privacy guarantee. In this paper, we propose Kalεido, an eye-tracking data processing system that (1) provides a formal privacy guarantee, (2) integrates seamlessly with existing eye-tracking ecosystems, and (3) operates in real-time. Kalεido acts as an intermediary protection layer in the software stack of eye-tracking systems. We conduct a comprehensive user study and trace-based analysis to evaluate Kalεido. Our user study shows that the users enjoy a satisfactory level of utility from Kalεido. Additionally, we present empirical evidence of Kalεido’s effectiveness in thwarting real-world attacks on eye-tracking data.
Advances in deep learning have made face recognition technologies pervasive. While useful to social media platforms and users, this technology carries significant privacy threats. Coupled with the abundant information they have about users, service providers can associate users with social interactions, visited places, activities, and preferences - some of which the user may not want to share. Additionally, facial recognition models used by various agencies are trained by data scraped from social media platforms. Existing approaches to mitigate these privacy risks from unwanted face recognition result in an imbalanced privacy-utility trade-off to the users. In this paper, we address this trade-off by proposing Face-Off, a privacy-preserving framework that introduces minor perturbations to the user’s face to prevent it from being correctly recognized. To realize Face-Off, we overcome a set of challenges related to the black box nature of commercial face recognition services, and the scarcity of literature for adversarial attacks on metric networks. We implement and evaluate Face-Off to find that it deceives three commercial face recognition services from Microsoft, Amazon, and Face++. Our user study with 423 participants further shows that the perturbations come at an acceptable cost for the users.
The pervasive use of smart speakers has raised numerous privacy concerns. While work to date provides an understanding of user perceptions of these threats, limited research focuses on how we can mitigate these concerns, either through redesigning the smart speaker or through dedicated privacy-preserving interventions. In this paper, we present the design and prototyping of two privacy-preserving interventions: ‘Obfuscator’ targeted at disabling recording at the microphones, and ‘PowerCut’ targeted at disabling power to the smart speaker. We present our findings from a technology probe study involving 24 households that interacted with our prototypes; the primary objective was to gain a better understanding of the design space for technological interventions that might address these concerns. Our data and findings reveal complex trade-offs among utility, privacy, and usability and stresses the importance of multi-functionality, aesthetics, ease-of-use, and form factor. We discuss the implications of our findings for the development of subsequent interventions and the future design of smart speakers.
Online services utilize privacy settings to provide users with control over their data. However, these privacy settings are often hard to locate, causing the user to rely on provider-chosen default values. In this work, we train privacy settings centric encoders and leverage them to create an interface that allows users to search for privacy settings using free-form queries. To achieve this, we create a custom Semantic Similarity dataset, which consists of real user queries covering various privacy settings. We then use this dataset to fine-tune the state of the art encoders. Using these fine-tuned encoders, we perform semantic matching between the user queries and the privacy settings to retrieve the most relevant setting. Finally, we also use these encoders to generate embeddings of privacy settings from the top 100 websites and perform unsupervised clustering to learn about the online privacy settings types.
New advances in machine learning have made Automated Speech Recognition (ASR) systems practical and more scalable. These systems, however, pose serious privacy threats as speech is a rich source of sensitive acoustic and textual information. Although offline and open-source ASR eliminates the privacy risks, its transcription performance is inferior to that of cloud-based ASR systems, especially for real-world use cases. In this paper, we propose Prεεch, an end-to-end speech transcription system which lies at an intermediate point in the privacy-utility spectrum. It protects the acoustic features of the speakers’ voices and protects the privacy of the textual content at an improved performance relative to offline ASR. Additionally, Prεεch provides several control knobs to allow customizable utility-usability-privacy trade-off. It relies on cloud-based services to transcribe a speech file after applying a series of privacy-preserving operations on the user’s side. We perform a comprehensive evaluation of Prεεch, using diverse real-world datasets, that demonstrates its effectiveness. Prεεch provides transcription at a 2% to 32.25% (mean 17.34%) relative improvement in word error rate over Deep Speech, while fully obfuscating the speakers’ voice biometrics and allowing only a differentially private view of the textual content.
The EU General Data Protection Regulation (GDPR) is one of the most demanding and comprehensive privacy regulations of all time. A year after it went into effect, we study its impact on the landscape of privacy policies online. We conduct the first longitudinal, in-depth, and at-scale assessment of privacy policies before and after the GDPR. We gauge the complete consumption cycle of these policies, from the first user impressions until the compliance assessment. We create a diverse corpus of two sets of 6,278 unique English-language privacy policies from inside and outside the EU, covering their pre-GDPR and the post-GDPR versions. The results of our tests and analyses suggest that the GDPR has been a catalyst for a major overhaul of the privacy policies inside and outside the EU. This overhaul of the policies, manifesting in extensive textual changes, especially for the EU-based websites, comes at mixed benefits to the users. While the privacy policies have become considerably longer, our user study with 470 participants on Amazon MTurk indicates a significant improvement in the visual representation of privacy policies from the users’ perspective for the EU websites. We further develop a new workflow for the automated assessment of requirements in privacy policies. Using this workflow, we show that privacy policies cover more data practices and are more consistent with seven compliance requirements post the GDPR. We also assess how transparent the organizations are with their privacy practices by performing specificity analysis. In this analysis, we find evidence for positive changes triggered by the GDPR, with the specificity level improving on average. Still, we find the landscape of privacy policies to be in a transitional phase; many policies still do not meet several key GDPR requirements or their improved coverage comes with reduced specificity.
Recent advances in machine learning (ML) algorithms, especially deep neural networks (DNNs), have demonstrated remarkable success (sometimes exceeding human-level performance) on several tasks, including face and speech recognition. However, ML algorithms are vulnerable to adversarial attacks, such test-time, training-time, and backdoor attacks. In test-time attacks an adversary crafts adversarial examples, which are specially crafted perturbations imperceptible to humans which, when added to an input example, force a machine learning model to misclassify the given input example. Adversarial examples are a concern when deploying ML algorithms in critical contexts, such as information security and autonomous driving. Researchers have responded with a plethora of defenses. One promising defense is randomized smoothing in which a classifier’s prediction is smoothed by adding random noise to the input example we wish to classify. In this paper, we theoretically and empirically explore randomized smoothing. We investigate the effect of randomized smoothing on the feasible hypotheses space, and show that for some noise levels the set of hypotheses which are feasible shrinks due to smoothing, giving one reason why the natural accuracy drops after smoothing. To perform our analysis, we introduce a model for randomized smoothing which abstracts away specifics, such as the exact distribution of the noise. We complement our theoretical results with extensive experiments.
Biometrics have been widely adopted for enhancing user authentication, benefiting usability by exploiting pervasive and collectible unique charactersticts from physiological or behavioral traits of human. However, successful attacks on ‘static’ biometrics such as fingerprints have been reported where an adversary acquires users’ biometrics stealthily and compromises nonresilient biometrics. To mitigate the vulnerabilities of static biometrics, we leverage the unique and nonlinear hand-surface vibration response and design a system called Velody to defend against various attacks including replay and synthesis. The Velody system relies on two major properties in hand-surface vibration responses: uniqueness, contributed by physiological characteristics of human hands, and nonlinearity, whose complexity prevents attackers from predicting the response to an unseen challenge. Velody employs a challenge-response protocol. By changing the vibration challenge, the system elicits input-dependent nonlinear ‘symptoms’ and unique spectrotemporal features in the vibration response, stopping both replay and synthesis attacks. Also, a large number of disposable challenge-response pairs can be collected during enrollment passively for daily authentication sessions. We build a prototype of Velody with an off-the-shelf vibration speaker and accelerometers to verify its usability and security through a comprehensive user experiment. Our results show that Velody demonstrates both strong security and long-term consistency with a low equal error rate (EER) of 5.8% against impersonation attack while correctly rejecting all other attacks including replay and synthesis attacks using a very short vibration challenge.
Audio-based sensing enables fine-grained human activity detection, such as sensing hand gestures and contact-free estimation of the breathing rate. A passive adversary, equipped with microphones, can leverage the ongoing sensing to infer private information about individuals. Further, with multiple microphones, a beamforming-capable adversary can defeat the previously-proposed privacy protection obfuscation techniques. Such an adversary can isolate the obfuscation signal and cancel it, even when situated behind a wall. AudioSentry is the first to address the privacy problem in audio sensing by protecting the users against a multi-microphone adversary. It utilizes the commodity and audio-capable devices, already available in the user’s environment, to form a distributed obfuscator array. AudioSentry packs a novel technique to carefully generate obfuscation beams in different directions, preventing the multi-microphone adversary from canceling the obfuscation signal. AudioSentry follows by a dynamic channel estimation scheme to preserve authorized sensing under obfuscation. AudioSentry offers the advantages of being practical to deploy and effective against an adversary with a large number of microphones. Our extensive evaluations with commodity devices show that protects the user’s privacy against a 16-microphone adversary with only four commodity obfuscators, regardless of the adversary’s position. AudioSentry provides its privacy-preserving features with little overhead on the authorized sensor.
While generalizing well over natural inputs, neural networks are vulnerable to adversarial inputs. Existing defenses against adversarial inputs have largely been detached from the real world. These defenses also come at a cost to accuracy. Fortunately, there are invariances of an object that are its salient features; when we break them it will necessarily change the perception of the object. We find that applying invariants to the classification task makes robustness and accuracy feasible together. Two questions follow: how to extract and model these invariances? and how to design a classification paradigm that leverages these invariances to improve the robustness accuracy trade-off? The remainder of the paper discusses solutions to the aformenetioned questions.
For more than two decades since the rise of the World Wide Web, the “Notice and Choice” framework has been the governing practice for the disclosure of online privacy practices. The emergence of new forms of user interactions, such as voice, and the enforcement of new regulations, such as the EU’s recent General Data Protection Regulation (GDPR), promise to change this privacy landscape drastically. This paper discusses the challenges towards providing the privacy stakeholders with privacy awareness and control in this changing landscape. We will also present our recent research on utilizing Machine learning to analyze privacy policies and settings.
Voice has become an increasingly popular User Interaction (UI) channel, with voice-activated devices becoming regular fixtures in our homes. The popularity of voice-based assistants (VAs), however, have brought along significant privacy and security threats to their users. Recent revelations have indicated that some VAs record user’s private conversations continuously and innocuously. With the VAs being connected to the Internet, they can leak the recorded content without the user’s authorization. Moreover, these devices often do not pack authentication mechanisms to check if the voice commands are issued by authorized users. To address both shortcomings, we propose a framework to impose a security and privacy perimeter around the user’s VA. Our proposed framework continuously jams the VA to prevent it from innocuously recording the user’s speech, unless the user issues a voice command. To prevent unauthorized voice commands, our framework provides a scheme similar to two-factor authentication to only grant access when the authorized user is in its vicinity. Our proposed framework achieves both objectives through a combination of several techniques to (a) continuously jam one (or many) VA’s microphones in a manner inaudible to the user, and (b) provide only authenticated users easy access to VAs.
Voice has become an increasingly popular User Interaction (UI) channel, mainly contributing to the current trend of wearables, smart vehicles, and home automation systems. Voice assistants such as Alexa, Siri, and Google Now, have become our everyday fixtures, especially when/where touch interfaces are inconvenient or even dangerous to use, such as driving or exercising. The open nature of the voice channel makes voice assistants difficult to secure, and hence exposed to various threats as demonstrated by security researchers. To defend against these threats, we present VAuth, the first system that provides continuous authentication for voice assistants. VAuth is designed to fit in widely-adopted wearable devices, such as eyeglasses, earphones/buds and necklaces, where it collects the body-surface vibrations of the user and matches it with the speech signal received by the voice assistant’s microphone. VAuth guarantees the voice assistant to execute only the commands that originate from the voice of the owner.We have evaluated VAuth with 18 users and 30 voice commands and find it to achieve 97% detection accuracy and less than 0.1% false positive rate, regardless of VAuth’s position on the body and the user’s language, accent or mobility. VAuth successfully thwarts various practical attacks, such as replay attacks, mangled voice attacks, or impersonation attacks. It also incurs low energy and latency overheads and is compatible with most voice assistants.
As interconnected devices become embedded in every aspect of our lives, they accompany many privacy risks. Location privacy is one notable case, consistently recording an individual’s location might lead to his/her tracking, fingerprinting and profiling. An individual’s location privacy can be compromised when tracked by smartphone apps, in indoor spaces, and/or through Internet of Things (IoT) devices. Recent surveys have indicated that users genuinely value their location privacy and would like to exercise control over who collects and processes their location data. They, however, lack the effective and practical tools to protect their location privacy. An effective location privacy protection mechanism requires real understanding of the underlying threats, and a practical one requires as little changes to the existing ecosystems as possible while ensuring psychological acceptability to the users. This thesis addresses this problem by proposing a suite of effective and practical privacy preserving mechanisms that address different aspects of real-world location privacy threats.
First, we present LP-Guardian, a comprehensive framework for location privacy protection for Android smartphone users. LP-Guardian overcomes the shortcomings of existing approaches by addressing the tracking, profiling, and fingerprinting threats posed by different mobile apps while maintaining their functionality. LP-Guardian requires modifying the underlying platform of the mobile operating system, but no changes in either the apps or service provider. We then propose LP-Doctor, a light-weight user-level tool which allows Android users to effectively utilize the OS’s location access controls. As opposed to LP-Guardian, LP-Doctor requires no platform changes. It builds on a two year data collection campaign in which we analyzed the location privacy threats posed by 1160 apps for 100 users. For the case of indoor location tracking, we present PR-LBS (Privacy vs. Reward for Location-Based Service), a system that balances the users’ privacy concerns and the benefits of sharing location data in indoor location tracking environments. PR-LBS fits within the existing indoor localization ecosystem whether it is infrastructure-based or device-based. Finally, we target the privacy threats originating from the IoT devices that employ the emerging Bluetooth Low Energy (BLE) protocol through BLE-Guardian. BLE-Guardian is a device agnostic system that prevents user tracking and profiling while securing access to his/her BLE-powered devices. We evaluate BLE-Guardian in real-world scenarios and demonstrate its effectiveness in protecting the user along with its low overhead on the user’s devices.
It is cost-effective to process wireless frames on general-purpose processors (GPPs) in place of dedicated hardware. Wireless operators are decoupling signal processing from basestations and implementing it in a cloud of compute resources, also known as a cloud-RAN (C-RAN). A C-RAN must meet the deadlines of processing wireless frames; for example, 3ms to transport, decode and respond to an LTE uplink frame. The design of baseband processing on these platforms is thus a major challenge for which various processing and real-time scheduling techniques have been proposed.
In this paper, we implement a medium-scale C-RAN-type platform and conduct an in-depth analysis of its real-time performance. We find that the commonly used (e.g., partitioned) scheduling techniques for wireless frame processing are inefficient as they either over-provision resources or suffer from deadline misses. This inefficiency stems from the large variations in processing times due to fluctuations in wireless traffic. We present a new framework called RT-OPEX, that leverages these variations and proposes a flexible approach for scheduling. RT-OPEX dynamically migrates parallelizable tasks to idle compute resources at runtime, reducing processing times and hence deadline misses at no additional cost. We implement and evaluate RT-OPEX on a commodity GPP platform using realistic cellular workload traces. Our results show that RT-OPEX achieves an order-of-magnitude improvement over existing C-RAN schedulers in meeting frame processing deadlines.
Bluetooth Low Energy (BLE) has emerged as an attractive technology to enable Internet of Things (IoTs) to interact with others in their vicinity. Our study of the behavior of more than 200 types of BLE-equipped devices has led to a surprising discovery: the BLE protocol, despite its privacy provisions, fails to address the most basic threat of all—hiding the device’s presence from curious adversaries. Revealing the device’s presence is the stepping stone toward more serious threats that include user profiling/fingerprinting, behavior tracking, inference of sensitive information, and exploitation of known vulnerabilities on the device. With thousands of manufacturers and developers around the world, it is very challenging, if not impossible, to envision the viability of any privacy or security solution that requires changes to the devices or the BLE protocol. In this paper, we propose a new device-agnostic system, called BLE-Guardian, that protects the privacy of the users/environments equipped with BLE devices/IoTs. It lets the users and administrators control those who discover, scan, and connect to their devices. We have implemented BLE-Guardian using Ubertooth One, an off-the-shelf open Bluetooth development platform, facilitating its wide deployment. Our evaluation with real devices shows that BLE-Guardian effectively protects the users’ privacy while incurring little overhead on the communicating BLE-devices.
With the advance of indoor localization technology, indoor location-based services (ILBS) are gaining popularity. They, however, accompany privacy concerns. ILBS providers track the users’ mobility to learn more about their behavior, and then provide them with improved and personalized services. Our survey of 200 individuals highlighted their concerns about this tracking for potential leakage of their personal/private traits, but also showed their willingness to accept reduced tracking for improved service. In this paper, we propose PR-LBS (Privacy vs. Reward for Location-Based Service), a system that addresses these seemingly conflicting requirements by balancing the users’ privacy concerns and the benefits of sharing location information in indoor location tracking environments. PR-LBS relies on a novel location-privacy criterion to quantify the privacy risks pertaining to sharing indoor location information. It also employs a repeated play model to ensure that the received service is proportionate to the privacy risk. We implement and evaluate PR-LBS extensively with various real-world user mobility traces. Results show that PR-LBS has low overhead, protects the users’ privacy, and makes a good tradeoff between the quality of service for the users and the utility of shared location data for service providers.
Traditional mechanisms for delivering notice and enabling choice have so far failed to protect users’ privacy. Users are continuously frustrated by complex privacy policies, unreachable privacy settings, and a multitude of emerging standards. The miniaturization trend of smart devices and the emergence of the Internet of Things (IoTs) will exacerbate this problem further. In this paper, we propose Conversational Privacy Bots (PriBots) as a new way of delivering notice and choice through a two-way dialogue between the user and a computer agent (a chatbot). PriBots improve on state-of-the-art by offering users a more intuitive and natural interface to inquire about their privacy settings, thus allowing them to control their privacy. In addition to presenting the potential applications of PriBots, we describe the underlying system needed to support their functionality. We also delve into the challenges associated with delivering privacy as an automated service. PriBots have the potential for enabling the use of chatbots in other related fields where users need to be informed or to be put in control.
Mobile users are becoming increasingly aware of the privacy threats resulting from apps’ access of their location. Few of the solutions proposed thus far to mitigate these threats have been deployed as they require either app or platform modifications. Mobile operating systems (OSes) also provide users with location access controls. In this paper, we analyze the efficacy of these controls in combating the location-privacy threats. For this analysis, we conducted the first location measurement campaign of its kind, analyzing more than 1000 free apps from Google Play and collecting detailed usage of location by more than 400 location-aware apps and 70 Advertisement and Analytics (A&A) libraries from more than 100 participants over a period ranging from 1 week to 1 year. Surprisingly, 70% of the apps and the A&A libraries pose considerable profiling threats even when they sporadically access the user’s location. Existing OS controls are found ineffective and inefficient in mitigating these threats, thus calling for a finer-grained location access control. To meet this need, we propose LP-Doctor, a light-weight user-level tool that allows Android users to effectively utilize the OS’s location access controls while maintaining the required app’s functionality as our user study (with 227 participants) shows.
Usage behaviors of different smartphone apps capture different views of an individual’s life, and are largely independent of each other. However, in the current mobile app ecosystem, a curious party can covertly link and aggregate usage behaviors of the same user across different apps. We refer to this as unregulated aggregation of app usage behaviors. In this paper, we present a fresh perspective of unregulated aggregation, focusing on monitoring, characterizing and reducing the underlying linkability across apps. The cornerstone of our study is the Dynamic Linkability Graph (DLG) which tracks applevel linkability during runtime. We observed how DLG evolves on real-world users and identified real-world evidence of apps abusing IPCs and OS-level identifying information to establish linkability. Based on these observations, we propose a linkability-aware extension to current mobile operating systems, called LinkDroid,which provides runtime monitoring and mediation of linkability across different apps. LinkDroid is a client-side solution and compatible with the existing smartphone ecosystem. It helps end-users “sense” this emerging threat and provides them intuitive opt-out options.
This work proposes a replication scheme that is implemented on top of a previously proposed system for MANETs that cache submitted queries in special nodes, called query directories, and uses them to locate the data (responses) that are stored in the nodes that first request them, called caching nodes. The system, which was named distributed cache invalidation method (DCIM), includes client-based mechanisms for keeping the cached data consistent with the data source. In this work, we extend DCIM to handle cache replicas inside the MANET. For this purpose, we utilize a push-based approach within the MANET to propagate the server updates to replicas inside the network. The result is a hybrid approach that utilizes the benefits of pull approaches for client server communication and those of push approaches inside the network between the replicas. The approach is analyzed analytically, and the appropriate number of replicas is obtained, where it was concluded that full replication of the indices of data items at the query directory and two-partial replication of the data items themselves makes most sense. Simulation results based on ns2 demonstrate the ability of the added replication scheme to lower delays and improve hit ration at the cost of mild increases in overhead traffic.
Coverage criteria aim at satisfying test requirements and compute metrics values that quantify the adequacy of test suites at revealing defects in programs. Typically, a test requirement is a structural program element, and the coverage metric value represents the percentage of elements covered by a test suite. Empirical studies show that existing criteria might characterize a test suite as highly adequate, while it does not actually reveal some of the existing defects. In other words, existing structural coverage criteria are not always sensitive to the presence of defects.
This paper presents PBCOV, a Property-Based COVerage criterion, and empirically demonstrates its effectiveness. Given a program with properties therein, static analysis techniques, such as model checking, leverage formal properties to find defects. PBCOV is a dynamic analysis technique that also leverages properties and is characterized by the following: (a) It considers the state space of first-order logic properties as the test requirements to be covered; (b) it uses logic synthesis to compute the state space; and (c) it is practical, i.e., computable, because it considers an over-approximation of the reachable state space using a cut-based abstraction.
We evaluated PBCOV using programs with test suites comprising passing and failing test cases. First, we computed metrics values for PBCOV and structural coverage using the full test suites. Second, in order to quantify the sensitivity of the metrics to the absence of failing test cases, we computed the values for all considered metrics using only the passing test cases. In most cases, the structural metrics exhibited little or no decrease in their values, while PBCOV showed a considerable decrease. This suggests that PBCOV is more sensitive to the absence of failing test cases, i.e., it is more effective at characterizing test suite adequacy to detect defects, and at revealing deficiencies in test suites.
As smartphones are increasingly used to run apps that provide users with location-based services, the users' location privacy has become a major concern. Existing solutions to this concern are deficient in terms of practicality, efficiency, and effectiveness. To address this problem, we design, implement, and evaluate LP-Guardian, a novel and comprehensive framework for location privacy protection for Android smartphone users. LP-Guardian's overcomes the shortcomings of existing approaches by addressing the tracking, profiling, and identification threats while maintaining app functionality. We have implemented and evaluated LP-Guardian's on Android 4.3.1. Our evaluation results show that LP-Guardian's effectively thwarts the privacy threats, without deteriorating the user's experience (less than 10% overhead in delay and energy). Also, LP-Guardian's privacy protection is shown to be achieved at a tolerable loss in app functionality.
The Wireless Access in Vehicular Environments (WAVE) protocol stack has been recently defined to enable vehicular communication on the Dedicated Short Range Communication (DSRC) frequencies. Some recent studies have demonstrated that the WAVE technology might not provide sufficient spectrum for reliable exchange of safety information over congested urban scenarios. In this paper, we address this issue, and present a novel cognitive network architecture in order to dynamically extend the Control Channel (CCH) used by vehicles to transmit safety-related information. To this aim, we propose a cooperative spectrum sensing scheme, through which vehicles can detect available spectrum resources on the 5.8 GHz ISM band along their path, and forward the data to a fixed infrastructure known as Road Side Units (RSUs). We design a novel Fuzzy-Logic based spectrum allocation algorithm, through which the RSUs infer the actual CCH contention conditions, and dynamically extend the CCH bandwidth in network congestion scenarios, by using the vacant frequencies detected by the sensing module. The simulation results reveal the effectiveness of our architecture in providing dynamic and scalable allocation of spectrum resources, and in increasing the performance of safety-related applications.
This paper proposes distributed cache invalidation mechanism (DCIM), a client-based cache consistency scheme that is implemented on top of a previously proposed architecture for caching data items in mobile ad hoc networks (MANETs), namely COACS, where special nodes cache the queries and the addresses of the nodes that store the responses to these queries. We have also previously proposed a server-based consistency scheme, named SSUM, whereas in this paper, we introduce DCIM that is totally client-based. DCIM is a pull-based algorithm that implements adaptive time to live (TTL), piggybacking, and prefetching, and provides near strong consistency capabilities. Cached data items are assigned adaptive TTL values that correspond to their update rates at the data source, where items with expired TTL values are grouped in validation requests to the data source to refresh them, whereas unexpired ones but with high request rates are prefetched from the server. In this paper, DCIM is analyzed to assess the delay and bandwidth gains (or costs) when compared to polling every time and push-based schemes. DCIM was also implemented using ns2, and compared against client-based and server-based schemes to assess its performance experimentally. The consistency ratio, delay, and overhead traffic are reported versus several variables, where DCIM showed to be superior when compared to the other systems.
Mobile devices are getting more pervasive, and it is becoming increasingly necessary to integrate web services into applications that run on these devices. We introduce a novel approach for dynamically invoking web service methods from mobile devices with minimal user intervention that only involves entering a search phrase and values for the method parameters. The architecture overcomes technical challenges that involve consuming discovered services dynamically by introducing a man-in-the-middle (MIM) server that provides a web service whose responsibility is to discover needed services and build the client-side proxies at runtime. The architecture moves to the MIM server energy-consuming tasks that would otherwise run on the mobile device. Such tasks involve communication with servers over the Internet, XML-parsing of files, and on-the-fly compilation of source code. We perform extensive evaluations of the system performance to measure scalability as it relates to the capacity of the MIM server in handling mobile client requests, and device battery power savings resulting from delegating the service discovery tasks to the server.
This paper proposes a data replication scheme implemented on top of a cooperative data caching architecture in MANETs that caches submitted queries in special nodes, called query directories (QDs), and uses them to locate data (responses) stored in the nodes that requested them, and called caching nodes (CNs). The QD entries are replicated according to a cost minimization model, and the actual data items are placed in nearby CNs. The proposed system is dynamic, as it adapts to topology changes and relocates replicas as necessary. The preliminary prototype of the proposed method is simulated using ns2 to assess its performance experimentally. Enhancements in performance in terms of lowered access delay and improved hit rates are reported, while maintaining a cap on overhead traffic.
Mobile Ad hoc Networks (MANETs) have become increasingly popular with the rapid emergence of hand-held devices and advanced communication technologies. As a result, several MANET applications have been proposed one of which is the data access application. To enhance the performance of this application cache management systems have been suggested; however, they have been designed regardless of the privacy concerns they raise. We study the cache management system COACS (a COoperative and Adaptive Caching System for MANETs) and its weaknesses in terms of privacy to propose a privacy-preserving protocol to render such a caching system well protected against all kind of internal or external privacy breaches. We also provide a mathematical analysis to measure the system's degree of anonymity.
Port scanning is the most popular reconnaissance technique attackers use to discover services they can break into. Port scanning detection has received a lot of attention by researchers. However a slow port scan attack can deceive most of the existing Intrusion Detection Systems (IDS). In this paper, we present a new, simple, and efficient method for detecting slow port scans. Our proposed method is mainly composed of two phases: (1) a feature collection phase that analyzes network traffic and extracts the features needed to classify a certain IP as malicious or not. (2) A classification phase that divides the IPs, based on the collected features, into three groups: normal IPs, suspicious IPs and scanner IPs. The IPs our approach classify as suspicious are kept for the next (K) time windows for further examination to decide whether they represent scanners or legitimate users. Hence, this approach is different than the traditional approach used by IDSs that classifies IPs as either legitimate or scanners, and thus producing a high number of false positives and false negatives. A small Local Area Network was put together to test our proposed method. The experiments show the effectiveness of our proposed method in correctly identifying malicious scanners when both normal and slow port scan were performed using the three most common TCP port scanning techniques. Moreover, our method detects malicious scanners that are otherwise not detected using well known IDSs such as Snort.
The Wireless Access in Vehicular Environments (WAVE) protocol stack is one of the most important protocols proposed to standardize and allocate spectrum for vehicle-to-vehicle and vehicle-to-infrastructure communication. In a previous work, we proved that WAVE faces a spectrum scarcity problem which hinders reliable exchange of safety information. To overcome this problem, we proposed a system that applies cognitive networks principles to WAVE as to increase the spectrum allocated to the control channel (CCH) by the IEEE 802.11p amendment, where all safety information is transmitted. However, the decision making process in our previous work did not utilize the extra spectrum efficiently as it was not allocated according to the contention level experienced by the vehicle. In this paper, we suggest a system that employs a fuzzy logic system (FLS) to dynamically assign additional spectrum from the ISM band to the CCH. This system, which we call FCVANET, assigns the minimum necessary additional bandwidth to relieve the contention. The FLS takes as input 2 parameters, the message delay and the un-transmitted packets and utilizes a feedback loop. Our simulations show that the proposed system allocates bandwidth more efficiently in accordance with the contention level faced by the vehicles. The system succeeds to relieve contention by reducing delay and the number of un-transmitted packets.
Wireless Access in Vehicular Environments (WAVE) protocol stack is the most important protocol used to allocate spectrum for vehicular communication. The capabilities of WAVE to provide reliable exchange of safety information are questionable. In a previous work, we suggested a system that employs cognitive networks principles to increase the spectrum allocated to the control channel (CCH) by the IEEE 802.11p amendment, where all safety information is transmitted. However, the decision making process implemented in that work does not differentiate between contention levels and does not relate precisely the measured contention to the amount of needed spectrum, which leads to an inefficient utilization of the white spectrum. In order to assign the minimum necessary additional bandwidth to relieve the contention, we suggest in this paper a new system that quantifies contention into multiple levels of severity based on Fuzzy Logic and maps additional spectrum correspondingly. Simulations show the effectiveness of the system in allocating the minimum needed bandwidth to relieve contention, without affecting other QoS parameters such as delay and number of untransmitted packets.
The Wireless Access in Vehicular Environments (WAVE) protocol stack is one of the most important protocols used to allocate spectrum for vehicular communication. In a previous work, we proved that WAVE does not provide sufficient spectrum for reliable exchange of safety information. More specifically, safety message delay is not acceptable and exceeds application requirements. In this paper, we propose a system that provides Data delivery guarantees using Cognitive networks principles in congested Vehicular ad hoc networks. We will refer to our system as DCV. Our goal is to ensure that all safety packets get generated and transmitted during the same interval. The system monitors the contention delay experienced by cars on the control channel where all safety packets should be transmitted. If the sensed contention delay exceeds delay threshold γ, the Road Side Unit (RSU) needs to increase the spectrum allocated to the control channel using cognitive networks. The RSU employs a feedback control design where additional bandwidth is added to drive the contention delay below the delay threshold γ used as reference input for the controller. Analysis and simulations indicate the effectiveness of the system in providing data delivery guarantees in vehicular networks and thus increasing safety measures on the road.
Researchers have suggested Vehicular Ad hoc Networks as a way to enable car to car communications and to allow for the exchange of safety and other types of information among cars. The Wireless Access in Vehicular Environments (WAVE) protocol stack is standardized by the IEEE, and it allocates spectrum for vehicular communication. In our work we prove that it does not provide sufficient spectrum for reliable exchange of safety information. To alleviate this problem, we present a system that employs cognitive network principles to increase the spectrum allocated to the control channel (CCH) by the WAVE protocols, where all safety information is transmitted. To accomplish this objective, the proposed system relies on sensed data sent by the cars to road side units that in turn forward the aggregated data to a processing unit. The processing unit infers data contention locations and generates spectrum schedules to dispatch to the passing cars. Analysis and simulation results indicate the effectiveness of the system in improving data delivery in vehicular networks and thus increasing the reliability of safety applications.
This work builds on the LIME (Linda in mobile environment) tuple space framework to implement a system that offers clustering and routing capabilities for mobile ad hoc network (MANET) environments, and provide an agent-like architecture for running distributed and collaborative applications on mobile devices. This paper describes the components that were added to the LIME system, which were necessary to implement engagement and disengagement of hosts into and out of spaces, and illustrates the developed engagement mechanism and routing protocol with the aid of example scenarios. The paper then discusses the system performance obtained from implementing its functions using the ns-2 network simulation software. The obtained results indicate that the system works reasonably well under different conditions (host transmission range, host mobility, and density of hosts in the network). For instance, the time for a host to join a space is well under one second in sparse spaces and goes up to only two seconds in moderately dense spaces). Moreover, the system offers routing performance that is moderately better than that of ZRP, both in terms of route discovery delay and generated traffic.
In this paper, we present a novel integrated method for testing gate-oxide shorts due to pinhole defects in the gate oxide of CMOS circuits using a wavelet transform-based transient current (iDDT) analysis technique. Wavelet transform has the property of resolving events in both time and frequency domains unlike Fourier transform which decomposes a signal in frequency components only. The proposed method is based on switching the CMOS gate, monitoring the wavelet transform of the transient current and comparing it to the one of the defect-free gate. The MOS transistor is modeled using a two-dimensional non-linear split model. Simulation results on the circuit under test show that wavelet transform has higher fault detection sensitivity than Fourier or peak-current value comparison methods and hence, can be considered very promising for defect oriented testing of gate-oxide shorts.
The time based localization utilizing cellular communication networks has been investigated as a complementation to Global Navigation Satellite Systems (GNSS) for critical scenarios, like indoor or urban canyon areas. By suitable Hybrid Data Fusion (HDF) algorithms which combine the information from GNSS and terrestrial cellular networks, the estimated position accuracy can be improved. However, the wave propagation characteristics for joint GNSS and terrestrial mobile radio based localization as application has not been studied yet. Therefore, a measurement campaign for GNSS at 1.51 GHz and terrestrial radio at 5.2 GHz was performed. In this paper, an analysis of the outdoor to indoor channel for the joint localization as application is presented. It turns out to be that the Time of Arrival (ToA) bias, which is the difference between the geometric distance and the distance propagated by the first incoming wave, is depending on the elevation angle of incoming rays seen from the building to the transmitter. A comparison between two carrier frequencies is addressed.
This paper describes a fast HTML web page detection approach that saves computation time by limiting the similarity computations between two versions of a web page to nodes having the same HTML tag type, and by hashing the web page in order to provide direct access to node information. This efficient approach is suitable as a client application and for implementing server applications that could serve the needs of users in monitoring modifications to HTML web pages made over time, and that allow for reporting and visualizing changes and trends in order to gain insight about the significance and types of such changes. The detection of changes across two versions of a page is accomplished by performing similarity computations after transforming the web page into an XML-like structure in which a node corresponds to an open–close HTML tag. Performance and detection reliability results were obtained, and showed speed improvements when compared to the results of a previous approach.
Vehicular ad hoc networks, also known as VANETs, constitute a major pillar in making the dream of an Intelligent Transportation System (ITS) come true. By enabling vehicles to communicate with each other, it would be possible to have safer and more efficient roads where drivers and concerned authorities are supplied with timely information. Based on a short to medium range communication systems, VANETs can enable both safety and entertainment types of applications to come to reality. Unfortunately, the application layer has not received sufficient attention. Although some of the undergoing projects have touched on the subject, their works do not seriously cover issues dealing with actual implementations of VANET scenarios. This paper describes some application layer scenarios which we developed using the network simulator ns2. We describe the limitations of ns2 as it concerns VANET simulations and our implemented solution, and then move on to considering car braking and changing lane scenarios in order to demonstrate how such applications may work.
In spite of the existence of several systems that organize and offer information to users (e.g. Internet, Intranet, or private databases), finding the desired data is still a time consuming task. ASKME solves this problem by providing users with a collaborative learning environment which evolves through direct and indirect user contributions. The system includes a credit/debit system to make sure users participate in providing answers and feedback. The system can also provide users with online material that it has located. The current system is implemented as a web server and thus the knowledge-base (KB) is centralized. Future plans include allowing for distributed data.