Understanding Android Application Security

a study of android application security william n.w
1 / 45
Embed
Share

Explore the security challenges of Android applications, study popular free apps, analyze code for vulnerabilities, and address the misuse of personal data and identifiers. Discover the complexities of Android system architecture and the need for improved security measures.

  • Android Security
  • Smartphone Apps
  • Application Analysis
  • Mobile Security
  • Privacy Concerns

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. A Study of Android Application Security William Enck, Damien Octeau, Patrick McDaniel, Swarat Chaudhuri Systems and Internet Infrastructure Security Laboratory The Pennsylvania State University SAKWANNUENG TRAKOOLSHOKE-SATIAN UNIVERSITY OF SOUTH FLORIDA 9 NOVEMBER 2015

  2. Outline Introduction Understand smartphone application security Background What is Android? What are the Android system components? The ded decompiler How does it work? Evaluating Android Security Focuses of analysis Application Analysis Results Program analysis results Limitations and Conclusion The research s limitations and observation of results

  3. Introduction A Study of Android Application Security Seeks to better understand smartphone application security Studies 1,100 popular free Android applications Introduce the ded compiler Analyze 21 millions lines of recovered code Uncover pervasive use/misuse of personal/phone identifiers Deep penetration of advertising and analytic networks

  4. Introduction Enormous security challenges Rapidly developed and deployed applications, coarse permission systems, privacy invading behaviors, malware, and limited security models led to exploitable phones and applications Lack of common definition for security and the volume of applications Malicious, questionable, and vulnerable applications will find their way to the market This paper broadly characterize the security of applications in the Android Market Design and implement a Dalvik decompiler, ded Analyze codes using automated tests and manual inspection Identify the root causes of any discovered vulnerabilities

  5. Background Android An OS designed for smartphones Provide a sandboxed application execution environment Customized embedded Linux system interacts with the phone hardware The binder middleware and application API runs on top of Linux Application s only interface to the phone is through these APIs

  6. Background Android system architecture Each execution is within a Dalvik Virtual Machine (DVM) and under a unique UNIX uid Application interact with each other and the phone through IPC Intents are types inter process messages directed to particular applications or systems services

  7. Background Android system architecture Persistent content provider data stores are queried through SQL-like interfaces Background services provide RPC and callback interfaces trigger actions or access data UI activities receive named action signals from the system and other applications Access to system resource, data, and IPC is governed by permissions assigned at install time

  8. Background Android system architecture The permissions are defined in its manifest file An application is allowed to access a resource or interface if the required permission allows it The user is presented a screen listing the permissions requirement of an application

  9. Background Dalvik Virtual Machine (DVM) Android applications are written in Java, but run in the DVM DVM and Java bytecode run-time environments differ substantially Java applications are composed of one or more .class files, one file per class JVM loads the bytecode for a Java class from the associated .class file as it is referenced at run time Dalvik application consists of a single .dex file containing all application classes

  10. Background Dalvik Virtual Machine (DVM) After the java compiler creates JVM bytecode, the Dalvik dx compiler consumes the .class files, recompiles them to Dalvik bytecode and write the resulting application into a single .dex file. This process consists of the translation, reconstruction, and interpretation of three basic elements of the application A constant pool describes the constants used by a class. Includes among other items, references to other classes, method names, and numerical constants Class definition consists in the basic information e.g. access flags and class names Data element contains the method code executed by the target VM, as well as other information related to methods and to class and instance variables

  11. Background DVM JVM Register Based Stack based Assigns local variables to any of the 2^16 available registers. Directly manipulate registers Assigns local variables to a local variable table and push them onto an operand stack Register Architecture 218 opcodes 200 opcodes Instruction set Tens of opcodes dedicated to moving elements between the stack and local variable table Include the source and destination registers Dalvik has average 30% fewer instructions than Java, but have 35% larger code size

  12. Background DVM JVM A single pool that all classes simultaneously reference. Inlining their values into the bytecode Replicate elements in the constant pools within the multiple .class files Constant pool structure Java bytecode structure loosely mirrors the source code Control flow structure Does not Use the same opcodes for integers and floats Distinguish between int/float/long/double Ambiguous primitive types

  13. Background DVM JVM Use a zero value constant as null Null references Has a null value Compares between two integers and compares between an integer and zero Uses typed opcodes for the comparison of object references and for null comparison of object Comparison of object references Uses ambiguous opcodes to store and retrieve elements in arrays of primitive types Unambiguous. The array type must be recovered for correct translation Storage of primitive types in arrays

  14. The ded decompiler Building a decompiler from DEX to Java, proved to be surprisingly challenging Java decompilation has been studied since the 1990s Prior to our work, there existed no functional tools for the Dalvik bytecode. The vast difference between JVM and DVM makes simple modification of existing decompilers was not possible This choice to decompile the Java source rather than operate on the DEX opcodes directly Leverage existing tools for code analysis. Required access to source code to identify false-positives resulting from automated code analysis Perform manual confirmation The decompiler is freely available at http://siis.cse.psu.edu/ded

  15. The ded decompiler ded extraction occurs in three stages: Retargeting Optimization Decompilation

  16. The ded decompiler Application retargeting Recovering typing information Translating the constant pool Retargeting the bytecode Type Inference Identify class and method constants/variables Only know variable width 32/64 bits, not type Does not distinguish integer and object reference comparison Determine unknown types by how they are used in operations with know type operands

  17. The ded decompiler Type Inference ded adopts the approach Dalvik bytecode reuses registers that are no longer in scope 3 ways ded infers a register s type Comparing with known type Types associated with instructions Passing register to methods / return value expose the type via method signature

  18. The ded decompiler Ded type inference algorithm Identify ambiguous register declaration Each branch of control flow is pushed onto an inference stack When branch is abandoned, the next branch is popped of the stack, continue searching Type information is forward propagated, modulo register reassignment, through the control flow graph

  19. The ded decompiler Constant pool conversion Dalvik maintains a single constant pool for the application Java maintains one for each class Dalvik bytecode places primitive type constants directly in the byte code Java bytecode uses the constant pool for most references. The conversion of constant pool information is performed by: Identify which constants are needed for a .class file. Once ded identifies the constant required by a class, it adds them to the target .class file For primitive type constants, new entries are created For class, method, and instance variable references, the created Java constant pool entries are based on the Dalvik constant pool entries.

  20. The ded decompiler Method Code Retargeting Preprocess the bytecode to reorganize structures that cannot be directly retargeted Linearly traverse the DVM bytecode and translate to the JVM ded reorders and annotates the bytecode with array size and type information for translation Bytecode translation linearly processes each Dalvik instruction maps each referenced register to a Java local variable table index performs an instruction translation for each encountered Dalvik instruction. patches the relative offsets used for branches based on preprocessing annotations defines exception tables that describe try/catch/finally blocks The resulting translated code is combined with the constant pool to creates a legal Java .class file

  21. The ded decompiler Optimization and Decompilation The retargeted .class file can be decompiled using Fernflower or Soot ded s bytecode yields unoptimized Java code Decompiled code is complex and hard to analyze We use Soot to optimize Soot is an optimization tool with the ability to recover source code

  22. The ded decompiler Source Code Recovery Validation The recovered code was virtually indistinguishable from the original source We recover the source code for the top 50 free applications from each of the 22 applications categories 1,100 in total Obtained September 1, 2010 took 497.7 hours or 20.7 days

  23. The ded decompiler Categories of failure Retargeting failures (0.59%) Unresolved reference Type violations Illegal bytecode Decompilation failures Decompilation limitation

  24. Evaluating Android Security Focus of analysis: Exploring issues uncovered in previous studies and malware advisories Searching for general coding security failures exploring misuse/security failures in the use of Android framework Four approaches to evaluate recovered source code: Control flow analysis Data flow analysis Structural analysis Semantic analysis

  25. Evaluating Android Security Control flow analysis Imposes constraints on the sequences of actions executed by input program P, classifying some of them as errors A control flow rule is an automaton A whose input words are sequences of actions of P An erroneous actions sequence is one that drives A into a predefined error state To statically detect violations specified by A, the program analysis traces each control flow path in the tool s model of P

  26. Evaluating Android Security Data flow analysis Permits the declarative specification of problematic data flows in the input program Android phone contains several pieces of private information that should never leave the phone: user s phone number, IMEI (device id), IMSI (subscriber id), ICC-ID (sim card serial number) We check that this information is not leaked to the network The specification declaratively labels program statements matching certain syntactic patterns as data flow sources and sinks Data flow between the source and sinks are violations

  27. Evaluating Android Security Structural analysis Allows for declarative pattern matching on the abstract syntax of the input source code Not concerned with program executions or data flow

  28. Evaluating Android Security Semantic analysis Allows the specification of a limited set of constraints on the values used by the input program The analyzer detects violations to this property using constant propagation techniques well known in program analysis literature

  29. Evaluating Android Security Analysis overview Covers both dangerous functionality and vulnerabilities Selecting properties for study was a significant challenge Properties Specifications Misuse of Phone Identifiers Phone identifiers leaking to remote network servers. Identify data flows Exposure of Physical Location Location exposed to advertisement servers. Identify the portion of code Abuse of Telephony Services Malware sent SMS to premium numbers. Identify hard-coded phone numbers. Eavesdropping on Audio/Video Audio/video eavesdropping. Identify control flows to UI

  30. Evaluating Android Security Properties Specifications Botnet Characteristics (Sockets) Non-HTTP ports and protocols. Examine Socket use for suspicious behavior. Harvesting Installed Applications List of installed applications. Survey use to APIs. Use of Advertisement Libraries Information exposure to ad and analytic networks. Survey inclusion of ad and analytic libraries. Dangerous Developer Libraries Dangerous functionality in applications. Report replication and the implications. Android-specific Vulnerabilities Search for non-secure coding practice. General Java Application Vulnerabilities Java application vulnerabilities. Misuse of information and methods.

  31. Application Analysis Results Information Misuse Explore how sensitive information is being leaked through information sinks OutputStream object from URLConnections, HTTP GET, and POST parameters in HttpClient connections, and the string used for URL objects

  32. Application Analysis Results Phone Identifiers Frequently leaked through plaintext requests Used as device fingerprints Property to a remote server IMEI are used to track individual users IMEI is tied to personally identifiable information (PII) Not all phone identifiers use leads to exfiltration Are sent to advertisement and analytic servers

  33. Application Analysis Results Location Information The granularity of location reporting may not be obvious to the user Sent to advertisement servers

  34. Application Analysis Results Phone Misuse Explore the misuse of smartphone interfaces

  35. Application Analysis Results Telephony Services Applications do not use fixed phone number services Applications do not misuse voice services Background Audio/Video Applications do not misuse video recording Applications do not misuse audio recording

  36. Application Analysis Results Socket API Use external server A few applications include code that uses the Socket class directly No evidence of malicious behavior by applications using Socket directly Installed Applications Do not harvest information about which applications are installed on the phone

  37. Application Analysis Results Included Libraries Libraries included by applications are easy to identify due to namespace conventions

  38. Application Analysis Results Advertisement and Analytic Libraries Use of phone identifiers and location is sometimes configurable Reporting frequency is often configurable Probe for permissions

  39. Application Analysis Results Developer Toolkits Many applications use developer toolkits containing common sets of utilities identifiable through class name or library path Replicate dangerous functionality Probe for permissions Well known brand commission developers to include dangerous functionality

  40. Application Analysis Results Android-specific Vulnerabilities Technical report of Android-specific vulnerabilities

  41. Application Analysis Results Leaking Information to Logs Private information is written to Android s general logging interface Leaking Information to PC Application broadcast private information in IPC accessible to all applications Unprotected Broadcast Receivers Some applications are vulnerable to forging attacks to dynamic broadcast receivers

  42. Application Analysis Results Intent Injection Attacks Some applications define intent addresses base on IPC input Delegating Control Few applications unsafely delegate actions Null Checks on IPC Input Applications frequently do not perform null checks on IPC input Sdcard Use Unexpected uses of data read/write JNI Use Java Native Interface, not written in Java have inherent dangers

  43. Limitations This study is limited in three ways The studied applications were selected with a bias towards popularity The program analysis tool cannot compute data and control flows for IPC between components Source code recovery failures interrupt data and control flows

  44. Conclusion ded and the program analysis specifications open a new door for application certification Potential to integrate these tool into an application certification process Challenging logistically and technically Concern for misuse of privacy sensitive information IMEI, IMSI, ICC-ID Malicious intent How is it misused? used for everything from cookie-esque tracking to account numbers

  45. Conclusion Significant penetration of ad and analytic libraries Occur in 51% of the applications studied An application could have up to 8 different libraries Developers fail to take necessary security precautions Many developers fail to securely use Android APIs Insufficient protection of privacy sensitive information. No exploitable vulnerabilities that can lead to malicious control of the phone We found no evidence of telephony misuse, background noise recording of audio or video, abusive connections, or harvesting lists of installed applications Future study: perform attacks!

Related


More Related Content