Android Malware: Dynamic, Static and Hybrid Analysis Approaches On Mobile Phones

do not necessarily reflect the views of UKDiss.com.

Share this: Facebook Twitter Reddit LinkedIn WhatsApp

Executive Summary:

With the increase of abundance of mobile phones and mobile operating systems, attackers have a larger platform to deploy malicious software. This paper focused on the Android operating system and how to fight its malware with different approaches. In addition, this paper focuses on Dynamic analysis vs Static analysis approaches on detecting Android malware. We will explore not only sophisticated but simple techniques and workarounds to detect Android malware.

/* Draft

may be able to detect more sophisticated malware which, for instance, follow obfuscation techniques, but it imposes a serious overhead in terms of processing time especially on mobile phones with limited processing power. However, on the other hand Static Analysis can come in handy because it requires far less processing power and is perfect for mobile phones. I talked about 2 approaches for static analysis which are predefined detection patterns like Stowaway and Kirin, and methods that use ML classification to effectively detect and classify malware. I focused on the strong and weak points of DroidAPIMiner, DREBIN and MaMaDroid. Finally, I talked about open issues and issues with ML nowadays.

Table of contents:

Chapter 1: Introduction

1.1 – Introduction to Android OS and the arising threats

1.2 – Motivation & Goals

Chapter 2: Dynamic Malware Analysis

2.1 – Background & motivation

2.2 – Existing State-of-art approaches

Chapter 3: Static Malware Analysis

3.1 – Background & motivation

3.2 – Predefined approaches in static analysis

3.3 – Introduction & motivation to Machine Learning in static analysis

3.4 – ML State-of-art approaches

Chapter 4: Hybrid Malware Analysis

4.1 – Background & motivation

4.2 – Existing State-of-art approaches

Chapter 5: Discussion & Comparison

5.1 – Dynamic vs Static analysis vs Hybrid

5.2 – Performance analysis

5.3 – Limitations & improvements

Chapter 6: Conclusion

6.1 – Conclusion

6.2 – Areas of further research

1 Introduction

1.1 Introduction to Android OS And Arising Threats

It is easy to say that technology is not the main reason of our survival as human species, but if we look at our daily lives nowadays we realize that all our daily activities are dependent on technology. However, one of the most important platforms of technology nowadays is mobile phones and it is being used for almost everything in our daily lives. Therefore, we may ask ourselves, how secure is our mobile phone? There exists numerous mobile phone platforms, most popularly iOS and Android. In this paper, the focus will be on the Android operating system, and how it is affected by malware deployment. As shown in Figure 1 above(Gartner,2016), Android has the largest market share and thus has the most users worldwide, and in contrast with other mobile platforms, Android is in fact more vulnerable to attacks and malware deployment. According to AV-Test security report(AVTest,2016), Android is the second most targeted operating system for malware deployment after Microsoft Windows, with a previous number of 358,881 malware in Jan 2013 which then went up to 16,514,928 in Sept 2016. This demonstrates the severity of the situation nowadays, knowing that malicious hackers are getting smarter and more familiar with the new techniques and how to bypass them. In addition, the Android operating system is more targeted by malware, because unlike other operating systems, Android has third party “Application Stores”(Google Play Store Alternatives) other than the official “Play Store”. These unofficial “stores” might vary from a website, an application or a torrent. Therefore, unexperienced users may be lured to download unverified applications from unidentified developers that may turn out to be malicious malware. It is worth mentioning that a large kind of these attacks end up to being successful because of the absence of security ethics. Therefore, if the user is not educated about security pitfalls and how to differentiate between official and fake applications, he is definitely more susceptible to all kinds of malware. Figure 2 above shows a rather funny example of how an uneducated user might be tricked or convinced to give out private information, in this example his/her credit card details.

Figure 1: Statistics of Mobile Phone Platforms(Gartner, 2016)

1.2 Motivat
ion & Goals

Figure 2: Credit Card Security Malware

From the previous section we can deduce that the Android operating system is vulnerable to a wide spectrum of attacks. Therefore, users might need software tools that help or rather shield them from malware. In this paper I will discuss various approaches to tackle Android malware. Researchers and IT security professionals have approached this topic during the past years, and significant progress has been achieved, maybe not to remedy the root cause but at least addressing and mitigating the problem. The focus will be on the analysis of android applications for malicious behavior. The methods to be explored in this paper can be categorized into Dynamic(Chapter 2), Static(Chapter 3) and Hybrid(Chapter 4) approaches. Furthermore, I will go into each of these categories and expand on existing state-of-art implementations of them, afterwards, provide an analysis for each state-of-art with its challenges and limitations. Additionally, chapter 5 is allocated for the comparison of the approaches. Finally, the paper will end with a conclusion and a summary of the findings.

2 Dynamic Malware Analysis

2.1 Background & Motivation

The first idea that comes to mind for malware analysis and detection is to monitor how a program behaves and hence deduce if it is doing malicious activities. This approach is called Dynamic analysis. Dynamic Analysis(sometimes referred to as behavior analysis) of software focuses on examining the target software during run-time, or in other words, monitoring the software while it is in execution phase. As a result, this technique can keep track of code loadings and documents the behaviors, which will prove to be useful in many cases(Kapratwar, 2016). Dynamic analysis can be done on multiple layers of the application hierarchy depending on what it is to be monitored. For instance, a system can monitor varying types of function calls, such as API calls to access the Gmail cloud service, furthermore, monitoring such functions calls can be done in different layers of abstraction in the operating system(Szydlowski et al.,2012).

2.2 Existing State-of-Art Approaches

In order to examine the successfulness and efficiency of Dynamic Analysis, we will go consider a few state-of-art implementations. The first approach is DroidRanger(Zhou et al., 2012). DroidRanger’s developers initially created a crawler to scan the official and unofficial^[1](third party stores for Android official store). Then the collected apps are run into 2 detection engine. The first that is split into 2 steps: permission-based filteringand behavioral foot-printing. This basically aims at examining the AndroidManifest.xml file which is included in the root directory of each Android application, the manifest file has an identifier for every resource used in the application. Therefore, the manifest also includes the declared permissions to be used by the application, for example, if the application requires to have access(permission to use) to the internet it must add“<uses-permission android_name=”android.permission.INTERNET” />”. Having said that, we can examine the manifest file of each application to check if it requests permissions which are normally requested in various categories of malware, and if it doesn’t the scheme filters it out. Hence, this way the scheme can efficiently filter out non-malicious applications. However, this scheme is considered lightweight in comparison to the next step, which is the behavioral foot-printing scheme. The scheme examines the app manifest, app byte-code and structural layout of the app. The app byte-code holds rich semantic-information, however, DroidRanger focuses on Android framework APIs, more specifically it makes use of data flow analysis algorithm to scan for fixed inputs(for example, a saved phone number)(Enck et al.,2011). Furthermore, the structural layout of the application is decompressed to reveal tree structure which can be used to deduce malware behaviors. The mentioned steps above are operated to detect known malware, since the software is actually trying to compare the abnormalities to those of real life malware, on the other hand, to detect unknown malware DroidRanger implemented a heuristics-based filtering scheme which makes the second detection engine. This scheme aims at uncovering zero-day malware and that is by focusing on Android services as well as features that can be used wrongfully to initiate neoteric code that can be native machine code or even Java binary code. Afterwards, dynamic execution monitoring is employed to monitor the application at runtime, and the main focus will be on specifically the new behaviors that initiated as a result of the neoteric code mentioned earlier. Despite DroidRanger having advantages, it has a few drawbacks such as the need to keep malware samples as well as the coverage of markets. On the one hand, A considerable part of DroidRanger is to scan for applications that match certain malware samples, these samples need to be updated consistently since malware is constantly evolving. On the other hand, DroidRanger relies on the crawler from the initial phase to collect applications from the Android Market and several other alternative markets. Hence, it is limited on market coverage and thus may miss important malware categories.

The second state-of-art implementation is TaintDroid(Enck et al.,2014). Before we go into the mechanics of TaintDroid, we must first explain what Taint Analysis is. Dynamic taint analysis is based on the hypothesis that if a malicious hacker wanted to alter the way that a program runs, he/she must render a value that is normally acquired from a trusted and legitimate source to be alternatively derived from his/her own input parameters. For example, the return address in a program is expected to be provided by the code itself and not from external illegitimate sources, therefore, an attacker might aim to exploit these input parameters and overwrite them with his/her own input data. Henceforth, the data that was calculated and derived from these sources is said to be tainted(Newsome et al.,2005). Now, TaintDroid utilizes dynamic taint analysis to mark /label sensitive private data, or in other words the private data is tainted while it is propagating through the program functions, variables and process identifiers. Furthermore, when the data propagates in the network or leaves the system, TaintDroid logs the labels of the data and records which application is taking care of the transmission, as well as to which address or destination is the data being transmitted to. Accordingly, TaintDroid provides the user and respective security services with documented and reasonable understanding of how the data is being used and by which application or process, and hence this may help in classifying which application might be using the data wrongfully or for illegitimate purposes. TaintDroid is definitely an improvement and an advancement in the field of malware analysis, but there exists few ways that might render it insufficient by itself. Nowadays malware is more advanced in a way that it employs multiform evasion techniques to help it dodge systems that are monitoring them(such as TaintDroid) to ultimately protect their methods from being discovered. For instance, W32/Ratos(Threat Encyclopedia) employs time/execution self checking to find out if it is user monitoring or analysis. Other categories of malware do not aim to evade the monitoring process, rather they employ self-modifying techniques to make malware analysis and debugging techniques take considerably more time to complete, and thus renders it infeasible. Another limitation is that most of today’s Android Malware are compressed or encrypted, hence, to examine and analyze a certain malware sample we need to execute it. In addition, an attacker can combine all the above limitations into a single Malware and thus making the process much more complex and problematic(Cavallaro et al.,2007).

The third state-of-art implementation is DroidScope(Yan et al.,2012). Before we dive into how DroidScope works, a brief overview of the Android OS is needed. Visual representation of the Android Operating System Architecture can be seen in Figure 3^[2]. The Android OS is based on a Linux Kernel, the kernel operates at the lowest level and thus it has the highest privilege. Furthermore, services, native and system applications are executed as linux kernel processes. Each application run in Android OS has a unique user ID and a group ID, these IDs are then used for the allocation of resources. However, Zygote is the parent process of all Android processes, which means that each process is an instance of Zygote. Android applications are written the Java programming language, accordingly, to create a Java component, the app needs to be written in Java and then compiled into Java byte-code. Afterwards, the Java byte-code is converted into Dalvik byte-code, or in other words the dex file. Finally, the components including the dex file are joined into an apk package which is the final product. DroidScope follows a rather different

approach from other approaches mentioned earlier, it is built on QEMU^[3] thus it can rebuilt the OS and Java-level semantic views externally. In this way, it can detect any leaks of information at any level, such as Java and native components. In addition, on top of this, DroidScope introduced custom analysis by exporting 3 tiered APIs that mimic the 3 layers of the Android phone, which are the hardware, operating system and the Dalvik virtual machine(DVM). This lead to the development of plugins that can work on top of DroidScope. It also provides the benefit of operating externally and outside the software stack and thus can analyze attacks at kernel-level. However, the biggest drawback is that DroidScope cannot operate in real-time which means that it cannot provide real-time monitoring.

Figure 3: Overview of the Android Operating System(Yan et al.,2012)

The last state-of-art to visit is CopperDroid(Tam et al.,2015), a framework that makes use of QEMU. CopperDroid is an automatic Virtual Machine Introspection(VMI) based dynamic analysis system to reassemble the behaviors of Android malware. CopperDroid has an interesting feature and is its approach in identifying operating system behaviors as well as high-level behaviors. Android applications are executed in their own sandbox, that is irrespective if the target device is running Dalvik VM or ART^[4]. Applications running in sandboxes has proven to be advantageous in security, and that is because the process constraints the application and only allows it to use the resources^[5] and permissions^[6] that it needs. However, an application can communicate with other processes/applications using Inter Process Communication(IPC) and Remote Procedure Call(RPC) interaction. Classic system call analysis fails to facilitate the behavior of Android Applications since it misses a wealth of information such as Android specific semantics and the reconstruction of IPC and RPC interactions and these reasons have proved to be crucial for the perspicacity of app behaviors. CopperDroid proved to provide a better inspection on system calls, which is used to deserialize complicated Android objects. Notably, CopperDroid implements out-of-the-box dynamic analysis which is developed on top of the QEMU. In addition, CopperDroid introduced a new interesting concept of the unmarshalling Oracle where they elaborated as follows: “We introduce the concept of the unmarshalling Oracle, which seamlessly recreates complex Android objects to enrich the semantics of the reconstructed OS- and Android-specific behavior”(Tam et al.,2015). Through the Oracle, CopperDroid not only avoids writing manual code, but also provides a transparent approach to deal with the ever increasing Android complex objects. In addition, CopperDroid analysis does not need complicated introspection, rather it only requires the extraction of system-calls that are done by the processes in the monitored system. Therefore, being an out-of-the-box approach, CopperDroid can easily integrate on variations of the Android framework, like Dalvik VM, ART and different versions. This approach is indeed an advancement in the line of research in Android malware analysis, and it defines a new level in dynamic analysis of Android malware. But, nothing comes for free, CopperDroid’s approach to grab binder IPC from ioctl^[7] system calls in fact incurs considerable runtime overhead which is the outcome of heavy logging as well as the detailed analysis requisites. This is because the process extracts the memory which is used by the ioctl system calls. In addition, CopperDroid does not always extract high-level behaviors of API. For instance, CopperDroid will not always capture SQLite queries and this is because SQLite does not cause system-calls in all cases. Since, SQLite usually creates a cache for its data in memory and accessing this data does not require the invocation of a system-call (Yuan et al. 2017). Furthermore, CopperDroid similar to DroidScope need QEMU simulation to undergo the analysis of the target application and that by itself incurs a considerable overhead.

3 Static Malware Analysis

3.1 Background & Motivation

Another approach to Android malware analysis is Static Analysis. Static analysis became very popular with the advent of mobile phones, since Static analysis is light in nature compared to dynamic analysis because it does not execute in application runtime. Static analysis focuses on harvesting information statically. This information is called features, and these features can be anything from filenames, checksums, APIs to Java byte-code. Afterwards, judging by the nature of the acquired features, the application being analyzed is deemed malicious or benign. Static analysis is attractive for security researchers for the following reasons, Java being a high-level language and the ease of analyzing DEX code. Java could be statically analyzed in an easier manner than other programming languages such as C/C++ because it does not change directly from high-level code to machine code through the compiler, rather there is an intermediary state called the Java byte-code and that is the target for static analysis. In addition, the Dalvik virtual machine(DEX) provides to some extent a high-level instruction set which in fact keeps a lot of the semantics of Java(Reaves et al.,2016). The static analysis approaches that will be discussed are divided into two parts. The first part is the predefined or in other words manually automated static analysis tools which follow a set of rules to classify an application. The second part will be the static analysis tools that makes use of Machine Learning algorithms to classify applications accordingly.

3.2 Predefined approaches in static analysis

Manually automated approaches as mentioned before are done through instrumentation, and that by itself has two sides. The first is that instrumentation is fast and can be done efficiently to include and exclude redundant checks. However, instrumentation can be dangerous because it can be easily detected since it constant and consequently can be avoided by malware. The first implementation to be discussed is Stowaway(Felt et al., 2011). The Android store is now home to more than 3 million applications(AppBrain,2017), and that number does not include the number of applications that can be downloaded from a third party(alternative) stores. Currently, the Android Operating System is attractive to developers because of the ease of development, and for the abundance of a wide spectrum of APIs that help the developers in accessing information, hardware and phone settings. For instance, if a developer wants to use the phone camera, he has to declare a Camera object and then instantiate it and close/release it when he is done. However, this is not the end of the process, since the developer also has to declare the permission to use the phone camera in the AndroidManifest.xml file. Afterwards, when the user downloads that application, he/she will be prompted at install-time that the respective application is requesting the following privileges(the Camera in this case), and thus the user has the option to either accept and grant the request permissions, or decline the permissions and cancel the installation. This is how the Android permission system works in short. One might observe this situation and say that the user has ultimate control over the permissions and hence it is in his own hands to accept or reject the requested permissions, and if everything goes wrong its the user’s fault for providing access. This argument is false because statistically speaking and based on surveys (Felt et al.,2012), more than half of the users tend to ignore the permission request prompt and accept without reading, or they do not fully understand what are permissions and how can it affect them. In addition, certain malicious developers may request more permissions and privileges that they might need, thus rendering the application to be overprivileged. Therefore, users need a tool to inform them when such phenomenon occurs and ultimately aid them in deciding whether to grant or deny permissions/privileges. Stowaway is static analysis tool for Android that analyzes applications to determine the maximum set privileges that it might need. To do that, Stowaway’s approach was first to build a map based on the Android Access Control Policy that specifies what permissions might each method(API call) require. The map was constructed by forcing the Android verification mechanism to log every every permission check as it takes place. Afterwards, Stowaway would examine the DEX files that are present in every compiled application, and use them as inputs to examine API calls, Content Providers and Intents. Accordingly, Stowaway uses the gained information from the previous step and based on the constructed permission map, it calculates the number of permissions the application might need. Then it checks if the developers followed the rule of least privilege, and if they don’t the application is deemed overprivileged. Finally, applications that are overprivileged can be interpreted as suspicious, since being overprivileged can lead to data leaks or can be used in malicious behaviors. Stowaway was the first implementation of a mechanism that tracks and monitors privileges to detect overprivileged or unnecessary privileged applications. However, Stowaway has a few limitations, most importantly its failure to integrate into newer Android Versions. Stowaway is highly dependent on manual input of API and sequence arguments, and the these APIs in usually updated in each release of a new Android version. So for example, if I’m running Android 4.2, and my Stowaway stored arguments are for Android 4.1, I will need to respecify what is different to the current Android version I’m using, which is in fact costly and inconvenient(Au et al., 2012). Another limitation not only to Stowaway but also static analysis in general is the phenomenon when the developers ask for the permissions through native code or exec which is outside the Android API. Stowaway thus will not be able to detect the mentioned permission requests, therefore, leading to a percentage of false positives.

The second approach to be discussed is Kirin(Enck et al., 2009), a lightweight analysis tool. Kirin provides a lightweight security certification during the installation of Android applications. In other words, it defines security rules which are predefined templates that are engineered to map unwanted properties such as properties related to security configuration in an Android application. Kirin is developed as an extension to the Android Operating System and will run when the user chooses to initiate an installation of an application. When an installation process happens, Kirin will run its security service to analyze the application and check if it succeeds the certification. Finally, if the application fails the certification, Kirin will inform the user and will give him/her the option of overriding the installations process. Now, the novelty of Kirin is evidently dependent on how robust are the rules. To deem an application safe or not safe, we must check if it provides the user with basic security requirements, this is also know as the field of security requirement engineering which is a fundamental part of the whole software engineering process.Security requirement engineering deals with how an application needs to behave in order to facilitate a secure environment, or in other words, an environment that does not harm the user’s assets. Therefore, it is crucial to define what an asset means in the case of a mobile phone so that security requirements can be defined accordingly. Assets in a mobile phone can be information about the user such as his email, contacts, picture gallery, internet browsing history, personal address and anything that of value to the user. So with this in mind, Kirin follows certain steps to identify security requirements as shown in Figure 4. Step one is to Identify the assets and this is done by extracting features from the Android manifest file in the form of permissions. For instance, android.permission.RECORD_VIDEO is a permission for accessing the phone’s video camera for recording, hence, the matching assets to this permission would be the camera and the microphone. Since, the camera might can a video footage of the user and the microphone might record an audio track of the user’s voice. The second step is to identify functional requirements which will be done by examining the assets and giving them descriptions that will depict how the asset interacts with other processes or hardware. This is arguably the most important step since most threats lie in how the asset communicates with other processes or third party applications. The third step is determining threats as well as security goals for the identified assets. If we take the security goal of confidentiality, and go back to our example in step one, an example of a threat(in this case a spyware) would be if an application accessed the video being recorded and sent it to an external source, hence breaking the goal of confidentiality. Here Kirin will generate a threat description that will explicitly mention the case mentioned above. Withal, generating threat descriptions might prove to be tricky since the platform of an attack is very wide and cannot be covered fully, however, with the help of security experts and researchers, a sound report can be done. The fourth step is to develop the asset’s security requirements that are based on the threat descriptions. This is a straightforward step where Kirin combines the information collected above to write a security requirement that will contribute to the final security certification. Finally, step 5 aims at discussing the practical limitations of Kirin’s security mechanism, since after all Kirin is a static analysis approach that is constrained but the information provided by the manifest file Kirin(Enck et al., 2009). One of the main limitations to Kirin is that it relies on manually predefined patterns(instrumentation) that are used for detection(Arp et al., 2014). Despite that these methods are efficient and scalable, they will most likely fail in analyzing new malware since the predefined rulesets will not always be up-to-date to newly released malware.

Figure 4: Kirin steps to identify security requirements(Enck et al., 2009)

3.3 Introduction & Motivation to Machine Learning in Static Analysis

Machine learning is a powerful approach than can help us in almost any application in our lives. An article by Josh Schwartz(Chief of Engineering and Data Science at Chartbeat) explains that Machine Learning is now no longer for experts and professionals, and that is because the tools that are being released nowadays are user friendly and easy to learn. Two of the most popular tools and libraries for machine learning are Scikit-learn, which is a library for python, and TensorFlow which is an open-source software library for machine intelligence. Moreover, we can use Machine learning in Android malware analysis. Features extracted from a static analysis method can be fed into a ML classifier to help identify if an application is benign or malicious. Few of the most popular Android detection methods using machine learning are DroidAPIMiner(Aafer et al., 2013), DREBIN(Arp et al., 2014) and MaMaDroid(Mariconti et al., 2016). However, before starting to address Machine Learning approaches, we need to define what Machine Learning is and how does it work. Machine Learning algorithms are algorithms which are taught how “learn” from previously defined data which are usually input, and it uses the data to learn how to recognize patterns and eventually produce output which can be anything depending on the case. And by testing and constant training the output will gradually improve to become as accurate as possible. Usually in ML classification accuracy is a factor calculated by the percent of true positives/negatives against the percent of false positives/negatives. However, training is dependent on the training data which we call datasets. For instance, a large dataset might give our algorithm more data to deduce a pattern, while a small but relevant and condensed dataset can make our algorithm make smarter or more logical outputs. Machine Learning algorithms are split into categories which are Supervised, Unsupervised, Semi-supervised and Reinforcement learning(Nesta report 2015). However for the sake of this paper, we are going to go through supervised learning. A supervised learning algorithm needs a dataset with characterized data, furthermore, supervised learning is commonly used to solve classification problems like DroidAPIMiner, DREBIN and MaMaDroid.

3.4 ML State-of-Art Approaches

The first state-of-art to be discussed is DroidAPIMiner(Aafer et al., 2013) where the authors developed a lightweight and lightweight machine learning classifier to analyze Android applications and deem them malicious or benign. The job of a classifier in the field of machine learning is to identify to which sub-population does the input belong based on a given data-set. DroidAPIMiner aims at an extensive data mining approach that will eventually be used to train a classifier for Android apps. And by using the features extracted the classifier should adapt and train itself to identify more complex Android malware. As shown in Figure 5, DroidAPIMiner follows 3 steps, feature extraction, feature refinement and model learning and generation. The feature extraction phase is when samples of benign and malicious applications(samples) are given as input to be analyzed for distinct traits between them. DroidAPIMiner focused on the analysis of application byte-code to extract API calls and their respective package-level details. In addition, the system also extracts permissions that were requested by the benign and malicious sample to have an insight of what permissions are used and what is the percent usage. The collected information will serve as the feature set. The next phase is feature refinement and is responsible for filtering out features for the goal of having a baseline for distinction between the benign and malicious samples. More specifically, to refine the dataset so that the APIs used to classify malware are relatively more than the benign APIs. However, there might exist some features that are frequent in both sets, these features are further analyzed using data flow analysis extract the API that are used to invoke dangerous values. Afterwards, DroidAPIMiner is built as a Python program using AndroGuard^[8] to facilitate the static analysis of Android Applications. Lastly, the model learning and generation phase is when the new feature set is fed into the classifier(Aafer et al., 2013).The authors of DroidAPIMiner employed ID3 DT(Quinlan et al., 1986), C4.5 DT(Quinlan et al., 1986), KNN(Aha et al., 1991) and linear SVM(Vapnik et al., 2013) for the intent of generating better results, however, they concluded that KNN resulted in better output if put in comparison with the other models they used. KNN produced an accuracy of 99% and a low positive rate of 2.2%(Aafer et al., 2013). Though DroidAPIMiner is a vast improvement to the line of static analysis, it remains vulnerable to evasion methods that attackers might follow to trick the classifier into producing incorrect or inaccurate output. Two of the most popular evasion techniques are ByteCode encryption and the use of Native Code. ByteCode encryption is a type of code obfuscation which evidently encrypts the ByteCode such that the feature extraction will fail or will result in the extraction of data that doesn’t make sense. Native Code evasion technique happens when the malware author attach malicious payloads to native content which is not an area of analysis for DroidAPIMiner. Moreover, another limitation to DroidAPIMiner is that it needs active retraining to keep performing well, and this is because it analyzes the frequency of API calls used by common malware, and we know that new updates and patches are released everyday which will deprecate certain API calls. Therefore, new designed malware will make it undetected when the classifier is not trained or kept up to date(Mariconti et al., 2016).

Figure 5: Illustration of the approach DroidAPIMiner follows(Aafer et al., 2013)

Another famous state-of-art for static analysis with machine learning classification is DREBIN(Arp et al., 2014). DREBIN has a similar approach of DroidAPIMiner to Android malware analysis, which also relies on the extraction of APIs statically. The system examines the AndoirdManifest.xml file and extracts the following sets: hardware components, requested permissions, app components and filtered intents. The hardware component set includes API calls that request hardware parts such as the GPS, camera, touchscreen and microphone. These requests are collected because malicious attackers might want to used the hardware to collect information such as the user’s location according to the GPS. However, the requested permission set is the permission collected from the manifest. The reason we also focus on permissions is that in most cases, Android malware usually requests more permission than benign applications which renders it overprivileged (Sarma et al.,2012). However, the app component set identifies which type of component is used in each interface in the application, and this will help in taking each malware with the components that it uses. However, the filtered intents set stores the set intents that are used form inter process communication. This is important since malware usually waits for the right moment to execute commands, or in more technical terms, listens to the callbacks of functions. For example, a malware author might want to attach a command stealthily on a button click, this means that he must wait until for the onClick()callback. Moreover, DREBIN collects information from disassembled DEX code for the following sets: restricted API calls, used permissions, suspicious API calls and network addresses. The restricted API calls set includes the critical API used. Some APIs are restricted and not accessible by third party apps and are protected by the Android operating system. However, if we find these APIs being used in the DEX code it means that the application is using it without requesting permission in the manifest file, this means that the application is using a root exploit. The next set is used permissions which

Figure 6: Descriptive Scheme of steps that DREBIN follows(Arp et al., 2014)

Figure 7: Example run of DREBIN(Arp et al., 2014)

is created in the form of completeness to the previous set, and it includes the permissions that are literally both used and requested. The set of suspicious API calls includes the API calls that are usually used to gain access personal or private information as well as critical hardware resources. Lastly, the network addresses set includes IP addresses, TCP connections and URLs. This set is particularly important because a malware might want to retrieve certain commands from a host or send information. Finally, these sets are then embedded into a vector space. Now that the vector space is ready, the learning model needs to be set up. Unlike other methods, DREBIN uses linear SVM as a learning method which constructs a hyperplane with a maximum margin to separate malicious and benign applications. However, this model is not stored on the mobile phone, rather it is run on a dedicated offline system that will transfer the learning model to the mobile phone, which will eventually be used to classify applications as benign or malicious. Ultimately, DREBIN is highly regarded for its ability to provide explanation to the user if it deems an app malicious, and this is done by setting a weight value for each dubitable feature extracted, and then uses the weights to construct meaningful sentences which provide explanation. For example, Figure 7 shows an instance where DREBIN classified an application as malicious and gave explanation for the reasons of the decision. Now, DREBIN might look like the perfect solution, but in fact it faces few limitations. Transformation attacks are still a main constraint for static analysis, ByteCode encryption as previously explained in DriodAPIMiner. In addition, reflection attacks can also hinder the output. Reflection in Java is actually an advantage since it allows the

developer to access a method in another class by calling its name, however, when used by malware authors it can prove troublesome for static analysis techniques. Accordingly, if a malware author combines reflection with encryption is can become impossible for static analysis techniques to detect it, thus rendering static analysis infeasible and inconclusive(Rastogi et al., 2013). Lastly, DREBIN’s progression remains dependent on the accessibility and availability of the learning model which is supplied by an offline machine.

The last state-of-art I’m going to discuss is MaMaDroid(Mariconti et al., 2016) which is similar to DroidAPIMiner and DREBIN’s approach to Android malware static analysis but it follows a different approach. MaMaDroid follows a behavioral approach to extract relevant features which will be used later for classification. The steps followed by MaMaDroid are shown in Figure 8. The first step is basically statically extracting information from the application apk file, and most importantly the call graph. Next step aims for the extraction of the sequence that the API calls follow. Keeping in mind that in the field of computer science, a graph is a web of connected nodes, and thus there is no entry point or exit. Therefore, step 2 extracts the API fro the call graph and tries to define an entry node and built the sequence of calls that happened. Then, MaMaDroid abstracts the extracted sequences of API calls to a Package/Family level(a higher level). In this way, MaMaDroid does not work with fixed/raw API calls, rather, with high-level features thus making it resistant to API changes. MaMaDroid then constructs feature vectors which will be later used for classification, however, the feature vectors are based on the concept of Markov Chains to represent the transitions from one API call to the other. Markov chains are usually depicted by a set of nodes, where each node has its own distinct state. Nodes are connected by edges, and each edge that connects a node to another has a probability value, where the summation of all the probability should be equal to exactly 1(Norris, 1997). Therefore, MaMaDroid chooses to depict the transitions and sequences of the extracted APIs in the form of Markov chains. Moreover, the feature vectors that are built are actually the probabilities of transition from one state(API call) to another. The last step is the classification based on the feature set that was collected from the previous step. Now, MaMaDroid has many benefits, most importantly the abstraction of API calls and lesser susceptibility to Android API changes. MaMaDroid does not work directly with raw API calls, but rather it abstracts them to either its package or family level, and this is favorable because it renders the extraction process to be more resilient to API changes over time. In addition, this benefit makes MaMaDroid more scalable over time and significantly lowers the cost of retraining which is a serious drawback of using a ML classifier. Therefore, MaMaDroid datasets can last longer intervals of time before it starts to loose accuracy and eventually impose higher false positive rates. And we need also to keep in mind that that if we train our datasets for newer malware, it of course means that its is more prepared to deal with newer malware, but it is hence more vulnerable to old malware because now it was retrained for newer APIs(Mariconti et al., 2016). Wherefore, MaMaDroid applies opposite analysis which tests if the it can also detect the malware created before it was retrained. Hence, MaMaDroid not only protects against new malware but also the older ones. Since MaMaDroid is a Static Analysis method, it inherits the inevitable limitations that I already addressed in DroidAPIMiner and DREBIN, yet, Dynamic Analysis can overcome these limitations. Furthermore, MaMaDroid needs a good amount of memory to undergo the classification process and this is a limitation when put in contrast with DREBIN.

Figure 8: Steps used in the operation of MaMaDroid(Mariconti et al., 2016)

4 Hybrid Malware Analysis

4.1 Background & Motivation

Heretofore, we have discussed dynamic and static malware analysis approaches, however, one may ask why can’t we apply both together in the same implementation. Dynamic and static analysis approaches come with various limitations, but most importantly is that the limitations are rarely the same. Therefore, if we would make an implementation that uses both static and dynamic, we can in theory remedy a good amount of those limitations. For instance, we can use static analysis to perform a lightweight and fast analysis of the Android apk file, and also run dynamic analysis later in runtime to check for dynamic code loading(Tam et al.,2017). Developing a a hybrid approach might prove tricky, since the developer has to choose wisely what should he should implement, since using dynamic analysis per se imposes some overhead in runtime. This section will be dedicated for the discussion of hybrid malware analysis approaches such as RiskRanker(Grace et al., 2012), MAST(Chakradeo et al. 2013) and MARVIN(Lindorfer et al., 2015).

4.2 Existing Hybrid State-of-Art Approaches

The first approach in hybrid malware analysis to be discussed is RiskRanker(Grace et al., 2012), which focuses on checking whether a certain applications undergoes dangerous behaviors. However, RiskRanker, unlike the some approaches claims that it doesn’t entirely rely on manually automated rulesets or signatures. It achieves that by introducing the concept of risk to classify applications. Risks are classified into 3 categories: high, medium and low risk. High-level risk on mobile phones is considered to threaten the software integrity, that can be a root-exploit^[9]. Medium-level risks do not threaten the integrity of phone software, but it usually in the form of a threat that performs commands with the user’s consent, such as giving personal identified information. However, low-level risk will be discarded because it only has threats for publicly available information, and we can safely say that almost every application has a low-level risk. In its design, RiskRanker then performs first as well as second order risk analysis. First order risk analysis serves as the static analysis part where the program statically extracts specific features to analyze. Furthermore, in first order risk analysis, to detect high-level risk, RiskRanker focuses on detecting the use of native code then it compares the application to available signatures to verify if it has root-exploits. However, to detect medium-level risk, RiskRanker operates on reverse engineered Dalvik byte code to trace hidden calls to suspicious functions/methods. To connect a button to a function, an Action Listener needs to be in place, in this case its the onClick() function. For instance, if a developer wants to stealthily excuse a command without the user’s notice, he will link it to the onClick() callback and it will run when the user clicks the button. First order analysis will fail if the malware authors employed code obfuscation methods, therefore RiskRanker authors designed a second order analysis to handle this phenomenon through dynamic behavior analysis. Firstly, RiskRanker checks if the application does questionable behaviors like storing a child app within parent/host app. A Child app can be used to store exploits and lines of code that might not be in the parent app, also it can dynamically load^[10] external libraries and execute code. Then it checks if the application is running encrypted code or in other words running decryption functions frequently. Lastly, RiskRanker checks for unsafe Dalvik code loading, since the static analysis in the first order analysis will not be able to detect it. One of biggest achievements of RiskRanker is its ability to detect Zero-Day Android malware. A Zero-Day malware is a malware that has new or undiscovered software signatures, which renders it to be the worst kind of malware. For instance, a Zero-Day might be an exploit for an undiscovered bug. However, RiskRanker has several limitations such as in the first order analysis, high-level risk done through instrumentation dependent on previous malware signatures. Another limitation lies in the second order analysis and is when RiskRanker attempts to identify applications that use encryption to obfuscate code(Grace et al., 2012). Since, RiskRanker focused on identifying where the target application uses encryption functions provided by the Java library such as Java.crypto, but malware authors can implement their own encryption functions or use third party libraries. In addition, the probability of false positives needs to be addressed in RiskRanker since some legitimate applications utilize dynamic loading to help in efficiency, and some employ their own encryption functions for security.

Another state-of-art approach to hybrid analysis we are going to discuss is MAST(Chakradeo et al. 2013) which is the Mobile Application Security Triage. MAST is based on Multiple Correspondence Analysis(MCA) which is a statistical method to measure the relation among numerous categorical/qualitative data(Salkind et al. 2007). However, in simple terms, MAST develops a questionnaire for Android applications that search for solid correlations between indicators of malware functionalities…. To be continued

Another famous approach to Android hybrid analysis of malware is Marvin(Lindorfer et al., 2015), where applications are analyzed using both static and dynamic analysis and are then given a malice score. Since Marvin analysis is done externally, the authors have also developed an Android application called Andrubis^[11] and released it on the the official Google Play Store. Users who download this application can submit and upload any application up to 9 mega-byte to the server, and eventually users will get a malice score for it. In addition, users can upload an application directly to the website interface and get the same result. Marvin relies on machine learning techniques to perform classification similar to previous static analysis approaches. However, the system also includes dynamic analysis to provide run-time analysis. The steps involved in Marvin are depicted in Figure 9. The most important step here is the feature extraction operation, this is because there are both static and dynamic analysis techniques. Static analysis provides a wealth of information about the applicationfrom available meta-data, such as the file names, permissions requested and information about the publishing author. In addition, static analysis will be responsible for extracting security specific API calls as well as the amount of actually used permissions versus requested permissions. Moreover, static analysis is able to show details and information that dynamic analysis would miss, and this is because dynamic analysis can have the limitation of small code coverage. However, dynamic analysis is also useful since it can provide insight on how the application behaves at runtime. Most importantly, detection of dynamically loaded code which can in fact be loaded from websites, child applications and obfuscated executable files that are scrambled at installation yet they get unscrambled during execution. Another fundamental piece of information about the malware is its network behavior. This can only be detected by dynamic analysis since the network activity is at runtime. Network behaviors can convey informations such as IP addresses, communicating servers as well as HTTP communication. Therefore, the static analysis part will focus on extracting features from the Android package file such as the used permissions that are related to the API calls, the use of APIs that perform cryptography, native code that is invoked by not only JNI (Java Native Interface) but also Dalvik byte-code and the use of API that induce reflection. Afterwards, the application is transferred to be executed in a safe environment (emulator) in a Dalvik Virtual Machine where virtual machine introspection(VMI) takes place. Therefore, the dynamic analysis focuses on extracting data leaks^[12] dynamically loaded code and network operations. Hence, the features extracted from both techniques are combined to form a large dataset for classification. According Marvin’s evaluation, the system was tested on a collection of more than 135,000 Android applications, and Marvin classified then with a success rate of 98.24% and a rate of false positive of less than 0.04%. Marvin is famous for allowing third parties and users to submit applications for analysis. In addition, the result interface shows the malice score as well as points of interest for further manual investigation, making it useful for security professionals and malware analysts. In addition, Marvin employs a constant retraining scheme to avoid classifier failure in the long term, however, it is dependent on the availability of new malware samples. Since Marvin operates static and dynamic analysis, it is definite that it will inherit limitations and shortcomings. One of the fundamental limitations is that the application to be analyzed has to be already installed on the user’s phone. Therefore, the chances are the user already opened the application and the malicious activity has already took place. Another shortcoming is related to dynamic analysis and more precisely the use of an emulator, since some malware are equipped to assess the environment and detect if they are in a virtual machine. However, static analysis is a huge help here since it provides information which won’t be available had it only been dynamic analysis. One theoretical limitation is that if a malware author found out what are the most critical features that contribute highly to the classification process, he might try evade using them or add more good-ware features enough to make the classifier uncertain. However, for this to hold, the malware author has to keep track of the retraining processes which in fact make it considerably harder(Lindorfer et al., 2015).

Figure 9: Schematic overview of Marvin(Lindorfer et al., 2015)

^[1] By unofficial, we mean the third party application stores other than the Official Android Play Store.

^[2] It is worth noting that Figure 3 shows the discontinued architecture of the Android Operating System. The Dalvik Virtual Machine is now replaced by the Android Runtime(ART) environment. However, for the sake of explaining DroidScope we considered the older Android Architecture.

^[3] QEMU is a machine emulator, it can emulate multiple CPUs such as x86, ARM and Sparc(Bellard et al.,2005).

^[4] The main difference between Dalvik VM and ART is that of Just-in-time(JIT) and Ahead-of-Time(AOT) respectively. However, ART is proven to be of better performance at the cost of one time translation done at the installation phase. Nevertheless, this topic is not in the scope of this paper.

^[5] For instance, sandboxing prevents a certain application from taking/allocating more memory that might be the property of another running application.

^[6] Permissions have to be explicitly defined in the AndroidManifest.xml file, thus the sandbox restricts the application to the defined permissions accordingly.

^[7] Input/output control system call(ioctl) is responsible for sending information from the process to the kernel and to return information to a process(Ubuntu Manual).

^[8] AndroGuard is a static analysis tool available on GitHub(https://github.com/androguard/androguard)

^[9] A root-exploit is an exploit that can get root privileges on a phone, and since Android is built based on a Linux kernel, root privileges can give an application all privileges with no exception thus making it very powerful and malicious.

^[10] Dynamic loading in this paper is when an application loads a library or byte-code into memory at run-time.

^[11] Link: <http://play.google.com/store/apps/details?id=org.iseclab.andrubis&hl=en_GB>

^[12] Data Leaks in this case is data that is sent without the user’s consent to unknown sources via routes such as an SMS message.

Android Malware: Dynamic, Static and Hybrid Analysis Approaches On Mobile Phones

Professor