Static analysisĪndroidManifest.xml contains a lot of features that can be used for static analysis. HTTP API is provided to allow the full download of the unaltered APKs from the Androzoo dataset. A weekly updated list containing all the detailed information about the apps is created. The architecture is developed to collect the Androzoo dataset from different sources including official android market, Google Play, Anshi, AppChina, 1mobile, and Genome project dataset. No.įor benign android apps, we used the Androzoo dataset, which currently contains more than eight million unique android apps, and the number is still growing. The families of each malware category in Table 1 along with the numbers of the captured samples are as presented below: Adware Sr. Table 1 presents the details of 14 android malware categories along with number of respective families and samples in the dataset. We searched for similar malware samples to categorize malware samples in dataset with similar characteristics. We used VirusTotal to specify malware family and label the dataset by following a consensus of 70% anti-viruses to incorporate reliability in labeled dataset. Capturing data and final datasetĬCCS supported us to capture the real-world android malware apps for analysis. The taxonomy is presented in the research paper mentioned under license (Section 5). We collected 14 malware categories including adware, backdoor, file infector, no category, Potentially Unwanted Apps (PUA), ransomware, riskware, scareware, trojan, trojan-banker, trojan-dropper, trojan-sms, trojan-spy and zero-day.Ī complete taxonomy of all the malware families of captured malware apps is created by dividing them into eight categories such as sensitive data collection, media, hardware, actions/activities, internet connection, C&C, antivirus and storage & settings. Benign android apps (200K) are collected from Androzoo dataset to balance the huge dataset. To generate the representative dataset, we collaborated with CCCS to capture 200K android malware apps which are labeled and characterized into corresponding family. The dataset includes 200K benign and 200K malware samples totalling to 400K android apps with 14 prominent malware categories and 191 eminent malware families. This research work proposes a new comprehensive and huge android malware dataset, named CCCS-CIC-AndMal-2020. There are many techniques available to identify and classify android malware based on machine learning, but recently, deep learning has emerged as a prominent classification method for such samples. It is an open challenge for cybersecurity experts. Detecting android malware in smartphones is an essential target for cyber community to get rid of menacing malware samples.Īndroid malware is one of the most serious threats on the internet which has witnessed an unprecedented upsurge in recent years. Android malware industry is becoming increasingly disruptive with almost 12,000 new android malware instances every day. The unrivaled threat of android malware is the root cause of various security problems on the internet. CCCS-CIC-AndMal-2020 Canadian Institute for Cybersecurity (CIC) project in collaboration with Canadian Centre for Cyber Security (CCCS)
0 Comments
Leave a Reply. |