AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Most botnets today rely on a centralized command and control server, whereby bots query a predefined command and control domain name that resolves to the IP address of the command and control server from which commands will be received

From throw-away traffic to bots: detecting the rise of DGA-based malware

USENIX Security Symposium, pp.24-24, (2012)

Cited by: 594|Views195
EI
Full Text
Bibtex
Weibo

Abstract

Many botnet detection systems employ a blacklist of known command and control (C&C) domains to detect bots and block their traffic. Similar to signature-based virus detection, such a botnet detection approach is static because the blacklist is updated only after running an external (and often manual) process of domain discovery. As a resp...More

Code:

Data:

0
Introduction
  • Botnets are groups of malware-compromised machines, or bots, that can be remotely controlled by an attacker through a command and control (C&C) communication channel.
  • Most botnets today rely on a centralized C&C server, whereby bots query a predefined C&C domain name that resolves to the IP address of the C&C server from which commands will be received
  • Such centralized C&C structures suffer from the single point of failure problem because if the C&C domain is identified and taken down, the botmaster loses control over the entire botnet.
Highlights
  • Botnets are groups of malware-compromised machines, or bots, that can be remotely controlled by an attacker through a command and control (C&C) communication channel
  • Most botnets today rely on a centralized command and control server, whereby bots query a predefined command and control domain name that resolves to the IP address of the command and control server from which commands will be received
  • In an effort to combine the simplicity of centralized command and control with the robustness of P2P-based structures, attackers have recently developed a number of botnets that locate their command and control server through automatically generated pseudo-random domains names
  • We use a supervised domain generation algorithms Classifier to prune NXDomain clusters that appear to be generated by domain generation algorithms that we have previously discovered and modeled, or that contain domain names that are similar to popular legitimate domains
  • Our goal is to identify which domain names, among the ones generated by the discovered domain generation algorithms-based bots, resolve into a valid IP address
  • Over the fifteen months of the operational deployment in a major ISP, Pleiades was able to identify six domain generation algorithms that belong to known malware families and six new domain generation algorithms never reported before
Results
  • In order to set the thresholds θma j and θσ defined in Section 4.2, the authors spent the first five days of November 2010 labeling the 213 produced clusters as DGA related (Positive) or noisy (Negative).
  • For this experiment, the authors included all produced clusters without filtering out those with θμ =98% “similarity” to an already known one.
  • All falsely reported clusters had variance very close to 0.001
Conclusion
  • The results the authors presented in Table 1 show that Pleiades is able to construct a very accurate DGA Classifier module, which produces very few false positives and false negatives for α = 10.
  • Pleiades monitors traffic below the local recursive DNS server and analyzes streams of unsuccessful DNS resolutions, instead of relying on manual reverse engineering of bot malware and their DGA algorithms.
  • Over the fifteen months of the operational deployment in a major ISP, Pleiades was able to identify six DGAs that belong to known malware families and six new DGAs never reported before
Tables
  • Table1: Detection results (in %) using 10-fold cross validation for different values of α
  • Table2: DGAs Detected by Pleiades
  • Table3: TPs (%) for C&C detection (1,000 training sequences)
  • Table4: The actual structure of the domain name used by this DGA can be separated into a four byte prefix and a suffix string argument. The suffix string arguments we observed were: seapollo.com, tomvader.com, aulmala.com, apontis.com, fnomosk.com, erhogeld.com, erobots.com, ndsontex.com, rtehedel.com, nconnect.com, edsafe.com, berhogeld.com, musallied.com, newnacion.com, susaname.com, tvolveras.com and dminmont.com. C&C Infrastructure for BankPatch
Download tables as Excel
Related work
  • Dynamic domain generation has been used by malware to evade detection and complicate mitigation, e.g., Bobax, Kraken, Torpig, Srizbi, and Conficker [26]. To uncover the underlying domain generation algorithm (DGA), researchers often need to reverse engineer the bot binary. Such a task can be time consuming and requires advanced reverse engineering skills [18].

    The infamous Conficker worm is one of the most aggressive pieces of malware with respect to domain name generation. The “C” variant of the worm generated

    50,000 domains per day. However, Conficker-C only queried 500 of these domains every 24 hours. In older variants of the worm, A and B, the worm cycled through the list of domains every three and two hours, respectively. In Conficker-C, the length of the generated domains was between four and ten characters, and the domains were distributed across 110 TLDs [27].
Reference
  • K. Aas and L. Eikvil. Text categorisation: A survey., 1999.
    Google ScholarFindings
  • abuse.ch. ZeuS Gets More Sophisticated Using P2P Techniques. http://www.abuse.ch/
    Findings
  • M. Antonakakis, R. Perdisci, D. Dagon, W. Lee, and N. Feamster. Building a dynamic reputation system for DNS. In the Proceedings of 19th
    Google ScholarLocate open access versionFindings
  • M. Antonakakis, R. Perdisci, W. Lee, N. Vasiloglou, and D. Dagon. Detecting malware domains in the upper DNS hierarchy. In the Proceedings of 20th USENIX Security Symposium (USENIX Security ’11), 2011. //www.symantec.com/security_
    Locate open access versionFindings
  • 2008-081817-1808-99&tabid=2, 2009.
    Google ScholarFindings
  • L. Bilge, E. Kirda, C. Kruegel, and M. Balduzzi. EXPOSURE: Finding malicious domains using passive dns analysis. In Proceedings of NDSS, 2011.
    Google ScholarLocate open access versionFindings
  • R. Feldman and J. Sanger. The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge Univ Pr, 200www.microsoft.com/security/portal/
    Findings
  • Name=Virus%3AWin32%2FExpiro.Z, 2011.
    Google ScholarFindings
  • Y. Freund and L. Mason. The alternating decision tree learning algorithm. In Proceedings of the Sixteenth International Conference on Machine Learning, ICML ’99, 1999.
    Google ScholarLocate open access versionFindings
  • G. Gu, P. Porras, V. Yegneswaran, M. Fong, and USENIX Security, 2007.
    Google ScholarLocate open access versionFindings
  • [12] S. Golovanov and I. Soumenkov. TDL4 top bot. http://www.securelist.com/en/ analysis/204792180/TDL4_Top_Bot, 2011.
    Findings
  • [13] G. Gu, R. Perdisci, J. Zhang, and W. Lee. BotMiner: clustering analysis of network traffic for protocol- and structure-independent botnet detection. In USENIX Security, 2008.
    Google ScholarLocate open access versionFindings
  • [14] G. Gu, J. Zhang, and W. Lee. BotSniffer: Detecting botnet command and control channels in network traffic. In Network and Distributed System Security Symposium (NDSS), 2008.
    Google ScholarLocate open access versionFindings
  • wiki.mozilla.org/TLD_List, 2006.
    Google ScholarFindings
  • [16] S. Krishnan and F. Monrose. Dns prefetching and its privacy implications: when good things go bad. In Proceedings of the 3rd USENIX conference on 10, Berkeley, CA, USA, 2010. USENIX Association.
    Google ScholarLocate open access versionFindings
  • Wiley-Interscience, 2004.
    Google ScholarFindings
  • [18] M. H. Ligh, S. Adair, B. Hartstein, and M. Richard. Domain names - concepts and facilities. http://www.ietf.org/rfc/
    Findings
  • rfc1034.txt, 1987.
    Google ScholarFindings
  • [20] P. Mockapetris. Domain names - implementation and specification. http://www.ietf.org/ rfc/rfc1035.txt, 1987.
    Findings
  • University Press, 2010.
    Google ScholarFindings
  • [22] A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In Advances In Neural Information Processing Systems, pages 849–856. MIT Press, 2001.
    Google ScholarLocate open access versionFindings
  • [23] P. Porras, H. Saidi, and V. Yegneswaran. An analysis of conficker’s logic and rendezvous points. http://mtc.sri.com/Conficker/, 2009.
    Findings
  • [24] D. Pelleg and A. W. Moore. X-means: Extending k-means with efficient estimation of the number of clusters. In Proceedings of the Seventeenth International Conference on Machine Learning, ICML ’00, pages 727–734, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc.
    Google ScholarLocate open access versionFindings
  • http://www.cert.pl/news/4711/
    Findings
  • langswitch_lang/en, 2012.
    Google ScholarFindings
  • [27] P. Porras, H. Saidi, and V. Yegneswaran. Conficker C analysis. Technical report, SRI International, Menlo Park, CA, April 2009.
    Google ScholarFindings
  • chapter A tutorial on hidden Markov models and selected applications in speech recognition. 1990.
    Google ScholarFindings
  • http://www.damballa.com/downloads/
    Findings
  • r_pubs/KrakenWhitepaper.pdf, 2008.
    Google ScholarFindings
  • http://blog.threatexpert.com/2008/
    Findings
  • 11/srizbis-domain-calculator.html, 2008.
    Google ScholarFindings
  • //www.sophos.com/en-us/
    Findings
  • detailed-analysis.aspx, 2012.
    Google ScholarFindings
  • //www.secureworks.com/research/
    Findings
  • threats/bobax/, 2004.
    Google ScholarFindings
  • [34] B. Stone-Gross, M. Cova, L. Cavallaro, B. Gilbert, M. Szydlowski, R. Kemmerer, C. Kruegel, and G. Vigna. Your botnet is my botnet: analysis of a botnet takeover. In Proceedings of the 16th ACM
    Google ScholarLocate open access versionFindings
  • Conference on Computer and Communications Security, CCS ’09, pages 635–647, New York, NY, USA, 2009. ACM.
    Google ScholarFindings
  • [35] S. Stover, D. Dittrich, J. Hernandez, and S. Dietrich. Analysis of the storm and nugache trojans: P2P is here. In USENIX;login:, vol. 32, no. 6, December 2007.
    Google ScholarLocate open access versionFindings
  • [36] T.-F. Yen and M. K. Reiter. Are your hosts trading or plotting? Telling P2P file-sharing and bots apart. In ICDCS, 2010.
    Google ScholarLocate open access versionFindings
  • [37] R. Villamarin-Salomon and J. Brustoloni. Identifying botnets using anomaly detection techniques applied to dns traffic. In 5th Consumer Communications and Networking Conference, 2008.
    Google ScholarLocate open access versionFindings
  • [38] Wikipedia. The storm botnet. http://en. wikipedia.org/wiki/Storm_botnet, 2010.
    Findings
  • [39] J. Williams. What we know (and learned) from the waledac takedown. http://tinyurl.com/ http://blog.fireeye.com/research/2008/11/technicaldetails-of-srizbis-domain-generationalgorithm.html, 2008.
    Findings
  • //www.microsoft.com/security/
    Findings
  • aspx?Name=Trojan%3AJava%2FBoonana, 2011.
    Google ScholarFindings
  • [42] S. Yadav, A. K. K. Reddy, A. N. Reddy, and S. Ranjan. Detecting algorithmically generated malicious domain names. In Proceedings of the 10th annual Conference on Internet Measurement, IMC ’10, pages 48–61, New York, NY, USA, 2010. ACM.
    Google ScholarLocate open access versionFindings
  • [43] S. Yadav and A. N. Reddy. Winning with dns failures: Strategies for faster botnet detection. In Privacy in Communication Networks, 2011.
    Google ScholarLocate open access versionFindings
  • [44] T.-F. Yen and M. K. Reiter. Traffic aggregation for malware detection. In Proc. International conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA), 2008.
    Google ScholarLocate open access versionFindings
  • http://isc.sans.edu/
    Findings
  • requests/10312, 2011.
    Google ScholarFindings
  • [46] J. Zhang, R. Perdisci, W. Lee, U. Sarfraz, and International Conference on Dependable Systems and Networks - Dependable Computing and Communication Symposium, 2011.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科