Publications

Spam E-Mail Classification by Utilizing N-Gram Features of Hyperlink Texts

Paper

Spam E-Mail Classification by Utilizing N-Gram Features of Hyperlink Texts

April, 2019

Authors
A. Selman Bozkir
Esra Sahin
Murat Aydos
Ebru Akcapinar Sezer
Fatih Orhan

Utilization and Comparision of Convolutional Neural Networks in Malware Recognition

Paper

Utilization and Comparision of Convolutional Neural Networks in Malware Recognition

March, 2019

Authors
Ahmet Selman Bozkır
Murat Aydos

Experience of Converting an Existing Cloud Application into a Standalone System by Using Docker and Related Technologies

In this paper, some basic information about Docker technology, which is frequently mentioned nowadays, is given and its differences with and advantages over virtualization technologies are mentioned. In addition to this, a brief information about Valkyrie architecture, which is a cloud-based product, is given and the actions to be taken to meet the requirements and expected problems are mentioned.

Finally, the experience of converting this product into a standalone package using Docker and Docker technologies; the way Docker and Docker Swarm technologies are utilized throughout this experience; and additional benefits obtained while using this architecture have been discussed.

Conference

UYMK 2018 - Ulusal Yazılım Mimarisi Konferansı (National Conference of Software Architecture)

Paper

Experience of Converting an Existing Cloud Application into a Standalone System by Using Docker and Related Technologies

November, 2018

Authors
Serdar Mumcu
Gürkan Karahan
Anıl Doğan

Intractable Problems in Malware Analysis and Practical Solutions

"Intractable Problems in Malware Analysis and Practical Solutions” paper prepared by Ali Aydin Selçuk, Fatih Orhan and Berker Batur. Related article can be examined here.

June, 2018

Authors
Prof. Dr. Ali Aydın Selçuk
Fatih Orhan
Berker Batur

Spam Filtering Using Big Data and Deep Learning

Spam e-mails and other fake, falsified e-mails like phishing are considered as spam e-mails, which aim to collect sensitive personal information about the users via network or behave against authority in an illegal way. Most of the e-mails around the Internet contain spam context or other relevant spam like context such as phishing e-mails. Since the main purpose of this behavior is to harm Internet users financially or benefit from the community maliciously, it is vital to detect these spam e-mails immediately to prevent unauthorized access to email users’ credentials.

To detect spam e-mails, using successful machine learning and classification methods are therefore important for timely processing of emails. Considering the billions of e-mails on the internet, automatic classification of emails as spam or not spam is an important problem. In this thesis, we studied supervised machine learning and specifically “deep learning” methods to classify emails. Our results indicate that deep learning is very promising in terms of successful classification of emails with an accuracy of up to 96%.

Thesis

Spam Filtering Using Big Data and Deep Learning

February, 2018

Authors
Prof. Dr. Erdoğan Doğdu
Onur Göker

Core Illumination: Traffic Analysis in Cyberspace

The information security discipline devotes immense resources to developing and protecting a core set of protocols that encode and encrypt Internet communications. However, since the dawn of human conflict, simple traffic analysis (TA) has been used to circumvent innumerable security schemes. TA leverages metadata and hard-to-conceal network flow data related to the source, destination, size, frequency, and direction of information, from which eavesdroppers can often deduce a comprehensive intelligence analysis. TA is effective in both the hard and soft sciences, and provides an edge in economic, political, intelligence and military affairs.

Today, modern information technology, including the ubiquity of computers, and the interconnected nature of cyberspace, has made TA a global and universally accessible discipline. Further, due to privacy issues, it is also a global concern. Digital metadata, affordable computer storage, and automated information processing now record and analyze nearly all human activities, and the scrutiny is growing more acute by the day. Corporate, law enforcement, and intelligence agencies have access to strategic datasets from which they can drill down to the tactical level at any moment. This paper discusses the nature of TA, how it has evolved in the Internet era, and demonstrates the power of high-level analysis based on a large cybersecurity dataset.

Conference

NATO Cooperative Cyber Defence Centre of Excellence

December, 2017

Authors
Kenneth Geers

Undecidable Problems in Malware Analysis

Malware analysis is a challenging task in the theory as well as the practice of computer science. Many important problems in malware analysis have been shown to be undecidable. These problems include virus detection, detecting unpacking execution, matching malware samples against a set of given templates, and detecting trigger-based behavior. In the paper that is prepared by Prof. Dr. Ali Aydın Selçuk, Fatih Orhan and Berker Batur give a review of the undecidability results in malware analysis and discuss what can be done in practice. Related article can be examined here.

Conference

International Conference for Internet Technology and Secured Transactions , ICITST-2017

December, 2017

Authors
Prof. Dr. Ali Aydın Selçuk
Fatih Orhan
Berker Batur

KDD 2017 Best Paper Award and Best Student Paper Award for Applied Data Science rack

HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network” paper prepared by Shifu Hou, Yanfang Ye, Yangqiu Song and Melih Abdulhayoglu. In the paper, to detect Android malware, instead of using Application Programming Interface (API) calls only, they further analyze the different relationships between them and create higher-level semantics which require more efforts for attackers to evade the detection. It has been selected as the KDD 2017 Best Paper, as well as the KDD 2017 Best Student Paper, for the Applied Data Science track. Related article can be examined here.

Conference

Conference on Knowledge Discovery and Data Mining, KDD 2017

August, 2017

Authors
Shifu Hou
Yanfang Ye
Yangqiu Song
Melih Abdulhayoglu

Spam Email Detection by Employing Machine Learning Methods over n-Gram Features of Email Hyperlink Texts

Within the scope of academic cooperation with Comodo and Assist. Prof. Dr. Murat Aydos and his colleagues form Hacettepe University, studies were conducted on Spam Email Detection by Employing Machine Learning Methods. Throughout the study, a novel, large scale dataset covering 140.000 hyperlink texts belonging to spam and ham emails has been used for feature extraction and performance evaluation. In order to generate the required vocabularies, unigram, bigram and trigram models have been examined. Next, three different machine learning methods (SVM, Naive Bayes as a non-active and SVM-Pegasos as an active learner method) have been employed to classify each link. According to the results, classification using trigram-based bag-of-words representation reaches up to 99% accuracy with at most 1% false-positive rate by outperforming unigram and bigram models. Apart from having high accuracy, the proposed approach also preserves the privacy of the customers since it does not require any kind of analysis on e-mail body contents.

Conference

The 11th IEEE International Conference AICT2017

September, 2017

Authors
Ahmet Selman Bozkır
Esra Sahin
Murat Aydos
Fatih Orhan

Mobile Malware Detection Using Deep Neural Network

Mobile Malware Detection Using Deep Neural Networks studies conducted by Assoc. Prof. Dr. Ali Gökhan Yavuz and his student İrfan Bulut from Yildiz Technical University is one of the successful studies carried out under Comodemia. In this study, we present a novel model based on deep learning for the prediction of mobile malware without requiring execution in a sandbox environment. Application permissions were used as features. After optimizing their weights with automatic encoder and they were classified with a multilayer perceptron with an accuracy of 93.67%.

Conference

Signal Processing and Communications Applications Conference (SIU)

May, 2017

Authors
Irfan Bulut
A. Gökhan Yavuz

METU Cyber Defense and Security R&D Lab: CyDeS

The CyDeS Laboratory was established under the METU Informatics Institute in Mid-2014, with the support of Comodo Group, Inc., one of the leading companies in the world in the field of information technology and security certification. Since then, CyDeS has consolidated a substantial amount of ongoing research related to cybersecurity being carried out at METU, in addition to guiding and sponsoring new research projects. Cydes has been hosting International Symposium on Cyber Defense & Security Conference at the METU Informatics Institute. The event brought together distinguished guests from all backgrounds: including public, private and academic circles. In this regard, it was very important for achieving CyDeS vision for leading cyber security research by bridging researchers with the companies that are active in the field.

Zero-day Malware detection by Ensemble based Hybridization of Static and Dynamic Malware Detection Techniques

Zero Day Malware Detection by Ensemble Based Hybridization for Static and Dynamic Malware Detection Techniques presentation is done by Mesut Kaya, Berker BATUR and Tamer Tavaslıoğlu. The related study can be examined here.

Reconfigurable Framework for Remote Monitoring and Management of Computer Systems

Within the cooperation with academia, “Reconfigurable Framework for Remote Monitoring and Management of Computer Systems" article is prepared by Assoc. Prof. Dr. Halit Oğuztüzün from Orta Doğu Technical University Computer Engineering Department and Gülşah Yalçın from Comodo. Remote Monitoring and Management systems are information technology software tools to organize and manage client workstations. They are used by many companies that are willing to minimize their labor cost, collect and measure the data of a variety of clients, administrate them from a single point, in a reliable and secure way. Dynamic profile deployment, dynamic reconfiguration of monitors in response to changes in clients’ profiles and creating notifications or running a procedure on the fly are the main features of remote monitoring systems which can be fully expressed by Dynamic Software Product Line Approach. Reconfigurable Framework for Remote Monitoring and Management of Computer Systems aims to provide IT service providers with a dynamically reconfigurable, reusable and easy to define smart monitoring and measurement mechanism.

Ham E-Mail Classification using Machine Learning Methods based on Bag of Words Technique

Nowadays, we use frequently e-mails, one of the communication channels, in electronic environment. It play an important role in our lives because of many reasons such as personal communications, business-focused activities, marketing, advertising etc. E-mails make life easier because of meeting many different types of communication needs. On the other hand they can make life difficult when they are used outside of their purposes. Spam emails can be not only annoying receivers, but also dangerous for receiver’s information security. Detecting and preventing spam e-mails has been a separate issue. In this study, the texts of the links which is in the e-mail body are handled and classified by the machine learning methods and Bag of Word Technique. We analyzed the effect of different N-Grams on classification performance and the success of different machine learning techniques in classifying spam e-mail by using accuracy metric. The related article can be examined here.