Читать книгу Artificial Intelligence and Data Mining Approaches in Security Frameworks - Группа авторов - Страница 36

2.6 Phishing Website Classification

It is a kind of social engineering attack generally used to filch data of a user, like login credentials and credit card numbers. To cover up honest websites, forged websites are usually formed by fraudulent people. Due to phishing activities of attackers, users mistakenly lose their money. Therefore, a critical step must be taken for the protection of online trading. Goodness of the extracted features denotes the prediction and classification accuracy of a website. An anti-phishing tool is used by most of the internet users to feel safe against phishing attacks. Anti-phishing tool is required to predict accurate phishing. Content parts of phishing websites along with security indicators may have a set of clues within the browsers. Various methods have been proposed to handle the problem of phishing. For predicting phishing attacks, rule-based classification, which is a data mining technique, is used as a proficient method for prediction. If an attacker is sending an email to victims by requesting them to reveal their personal information, it is an indication of phishing. To create phishing websites with proper trick, a set of mutual features are used by phishers. We can distinguish between phishy and non-phishy websites on the basis of extracted features of that visited website.

Identification of phishing sites can be done with the help of two approaches:

1 i) Blacklist based: It includes comparative analysis of the URL, i.e., requested along with other URLs which are present in that list.
2 ii) Heuristic based: Certain features from various websites are collected and labeled as either as phishy or genuine.

The main drawback of the blacklisted approach is that it cannot have all phishing websites because every second, a new malicious website is launched, while a heuristic-based approach can identify fake websites that are original. Heuristic-based methods depend on the feature’s selection and the manner in which they processed. Data mining is used to discover relations and pattern amongst features within a given dataset. The utmost job of data mining is to take decisions because these decisions are patterns and rules dependent which have been derived using the data mining algorithms. Though considerable progress has been made for the development of prevention techniques, still phishing is a threat because the techniques used for countermeasures are still based on blacklisting of reactive URL (Polychronakis, 2009). Because of the shorter lifetime of phishing websites, methods used in these sites are considered as ineffective. A new approach, associative classification (AC) was found more appropriate for these kinds of applications; it is a mixture of Association rule and Classification techniques of data mining.

There are two stages in association classification (AC):

1 i) Training phase: It is used to induce hidden knowledge (rules) with the help of Association rule.
2 ii) Classification phase: It is used to build a classifier after cropping ineffective and superfluous rules.

It has been proved from many research studies that association classifier (AC) generally shows better classifiers in terms of error rate than decision tree and rule induction (standard classification approaches).

Artificial Intelligence and Data Mining Approaches in Security Frameworks

Подняться наверх