2020 Phishing Trends With PDF Files

  • By : Saber Mohamed

    From 2019-20, we noticed a dramatic 1,160% increase in malicious PDF files – from 411,800 malicious files to 5,224,056. PDF files are an enticing phishing vector as they are cross-platform and allow attackers to engage with users, making their schemes more believable as opposed to a text-based email with just a plain link.

    To lure users into clicking on embedded links and buttons in phishing PDF files, we have identified the top five schemes used by attackers in 2020 to carry out phishing attacks, which we have grouped as Fake CaptchaCouponPlay Button, File Sharing and E-commerce.

    Palo Alto Networks customers are protected against attacks from phishing documents through various services, such as Cortex XDRAutoFocus and Next-Generation Firewalls with security subscriptions including WildFireThreat PreventionURL Filtering and DNS Security.

    Data Collection

    To analyze the trends that we observed in 2020, we leveraged the data collected from the Palo Alto Networks WildFire platform. We collected a subset of phishing PDF samples throughout 2020 on a weekly basis. We then employed various heuristic-based processing and manual analysis to identify top themes in the collected dataset. Once these were identified, we created Yara rules that matched the files in each bucket, and applied the Yara rules across all the malicious PDF files that we observed through WildFire.

    Data Overview

    In 2020, we observed more than 5 million malicious PDF files. Table 1 shows the increase in the percentage of malicious PDF files we observed in 2020 compared to 2019.



    Total PDF Files Seen

    Percentage of PDF Malware

    Percentage Increase










    Table 1. Distribution of malicious PDF samples in 2019 and 2020.

    The pie chart in Figure 1 gives an overview of how each of the top trends and schemes were distributed. The largest number of malicious PDF files that we observed through WildFire belonged to the fake “CAPTCHA” category. In the following sections, we will go over each scheme in detail. We do not discuss the ones that fall into the “Other” category, as they include too much variation and do not demonstrate a common theme.

    Figure 1. Malicious PDF trends in 2020.

    Usage of Traffic Redirection

    After studying different malicious PDF campaigns, we found a common technique that was used among the majority of them: usage of traffic redirection.

    Before we review the different PDF phishing campaigns, we will discuss the importance of traffic redirection in malicious and phishing PDF files. The links embedded in phishing PDF files often take the user to a gating website, from where they are either redirected to a malicious website, or to several of them in a sequential manner. Instead of embedding a final phishing website – which can be subject to frequent takedowns – the attacker can extend the shelf life of the phishing PDF lure and also evade detection. Additionally, the final objective of the lure can be changed as needed (e.g. the attacker could choose to change the final website from a credential stealing site to a credit card fraud site). Not specific to PDF files, the technique of traffic redirection for malware-based websites is heavily discussed in “Analysis of Redirection Caused by Web-based Malware” by Takata et al.

    Phishing Trends With PDF Files

    We identified the top five phishing schemes from our dataset and will break them down in the order of their distribution. It is important to keep in mind that phishing PDF files often act as a secondary step and work in conjunction with their carrier (e.g., an email or a web post that contains them).

    1. Fake CAPTCHA

    Fake CAPTCHA PDF files, as the name suggests, demands that users verify themselves through a fake CAPTCHA. CAPTCHAs are challenge-response tests that help determine whether or not a user is human. However, the phishing PDF files we observed do not use a real CAPTCHA, but instead an embedded image of a CAPTCHA test. As soon as users try to “verify” themselves by clicking on the continue button, they are taken to an attacker-controlled website. Figure 2 shows an example of a PDF file with an embedded fake CAPTCHA, which is just a clickable image. A detailed analysis of the full attack chain for these files is included in the section Fake CAPTCHA Analysis.

    حمّل تطبيق جريدة عالم رقمي الآن