PHOTO: Cybercrime Magazine

The History Of Ethical Hacking And Penetration Testing

Predicting the future, 60 years ago. Sponsored by Horizon3.ai

David Braue

Melbourne, Australia – Feb. 13, 2025

Penetration testing is on a tear at the moment, with companies pouring money into the fast-evolving sector as surging cyberattacks and increasing regulatory expectations jolt executives into investing in proactive security after what has often been too many years of complacency.

The global penetration testing market is pegged to exceed $5 billion annually by 2031, according to Cybersecurity Ventures, with one recent study finding that 85 percent of U.S. and European companies had increased their penetration testing budgets.

Most are starting from a low baseline: although one recent survey found that 73 percent of enterprises are changing their IT environments at least quarterly, just 40 percent say they are pentesting those environments as often despite committing an average of $164,400 – 13 percent of their annual IT budgets – to pentesting.

The future growth of this market is poised to come from automation of the process – with a growing roster of penetration-testing-as-a-service (PTaaS) companies enabling continuous penetration testing, and generative AI (GenAI) flagged as the latest technology set to transform the way the tests are run.

With so many businesses only getting serious about penetration testing now, you’d be forgiven for thinking that the practice had just emerged over the past few years. The reality, however, is that pen testing as a concept has been around for nearly 60 years – introduced by one forward-thinking computer specialist at April 1967’s Joint Computer Conference in Atlantic City, NJ.



Setting the penetration testing agenda

In a presentation to the more than 15,000 computer security experts gathered for that event, RAND Corporation computer engineer Willis H. Ware shared a seminal paper called Security and Privacy in Computer Systems that would become a manifesto for the cybersecurity industry – and recognized the importance of penetration testing from day one.

“One would argue on principle that maximum protection should be given to all information labelled private,” he said, arguing that private-sector companies wouldn’t necessarily be held to the same strict security standards as the government and military organizations that dominated networked computing at the time.

In the absence of military-level controls prohibiting the sharing of classified information, Ware said, there was no guarantee that companies would invest the time or money to secure their data well enough to keep interested outsiders out.

“If privacy of information is not protected by law and authority,” he explained, “we can expect that the owner of sensitive information will require a system designed to guarantee protection only against the threat as he sees it.”

Driven by a lack of imagination and introspection, Ware warned, many companies were likely to underinvest in data security – leaving blind spots that would leave them exposed to attacks from outsiders ready to find a way around whatever security defenses they had put in place.

“Private information will always have some value to an outside party,” he said, “and it must be expected that penetrations will be attempted against computer systems handling such information.”

“The value of private information to an outsider will determine the resources he is willing to expend to acquire it – [and] the value of information to its owner is related to what he is willing to pay to protect it.”

“Perhaps,” Ware postulated, “this game-like situation can be played out to arrive at a rational basis for establishing the level of protection.”

Unchaining the tigers

Ware’s early vision of companies and hackers being in a state of continuous conflict proved remarkably prescient – laying the foundation for the Defense Science Board Task Force on Computer Security’s (TFCS) foundational 1970 Ware Report and decades of improving tactics, techniques, and procedures (TTP) on both sides of the conflict.

Heeding his warnings about a state of play that was rapidly emerging as inevitable, RAND and government agencies partnered to form ‘tiger teams’ – borrowing a term from the space and military complex – that would methodically probe a computer system’s design and develop techniques to resolve the network, hardware, and software vulnerabilities they identified.

Computer pioneer James P Anderson, a Penn State graduate who initially trained as a meteorologist before a career in the Navy led him to discover cryptography and membership on the TFCS, ultimately authored his own report – 1972’s Anderson Report (incorporating volumes One and Two), which laid out the framework for cybersecurity’s growth during the 1970s.

The process of penetration testing was a core part of his methodology, which outlined a detection-mitigation loop in which specialists would continuously look for vulnerabilities, design exploits for those vulnerabilities, then look for weaknesses in those exploits that would allow security systems to intercept and disable the threat.

This approach was, notably, tested in anger in 1974 when the US Air Force ran one of the first known ‘white hat’ attacks – the term comes from the hats worn in movie Westerns by Lone Ranger-like cowboys as they fought dastardly black-hatted villains – against its Multiplexed Information and Computing Service (Multics), which shaped secure distributed-computing architectures for decades afterwards.

That testing turned up a number of vulnerabilities, allowing Multics engineers to fix the problems before they could be exploited by malicious outsiders such as nation-state actors – a particular breed of enemy that Ware’s 1967 presentation anticipated in warning that “for reasons of national interest will someday find the professional cryptoanalytic effort of a foreign government focused on the privacy-protecting measures of a computer network.”



The path to automated testing

Increasing awareness of the immense potential of computers naturally attracted curious hackers like Steve Wozniak and the late Kevin Mitnick, who came of age as part of a new generation for whom computers were less a newfangled development and more a technology that promised to change the world.

The personal computer revolution of the 1980s democratized computing and networking technology, with the predictable complications for data security as Ware’s “game-like situation” was writ large and pioneers like Peter Norton began establishing the security brands that would shape the next 40 years.

A 1983 Department of Defense security manual, known as the Trusted Computer System Evaluation Criteria (TCSEC) or ‘Orange Book’, outlined procedures for penetration testing at a range of security levels and mandated, among other things, at least 20 hours’ work by a team involving at least two Computer Science graduates and, for higher security levels, others with Master’s Degrees in Computer Science or equivalent.

And while many hackers found themselves able to avoid prosecution due to grey areas in existing information protection laws that failed to address hacking, the 1986 Computer Fraud and Abuse Act (CFAA) drew a line in the sand – outlining a broad range of data protections with the backing of the U.S. Department of Justice.

As well as potentially punishing hackers for testing the defenses of corporations and government bodies, the CFAA made penetration testing riskier because even white-hat hackers could theoretically be prosecuted for computer trespassing even if their intentions were good.

This paved the way for professional penetration testers working under the legal protection of large consulting giants, who saw value in penetration testing as a way of identifying gaps in clients’ network security – and, no doubt, identifying opportunities to upsell them on new security consulting services as the Internet became commonplace in the 1990s.

Whereas penetration testing was largely conducted on a hobby or individual practitioner basis in the 1980s, its elevation to part of the technology consulting pantheon saw the process of detecting vulnerabilities start to be automated – with security researcher Dan Farmer and programmer Wietse Venema taking a major step with the 1995 release of the freely available Security Administrator Tool for Analyzing Networks (SATAN).

SATAN revolutionized the practice of penetration testing, pairing a detailed network scanner – which could also map out a network and details of all connected hosts – with a web browser interface that made it easy to use and presented results in an actionable way.

Although SATAN fell out of use over time, the paradigm it established spawned tools like nmap, Nessus, SARA and SAINT – which, along with its companion SAINTexploit pentesting tool, maps out available network services and throw a barrage of exploits to identify which vulnerabilities exist within a particular network environment.

Formalizing the pentesting function

The early years of this century saw the steady codification of pentesting as a discipline, with developments such as 2003’s OWASP Web Security Testing Guide laying down a methodological framework for pentesting that is still in use today.

By 2009, formalization of OWASP’s Penetration Testing Execution Standard (PTES) worked to translate what had been a highly technical practice into the business sphere, both providing technical standards and aiming to help businesses understand the business value of penetration testing through a seven-layer model that includes pre-engagement interactions, intelligence gathering, threat modelling, vulnerability analysis, exploitation, post-exploitation, and reporting.

Intervening years have seen the formalization of a raft of derivative industry standards for pentesting in specific situations, such as the now mandatory pentesting requirements set out in the Payment Card Industry Data Security Standard (DSS) 4.0; the National Institute of Standards and Technology (NIST) Technical Guide to Information Security Testing and Assessment and its adaptation to the requirements of healthcare’s HIPAA governance rules; and the establishment of formal certification schemes such as CHECK, which helps UK businesses identify approved pentesting companies that are considered safe to hire.

Cybersecurity associations now offer a range of pentesting certifications to help security practitioners formalize their capabilities, including CompTIA PenTest+, EC-Council Certified Ethical Hacker (CEH) and Licensed Penetration Tester (LPT), Certified Penetration Tester (CPT), Certified Red Team Operations Professional (CRTOP), and others.

Yet even as the cybersecurity industry has both proceduralized the process of pentesting and built on this professionalism to sell the concept to the businesses that rely on it, automated pentesting tools and frameworks have paradoxically increased the threat to those companies.

This is because all of these TTPs are also readily available to cybercriminals – who have wasted no time using them to probe potential targets for soft spots that can be used to launch DDoS attacks, circumvent firewalls, exploit weaknesses in remote access platforms, and more.

To parry this threat, many businesses – who often lack the broad and deep pentesting skills necessary to regularly run meaningful, standards-compliant testing – have warmed to alternative models for detecting threats such as crowdsourcing, in which companies like Bugcrowd, HackerOne, and Synack engage massive online communities of security experts to conduct ethical penetration tests for clients.



Automation vs automation

It may have taken proponents of penetration testing several decades to progress the discipline to the point where it is taken seriously and adopted broadly, but in this day and age – whether you enlist your internal security staff to run penetration tests or outsource it to PTaaS firms – having a penetration testing strategy is now essential for any company making any kind of cybersecurity investment.

Just as automated pentesting and PTaaS offerings have allowed companies to test their security more frequently – after every new software build or update, potentially, rather than quarterly or annually as in the past – the emergence of generative AI (GenAI) technology is disrupting the industry once again as both white hat and black hat teams lean on the technology to support their work.

One recent study by Australian and Indian academics, for example, evaluated the use of the ChatGPT 3.5 large language model (LLM) during pentesting and found “amazing” results that produced “better pentesting report[s].”

By adopting GenAI, the authors wrote, “penetration testing becomes more creative, test environments are customized, and continuous learning and adaptation is achieved…. LLMs can quickly analyze large amounts of data and generate test scenarios based on various parameters, streamlining the testing process and saving valuable time for security professionals.”

GenAI, the researchers said, proved adept at analyzing historical records of attack vectors and “mimicking human-like behaviour” – helping security teams “better understand and anticipate the tactics that real attackers may employ…. In a black box pentest where the tester receives zero information on the target, social engineering attacks or a phishing campaign can be launched in no time at all.”

Yet just as GenAI can be used to support well-intentioned pentesting activities, on the other side of the coin it is also being perverted by cybercriminals to help them create targeted attacks that are more efficient than ever.

Red teaming and testing “with the hacker mindset… is a big focus,” OpenPolicy co-founder and CEO Dr Amit Elazari observed at this year’s RSA conference, noting that pentesting has become “common in many [environments].”

“Your organization should already be working with friendly hackers and collaborating on vulnerability disclosure programs, and thinking about boundaries – but that concept is going to get pushed even further with AI.”

And while pentesting continues to require human oversight and interpretation of results, it is not hard to envision an increasingly automated response as corporate networks are increasingly prodded by both offensive and defensive vulnerability scanners that seek to identify and exploit unpatched vulnerabilities before the other side does.

The toolset may have changed, but many of the dynamics of today’s pentesting environment would come as no surprise to Ware, who passed away in 2013 as cybersecurity was finally and meaningfully moving from the IT department to the boardroom.

Even back in 1967, however, he could see the writing on the wall.

“Private information will always have some value to an outside party,” he wrote, “and it must be expected that penetrations will be attempted against computer systems handling such information…. Deliberate penetrations must be anticipated, if not expected.”

“If one can estimate the nature and extent of the penetration effort expected against an industrial system, perhaps it can be used as a design parameter to establish the level of protection for sensitive information.”

David Braue is Editor-at-Large at Cybercrime Magazine and an award-winning technology writer based in Melbourne, Australia.


Sponsored by Horizon3.ai

Horizon3.ai is a mix of U.S. Special Operations, U.S. National Security, and cybersecurity industry veterans. Our mission is to “turn the map around” – using the attacker’s perspective to help enterprises prioritize defensive efforts.

Our team of nation-state-level, ethical hackers continuously identifies new attack vectors through autonomous pentesting and red team operations, leveraging collective intelligence to improve our products and strengthen our clients’ security. Founded in 2019, Horizon3.ai is headquartered in San Francisco, Calif., and 100 percent made in the USA.