The cybersecurity landscape is defined by massive data repositories used to train detection models and track evolving threats. Recent disclosures from vx-underground and VirusTotal reveal the staggering scale of these archives: vx-underground holds roughly 30 terabytes of malware source code, while VirusTotal manages an immense 31 petabytes of user-contributed samples.
Visualizing Data at Scale
To put these numbers into perspective, we translated these digital volumes into a physical stack of standardized 3.5-inch internal hard drives. Each drive is calculated at one inch in height, assuming a 1-terabyte capacity per unit.
The 30 terabytes held by vx-underground would result in a stack of 30 drives, reaching a height of 30 inches—roughly 2.5 feet. This is a modest height compared to the massive scale of the VirusTotal repository.

The Skyscraper Comparison
The math behind the VirusTotal archive is significantly more dramatic. With 31 petabytes—where one petabyte equals approximately 1,000 terabytes—the total collection would require 31,744 individual hard drives.
If stacked vertically, these drives would reach approximately 2,645 feet into the air. For context, this is nearly the height of the world’s tallest building, the Burj Khalifa in Dubai, which stands at 2,722 feet. Alternatively, the VirusTotal data set is equivalent to stacking about two and a half Eiffel Towers (1,083 feet each) on top of one another.
These figures highlight the critical necessity of automated systems for cybersecurity firms and AI researchers. As threat intelligence evolves, the sheer volume of data required to identify and neutralize malicious code continues to grow at a pace that dwarfs iconic architectural landmarks.
