Oct. 6, 2017
When it comes to the Dark Web, people tend to think of websites for black market drug sales and even child pornography. A news organization once scarily described the Dark Web as “a vast, secret, cyber underworld” that accounts for “90 percent of the Internet”, which has misled many people*1. In fact, they confused the “Dark Web” with the much bigger and generally more benign “Deep Web”.
We explain what is the Dark Web, how it differs from the Deep Web, how the Dark Web works, what is happening on the Dark Web and how to collect information from them.
1. What is the Dark Web? How does it differ from the Deep Web?
The Dark Web is not vast, it does not account for 90 percent of the Internet, and it is even not secret. The Dark Web is a collection of websites that exists on overlay networks which use the public Internet. Anyone can access a Dark Website, but it is very difficult to reveal where it is hosted or by whom it is hosted. To access these Dark Websites, specific software is required. The majority of Dark Websites utilize Tor*2, and a smaller number also use a similar tool called I2P*3.
Then what is the Deep Web?
Although newspapers and mainstream news tend to use “Dark Web” and “Deep Web” interchangeably, these two words do not refer to exactly the same thing.
The “Deep Web” refers to all web pages that search engines cannot find. Thus the “Deep Web” does include the “Dark Web”, but also includes all user databases, webmail pages, registration-required web forums, and pages behind paywalls. There are huge numbers of such pages, and most exist for mundane reasons. Considering how many pages just one Gmail account will create, you will understand the sheer size of the Deep Web. This scale is why newspapers and mainstream news outlets tell scare stories that “90 percent of the internet” is consisted of the “Dark Web”, where the term “Deep Web” should be used.
2. How does the Dark Web work?
Both Tor and I2P are the implementations of the Onion Routing. In the Onion Routing, data is encrypted in layers and transferred by the Onion Routers all over the world. In this way, the sender’s anonymity is protected from both the observers and the recipient.
Tor provides the hidden service as a mechanism for users to anonymously offer services accessible for other Tor users. I2P provides equivalent service called eepSite. Hidden services provide recipient anonymity, i.e., they hide the IP address of the server hosting the hidden service. Every hidden service has an onion URL that ends in “.onion”. Facebook’s official hidden service, for example, has onion address “facebookcorewwwi.onion”. There is no central repository that lists all the hidden Web services and thus users can only access those for which they know the onion URLs. Some aggregation sites offer lists of onion URLs. The official site of Tor Project explains the details of the hidden service*4.
3. What is happening on the Dark Web?
Thanks to the description that the Dark Web is “a vast, secret, cyber underworld”, people tend to associate the Dark Web with the sale of drugs, weapons, counterfeit documents and child pornography.
The most infamous examples of the Dark Websites include the Silk Road and its offspring. The Silk Road was (and perhaps still is) a website for trading in recreational drugs.*5
In August 2015, it was reported that 10GB of data stolen from Ashley Madison, a site designed to enable bored spouses cheat on their partners, was dumped on to the Dark Web. The files appear to include account details and log-ins for some 32 million users. Seven years’ worth of credit card and other payment transaction details are also part of the dump*6.
However, not everything on the Dark Web is so “dark”. One of the first high profile Dark Web sites is the Tor hidden service WikiLeaks created to accept leaks from anonymous sources. People who live in countries with Internet censorship can use the Dark Web to communicate with the outside world.
The non-profit ProPublica has won two Pulitzer Prizes for its investigative journalism, and sheds light on controversial topics like NSA spying, drug cartels, doping in sports, and corporate money in politics, since its finding in 2008. Now ProPublica is the first major news organization to launch an .onion website on the Dark Web (propub3r6espa33w.onion)*7. It allows everyone to read ProPublica’s content without being tracked. ProPublica also uses the Dark Web to protect sources, which played a major role in breaking news about the Edward Snowden leaks. They uses a hidden service on Tor called SecureDrop*8, which lets any news organization receive anonymous submissions.
4. How to collect information from the Dark Web?
Due to its anonymous nature, the Dark Web has brought great potential to extremists and terrorists to achieve the coordination, propaganda delivery, and other interactions between extremists groups. Thus worldwide government agencies and law enforcements are monitoring online social networks and virtual communities, including those in the Deep Web and the Dark Web, in order to collect information in advance.
Researches*9、*10 have been done on extracting the topics, trends, content and even key members from forums, blogs and social networks in the Dark Web with methods that combine both text mining and social network analysis techniques. These researches contribute to the realization of automated and semi-automated gathering and analyzing information.
Some security vendors provide services for collecting information from password protected forums, social media-connection with individuals and social media groups in the Deep Web. Their practice includes cultivating and operating virtual identities in online spaces, which allows their analysts to access social media platforms like a natural web user does. Human analysts then can be friend with terrorists and crackers on social networks, participate in discussions, and obtain knowledge of target selection and attack methods. The company offer their reports to their clients, including government departments, financial institutions, and various corporates.
Preventing cyber-attacks are becoming ever more difficult. The knowledge obtained from crackers’ group on the Dark Web, including the target selection and other pre-attack intelligence, helps issue preemptive alerts regarding to imminent attacks, and thus help strengthen the prevention. Since these information will be necessary for the future information security strategy for both governments, financial institutions and private corporates, we suppose further researches will be done in academia and there will be great demand for the aforementioned service in the near future.
- *160 minutes - The Dark Web, https://www.youtube.com/watch?v=7AonC0BKyJw
- *2Tor Project: Anonymity Online, https://www.torproject.org (Accessed 6th Oct 2017)
- *3I2P Anonymous Network, https://geti2p.net (Accessed 6th Oct 2017)
- *4Tor: Hidden Service Protocol, https://www.torproject.org/docs/hidden-services.html.en (Accessed 6th Oct 2017)
- *5Deep Web and Cybercrime - It Is Not Just the Silk Road, http://blog.trendmicro.com/trendlabs-security-intelligence/deepweb-and-cybercrime-it-is-not-just-the-silk-road/
- *6Hackers Finally Post Stolen Ashley Madison Data, http://www.wired.com/2015/08/happened-hackers-posted-stolen-ashley-madison-data/
- *7Why ProPublica Joined the Dark Web, http://www.propublica.org/podcast/item/why-propublica-joined-the-dark-web/
- *8SecureDrop, https://securedrop.org/ (Accessed 6th Oct 2017)
- *9Gaston L’Huillier, Sebastián A. Ríos, Hector Alvarez, and Felipe Aguilera. 2010. Topic-based social network analysis for virtual communities of interests in the Dark Web. In ACM SIGKDD Workshop on Intelligence and Security Informatics (ISI-KDD ’10). Article 9, 9 pages.
- *10Sebastián A. Ríos and Ricardo Muñoz. 2012. Dark Web portal overlapping community detection based on topic models. In Proceedings of the ACM SIGKDD Workshop on Intelligence and Security Informatics (ISI-KDD ’12). Article 2, 7 pages.
Security Engineering Department
Technology and Innovation General Headquarters
Fei Feng joined NTT DATA in 2015. She was engaged in a wide range of information security R&D, including the evaluation of cyber intelligence services, malware analysis and digital forensics. Currently, she is responsible for the evaluation of security products.