Nicholas, CharlesEckert, Matthew2019-10-112019-10-112015-01-0111413 paper explores the HTML characteristics of the deep web by gathering HTML tag frequencies on web pages using three different web crawling techniques. The first web crawling technique used the most popular websites listed by Alexa as the seed for the web crawler and randomly selected a sample of web pages to include in the statistics. The second web crawling technique consisted of web pages gathered from randomly generating shorten URLs and visiting pages that the shortened URLs redirected to. The third web crawling technique traversed the deep web going through .onion web sites and domains by randomly generating a IP. Statistics from these web crawling techniques are gathered and compared in this paper.This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see or contact Special Collections at speccoll(at)umbc.eduDeep WebHTMLweb crawlingCharacteristics of HTML in the Deep WebText