KOHL, DeborahBLODGETT, BridgetWOLF, RichardBYRD, David Allan2023-04-172023-04-172023-05-31UB_2023_Byrd_Dhttp://hdl.handle.net/11603/27621Without 100 percent accurate content, usability and information architecture mean little. Pristine content is the sine qua non of the user experience.D.S. -- The University of Baltimore, 2023Dissertation submitted to the Yale Gordon College of Arts and Sciences of The University of Baltimore in partial fulfillment of the requirements for the degree of Doctor of Science in Information and Interaction DesignRecruiting unpaid volunteers through a “crowdsourcing” technique has become a near-ubiquitous tactic of libraries, archives, and other institution seeking to textually digitize their analog holdings. Determining 1) certain demographic characteristics of those volunteers, 2) their familiarity with the topic, 3) their motivation, and 4) the process they use that correlate with higher performance in that task, has been little studied. Recovered Memories investigates 9 such variables both individually and combined. Optical character recognition (OCR) technology is one automated method for converting text, but has proven to be unsatisfactory for creating web content or e-books, mining data, creating data for artificial intelligence (AI) and machine learning (ML) software, and even some search functions. This paper theorized that some of the variables studied will correlate with higher performance. This research project examined the efficacy of a custom-built application to gather data (www.airforcehistory.net) One hundred and twelve historic documents from the Air Force Historical Research Agency’s archive were used in the examination to measure participants’ performance. Despite a relatively small sample size (n=50) and the lack of control endemic to field research, the participant variables of ‘familiarity with U.S. history in Vietnam (PV5),’ ‘process choice (PV8),’ and ‘age (PV2), group affiliation (PV7), and ‘familiarity with U.S. Air Force operations (PV6)’ were related. Multiple regression showed three factors correlated with better performance: gender, familiarity/Vietnam, and process selection. The author argues that OCR correction rather than copying/transcription, i.e. process choice, results in best performance and might be generalizable. Given the interest at the federal level in textually digitizing the holdings of military archives, this study has strong implications for policy and practice.220 leavesapplication/pdfen-USAttribution-NonCommercial-NoDerivs 3.0 United StatesThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by The University of Baltimore for non-commercial research and educational purposes.crowdsourcingocrArchivesdigital historyoptical character recognitionanalog to digital conversionRecovered Memories: Bringing the Air Force Archive Into the Digital AgeAttributes Correlated with Improved Performance in Textually Digitizing Analog DocumentsText