Statistical Techniques for Language Recognition: An Introduction and Guide for Cryptanalysts

dc.contributor.authorGanesan, Ravi
dc.contributor.authorSherman, Alan T.
dc.date.accessioned2019-02-21T16:04:23Z
dc.date.available2019-02-21T16:04:23Z
dc.date.issued2010-06-04
dc.description.abstractWe explain how to apply statistical techniques to solve several language-recognition problems that arise in cryptanalysis and other domains. Language recognition is important in cryptanalysis because, among other applications, an exhaustive key search of any cryptosystem from ciphertext alone requires a test that recognizes valid plaintext. Written for cryptanalysts, this guide should also be helpful to others as an introduction to statistical inference on Markov chains. Modeling language as a finite stationary Markov process, we adapt a statistical model of pattern recognition to language recognition. Within this framework we consider four well-defined language-recognition problems: 1) recognizing a known language, 2) distinguishing a known language from uniform noise, 3) distinguishing unknown 0th-order noise from unknown lst-order language, and 4) detecting non-uniform unknown language. For the second problem we give a most powerful test based on the Neyman-Pearson Lemma. For the other problems, which typically have no uniformly most powerful tests, we give likelihood ratio tests. We also discuss the chi-squared test statistic X 2 and the Index of Coincidence IC. In addition, we point out useful works in the statistics and pattern-matching literature for further reading about these fundamental problems and test statistics.en_US
dc.description.urihttps://www.tandfonline.com/doi/pdf/10.1080/0161-119391867980en_US
dc.format.extent36 pagesen_US
dc.genrejournal articles postprintsen_US
dc.identifierdoi:10.13016/m2qc3t-ly3f
dc.identifier.citationRavi Ganesan & Alan T. Sherman (1993) STATISTICAL TECHNIQUES FOR LANGUAGE RECOGNITION: AN INTRODUCTION AND GUIDE FOR CRYPTANALYSTS, CRYPTOLOGIA, 17:4, 321-366, DOI: 10.1080/0161-119391867980en_US
dc.identifier.urihttps://doi.org/10.1080/0161-119391867980
dc.identifier.urihttp://hdl.handle.net/11603/12835
dc.language.isoen_USen_US
dc.publisherTaylor & Francis Online
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Center for Research and Exploration in Space Sciences & Technology II (CRSST II)
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rights“This is an Accepted Manuscript of an article published by Taylor & Francis in Cryptologia on04 Jun 2010, available online: http://www.tandfonline.com/10.1080/0161-119391867980.”
dc.subjectautomatic plaintext recognitionen_US
dc.subjectcategorical dataen_US
dc.subjectchi-squared test statisticen_US
dc.subjectcomputational linguisticsen_US
dc.subjectcontingency tablesen_US
dc.subjectcryptanalystsen_US
dc.subjectcryptographyen_US
dc.subjectdocument processingen_US
dc.subjecthypothesis testingen_US
dc.subjectindex of coincidenceen_US
dc.subjectlanguage recognitionen_US
dc.subjectlikelihood ratio testsen_US
dc.subjectmarkov models of languageen_US
dc.subjectmaximum likelihood estimatorsen_US
dc.subjectnatural language processingen_US
dc.subjectstatistical inferenceen_US
dc.subjectstatistical pattern recognitionen_US
dc.subjectstatistics of languageen_US
dc.subjectweight of evidenceen_US
dc.titleStatistical Techniques for Language Recognition: An Introduction and Guide for Cryptanalystsen_US
dc.typeTexten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ShermanCryptologia93.pdf
Size:
2.03 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: