Statistical Techniques for Language Recognition: An Introduction and Guide for Cryptanalysts

Ganesan, Ravi; Sherman, Alan T.

Statistical Techniques for Language Recognition: An Introduction and Guide for Cryptanalysts

dc.contributor.author	Ganesan, Ravi
dc.contributor.author	Sherman, Alan T.
dc.date.accessioned	2019-02-21T16:04:23Z
dc.date.available	2019-02-21T16:04:23Z
dc.date.issued	2010-06-04
dc.description.abstract	We explain how to apply statistical techniques to solve several language-recognition problems that arise in cryptanalysis and other domains. Language recognition is important in cryptanalysis because, among other applications, an exhaustive key search of any cryptosystem from ciphertext alone requires a test that recognizes valid plaintext. Written for cryptanalysts, this guide should also be helpful to others as an introduction to statistical inference on Markov chains. Modeling language as a finite stationary Markov process, we adapt a statistical model of pattern recognition to language recognition. Within this framework we consider four well-defined language-recognition problems: 1) recognizing a known language, 2) distinguishing a known language from uniform noise, 3) distinguishing unknown 0th-order noise from unknown lst-order language, and 4) detecting non-uniform unknown language. For the second problem we give a most powerful test based on the Neyman-Pearson Lemma. For the other problems, which typically have no uniformly most powerful tests, we give likelihood ratio tests. We also discuss the chi-squared test statistic X 2 and the Index of Coincidence IC. In addition, we point out useful works in the statistics and pattern-matching literature for further reading about these fundamental problems and test statistics.	en
dc.description.uri	https://www.tandfonline.com/doi/pdf/10.1080/0161-119391867980	en
dc.format.extent	36 pages	en
dc.genre	journal articles postprints	en
dc.identifier	doi:10.13016/m2qc3t-ly3f
dc.identifier.citation	Ravi Ganesan & Alan T. Sherman (1993) STATISTICAL TECHNIQUES FOR LANGUAGE RECOGNITION: AN INTRODUCTION AND GUIDE FOR CRYPTANALYSTS, CRYPTOLOGIA, 17:4, 321-366, DOI: 10.1080/0161-119391867980	en
dc.identifier.uri	https://doi.org/10.1080/0161-119391867980
dc.identifier.uri	http://hdl.handle.net/11603/12835
dc.language.iso	en	en
dc.publisher	Taylor & Francis Online
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Center for Research and Exploration in Space Sciences & Technology II (CRSST II)
dc.relation.ispartof	UMBC Faculty Collection
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rights	“This is an Accepted Manuscript of an article published by Taylor & Francis in Cryptologia on04 Jun 2010, available online: http://www.tandfonline.com/10.1080/0161-119391867980.”
dc.subject	automatic plaintext recognition	en
dc.subject	categorical data	en
dc.subject	chi-squared test statistic	en
dc.subject	computational linguistics	en
dc.subject	contingency tables	en
dc.subject	cryptanalysts	en
dc.subject	cryptography	en
dc.subject	document processing	en
dc.subject	hypothesis testing	en
dc.subject	index of coincidence	en
dc.subject	language recognition	en
dc.subject	likelihood ratio tests	en
dc.subject	markov models of language	en
dc.subject	maximum likelihood estimators	en
dc.subject	natural language processing	en
dc.subject	statistical inference	en
dc.subject	statistical pattern recognition	en
dc.subject	statistics of language	en
dc.subject	weight of evidence	en
dc.title	Statistical Techniques for Language Recognition: An Introduction and Guide for Cryptanalysts	en
dc.type	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ShermanCryptologia93.pdf
Size:: 2.03 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.56 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

UMBC Center for Information Security and Assurance (CISA)
UMBC Computer Science and Electrical Engineering Department
UMBC Faculty Collection