Stochastic Complexity, Histograms and Hypothesis Testing of Homogeneity
Keywords:
Histogram density estimation, Minimum description length, Model selection, Quantization, Stochastic complexity, Test of homogeneity.Abstract
Information contained in a sample of quantitative data may be summarized or described by a nonparametric histogram density function. An interesting question is how to construct such a histogram density to express the data information with minimum stochastic complexity.The stochastic complexity is a pseudonym of Rissanen's minimum description length (MDL) which gives the length of a sequence of decipherable binary code resulted from optimally encoding the data information using a probability distribution based code-book. Here we have derived an optimal generalized histogram density estimator to provide both predictive and non-predictive coding description of a data sample. We have also obtained uniform and almost sure asymptotic approximations for the lengths of both descriptions. As an application of this result to statistical inference a new procedure for hypothesis testing of distribution homogeneity is proposed and is proved to have an asymptotic power of 1.
Downloads
Published
License
Upon acceptance of an article by the journal, the author(s) accept(s) the transfer of copyright of the article to European Journal of Pure and Applied Mathematics.
European Journal of Pure and Applied Mathematics will be Copyright Holder.