Stochastic Complexity, Histograms and Hypothesis Testing of Homogeneity


  • Guoqi Qian The University of Melbourne


Histogram density estimation, Minimum description length, Model selection, Quantization, Stochastic complexity, Test of homogeneity.


Information contained in a sample of quantitative data may be summarized or described by a nonparametric histogram density function. An interesting question is how to construct such a histogram density to express the data information with minimum stochastic complexity.The stochastic complexity is a pseudonym of Rissanen's minimum description length (MDL) which gives the length of a sequence of decipherable binary code resulted from optimally encoding the data information using a probability distribution based code-book. Here we have derived an optimal generalized histogram density estimator to provide both predictive and non-predictive coding description of a data sample. We have also obtained uniform and almost sure asymptotic approximations for the lengths of both descriptions. As an application of this result to statistical inference a new procedure for hypothesis testing of distribution homogeneity is proposed and is proved to have an asymptotic power of 1.






Mathematical Statistics

Stochastic Complexity, Histograms and Hypothesis Testing of Homogeneity. (2009). European Journal of Pure and Applied Mathematics, 3(1), 51-80.

