Hotelling's T-square distribution
In statistics, Hotelling's T-square statistic,[1] named for Harold Hotelling, is a generalization of Student's t statistic that is used in multivariate hypothesis testing.
Hotelling's T-square statistic is defined as
- <math>
t^2=n({\mathbf x}-{\mathbf\mu})'{\mathbf W}^{-1}({\mathbf x}-{\mathbf\mu}) </math> where n is a number of points (see below), <math>{\mathbf x}</math> is a column vector of <math>p</math> elements and <math>{\mathbf W}</math> is a <math>p\times p</math> matrix.
If <math>x\sim N_p(\mu,{\mathbf V})</math> is a random variable with a multivariate Gaussian distribution and <math>{\mathbf W}\sim W_p(m,{\mathbf V})</math> (independent of x) has a Wishart distribution with the same non-singular variance matrix <math>\mathbf V</math> and with <math>m=n-1</math>, then the distribution of <math>t^2</math> is <math>T^2(p,m)</math>, Hotelling's T-square distribution with parameters p and m. It can be shown that
- <math>
\frac{m-p+1}{pm} T^2\sim F_{p,m-p+1} </math> where <math>F</math> is the F-distribution.
Now suppose that
- <math>{\mathbf x}_1,\dots,{\mathbf x}_n</math>
are p×1 column vectors whose entries are real numbers. Let
- <math>\overline{\mathbf x}=(\mathbf{x}_1+\cdots+\mathbf{x}_n)/n</math>
be their mean. Let the p×p positive-definite matrix
- <math>{\mathbf W}=\sum_{i=1}^n (\mathbf{x}_i-\overline{\mathbf x})(\mathbf{x}_i-\overline{\mathbf x})'/(n-1)</math>
be their "sample variance" matrix. (The transpose of any matrix M is denoted above by M′). Let μ be some known p×1 column vector (in applications a hypothesized value of a population mean). Then Hotelling's T-square statistic is
- <math>
t^2=n(\overline{\mathbf x}-{\mathbf\mu})'{\mathbf W}^{-1}(\overline{\mathbf x}-{\mathbf\mu}). </math>
Note that <math>t^2</math> is closely related to the squared Mahalanobis distance.
In particular, it can be shown [2] that if <math>{\mathbf x}_1,\dots,{\mathbf x}_n\sim N_p(\mu,{\mathbf V})</math>, are independent, and <math>\overline{\mathbf x}</math> and <math>{\mathbf W}</math> are as defined above then <math>{\mathbf W}</math> has a Wishart distribution with n − 1 degrees of freedom
- <math>\mathbf{W} \sim W_p(V,n-1)</math>.
and is independent of <math>\overline{\mathbf x}</math>, and
- <math>\overline{\mathbf x}\sim N_p(\mu,V/n)</math>
This implies that:
- <math>t^2 = n(\overline{\mathbf x}-{\mathbf\mu})'{\mathbf W}^{-1}(\overline{\mathbf x}-{\mathbf\mu}) \sim T^2(p, n-1).</math>
Hotelling's two-sample T-square statistic
If <math>{\mathbf x}_1,\dots,{\mathbf x}_{n_x}\sim N_p(\mu,{\mathbf V})</math> and <math>{\mathbf y}_1,\dots,{\mathbf y}_{n_y}\sim N_p(\mu,{\mathbf V})</math>, with the samples independently drawn from two independent multivariate normal distributions with the same mean and covariance, and we define
- <math>\overline{\mathbf x}=\frac{1}{n_x}\sum_{i=1}^{n_x} \mathbf{x}_i \qquad \overline{\mathbf y}=\frac{1}{n_y}\sum_{i=1}^{n_y} \mathbf{y}_i</math>
as the sample means, and
- <math>{\mathbf W}= \frac{\sum_{i=1}^{n_x}(\mathbf{x}_i-\overline{\mathbf x})(\mathbf{x}_i-\overline{\mathbf x})'
+\sum_{i=1}^{n_y}(\mathbf{y}_i-\overline{\mathbf y})(\mathbf{y}_i-\overline{\mathbf y})'}{n_x+n_y-2}</math> as the unbiased pooled covariance matrix estimate, then Hotelling's two-sample T-square statistic is
- <math>t^2 = \frac{n_x n_y}{n_x+n_y}(\overline{\mathbf x}-\overline{\mathbf y})'{\mathbf W}^{-1}(\overline{\mathbf x}-\overline{\mathbf y})
\sim T^2(p, n_x+n_y-2)</math>
and it can be related to the F-distribution by
- <math>\frac{n_x+n_y-p-1}{(n_x+n_y-2)p}t^2 \sim F(p,n_x+n_y-1-p).</math>[2]
See also
- Student's t-distribution (the univariate equivalent)
- F-distribution (commonly tabulated or available in software libraries, and hence used for testing the T-square statistic using the relationship given above)
- Wilks' lambda distribution (in multivariate statistics Wilks' <math>\Lambda</math> is to Hotelling's <math>T^2</math> as Snedecor's <math>F</math> is to Student's <math>t</math> in univariate statistics).
References
- ↑ H. Hotelling (1931) The generalization of Student's ratio, Ann. Math. Statist., Vol. 2, pp360-378.
- ↑ 2.0 2.1 K.V. Mardia, J.T. Kent, and J.M. Bibby (1979) Multivariate Analysis, Academic Press.
Table of Contents In Alphabetical Order | By Individual Diseases | Signs and Symptoms | Physical Examination | Lab Tests | Drugs
Editor Tools Become an Editor | Editors Help Menu | Create a Page | Edit a Page | Upload a Picture or File | Printable version | Permanent link | Maintain Pages | What Pages Link HereThere is no pharmaceutical or device industry support for this site and we need your viewer supported Donations | Editorial Board | Governance | Licensing | Disclaimers | Avoid Plagiarism | Policies