Skip to contents

Test for the difference of two cluster means after hierarchical clustering, for matrix normal model with arbitrary scale matrices. Supported linkages (as in Gao et al. 2022) are "single", "average", "centroid", "ward.D", "median", "mcquitty" and "complete".

Usage

test.clusters.hc(
  X,
  U = NULL,
  Sigma = NULL,
  Y = NULL,
  UY = NULL,
  precUY = NULL,
  NC,
  clusters,
  linkage = "average",
  ndraws = 2000
)

Arguments

X

A \(n \times p\) matrix drawn from a \(n \times p\) matrix normal distribution \(\mathcal{MN}(\)M, U, Sigma\()\). X must have \(n\) rows and \(p\) columns.

U

A \(n \times n\) positive-definite matrix describing the dependence structure between the rows in X. If NULL, observations are considered independent and U is set to the \(n \times n\) identity matrix.

Sigma

A \(p \times p\) positive-definite matrix describing the dependence structure between the columns in X. If NULL, Sigma is over-estimated (in the sense of the Loewner partial order).

Y

If Sigma is NULL, an i.i.d. copy of X allowing its estimation. Y must have the same number of columns as X.

UY

If Sigma is NULL, a positive-definite matrix describing the dependence structure between the rows in Y. If NULL and its inverse is not provided, set to the identity matrix by default.

precUY

The inverse matrix of UY, that can be provided to increase computational efficiency. If UY is not NULL and precUY is NULL, precUY is obtained by inverting UY.

NC

The number of clusters to choose.

clusters

A vector of two integers from \(1\) to NC indicating the pair of clusters whose means have to be compared.

linkage

The type of linkage for hierarchical clustering. Must be either single, average, centroid, ward.D, median, mcquitty or complete.

ndraws

If linkage is complete, the number of Monte Carlo iterations.

Value

  • pvalue - The p-value for the difference of cluster means.

  • stat - The test statistic.

  • stdrr - If linkage is complete, the Monte Carlo standard error.

  • hcl - The partition of the n observations retrieved by the clustering algorithm.

References

[1] L. L. Gao, J. Bien, and D. Witten. Selective inference for hierarchical clustering. Journal of the American Statistical Association, 0(0):1–11, 2022.

Examples


n <- 50
p <- 20
M <- Matrix::Matrix(0, nrow = n , ncol = p) # Mean matrix
Sigma <- stats::toeplitz(seq(1, 0.1, length = p)) # Sigma: dependence between features
U <- matrixNormal::I(n) # U: dependence between observations
X <- matrixNormal::rmatnorm(s = 1, M, U, Sigma)
Y <- matrixNormal::rmatnorm(s = 1, M, U, Sigma) # i.i.d. copy of X

# HAC with average linkage under the global null hypothesis
test.clusters.hc(X, U, Sigma, NC = 3, clusters = sample(1:3, 2), linkage = "average")
#> $pvalue
#> [1] 0.918364
#> 
#> $stat
#> [1] 5.544
#> 
#> $hcl
#>  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
#>  1  1  2  1  1  3  2  1  2  2  1  2  1  1  2  1  1  1  1  1  1  3  1  1  1  1 
#> 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 
#>  2  1  2  2  1  1  1  1  2  1  2  1  1  1  2  1  1  3  2  1  1  1  1  1 
#> 
#> $Sigma
#> 20 x 20 Matrix of class "dsyMatrix"
#>            [,1]      [,2]      [,3]      [,4]      [,5]      [,6]      [,7]
#>  [1,] 1.0000000 0.9526316 0.9052632 0.8578947 0.8105263 0.7631579 0.7157895
#>  [2,] 0.9526316 1.0000000 0.9526316 0.9052632 0.8578947 0.8105263 0.7631579
#>  [3,] 0.9052632 0.9526316 1.0000000 0.9526316 0.9052632 0.8578947 0.8105263
#>  [4,] 0.8578947 0.9052632 0.9526316 1.0000000 0.9526316 0.9052632 0.8578947
#>  [5,] 0.8105263 0.8578947 0.9052632 0.9526316 1.0000000 0.9526316 0.9052632
#>  [6,] 0.7631579 0.8105263 0.8578947 0.9052632 0.9526316 1.0000000 0.9526316
#>  [7,] 0.7157895 0.7631579 0.8105263 0.8578947 0.9052632 0.9526316 1.0000000
#>  [8,] 0.6684211 0.7157895 0.7631579 0.8105263 0.8578947 0.9052632 0.9526316
#>  [9,] 0.6210526 0.6684211 0.7157895 0.7631579 0.8105263 0.8578947 0.9052632
#> [10,] 0.5736842 0.6210526 0.6684211 0.7157895 0.7631579 0.8105263 0.8578947
#> [11,] 0.5263158 0.5736842 0.6210526 0.6684211 0.7157895 0.7631579 0.8105263
#> [12,] 0.4789474 0.5263158 0.5736842 0.6210526 0.6684211 0.7157895 0.7631579
#> [13,] 0.4315789 0.4789474 0.5263158 0.5736842 0.6210526 0.6684211 0.7157895
#> [14,] 0.3842105 0.4315789 0.4789474 0.5263158 0.5736842 0.6210526 0.6684211
#> [15,] 0.3368421 0.3842105 0.4315789 0.4789474 0.5263158 0.5736842 0.6210526
#> [16,] 0.2894737 0.3368421 0.3842105 0.4315789 0.4789474 0.5263158 0.5736842
#> [17,] 0.2421053 0.2894737 0.3368421 0.3842105 0.4315789 0.4789474 0.5263158
#> [18,] 0.1947368 0.2421053 0.2894737 0.3368421 0.3842105 0.4315789 0.4789474
#> [19,] 0.1473684 0.1947368 0.2421053 0.2894737 0.3368421 0.3842105 0.4315789
#> [20,] 0.1000000 0.1473684 0.1947368 0.2421053 0.2894737 0.3368421 0.3842105
#>            [,8]      [,9]     [,10]     [,11]     [,12]     [,13]     [,14]
#>  [1,] 0.6684211 0.6210526 0.5736842 0.5263158 0.4789474 0.4315789 0.3842105
#>  [2,] 0.7157895 0.6684211 0.6210526 0.5736842 0.5263158 0.4789474 0.4315789
#>  [3,] 0.7631579 0.7157895 0.6684211 0.6210526 0.5736842 0.5263158 0.4789474
#>  [4,] 0.8105263 0.7631579 0.7157895 0.6684211 0.6210526 0.5736842 0.5263158
#>  [5,] 0.8578947 0.8105263 0.7631579 0.7157895 0.6684211 0.6210526 0.5736842
#>  [6,] 0.9052632 0.8578947 0.8105263 0.7631579 0.7157895 0.6684211 0.6210526
#>  [7,] 0.9526316 0.9052632 0.8578947 0.8105263 0.7631579 0.7157895 0.6684211
#>  [8,] 1.0000000 0.9526316 0.9052632 0.8578947 0.8105263 0.7631579 0.7157895
#>  [9,] 0.9526316 1.0000000 0.9526316 0.9052632 0.8578947 0.8105263 0.7631579
#> [10,] 0.9052632 0.9526316 1.0000000 0.9526316 0.9052632 0.8578947 0.8105263
#> [11,] 0.8578947 0.9052632 0.9526316 1.0000000 0.9526316 0.9052632 0.8578947
#> [12,] 0.8105263 0.8578947 0.9052632 0.9526316 1.0000000 0.9526316 0.9052632
#> [13,] 0.7631579 0.8105263 0.8578947 0.9052632 0.9526316 1.0000000 0.9526316
#> [14,] 0.7157895 0.7631579 0.8105263 0.8578947 0.9052632 0.9526316 1.0000000
#> [15,] 0.6684211 0.7157895 0.7631579 0.8105263 0.8578947 0.9052632 0.9526316
#> [16,] 0.6210526 0.6684211 0.7157895 0.7631579 0.8105263 0.8578947 0.9052632
#> [17,] 0.5736842 0.6210526 0.6684211 0.7157895 0.7631579 0.8105263 0.8578947
#> [18,] 0.5263158 0.5736842 0.6210526 0.6684211 0.7157895 0.7631579 0.8105263
#> [19,] 0.4789474 0.5263158 0.5736842 0.6210526 0.6684211 0.7157895 0.7631579
#> [20,] 0.4315789 0.4789474 0.5263158 0.5736842 0.6210526 0.6684211 0.7157895
#>           [,15]     [,16]     [,17]     [,18]     [,19]     [,20]
#>  [1,] 0.3368421 0.2894737 0.2421053 0.1947368 0.1473684 0.1000000
#>  [2,] 0.3842105 0.3368421 0.2894737 0.2421053 0.1947368 0.1473684
#>  [3,] 0.4315789 0.3842105 0.3368421 0.2894737 0.2421053 0.1947368
#>  [4,] 0.4789474 0.4315789 0.3842105 0.3368421 0.2894737 0.2421053
#>  [5,] 0.5263158 0.4789474 0.4315789 0.3842105 0.3368421 0.2894737
#>  [6,] 0.5736842 0.5263158 0.4789474 0.4315789 0.3842105 0.3368421
#>  [7,] 0.6210526 0.5736842 0.5263158 0.4789474 0.4315789 0.3842105
#>  [8,] 0.6684211 0.6210526 0.5736842 0.5263158 0.4789474 0.4315789
#>  [9,] 0.7157895 0.6684211 0.6210526 0.5736842 0.5263158 0.4789474
#> [10,] 0.7631579 0.7157895 0.6684211 0.6210526 0.5736842 0.5263158
#> [11,] 0.8105263 0.7631579 0.7157895 0.6684211 0.6210526 0.5736842
#> [12,] 0.8578947 0.8105263 0.7631579 0.7157895 0.6684211 0.6210526
#> [13,] 0.9052632 0.8578947 0.8105263 0.7631579 0.7157895 0.6684211
#> [14,] 0.9526316 0.9052632 0.8578947 0.8105263 0.7631579 0.7157895
#> [15,] 1.0000000 0.9526316 0.9052632 0.8578947 0.8105263 0.7631579
#> [16,] 0.9526316 1.0000000 0.9526316 0.9052632 0.8578947 0.8105263
#> [17,] 0.9052632 0.9526316 1.0000000 0.9526316 0.9052632 0.8578947
#> [18,] 0.8578947 0.9052632 0.9526316 1.0000000 0.9526316 0.9052632
#> [19,] 0.8105263 0.8578947 0.9052632 0.9526316 1.0000000 0.9526316
#> [20,] 0.7631579 0.8105263 0.8578947 0.9052632 0.9526316 1.0000000
#> 
# HAC with complete linkage under the global null hypothesis and over-estimation of Sigma
test.clusters.hc(X, U, Sigma = NULL, Y = Y, NC = 3, clusters = sample(1:3, 2), linkage = "complete")
#> Sigma not provided: plugging an over-estimate.
#> Clustering with complete linkage. Monte-Carlo approximation of the p-value.
#> $pvalue
#> [1] 0.6634771
#> 
#> $stat
#> [1] 7.640268
#> 
#> $stderr
#> [1] 0.03215498
#> 
#> $hcl
#>  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
#>  1  1  1  2  2  2  1  1  1  1  2  1  2  1  1  1  2  2  3  3  3  2  3  3  1  2 
#> 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 
#>  1  2  1  1  2  2  2  2  1  1  1  3  2  2  1  3  1  2  1  1  3  3  2  3 
#> 
#> $Sigma
#> 20 x 20 Matrix of class "dgeMatrix"
#>            [,1]      [,2]      [,3]      [,4]      [,5]      [,6]      [,7]
#>  [1,] 0.9934213 0.9412765 0.8697526 0.8552993 0.8259835 0.7838249 0.7979018
#>  [2,] 0.9412765 0.9405129 0.8709366 0.8601355 0.8286571 0.8159359 0.8141823
#>  [3,] 0.8697526 0.8709366 0.8876258 0.8707565 0.8383922 0.8189422 0.8356849
#>  [4,] 0.8552993 0.8601355 0.8707565 0.9514981 0.9301368 0.9123677 0.9281812
#>  [5,] 0.8259835 0.8286571 0.8383922 0.9301368 1.0048344 0.9954221 1.0226804
#>  [6,] 0.7838249 0.8159359 0.8189422 0.9123677 0.9954221 1.0882776 1.0926182
#>  [7,] 0.7979018 0.8141823 0.8356849 0.9281812 1.0226804 1.0926182 1.2068685
#>  [8,] 0.7226062 0.7428362 0.7523365 0.8115031 0.9035562 0.9650521 1.1178485
#>  [9,] 0.6969976 0.7064104 0.7025652 0.7691753 0.8822040 0.9250813 1.1117954
#> [10,] 0.6779214 0.6807706 0.6552074 0.7095845 0.8363435 0.8687770 1.0588346
#> [11,] 0.6357109 0.6407483 0.6308166 0.7110466 0.8350992 0.8592065 1.0486240
#> [12,] 0.5266937 0.5343857 0.5317296 0.6146070 0.7332503 0.7670403 0.9673564
#> [13,] 0.4039172 0.4209932 0.4105402 0.5143113 0.6457768 0.6911051 0.8858570
#> [14,] 0.3481853 0.3660747 0.3537463 0.4541326 0.5836342 0.6236176 0.8059830
#> [15,] 0.3303652 0.3523184 0.3487745 0.4525221 0.5862648 0.6041937 0.7800542
#> [16,] 0.2593149 0.3065588 0.2930740 0.3918182 0.5262199 0.5729183 0.7135145
#> [17,] 0.2467653 0.2857182 0.2749515 0.3772725 0.5023951 0.5265035 0.6668765
#> [18,] 0.2021692 0.2408124 0.2341818 0.3431688 0.4783744 0.4971726 0.6315234
#> [19,] 0.2229824 0.2567316 0.2441596 0.3386911 0.4712419 0.4784668 0.6019922
#> [20,] 0.1637809 0.2086601 0.1888656 0.2873559 0.4100864 0.4011158 0.5379165
#>            [,8]      [,9]     [,10]     [,11]     [,12]     [,13]     [,14]
#>  [1,] 0.7226062 0.6969976 0.6779214 0.6357109 0.5266937 0.4039172 0.3481853
#>  [2,] 0.7428362 0.7064104 0.6807706 0.6407483 0.5343857 0.4209932 0.3660747
#>  [3,] 0.7523365 0.7025652 0.6552074 0.6308166 0.5317296 0.4105402 0.3537463
#>  [4,] 0.8115031 0.7691753 0.7095845 0.7110466 0.6146070 0.5143113 0.4541326
#>  [5,] 0.9035562 0.8822040 0.8363435 0.8350992 0.7332503 0.6457768 0.5836342
#>  [6,] 0.9650521 0.9250813 0.8687770 0.8592065 0.7670403 0.6911051 0.6236176
#>  [7,] 1.1178485 1.1117954 1.0588346 1.0486240 0.9673564 0.8858570 0.8059830
#>  [8,] 1.1341846 1.1450159 1.0980316 1.0761843 0.9897778 0.9128556 0.8248934
#>  [9,] 1.1450159 1.2432823 1.2187003 1.2132068 1.1244924 1.0496562 0.9833677
#> [10,] 1.0980316 1.2187003 1.2837815 1.2736865 1.1831671 1.1195376 1.0438118
#> [11,] 1.0761843 1.2132068 1.2736865 1.3231966 1.2361497 1.1740809 1.1000951
#> [12,] 0.9897778 1.1244924 1.1831671 1.2361497 1.2513605 1.1808794 1.1161705
#> [13,] 0.9128556 1.0496562 1.1195376 1.1740809 1.1808794 1.2101008 1.1495807
#> [14,] 0.8248934 0.9833677 1.0438118 1.1000951 1.1161705 1.1495807 1.1660719
#> [15,] 0.8193285 0.9648609 1.0333755 1.1096130 1.0971881 1.1401078 1.1200134
#> [16,] 0.7415589 0.8774592 0.9455226 1.0249184 1.0009184 1.0587402 1.0486902
#> [17,] 0.6937170 0.8272373 0.9005405 0.9806566 0.9629096 1.0022151 0.9872356
#> [18,] 0.6483935 0.7872713 0.8728003 0.9679740 0.9666513 0.9937312 0.9568148
#> [19,] 0.6327566 0.7556018 0.8300742 0.8954888 0.8806907 0.9093673 0.8760557
#> [20,] 0.5914168 0.7090877 0.7873343 0.8439621 0.8306993 0.8760705 0.8460808
#>           [,15]     [,16]     [,17]     [,18]     [,19]     [,20]
#>  [1,] 0.3303652 0.2593149 0.2467653 0.2021692 0.2229824 0.1637809
#>  [2,] 0.3523184 0.3065588 0.2857182 0.2408124 0.2567316 0.2086601
#>  [3,] 0.3487745 0.2930740 0.2749515 0.2341818 0.2441596 0.1888656
#>  [4,] 0.4525221 0.3918182 0.3772725 0.3431688 0.3386911 0.2873559
#>  [5,] 0.5862648 0.5262199 0.5023951 0.4783744 0.4712419 0.4100864
#>  [6,] 0.6041937 0.5729183 0.5265035 0.4971726 0.4784668 0.4011158
#>  [7,] 0.7800542 0.7135145 0.6668765 0.6315234 0.6019922 0.5379165
#>  [8,] 0.8193285 0.7415589 0.6937170 0.6483935 0.6327566 0.5914168
#>  [9,] 0.9648609 0.8774592 0.8272373 0.7872713 0.7556018 0.7090877
#> [10,] 1.0333755 0.9455226 0.9005405 0.8728003 0.8300742 0.7873343
#> [11,] 1.1096130 1.0249184 0.9806566 0.9679740 0.8954888 0.8439621
#> [12,] 1.0971881 1.0009184 0.9629096 0.9666513 0.8806907 0.8306993
#> [13,] 1.1401078 1.0587402 1.0022151 0.9937312 0.9093673 0.8760705
#> [14,] 1.1200134 1.0486902 0.9872356 0.9568148 0.8760557 0.8460808
#> [15,] 1.1812872 1.1030765 1.0476923 1.0332970 0.9469639 0.9334082
#> [16,] 1.1030765 1.1567597 1.0729985 1.0525420 0.9778756 0.9440812
#> [17,] 1.0476923 1.0729985 1.0591590 1.0313244 0.9407344 0.9081039
#> [18,] 1.0332970 1.0525420 1.0313244 1.0835081 0.9937413 0.9583137
#> [19,] 0.9469639 0.9778756 0.9407344 0.9937413 0.9897893 0.9646212
#> [20,] 0.9334082 0.9440812 0.9081039 0.9583137 0.9646212 1.0270134
#>