R语言统计绘图：t 检验怎么做？

vlambda
2020-02-26

R语言统计绘图：t 检验怎么做？

Hi，新朋友，欢迎点击蓝字关注哦！

前言
正态性检验

Q-Q图示法
正态W检验法

方差齐性
t 检验

单样本 t 检验
配对样本 t 检验
独立样本 t 检验

正文

前言

计量资料的假设检验中，最简单、常用的方法就是 t 检验。

常见的t检验包括单样本 t 检验，配对样本 t 检验和独立样本 t 检验。

独立样本 t 检验一般要求数据服从正态分布且方差齐性。在进行 t 检验前一般先对资料进行方差齐性检验，若方差齐性，采用一般 t 检验；方差不齐，采用近似 t 检验。

配对样本 t 检验则要求每对数据差值的总体服从正态分布。

正态性检验

正态性检验的方法有两种，一是Q-Q图示法；二是正态 W 检验法。

Q-Q图示法

qqnorm(x, ylim,
       main="NormalQ-QPlot",
       xlab="TheoreticalQuantiles",
       ylab="SampleQuantiles",
       plot.it=TRUE, datax=FALSE,...)

qqnorm(x); qqline(x)观察数据是否服从正态分布。

options(digits = 3)  # 设定小数点后有效数字
x <- rnorm(20, mean=4, sd=4); x   # 生成正态分布数列
qqnorm(x); qqline(x)

若散点大致都在一条直线上，便可认为数据是服从正态分布的。

正态W检验法

H0:数据服从正态分布；
H1:数据不服从正态分布。
x <- rnorm(20, mean=4, sd=4); x  # 生成正态分布数列
shapiro.test(x)  # x是由数据构成的向量

输出：

Shapiro-Wilk normality test
data:  x
W = 1, p-value = 0.8

p-value = 0.8 ＞ 0.05，认为数据服从正态分布。

方差齐性

方差齐性检验多采用 Levene 检验。

需要的包为 car 包，函数为leveneTest()函数

install.packages("car")
library(car)

leveneTest()函数接受数据框结构，一列是各分组的取值 y，另一列是分组 group。

用法：

leveneTest(y, group, center=median, ...)  # center可选 mean 和 median(默认)。

示例：

options(digits = 3)
x <- rnorm(20, mean=4, sd=4); x  # 生成数据
y <- rnorm(20, mean=5, sd=4); y
d <- data.frame(x,y)  # 创建数据框
library(reshape2)
d1 <- melt(d, measure.vars = c("x","y"),  # 宽数据转长数据
              variable.name = "group",
              value.name = "groupvalue");d1
              
leveneTest(groupvalue ~ group, center = mean, data = d1) # 方差齐性检验

结果输出：

Levenes Test for Homogeneity of Variance (center = mean)
      Df F value Pr(>F)
group  1    0.27   0.61
      38

P-value = 0.27 ＞ 0.05，可认为等方差。

t 检验

t.test(x,y = NULL,  # x，y是由数据构成向量（只提供x，做单个正太总体均值检验）
       alternative=c("two.sided","less","greater"),  # alternative是备择假设，括号内分别表示双尾(默认)、单尾（μ1＜μ2）和单尾（μ1＞μ2）检验
       mu=0,  # 表示原假设μ0
       paired=FALSE,  # 不配对，true为配对
       var.equal=FALSE,  # 默认方差不齐，TRUE表示方差齐性
       conf.level=0.95,...)  # 置信水平

单样本 t 检验

x <- rnorm(20, mean=5, sd=4); x

t.test(x, alternative="greater", mu=5) # 或 x-n n为待检验均值

输出：

One Sample t-test
data:  x
t = -2, df = 19, p-value = 1
alternative hypothesis: true mean is greater than 5
95 percent confidence interval:
 0.487   Inf
sample estimates:
mean of x
     2.38

配对样本 t 检验

x <- rnorm(20, mean=5, sd=4); x
y <- rnorm(20, mean=4, sd=4); y

t.test(x-y, alternative="two.sided")

输出：

One Sample t-test
data:  x - y
t = 1, df = 19, p-value = 0.2
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 -0.621  3.585
sample estimates:
mean of x
     1.48

独立样本 t 检验

# 生成两组数据
x <- rnorm(20, mean=5, sd=4); x
y <- rnorm(20, mean=4, sd=4); y

t.test(x,y, var.equal=TRUE, alternative="two.sided") # 独立样本t检验

输出：

Two Sample t-test
data:  x and y
t = 1, df = 38, p-value = 0.2
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.562  3.526
sample estimates:
mean of x mean of y
     4.21      2.73

若方差不齐，不能满足 t 检验，采用 Wilcoxon 秩和检验，也叫 Mann-Whitney 检验。

wilcox.test(y ~ x, data)  # y是数值型变量，x为二分类变量，data为矩阵或数据框 或
wilcox.test(y1, y2) # y1、y2为各组的数值型向量

wilcox.test(x, y = NULL,
            alternative = c("two.sided", "less", "greater"),
            mu = 0,
            paired = FALSE,
            exact = NULL,  # 逻辑词，是否计算精确p值
            correct = TRUE,
            conf.int = FALSE, conf.level = 0.95, ...)

End

参考资料

1.《医学统计学》第4版

2.《统计建模与R软件》

3.《R语言实战》第2版

整理只作为笔记用，侵删。谢谢！

往期：

数据处理

统计分析

统计作图

(折线图)

R语言统计与绘图

更多精彩内容回复关键词

“R语言实战”

vlambda博客
学习文章列表