阅读 192

生物节律之cosinor分析——终极R包circacompare(5)

前面我们已经介绍了cosinor差异分析最常用的两个包:cosinorcosinor2包,这两个包其实已经做得非常好了,就是还有一点点缺陷,下面一起来看看它们的缺陷:

1.cosinor包

示例来自官网 vignettes

library(cosinor)
head(vitamind)
#   X        Y      time
# 1 0 16.12091 11.439525
# 2 0 29.90624  5.807104
# 3 1 39.17572  1.045492
# 4 1 35.15403  4.082983
# 5 1 43.67065 10.606247
# 6 0 31.20360  5.126054
fit <- cosinor.lm(Y ~ time(time) + X + amp.acro(X), data = vitamind, period = 12)
summary(fit)
# Raw model coefficients:
#             estimate standard.error lower.CI upper.CI p.value
# (Intercept)  29.6898         0.4654  28.7776  30.6020  0.0000
# X             1.9019         0.8041   0.3258   3.4779  0.0180
# rrr           0.9308         0.6357  -0.3151   2.1767  0.1431
# sss           6.2010         0.6805   4.8673   7.5347  0.0000
# X:rrr         5.5795         1.1386   3.3479   7.8111  0.0000
# X:sss        -1.3825         1.1364  -3.6097   0.8447  0.2237
# 
# ***********************
# 
# Transformed coefficients:
#             estimate standard.error lower.CI upper.CI p.value
# (Intercept)  29.6898         0.4654  28.7776  30.6020   0.000
# [X = 1]       1.9019         0.8041   0.3258   3.4779   0.018
# amp           6.2705         0.6799   4.9378   7.6031   0.000
# [X = 1]:amp   8.0995         0.9095   6.3170   9.8820   0.000
# acr           1.4218         0.1015   1.2229   1.6207   0.000
# [X = 1]:acr   0.6372         0.1167   0.4084   0.8659   0.000
test_cosinor(fit, "X", param = "amp")#param 命名要测试参数的字符串,“amp”表示振幅,“acr”表示顶相(峰值相位)
# Global test: 
# Statistic: 
# [1] 2.59
# 
#
#  P-value: 
# [1] 0.1072
#
#  Individual tests: 
# Statistic: 
# [1] 1.61
# 
# 
#  P-value: 
# [1] 0.1072
# 
#  Estimate and confidence interval[1] "1.83 (-0.4 to 4.05)"

发现没,其实这个包的核心是检测相位和振幅的差异,但是我们就无法推测基线(均值)差异了吗?其实不然,由summary(fit)我们其实已经得到了均值差异的结果

summary(fit)$transformed.table

1处代表默认状态下的均值水平,2处代表因子X=1时与默认状态下的均值差异,所以3处代表的就是均值差异的p值
因此,cosinor包中的比较是分散的,整理结果时需要花一番手脚

2.cosinor2

#构造测试数据集
set.seed(100)
mat1 <- data.frame(row.names = paste0("subject",1:10),
                   x1=1+rnorm(10,0,0.15),
                   x2=2+rnorm(10,0,0.2),
                   x3=3+rnorm(10,0,0.3),
                   x4=4+rnorm(10,0,0.4),
                   x5=3+rnorm(10,0,0.35),
                   x6=2+rnorm(10,0,0.25))
mat2 <- data.frame(row.names = paste0("subject",1:12),
                   x1=8+rnorm(12,0,0.85),
                   x2=6+rnorm(12,0,0.6),
                   x3=4+rnorm(12,0,0.45),
                   x4=2+rnorm(12,0,0.25),
                   x5=4+rnorm(12,0,0.4),
                   x6=6+rnorm(12,0,0.65))
time=seq(1,21,4)

library(cosinor2)
fit.pa_ext.cosinor <- population.cosinor.lm(data = mat1 , time = time, period = 24)
#     MESOR Amplitude Acrophase
# 1 2.520422  1.364075 -3.399919
fit.pa_int.cosinor <- population.cosinor.lm(data = mat2, time = time, period = 24)
#     MESOR Amplitude  Acrophase
# 1 4.954428  2.617844 -0.2602902
cosinor.poptests(fit.pa_ext.cosinor, fit.pa_int.cosinor)
#                   F df1 df2            p 1st population 2nd population
# MESOR     921.87276   1  20 3.315962e-18       2.520422      4.9544275
# Amplitude  14.83446   1  20 9.952569e-04       1.364075      2.6178436
# Acrophase 124.81985   1  20 4.755292e-10      -3.399919     -0.2602902

# Warning message:
# In cosinor.poptests(fit.pa_ext.cosinor, fit.pa_int.cosinor) :
#   Results of population amplitude difference test are not reliable due to different acrophases.
看到最后三行没,有一个大大的警告:如果两个种群的峰值相位显著不同,则振幅差异检验的结果不可靠,不可靠,不可靠,重要的结果说三遍

因此,这两个包都有一点点缺陷,不影响正常使用,但使用起来总觉得有点不舒服

3.终极R包circacompare

circacompare is an R package that allows for the statistical analyses and comparison of two circadian rhythms. This work is published here and can be cited as:
Rex Parsons, Richard Parsons, Nicholas Garner, Henrik Oster, Oliver Rawashdeh, CircaCompare: A method to estimate and statistically support differences in mesor, amplitude, and phase, between circadian rhythms, Bioinformatics, https://doi.org/10.1093/bioinformatics/btz730

示例1:检测单个基因是否振荡

library(circacompare)
# create an example data frame with a phase shift between groups of 6 hours.
df <- make_data(hours_diff = 6)
df <- df[df$group == "g1",]
head(df)
#   time measure group
# 1    1    0.35    g1
# 2    2   11.00    g1
# 3    3    7.60    g1
# 4    4   11.00    g1
# 5    5    6.90    g1
# 6    6   15.00    g1

out <- circa_single(x = df, col_time = "time", col_outcome = "measure")# call circacompare.
# view the graph.
out[[3]]
# view the results.
out[[2]]
#       mesor amplitude  amplitude_p phase_radians peak_time_hours
# 1 0.2229167  10.39622 1.266734e-09      1.514854        5.786316
summary(out[[1]])
# Formula: measure ~ k + alpha * cos(time_r - phi)
# 
# Parameters:
#         Estimate Std. Error t value Pr(>|t|)    
# k      0.22292    0.71757   0.311    0.759    
# alpha 10.39622    1.01480  10.245 1.27e-09 ***
# phi    1.51485    0.09761  15.519 5.57e-13 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 3.515 on 21 degrees of freedom
# 
# Number of iterations to convergence: 5 
# Achieved convergence tolerance: 2.926e-07
nlstools::confint2(out[[1]],level = 0.95)
#             2.5 %    97.5 %
# k     -1.269350  1.715183
# alpha  8.285839 12.506606
# phi    1.311859  1.717849

解读:out[[3]]是ggplot2对象,可以直接输出,也可以修改各种基于ggplot2的参数,图形左上角表明基因振荡是否显著


out[[2]]是拟合主要统计值,有1行5列,分别为均值(基线),振幅,振幅p值(也就是振荡显著性的p值),峰值相位弧度值,峰值相位小时值

out[[1]]为拟合结果,summary查看总结输出更详细的统计资料,
nlstools::confint2获得置信区间

示例2 检测单个基因在两组间是否振荡:核心内容

df <- make_data(hours_diff = 6)
table(df$group)# 2 groups
# g1 g2 
# 24 24 

out <- circacompare(x = df, col_time = "time", col_group = "group", col_outcome = "measure")# call circacompare.
# view the graph.
out[[1]]

# view the results.
out[[2]]$value <- round(out[[2]]$value,4)
out[[2]]
#                                   parameter   value
# 1                 Both groups were rhythmic  1.0000
# 2  Presence of rhythmicity (p-value) for g1  0.0000
# 3  Presence of rhythmicity (p-value) for g2  0.0000
# 4                         g1 mesor estimate  0.2229
# 5                         g2 mesor estimate  2.8946
# 6                 Mesor difference estimate  2.6717
# 7              P-value for mesor difference  0.0027
# 8                     g1 amplitude estimate 10.3962
# 9                     g2 amplitude estimate 14.2856
# 10            Amplitude difference estimate  3.8894
# 11         P-value for amplitude difference  0.0021
# 12                             g1 peak time  5.7863
# 13                             g2 peak time 23.9125
# 14                Phase difference estimate -5.8738
# 15          P-value for difference in phase  0.0000

summary(out[[3]])
nlstools::confint2(out[[3]])
解读:out[[1]]同样返回的为ggplot2对象,如图:


out[[2]]为拟合主要统计值,有15行2列:

1.两组都振荡的p值
2.g1组振荡显著性的p值
3.g2组振荡显著性的p值
4~7.g1组均值,g2组均值,两组均值差异,均值差异显著性
8-11.g1组振幅,g2组振幅,两组振幅差异,振幅差异显著性
12-15.g1组峰值相位,g2组峰值相位,两组峰值相位差异,峰值相位差异显著性
out[[3]]为拟合结果,summary查看总结输出更详细的统计资料,
nlstools::confint2获得置信区间

注意:单个时out[[3]]为图像,out[[1]]为拟合结果
比较时out[[1]]为图像,out[[3]]为拟合结果

ps:怎么样,是不是心动了,还有比这更好使的做cosinor工具吗?只恨没有早点发现啊,赶紧devtools::install_github("RWParsons/circacompare")安装起来吧

不喜欢代码的,也可以试试Shiny版本哦,链接在此:https://rwparsons.shinyapps.io/circacompare/
注意:
您上传的数据应为csv格式,同时您上传的文件应包含以下内容的列:

  • 1.时间点(应为数字,以小时为单位)
  • 2.分组变量(可以是任何格式,但只能有两个可能的值)
  • 3.测量值(应为数字)
    上传您的csv文件,然后从下拉菜单中选择相应的列,单击“运行”进行比较。
示例df数据

作者:leoxiaobei

原文链接:https://www.jianshu.com/p/27cb5a9893c6

文章分类
后端
版权声明:本站是系统测试站点,无实际运营。本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容, 请发送邮件至 XXXXXXo@163.com 举报,一经查实,本站将立刻删除。
相关推荐