{"title": "R\u8bed\u8a00\u5e38\u7528\u51fd\u6570\u53c2\u8003", "update_time": "2015-10-08 09:54:02", "tags": "\u5e38\u7528\u51fd\u6570", "pid": "331", "icon": "default.png"}
rep函数,产生重复序列 ``` > rep(c(1,3),3) [1] 1 3 1 3 1 3 ``` colMeans,rowMeas; 按列、按行计算平均值 ``` > x1 <- matrix(rep(c(1,3),4),ncol=2) > x1 [,1] [,2] [1,] 1 1 [2,] 3 3 [3,] 1 1 [4,] 3 3 > colMeans(x1) #按列计算平均值 [1] 2 2 > rowMeans(x1) #按行计算平均值 [1] 1 3 1 3 ``` apply,对matrix 按行或者按列执行指定向量函数 ``` > x1 [,1] [,2] [1,] 1 1 [2,] 3 3 [3,] 1 1 [4,] 3 3 > apply(x1,1,sum) #按行执行sum [1] 2 6 2 6 > apply(x1,2,sum) #按列执行sum [1] 8 8 ``` sapply,对data.frame 进行按列执行函数 ``` > sapply(iris[-5],mean) Sepal.Length Sepal.Width Petal.Length Petal.Width 5.843333 3.057333 3.758000 1.199333 ``` tapply,实现类似sql的group by的功能 ``` > tapply(iris[,'Petal.Width'],iris[,'Species'],mean) setosa versicolor virginica 0.246 1.326 2.026 ``` nrow、ncol、dim等查看data.frame的行列数 ``` > nrow(iris) [1] 150 > ncol(iris) [1] 5 > dim(iris) [1] 150 5 ``` cut & table 实现对数据的分段统计 ``` > x <- c(1,4,2,3,7,10,22,30,18) > table(cut(x,breaks=c(0,10,20,30))) (0,10] (10,20] (20,30] 6 1 2 ```