![]() |
Home |
---|
In this article you will learn about standard deviation and how to calculate it with sd() function and step by step in R Programming.
Standard deviation is square root of variance.
This function calculates stdev of a numeric vector or R object coercible to one by as.double()
sd(v, na.rm=F)
sd | name of function |
v | vector or R object |
na.rm | should missing values be removed. Setting the value as TRUE means Yes and FALSE means No |
It calculates stdev for sample i.e the denominator is N - 1. If you want to find stdev for population make necessary adjustment.
For vectors with length as 0 or 1 this function returns NA.
>
v <- c(1:10)
>
sd(v)
[1]
3.02765
Here v is a numeric vector of values 1 to 10. To find stdev of these numbers, we simply provide this vector as argument to sd function which returns 3.02765. In second example this function returns NA if a vector with length 1 is given to it.
> sd(5)
[1] NA
To understand the concept of stdev and how it is calculated, lets discuss it step by step in R.
stdev is a measure of dispersion of data, a higher value means data is more dispersed and a lower value means data is concentrated about the mean.
Lets suppose you have 10 values in your data set
1,2,3,4,5,6,7,8,9,10
and you want to calculate stdev of this data
First step is to find mean of all values. The mean is calculated by adding all values and dividing the sum by total number of values.
mean = (1+2+3+4+5+6+7+8+9+10)/10In R you can find mean of a vector by mean() function.
>
v <- c(1:10); v
[1]
1 2 3 4 5 6
7 8 9 10
>
m <- mean(v); m
[1]
5.5
Ofcourse it is a very simple step.
Subtract mean from each value one by one. It will give difference from mean for each item.
>
d <- v - m ; d
[1]
-4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5
2.5 3.5 4.5
Some the values are negative and some are positive in distance from mean, lets square them to negate that
> ds <- d^2;
ds
[1] 20.25
12.25 6.25 2.25 0.25
0.25 2.25 6.25 12.25 20.25
Sum the squared differences. This will provide sum of squared differences from mean, after that there are two scenarios if you want to calculate for sample divide the sum by N - 1, where N is total count of values. If you want to calculate for population divide the sum by N.
Here we are
calculating for sample.
> var <-
sum(ds)/(length(v)-1); var
[1] 9.166667
The sample variance is 9.16
For N, we use the function length(v). This returns N or total number of values then we subtract 1 from it.
The last step is to find square root of variance which will give standard deviation of vector in R.
>
sqrt(var)
[1]
3.02765
The value is 3.02765, this is the same value returned by sd() function. Here you have seen the working of this function step by step. Moreover, you can also manage to find stdev for population.