# Normal distribution and histogram in R

I spent much time lately seeking for a tool that would allow me to easily draw a histogram with a normal distribution curve on the same diagram. I could create the histogram in OOCalc, by using the `FREQUENCY()`

function and creating a column chart, but I found no way to add a curve, so I gave up. I started searching for something more powerful than OpenOffice. Of course, no Windows applications were allowed.

I googled my problem up before trying to use Maxima or something similar, and I found R. I haven’t heard about the R project earlier, but I decided to give it a try. And it was worth trying. Even if I were able to do the same in Octave or Maxima, I don’t think it could have been done easier.

### Installation

In my case it was just:

```
emerge dev-lang/R
```

You should probably search for the packages specific for your distribution (as far as I know R is available in the Ubuntu repository), or download it from the official site.

After installation, type

```
R
```

in the terminal, to launch the R console.

### Where the magic begins

We’ll do everything in just few lines of code. Let’s start with preparing the data. All we need is a comma-separated list of numbers (probably much longer than in this example), and we’ll create a vector that will keep this data:

```
x = c(4.14, 4.14, 4.16, 4.15, 4.19, 4.13, 4.16, 4.17)
```

Creating the histogram is as simple as that:

```
hist(x)
```

but you may want to use something like:

```
hist(x, col="#d3d3d3", xlim=c(4.10, 4.22), ylim=c(0, 20), probability=TRUE)
```

where `col="#d3d3d3"`

is the histogram color, `xlim`

and `ylim`

define the range of x and y values, and `probability=TRUE`

gives you probability density on the y axis.

Then, having in mind that `sd(x)`

returns a standard deviation and `mean(x)`

returns the mean value of the values in `x`

, and `dnorm()`

gives the density, we can add a standard deviation curve to our diagram:

```
s = sd(x)
m = mean(x)
curve(dnorm(x, mean=m, sd=s), add=TRUE)
```

The `add=TRUE`

option tells R to add the curve to the existing diagram instead of replacing it. The `col=`

option for changing the curve’s color is also available.

And that’s all. Quite simple, isn’t it?