The Binomial Distribution

Use the binomial probability when the outcome of experiments is success or failure, when the experiments are independent of one another, and when the probability of success doesn't change in successive trials. A coin toss is a binomial experiment.

**Example: **let success be a flip of heads. In two coin tosses (trials),

0 heads .... pdf( N=2, p=.5, X=0 ) = .25 1 head ..... pdf( N=2, p=.5, X=1 ) = .50 2 heads .... pdf( N=2, p=.5, X=2 ) = .25

**Example: **you flip a coin twenty times and get more than fourteen heads. Is it biased?

cdf( N=20, p=.5, x=14) = 0.9793053

is the probability of **no more **than fourteen heads, so the probability of fifteen or more is` 1 - 0.9793053 = 0.0206947, `slightly more than two per cent. The event is rare, but not unheard of. You would say the result is significant at a 95% confidence level but not at 99%. But ask for a new coin anyway.

**Example: **using a random sample, a polling organization asks 50 voters if they favor Candidate A for reelection. Given that 55% of the city's voters favor Candidate A, this formula returns the probability that exactly 33 people from the sample will favor her:

pdf( N=50, p=.55, X=33 ) = 0.0338768

The Beta Distribution

The beta probability is useful for studying a
variable such as a percentage or probability which may only take on values
within some restricted range [x_{0}, x_{1}].

If a < 1 and b < 1, the graph is U-shaped, tending to infinity at the limits. If a > 1 and b > 1, the graph is bell-shaped. For a = b = 1, we get the uniform density as a special case. You might want to experiment with a > 1 and b < 1.

B(a,b) in the formula above is defined
as G(a) G(b) / G(a+b).
The mean of the beta distribution is a/(a+b), and its variance is
ab / ((a+b)^{2} (a+b+1))

The Cauchy Distribution

The Cauchy distribution, also called Lorentzian, is the distribution of the quotient of independent standard normal random variables. The tails of the distribution, although exponentially decreasing, are asymptotically much larger than any corresponding normal distribution.

It is interesting for a number of theoretical reasons. Although its mean can be taken as zero, since it is symmetrical about zero, the expectation, variance, higher moments, and moment generating function do not exist.

For independent identically distributed Cauchy variables X_{1}, ..., X_{n}, the
*average* (X_{1} + ... + X_{n}) /n has, surprisingly,
the same density as the X_{j}.

The Chi-Square Distribution

If Z is a standard normal random variable, then the distribution of
Z^{2} is chi-square with one degree of freedom.

If U_{1} , U_{2} , ..., U_{n}
are independent
chi-square variables with one degree of freedom, then the distribution of
U_{1} + U_{2} + ... + U_{n}
is chi-square with n degrees of freedom.

The distribution is used in chi-square tests, which allow you to compare the differences between observed and expected counts.

The Exponential Distribution

The exponential distribution, also known as the waiting-time distribution, describes the amount of time or distance between the occurrence of random events (such as the time between major earthquakes or the time between no hitters pitched in major league baseball).

Use this distribution in connection with estimating the length of material life, or the length of time a process might take.

The F-Distribution

F stands for Sir Ronald Fisher, English geneticist and statistician. The distribution is used in the analysis of variance and is a function of the ratio of two independent random variables each of which has a chi-square distribution and is divided by its number of degrees of freedom.

The Gamma Distribution

Gamma densities provide a fairly flexible class for modeling nonnegative
random variables. The
**chi square** distribution is an instance of the
gamma, and when alpha = 1, the gamma distribution is an
**exponential** distribution.

The gamma distribution can be used to describe the probability that
a events will occur within a time period
l. Contrast it with the exponential distribution,
which describes the probability that *one* event will occur.

Verify that the formula is the one you expected to use. Certain texts define the distribution as Gamma( x; a, b), where b is the reciprocal of Lambda.

The Geometric Distribution

The geometric distribution describes a set of Bernoulli trials with
a fixed Bernoulli probability p. It is the probability of N, the trial
number of the **first success.**

The number of trials is unknown. The experiment continues until the first success occurs. In contrast with the binomial distribution, N is the random variable.

**Example:** let success in a trial be defined as a dice roll of a **six**.
The probability of success on the first trial will be one-sixth.

- pdf( p=1/6, N=1 ) = 1/6

On the second trial, the probability will be one-sixth of the remaining chances. The `pdf( p=1/6, N=2 )` will be `1/6 x 5/6, or 5/36.`

The Gumbel Distribution

The Gumbel distribution, a special case of the Fisher-Tippett Distribution, is particularly convenient for extreme value distribution purposes, and it may be used as an alternative to the normal distribution in the case of skewed empirical data.

The Gumbel distribution is used in flood frequency analysis, to determine the probability that a given flow will occur within a given time interval.

The Hypergeometric Distribution

The distribution describes random sampling **without replacement **when information about the population as a whole is known. The the major practical use of the hypergeometric distribution is in **survey sampling**, although textbook examples mostly concern urns filled with black and red balls,

**Example: **five cards are drawn from a deck of 52 playing cards. This formula calculates the probability that exactly one of the five cards drawn is an ace (assuming there are only four aces in the deck):

- pdf( N=52, n=5, T=4, X=1 ) = 0.299474

The probability that no more than one of the cards is an ace is

- cdf( N=52, n=5, T=4, X=1 ) = 0.958316

SampleSuccess must be greater than the larger of 0 or (SampleSize - PopSize + PopSuccess)

The Lognormal Distribution

A random variable **X** is said to follow a lognormal distribution if
the random variable Y = log( **X** ) is normally distributed.

The distribution of particle sizes in crushing, when there have been repeated impacts, is often skewed, with a slowly decreasing right tail. The lognormal is sometimes used to fit such a distribution.

The Negative Binomial Distribution

The negative binomial distribution describes the number of additional trials N, given a success rate p, needed to achieve r successes. Note that the total number of trials is r+N.

**Example: ** a door-to-door encyclopedia salesperson is required to make five in-home visits each day. Suppose he has a 30% chance of being invited into any given home. If he selects, ahead of time, the addresses of 30 households upon which to call, what is the probability that A) he requires fewer than eight addresses to make five calls? B) he requires twenty-five or more addresses to make five calls?

or about one chance in 35.

= .090471919,

**Conventions: ** the statistics textbooks are evenly divided between this random variable N = number of trials in excess, and the variable X = total trials. The conversion formula is X = N + r, and so the above results could also be written as

- A) cdf( r=5, p=.3, X=7 )= 0.0287955.

B) 1 - cdf( r=5, p=.3, X=25 ) = .090471919

The Normal Distribution

The graph of this distribution is the bell-shaped curve called the normal,
or Gaussian, probability curve. The graph can illustrate the idea of a
percentile. Click *Display Graph* to see that one deviation is
equivalent to the 84th percentile.

Many sets of measurements have been found to have this
frequency distribution. For example, let x_{i} be the
number of 6's cast in the N respective runs of n tosses of a die and assume
N to be moderately large. Let y_{j} be the weights, correct
to the nearest 1/100 g, of N lima beans chosen haphazardly from a 100-kg bag
of lima beans. Let z_{k} be the barometric pressures
recorded to the nearest 1/1000 cm by N students in succession, reading the
same barometer. It will be observed that the x 's, y 's, and z 's have
an amazingly similar frequency pattern.

There is a proof that, as a sample becomes large, the distribution of its mean (from a population with finite variance) approximates the normal distribution. It is known as the Central Limit Theorem.

The Poisson Distribution

Poisson probability is the probability that X number of events will occur in a given space or over a given time period.

**Example: **on average, Company ABC receives 30 customer service phone calls per hour. What is the probability that Company ABC will receive exactly 35 calls in one hour?

- pdf( Lambda=30, X=35 ) = 0.0453082

The probability that Company ABC will receive 35 **or fewer **calls in one hour is

- cdf( Lambda=30, X=35 ) = 0.8426165

The Rayleigh Distribution

The Rayleigh distribution was developed to describe the scattering of radiation.

Its has a theoretical importance in the transformation of a standard bivariate normal distribution to polar coordinates, which can be used to construct an algorithm for generating standard normal random variables.

The Student T Distribution

The Student's t-distribution measures the significance of a difference of means of small samples when two distributions are though to have the same variance, but possibly different means.

It is the distribution of a random variable X = u / v^{2} where
u and v are themselves independent random variables, u has a normal
distribution with mean 0 and a standard deviation of 1, and
v^{2} has a chi-square distribution with Df degrees of freedom.

The Weibull Distribution

The Weibull distribution is used to calculate the mean time to failure

of a device. Industrial applications of survival analysis often
involve testing components to destruction after subjecting them to a stress
which is assumed to speed up the aging process. The name of this
technique is *accelerated life testing.*

If beta = 1, the Weibull is the same as Exponential distribution with lambda = 1/beta.