**Asslam-o-Aliqum !**

**Probability distribution** (Binomial, Poisson, Uniform, Normal, Exponentials and Hyper-geometric)One approach to determining probability is to gather a large sample of data, organize the data into a relative frequency distribution, and rely on the relative frequencies to determine the probabilities of the various outcomes. This approach is time consuming and expensive. In some cases, where specific circumstances and assumptions are met, it is possible to by-pass this process and directly determine probabilities from a known probability distribution.

There are a variety of probability distributions that describe random variables and they fall into two categories: discrete probability distributions and continuous probability distributions.

**Discrete probability distributions** involve a random variable that can only take on integer values. For example, the number of runs that the Chicago Cubs score in a baseball game will be a number such as 0, 1, 2, 3, 4, …. The number of runs can only be an integer and cannot be a number such as 3.159.

**Continuous probability distributions** involve a random variable that can take on any real number value. For example, the weight of a 12 oz. can of Pepsi may actually be 12.02 oz., or 11.97 oz., or 12.1133412 ozs. etc. That is, any real number value is possible, limited only by or ability or interest in measuring precisely. There are a few major types or families of distributions that are particularly interesting because of their wide range of application. The more important discrete distributions are the binomial distribution and the poisson distribution. The hypergeometric distribution, and the discrete uniform distribution are also noteworthy. The important continuous distributions are the normal distribution and the exponential distribution. Likewise, it is important to be familiar with the continuous uniform distribution.

__Discrete Probability Distributions__ **Binomial Distribution**

The binomial distribution is fundamentally a Bernoulli process. A binomial distribution deals with consecutive trials, each of which has two possible outcomes which are identified in general terms such as success versus failure, or yes versus no. The discrete random variable x is the number of successes that occur in n trials. Therefore, the binomial distribution has two parameters: the probability a preferred outcome occurs (

**p** )

in a trial, and the number of trials (**n**). The probability of exactly x successes in n trials can be calculated using the binomial formula: The mean and variance of the binomial probability distribution are: E(X) = np

s 2(X) = np (1-p )

Examples: Binomial distribution

**Using Tables:**

The binomial probabilities for specific values of n and p are available in tables found in most any statistics text. For the Realgro Nurseries example, where n=5 and p =0.10, and the P(X=3) was calculated, the same result could have been obtained from a binomial table. Below is the table corresponding to this situation and the result is highlighted…

** **n=5

**p**

**0.1**

**0.2**

**0.3**

**0.4**

**0.5**

**0.6**

**0.7**

**0.8**

**0.9**

**x=0**

0.5905

0.3277

0.1681

0.0778

0.0313

0.0102

0.0024

0.0003

0.0000

**x=1**

0.3281

0.4096

0.3601

0.2592

0.1562

0.0768

0.0284

0.0064

0.0005

**x=2**

0.0729

0.2048

0.3087

0.3456

0.3125

0.2304

0.1323

0.0512

0.0081

**x=3**

0.0081

0.0512

0.1323

0.2304

0.3125

0.3456

0.3087

0.2048

**0.0729**

**x=4**

0.0005

0.0064

0.0284

0.0768

0.1562

0.2592

0.3601

0.4096

0.3281

**x=5**

0.0000

0.0003

0.0024

0.0102

0.0313

0.0778

0.1681

0.3277

0.5905

**Hypergeometric Distribution: **This family of distributions is closely related to the family of binomial distributions. Similar to the binomial, the focus is on the number of successes in a given number of consecutive trials. The key distinctions are that the consecutive trials are not independent and the probability of success changes from one trial to the next. One way to recognize the distinction between the binomial and the hypergeometric distribution is that the hypergeometric involves a sampling procedure without replacement so that when the item that was sampled is not returned to the population, the probability of success in the next trial is changed (dramatically if the population size is small). The probability of exactly x successes in n trials that are not independent is:

where

**N**= size of the population.

**n** = size of the sample

**s** = the number of successes in the population

**x** = the number of successes in the sample.

(The calculations in the parentheses refer to the combinatorial calculation).

The mean and variance of the hypogeometric are:

E(X) = ns/N

s 2(X) = ns(N-s)/N2

Example: Hypergeometric Distribution

**The Poisson Distribution**: this family of distributions is applied to events for which the probability of occurrence over a given span of time, space, or distance is small. The random variable x denotes the number of occurrences over a given span. That is, where the phenomena measured is a rate. Although it is closely related to the binomial distribution family, it has some unique characteristics for measuring rates. There is only one parameter, **l** , which is the average rate of occurrence. The probability that an event will occur exactly X times over a given span is:

where l is the average rate of occurrence and

e is the mathematical constant 2.71828 (the base

of natural logs).

Probabilities can be calculated or found in tables.

E(X) = l

s 2(X) = l

Examples: Poisson distribution

**Discrete Uniform Distribution**. This family of distributions is used to describe situations where the possible outcomes are all equally likely to occur. It is sometimes used to describe a sense of fairness in a particular situation.

For n possible outcomes the probability of any one of them is:

P(X) = 1/n

The mean and variance of the discrete uniform distribution are:

E(X) = (a + b)/2

s 2(X) = (n2-1)/12

where a is the smallest possible value of X

and b is the largest.

Example: Discrete Uniform Distribution

** ** Continuous Probability Distributions

**The Normal Distribution **is the most important continuous probability distribution for several reasons: (1) there are many phenomena in business, economics, and nature that can be approximated by the normal distribution; (2) it can be used to approximate the binomial distribution; and (3) according to a power statistical theorem (the central limit theorem) under appropriate conditions, the distribution of means and proportions taken from samples is approximately normal.

The key characteristics of the normal distribution are central tendency and symmetry. By central tendency is meant that most of the observations or data tend to be near the center or mean and the further from the mean, the fewer the observations and hence the smaller the probability. By symmetry is meant that if the distribution were bisected, the distribution of points greater than the mean would be the mirror image of the distribution of points less than the mean.

The key properties of the normal distribution are (1) the mean, median and mode are all the same; (2) the distribution is bell shaped and asymptotic (approaching the horizontal axis at both ends but not intersecting); and (3) the distribution is characterized by two parameters: the mean m and the standard deviation s .

The probability density function for the normal is:

Although probabilities could be calculated directly from the probability density function by integrating the function over a specified interval, there is an easier approach using the standard normal distribution table.

The **standard normal** is that particular member of the normal distribution family where the mean m **=0 **and the standard deviation s **=1**. The importance of the standard normal is that areas under the standard normal (hence probabilities) have been calculated and put in tables that are readily accessible. It is possible then to convert any normal distribution into the standard normal for purposes of calculating probabilities using this simple conversion formula:

Example: Standard Normal Distribution

The **exponential distribution** is useful in describing an interval between the occurrence of events. Examples: the time to load a truck, the time between arrivals at an emissions testing center, or the distance between major defects in a highway. The exponential distribution is characterized by one parameter m **, **the average time, distance, or whatever between events. The exponential is closely associated with the poisson, in fact there is a reciprocal relationship between the two families of distributions.

The probability density function for the exponential is:

Probabilities can be calculated using the probability density function, or found in tables for the exponential distribution, or by using the following calculation formula:

Example: Exponential Distribution

The **Continuous Uniform Distribution** is similar to the discrete uniform distribution however the random variable X, in this case, can be any real number rather than just an integer. The shape of the frequency distribution is that of a rectangle with an area equal to one.

The probability density function is:

where l is the lowest possible value for X

and h is the highest possible value. (the numerator is the number 1).

The mean and variance of the continuous uniform distribution are:

m = ( l + h ) / 2

s 2 = ( h + l )2 / 12

In the continuous uniform distribution, with l £ x £ h, the probability that x will be no more than b and no less than a is

P( a £ x £ b ) = ( b - a ) / ( h - l )