Application of Statistics in Physics

From Physics Book
Revision as of 23:40, 22 April 2022 by Esolis6 (talk | contribs)
Jump to navigation Jump to search

Claimed by Edwin Solis (April 16th, Spring 2022)

With the development of Quantum Mechanics and Statistical Mechanics, the subject of Statistics has become quintessential for understanding the foundation of these physical theories.

Mathematical Foundation

Probability

Probability is the numerical description of the likelihood of an event occurring from a sample space written as a value between 0 and 1. This event is just the outcome of executing an experiment, and the sample space is just the whole set of outcomes possible from this experiment. Usually, the probability for equally likely outcomes is written mathematically as

[math]\displaystyle{ P(A)=\frac{\text{# Number of times A occurs}}{\text{# Total number of outcomes}} }[/math]

For example, for a six face fair dice, we have a sample space [math]\displaystyle{ S=\{1, 2, 3, 4, 5, 6\} }[/math], where the probability of obtaining a 3 from a dice throw is given by [math]\displaystyle{ P(X=3)=\frac{1}{6} }[/math]. Here the dice throw is the experiment while the event is the outcome of the number obtained from the dice.

Note that we then define the probability of being in the sample space [math]\displaystyle{ P(S) = 1 }[/math] as an axiom that is congruent with the definition of probability.

More generally, however, we can define a probability space for an experiment with discrete outcomes as a mathematical function called probability mass function (pmf) [math]\displaystyle{ p(x) }[/math]. In the dice example, the pmf would be written as

[math]\displaystyle{ \begin{align} P(X=x) = p(x)=\begin{cases} \frac{1}{6} &\;|\;x=1,2,3,4,5,6\\ 0 &\;|\;otherwise \end{cases} \end{align} }[/math]

Now, consider an experiment as throwing a dart into a dartboard or measuring the length of a piece of rope. In both cases, we expect that there be infinitely points that are infinitely close to each other where the dart can land and values of length that the rope can have. Therefore, in this case, we find ourselves with a set of continuous values inside the sample space. As there are uncountable possible non-zero probabilities for these points, we would end up with a diverging value for [math]\displaystyle{ P(S) }[/math] which goes against the definition.

Therefore, for these continuous cases, we define the mathematical function, the probability density function (pdf) [math]\displaystyle{ f(x) }[/math]. Instead of assigning a probability for an outcome to take a specific value, the pdf assigns a probability to an interval of values that the outcome could come to have.

For example, if a there is a uniform probability of a bus arriving between [math]\displaystyle{ 10:00\text{ am} \text{ and } 11:00 \text{ am} }[/math], then the pdf of arriving [math]\displaystyle{ t }[/math] minutes after [math]\displaystyle{ 10:00\text{ am} }[/math] would be written as: [math]\displaystyle{ f(x) = \begin{cases} \frac{1}{60} &\;|\;0\leq x \leq 60\\ 0 &\;|\; otherwise \end{cases} }[/math] To obtain the actual probability we integrate over the interval we desire to evaluate. So, for the probability of the bus arriving during the first 10 minutes we would have:

[math]\displaystyle{ P(0\leq X \leq 10) = \int^{10}_0{f(x)\text{d}x} = \frac{1}{6} }[/math]

Therefore, we have the following properties:

Discrete Sample Space

For the pmf [math]\displaystyle{ p(x) }[/math] with [math]\displaystyle{ n }[/math] possible events,

  • [math]\displaystyle{ p(x)\geq 0 }[/math] if [math]\displaystyle{ x\in S }[/math], else [math]\displaystyle{ p(x) = 0 \text{ for } x\notin S }[/math]
  • [math]\displaystyle{ P(X=x) = p(x) }[/math]
  • [math]\displaystyle{ \sum_{x_i\in S}P(X=x_i) = 1 }[/math] with [math]\displaystyle{ S=\{x_1, x_2,...\} }[/math]

Continuous Sample Space

For the pdf [math]\displaystyle{ f(x) }[/math],

  • [math]\displaystyle{ f(x)\geq0 }[/math]
  • [math]\displaystyle{ P(a\lt X\lt b)=\int^b_a{f(x)\,\text{d}x} }[/math] from which follows that the probability of a specific value is [math]\displaystyle{ P(X=c)=\int^c_c{f(x)\,\text{d}x}=0 }[/math] if [math]\displaystyle{ f(c) }[/math] is finite.
  • [math]\displaystyle{ P(-\infty\lt X\lt \infty)=\int^{\infty}_{-\infty}{f(x)\,\text{d}x}=1 }[/math]

Independece and Exclusiveness

Events can have the properties of being independent, which means the probability of each occurring is separate from the probability of each other. It's mathematically written as [math]\displaystyle{ P(A\cap B)=P(A)P(B) }[/math].

Exclusiveness, on the other hand, represents that both events cannot occur simultaneously. Either one or the other occurs, but not both. This is written as [math]\displaystyle{ P(A\cap B)= 0 }[/math].

Finally, the inclusion-exclusion principle applies to probabilities as well by the relation: [math]\displaystyle{ P(A\cup B)=P(A) + P(B) - P(A\cap B) }[/math].

Random Variables and Distributions

From the base definition of probability, we can go a step further and deal with outcomes of experiments as their own variable. The outcome of a random experiment is called a Random Variable (r.v.). This variable does not have a definite value per se, rather, it possesses certain properties linked to the underlying sample space of the experiment. This means that all the possible events in the sample space are specific values a random variable can attain.

Using the dice example, the experiment of throwing the dice results in the outcome [math]\displaystyle{ X }[/math] which is the random variable of the result of the dice. [math]\displaystyle{ X }[/math] can take the values [math]\displaystyle{ 1,2,3,4,5,6 }[/math]. We can also define a random variable [math]\displaystyle{ Y }[/math] to be the sum of two consecutive dice throws, so [math]\displaystyle{ Y }[/math] can take the values [math]\displaystyle{ 1 \text{ to } 12 }[/math]. The properties of the random variable depend on the probability function we use for the sample space. This means there are two types of a random variable: discrete which is described by the probability mass function, and continuous which is described by the probability density function (sometimes, also described by the cumulative distribution function cdf).

The set of mathematical descriptions for the sample space and probability space of a random variable is called a distribution. From the respective type of random variables, we have discrete and continuous distributions.

Discrete

Using the probability mass function we have direct probabilities for specific values that the discrete random variable can take. This is useful for countable sets of events or measurements that can occur. Discrete distributions can be represented using a line graph by mapping every value [math]\displaystyle{ x\in S }[/math] to its corresponding probability [math]\displaystyle{ P(X=x)=p(x) }[/math].

Common discrete distributions are the Discrete Uniform Distribution, Bernoulli Distribution, Binomial Distribution, Poisson Distribution, and Hypergeometric Distribution

Continuous

For the continuous case, we have to evaluate the integral of the interval we require to find the probability using the probability density function. These distributions are mainly used for quantities that are known to have an infinite number of values; however, they also find utility in approximating discrete distributions composed of a large population of values. Continous distribution can be represented by graphing the pdf where the area of an interval represents the probability, as well as they can be graphed using the cdf to better see the change in probability from two extreme points of an interval.

The most common used continuous distribution is the Normal Distribution; other common ones are the cousins of the normal distribution: [math]\displaystyle{ \chi^2 }[/math] Distribution, t-Distribution, as well as Continuous Uniform Distribution, Exponential Distribution, and Boltzmann Distribution.

Multivariable

In addition to one random variable distribution, it is possible to have distributions of more than one random variable. These multivariate distributions are called a Joint Probability Distributions of Random Variables

For these distributions, it is then necessary to define a probability for the combination of multiple random variables. In the discrete case, this probability function is called a joint probability mass function. For example, in a deck of cards, for a selection of 4 cards, we could have [math]\displaystyle{ X }[/math] be the number of red cards drawn, while [math]\displaystyle{ Y }[/math] be the number of cards greater than 7. For this two random variable case, the probability would be written as [math]\displaystyle{ P(X=x,Y=y) = p(x,y) }[/math] where [math]\displaystyle{ p(x,y) }[/math] is the joint pmf. Similarly, for the continuous case, we define the joint probability density function. For the probability of a joint pdf we integrate over the rectangular interval needed:

[math]\displaystyle{ P(a\leq X \leq b, c \leq Y \leq d) = \int^{d}_{c}\int^{b}_{a}f(x,y)\;\text{d}x\,\text{d}y }[/math]

or for a region with some constraints:

[math]\displaystyle{ P(X, Y \in R) = \iint_R f(x,y)\;\text{d}x\,\text{d}y }[/math]

Similarly, for the special case of just random variables [math]\displaystyle{ X \text{ and } Y }[/math], the discrete and continuous cases would follow:

[math]\displaystyle{ \sum_{x\in S_x}\sum_{y \in S_y} p(x,y) = 1 }[/math]

[math]\displaystyle{ \int^{\infty}_{-\infty}\int^{\infty}_{-\infty}f(x,y)\;\text{d}x\,\text{d}y=1 }[/math]

In general, for [math]\displaystyle{ n }[/math] variables we must have that

[math]\displaystyle{ \sum_{x_1 \in S_1}\dots\sum_{x_n\in S_n}p(x_1,\dots,x_n) = 1 }[/math], and

[math]\displaystyle{ \int^\infty_{-\infty}\dots\int^\infty_{-\infty}{f(x_1,\dots,x_n)\;\text{d}x_1\dots\text{d}x_n} = 1 }[/math]

Expectation

The expected value of a random variable (also known as mean, average, or expectation) is the weighted average of the possible values the random variable can take according to its probability. The common knowledge is that the average of is defined as [math]\displaystyle{ \bar{x}=\frac{\sum^n_{i=1}x_i}{n} }[/math], yet this is only the special case that all the possible values are equiprobable. In general, we define the expectation of a random variable [math]\displaystyle{ X, E[X], }[/math] as

[math]\displaystyle{ E[X] = \sum_{x\in S}xp(x) }[/math] for the discrete case, and

[math]\displaystyle{ E[X] = \int^\infty_{-\infty}{xf(x)\;\text{d}x} }[/math] for the continuous case.

In physics, specially in Quantum Mechanics, it is more common to see the expectation of a quantity [math]\displaystyle{ X }[/math] written as [math]\displaystyle{ \langle X \rangle }[/math].

Note that we can generalize, this concept of expectation to the expectation of a function [math]\displaystyle{ g(X) }[/math] depending on [math]\displaystyle{ X }[/math] as

[math]\displaystyle{ E[g(X)] = \sum_{x\in S}g(x)p(x) }[/math], and

[math]\displaystyle{ E[g(X)] = \int^\infty_{-\infty}{g(x)f(x)\;\text{d}x} }[/math]

Similarly, for the multivariate case, the dependence on the many variables leads to the expressions:

[math]\displaystyle{ E[g(X_1, \dots, X_n)] = \sum_{x_1 \in S_1}\dots\sum_{x_n\in S_n}g(x_1,\dots,x_n)p(x_1,\dots,x_n) }[/math], and

[math]\displaystyle{ E[g(X_1, \dots, X_n)] = \int^\infty_{-\infty}\dots\int^\infty_{-\infty}{g(x_1,\dots,x_n)f(x_1,\dots,x_n)\;\text{d}x_1\dots\text{d}x_n} }[/math]

Variance and Standard Deviation

Variance is defined to be the expectation of the squared distance of a random variable from its average value. Mathematically, that is [math]\displaystyle{ Var(X)=E[(X-E[X])^2]=E[X^2]-(E[X])^2 }[/math]. The standard deviation is just the square root of the variance usually represented as [math]\displaystyle{ \sigma }[/math] such that [math]\displaystyle{ Var(X) = \sigma^2 }[/math]. The standard deviation can be thought of as the average unsigned distance from the mean value; however, it is not a measure of error. Rather, it quantifies the average dispersion/propagation from the mean of a distribution. Only in cases of statistical analysis of properties in the distribution of a population where only a sample is known, the standard deviation can be used alongside the number of samples to define a measure of error called the standard error.

The expression [math]\displaystyle{ \Delta X }[/math] is used in Quantum Mechanics to denote the standard deviation of the random variable [math]\displaystyle{ X }[/math].

Statistical population and samples

In the real world, most of the distributions observed are obscured by the lack of knowledge of the parameters or even the type of distribution a random variable exudes. This is because we may not know the exact shape of the population i.e. the pmf or pdf, or even the exact properties of the population e.g. average, standard deviation, etc.

Statistics defines the population as the complete set of existing data that describes its exact type of distribution and properties; and defines the sample as a specific subset chosen from the population which is used to gather some of the data.

An example would be a deck of cards. The whole population is the deck of cards, while a sample would be choosing a couple of cards. However, the nuance for this example from the real world, is that in most cases, the population may be composed of millions, billions, or possibly infinite of subjects.

It is easy to draw conclusions from population data to sample data, but not the reverse. If the pdf or pmf of a random variable is known, it is easy to determine its expectation and standard deviation for a number of trials, but not in the opposite case.

To distinguish when we are talking about the properties of a sample, we define the sample average [math]\displaystyle{ \bar{x} }[/math] and standard deviation of the sample [math]\displaystyle{ s }[/math], while for the population, population average [math]\displaystyle{ \mu }[/math] and standard deviation of the population [math]\displaystyle{ \sigma }[/math].

Uses

Statistical Mechanics

Quantum Physics

Physics Simulations

Experiments and hypothesis testing

References

Additional Resources