User Tools

Site Tools


science:phd-notes:2025-05-13-porter-thomas-fluctuations

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
science:phd-notes:2025-05-13-porter-thomas-fluctuations [2025/05/13 12:28] – Add intro to the PT distribution jon-dokuwikiscience:phd-notes:2025-05-13-porter-thomas-fluctuations [2025/05/26 12:43] (current) jon-dokuwiki
Line 21: Line 21:
 $$ $$
  
-and the distribution of $y$ values are hypothesised to follow the $\chi^2_1$ distribution, aka. the Porter-Thomas distribution.+and the distribution of $y$ values are hypothesised to follow the $\chi^2_1$ distribution, aka. the Porter-Thomas distribution. In the following figure we see an example of $B$ values plotted as a histogram and scaled to the height of the PT-distribution to show the resemblance. 
 + 
 +{{ :science:phd-notes:v50_porter_thomas_j_e1_m1.png |}} 
 + 
 +==== Porter-Thomas fluctuations ==== 
 +... is just really a fancy way of saying how much we expect $y$ values to vary. The PDF of the PT distribution is given by 
 + 
 +$$ 
 +g(x) = \dfrac{1}{\sqrt{2 \pi x}}e^{-x/2}, \quad x > 0, 
 +$$ 
 + 
 +with a mean of 1 and a variance of 2. Just check [[https://en.wikipedia.org/wiki/Chi-squared_distribution|the Wikipedia page]] if you don't believe me. Let us now invoke the almighty Central Limit Theorem (CLT)! Let us now draw a value from the PT distribution and we'll name it $X_1$. Suppose we want to know the sample average 
 + 
 +$$ 
 +\bar{X}_n = \dfrac{X_1 + ... + X_n}{n}. 
 +$$ 
 + 
 +The law of large numbers tells us that the sample average will converge to the expected value $\mu$ as $n$ goes to infinity. The CLT states that as $n$ gets larger, the distribution of $\bar{X}_n$ gets arbitrarily close to the normal distribution with a mean of 1 and a variance of $2/n$ (The PT distribution has a mean of 1 and a variance of 2). 
 + 
 +Let us quickly check that this is true! Let's say that $n = 1000$ and with some quick Python magic: 
 + 
 +<code> 
 +>>> from scipy.stats import chi2 
 +>>> n = 1000 
 +>>> sum(chi2.rvs(df=1, size=n))/
 +1.013582747288161 
 +</code> 
 + 
 +Pretty close to 1 that is. 
 + 
 +<code> 
 +>>> draws = [sum(chi2.rvs(df=1, size=n))/n for _ in range(100000)] 
 +>>> np.mean(draws), np.var(draws), 2/n 
 +(0.9999389145605803, 0.002000149052396594, 0.002) 
 +</code> 
 + 
 +Mic drop? 
 + 
 +Now! How can we use this information to determine how much $y$ should vary? And what does //vary// even mean here? Vary-ance maybe. If $y$ is PT-distributed, then $y$ has a variance of 2. The variance is a measure of dispersion; a measure of how far a set of numbers is spread out from their average value. In mathematical terms, the variance of a random variable $X$ is the expected value of the squared deviation from the mean of $X$: 
 + 
 +$$ 
 +\text{Var}(X) = E[(X - \mu)^2]. 
 +$$ 
 + 
 +So maybe what we want is to check that the variance of the $B$ distribution is (close to) 2? We can also draw a bunch of values from the distribution and check that the variance of the mean of all the $n$ draws are indeed equal to $2/n$, as the CLT predicts is true.
science/phd-notes/2025-05-13-porter-thomas-fluctuations.1747139291.txt.gz · Last modified: by jon-dokuwiki