Differences

This shows you the differences between two versions of the page.

--- science:phd-notes:2025-05-13-porter-thomas-fluctuations [2025/05/13 12:59] – Add notes on the Central Limit Theorem jon-dokuwiki
+++ science:phd-notes:2025-05-13-porter-thomas-fluctuations [2025/05/26 12:43] (current) – jon-dokuwiki
@@ Line 32: / Line 32: @@
 $$
-with a mean of 1 and a variance of 2. Just check [[https://en.wikipedia.org/wiki/Chi-squared_distribution|the Wikipedia page]] if you don't believe me. Let us now invoke the almighty Central Limit Theorem (CLT)! Let us now draw $N$ values from the PT distribution and let's name this value $X_1$. Suppose we want to know the sample average
+with a mean of 1 and a variance of 2. Just check [[https://en.wikipedia.org/wiki/Chi-squared_distribution|the Wikipedia page]] if you don't believe me. Let us now invoke the almighty Central Limit Theorem (CLT)! Let us now draw a value from the PT distribution and we'll name it $X_1$. Suppose we want to know the sample average
 $$
@@ Line 38: / Line 38: @@
 $$
-The law of large numbers tells us that the sample average will converge to the expected value $\mu = 1$ as $n$ goes to infinity. The CLT states that as $n$ gets larger, the distribution of $\bar{X}_n$ gets arbitrarily close to the normal distribution with a mean of 1 and a variance of $2/n$.
+The law of large numbers tells us that the sample average will converge to the expected value $\mu$ as $n$ goes to infinity. The CLT states that as $n$ gets larger, the distribution of $\bar{X}_n$ gets arbitrarily close to the normal distribution with a mean of 1 and a variance of $2/n$ (The PT distribution has a mean of 1 and a variance of 2).
+Let us quickly check that this is true! Let's say that $n = 1000$ and with some quick Python magic:
+<code>
+>>> from scipy.stats import chi2
+>>> n = 1000
+>>> sum(chi2.rvs(df=1, size=n))/n
+.013582747288161
+</code>
+Pretty close to 1 that is.
+<code>
+>>> draws = [sum(chi2.rvs(df=1, size=n))/n for _ in range(100000)]
+>>> np.mean(draws), np.var(draws), 2/n
+(0.9999389145605803, 0.002000149052396594, 0.002)
+</code>
+Mic drop?
+Now! How can we use this information to determine how much $y$ should vary? And what does //vary// even mean here? Vary-ance maybe. If $y$ is PT-distributed, then $y$ has a variance of 2. The variance is a measure of dispersion; a measure of how far a set of numbers is spread out from their average value. In mathematical terms, the variance of a random variable $X$ is the expected value of the squared deviation from the mean of $X$:
+$$
+\text{Var}(X) = E[(X - \mu)^2].
+$$
+So maybe what we want is to check that the variance of the $B$ distribution is (close to) 2? We can also draw a bunch of values from the distribution and check that the variance of the mean of all the $n$ draws are indeed equal to $2/n$, as the CLT predicts is true.