I know from previous studies that

$Var(A+B) = Var(A) + Var(B) + 2 Cov (A,B)$

However, I don't understand why that is. I can see that the effect will be to 'push up' the variance when A and B covary highly. It makes sense that when you create a composite from two highly correlated variables you will tend to be adding the high observations from A with the high observations from B, and the low observations from A with the low observations from B. This will tend tend to create extreme high and low values in the composite variable, increasing the variance of the composite.

But why does it work to multiply the covariance by *exactly* 2?

byouness 05/15/2018.

**Simple answer:**

The variance involves a square: $$Var(X) = E[(X - E[X])^2]$$

So, your question boils down to the factor 2 in the square identity:

$$(a+b)^2 = a^2 + b^2 + 2ab$$

Which can be understood visually as a decomposition of the area of a square of side $(a+b)$ into the area of the smaller squares of sides $a$ and $b$, in addition to **two** rectangles of sides $a$ and $b$:

**More involved answer:**

If you want a mathematically more involved answer, the covariance is a bilinear form, meaning that it is linear in both its first and second arguments, this leads to:

$$\begin{aligned} Var(A+B) &= Cov(A+B, A+B) \\ &= Cov(A, A+B) + Cov(B, A+B) \\ &= Cov(A,A) + Cov(A,B) + Cov(B,A) + Cov(B,B) \\ &= Var(A) + 2 Cov(A,B) + Var(B) \end{aligned}$$

In the last line, I used the fact that the covariance is symmetrical: $$Cov(A,B) = Cov(B,A)$$

**To sum up:**

It is two because you have to account for both $cov(A,B)$ and $cov(B,A)$.

Acccumulation 05/15/2018.

The set of random variables is a vector space, and many of the properties of Euclidean space can be analogized to them. The standard deviation acts much like a length, and the variance like length squared. Independence corresponds to being orthogonal, while perfect correlation corresponds with scalar multiplication. Thus, variance of independent variables follow the Pythagorean Theorem:

$var(A+B) = var(A)+var(B)$.

If they are perfectly correlated, then

$std(A+B) = std(A)+std(B)$

Note that this is equivalent to

$var(A+B) = var(A)+var(B)+2\sqrt{var(A)var(B)}$

If they are not independent, then they follow a law analogous to the law of cosines:

$var(A+B) = var(A)+var(B)+2cov(A,B)$

Note that the general case is one in between complete independence and perfect correlation. If $A$ and $B$ are independent, then $cov(A,B)$ is zero. So the general case is that $var(A,B)$ always has a $var(A)$ term and a $var(B)$ term, and then it has some variation on the $2\sqrt{var(A)var(B)}$ term; the more correlated the variables are, the larger this third term will be. And this is precisely what $2cov(A,B)$ is: it's $2\sqrt{var(A)var(B)}$ times the $r^2$ of $A$ and $B$.

$var(A+B) = var(A)+var(B)+MeasureOfCorrelation*PerfectCorrelationTerm$

where $MeasureOfCorrelation = r^2$ and $PerfectCorrelationTerm=2\sqrt{var(A)var(B)}$

Put in other terms, if $r = correl(A,B)$, then

$\sigma_{A+B} = \sigma_A^2+\sigma_B^2+ 2(r\sigma_A)(r\sigma_B)$

Thus, $r^2$ is analogous to the $cos$ in the Law of Cosines.

Bananin 05/16/2018.

I would add that what you cited is not the *definition* of $Var(A+B)$, but rather a *consequence* of the definitions of $Var$ and $Cov$. So the answer to why that equation holds is the calculation carried out by **byouness**. Your question may really be why that makes sense; informally:

How much $A+B$ will "vary" depends on four factors:

- How much $A$ would vary on its own.
- How much $B$ would vary on its own.
- How much $A$ will vary as $B$ moves around (or varies).
- How much $B$ will vary as $A$ moves around.

Which brings us to $$Var(A+B)=Var(A)+Var(B)+Cov(A,B)+Cov(B,A)$$ $$=Var(A)+Var(B)+2Cov(A,B)$$ because $Cov$ is a symmetric operator.

- Variance-covariance matrix of the errors in linear regression
- Difference in Variance of factor scores for supplementary and active observations PCA
- Machine learning: intuition behind perceptron learning algorithm
- Machine learning: intuition behind perceptron learning algorithm
- Understanding that $\operatorname{COV}(X,X) = \operatorname{VAR}(X)$ intuitively
- Alternative measure to the “explained variance” or “r-square” to explain the variance?
- Variance of sum of square (not independent) random vectors
- Why does the variance change, changing the sign of the random variable?
- Variance Formula
- Does multiplying two random variables with positive covariance increase variance?

- How do I deal with the political promotional items in the workplace?
- How to say "not safe for work" in Latin?
- Why didn't the Black Death result in favorable results for surviving peasants in China?
- What is a sine wave?
- GPA rounding 2.498 to 2.5. How ethical is this?
- Is requesting waiving the right to deletion of contributions against GDPR?
- In parliamentary systems, why does the ruling party bother debating any legislation if they have enough votes to pass whatever they please?
- How to vertically align subscripts in mathmode
- Is there a way to ask in game (i.e. in a non-meta way) what a character's class is?
- Children's story about a boy who unbeats an egg
- Can rope tied to a sword hold a person?
- What is '@!' file?
- How do you unlock Dry Bowser in Mario Kart Wii?
- How to make table less cramped
- Alternative for "descend" in the context of computer file systems
- Lighting up a subject during sunset without a flash?
- How to disable "link your phone" links on the lock screen?
- How do you correctly reason that this directed graph is acyclic?
- When creating an empty file, why might one prefer 'touch file' over ': > file'?
- Can private institutions fire employees without concern for the First Amendment?
- How is it legal for a hospital to put two patients together in the same room in the US?
- Why don't "classical" stringed instruments mark their note positions?
- What is GAS and how can I avoid it?
- Do eigenvalues depend on the choice of basis?