Estimating the Center of a Collection of Values

The mean is only one way of indicating the center of a collection of values. Other so-called measures of central tendency include the median and the midrange. In this lecture, we illustrate one way in which the mean is an “optimal” choice for indicating the center of set of observations.

We will begin our discussion by assuming that the collection of values represents a population, but everything we do in the lecture applies to samples as well as populations.

Consider a population consisting of \(N\) individuals. Let \(X\) be a random variable defined on this population, and let \(x_1, x_2, ..., x_N\) denote the realizations of \(X\) for the individuals within the population. We wish to estimate the location of the “center” of \(X\) for this population using a single number \(m\).

Sum of Squared Errors

There are many reasonable values that could be suggested for \(m\). We need to introduce a way of selecting the “best” possible value. To do this, we will introduce a method of scoring potential values of \(m\). This method will be called “Sum of Squared Errors”, and will be provided as a function of \(m\), denoted by \(SSE(m)\).

Assume that a value of \(m\) has been suggested. We can define the error in this estimate with respect to the ith observation \(x_i\) as follows:

\[e_i = x_i - m\]

Notice that \(e_i\) could be either positive, negative, or zero. To prevent large negative errors from cancelling out large positive errors (thus resulting in a small overall error), we will square each of these error terms. We will then sum the resulting (non-negative) squared errors across the entire population. In summary, we define \(SSE(m)\) as follows:

\[SSE(m) = \sum_{i=1}^N e_i^2 = \sum_{i=1}^N (x_i - m)^2\]

Minimizing SSE

The function \(SSE(m)\) provides a measure of how much the observations in our population vary from a proposed center \(m\). Our goal is to select the value of \(m\) that minimizes the function \(SSE(m)\). The next theorem states that this function is minimized by \(m = \mu\), where \(mu\) is the population mean.


Theorem 1. Let \(x_1, x_2, ..., x_N\) be the values of a random variable \(X\) defined on a population of size \(N\). Let \(SSE(m) = \sum_{i=1}^N (x_i - m)^2\). The function \(SSE(m)\) has a unique minimum at \(m = \mu\), where \(\mu = \sum_{i=1}^N x_i\).


This result still holds true if our observations \(x_i\) constitute a sample rather than an entire population. This is stated in the following theorem.


Theorem 2. Let \(X\) denote a random variable, and let \(x_1, x_2, ..., x_n\) denote a sample of \(n\) observations of \(X\). Let \(SSE(m) = \sum_{i=1}^n (x_i - m)^2\). The function \(SSE(m)\) has a unique minimum at \(m = \bar x\), where \(\bar x = \sum_{i=1}^n x_i\).


We will prove Theorem 1 soon, but before doing so, let’s see an example illustrating Theorem 2 with a small sample.

Example: Minimizing SSE for a Sample

Consider the sample given by \(x_1=2, x_2=3, x_3=5, x_4=8\).

We wish to estimate the center of this data with a number \(m\) such that the sum of squared errors is minimized. The steps required to find such an \(m\) are shown below.

Note that \(\bar x = \frac{1}{4}(2 + 3 + 5 + 8) = 4.5\)



We will now prove Theorem 1. The proof of Theorem 2 is practically identical.

Theorem 1. Let \(x_1, x_2, ..., x_N\) be the values of a random variable \(X\) defined on a population of size \(N\). Let \(SSE(m) = \sum_{i=1}^N (x_i - m)^2\). The function \(SSE(m)\) has a unique minimum at \(m = \mu\), where \(\mu = \sum_{i=1}^N x_i\).

Proof. We begin by rewriting the expression for \(SSE(m)\).

\[SSE(m) = \sum_{i=1}^N (x_i^2 - 2 x_i m +m^2)\] \[SSE(m) = \sum_{i=1}^N (m^2 - 2 x_i m +x_i^2)\]

\[SSE(m) = \sum_{i=1}^N (m^2) + \sum_{i=1}^N( - 2 x_i m) + \sum_{i=1}^N (x_i^2)\]

\[SSE(m) = N m^2 -2 m \sum_{i=1}^N x_i + \sum_{i=1}^N x_i^2\]

Notice that for a given population, \(N\), \(-2\sum_{i=1}^N x_i\), \(\sum_{i=1}^N x_i^2\) are all known constants. As such, the formula for \(SSE(m)\) is a second-degree polynomial in \(m\). Since the leading coefficient (\(N\)) is positive, the graph of \(SSE(m)\) is a parabola opening upward, and thus has a unique minimum. Let \(m^*\) be this minimum. Then:

\[m^* = -\frac{-2\sum_{i=1}^N x_i}{2 N} = \frac{1}{N}\sum_{i=1}^N x_i = \mu\]

LS0tDQp0aXRsZTogIkxlc3NvbiAwMyAtIE1pbmltaXppbmcgU1NFIg0KYXV0aG9yOiAiUm9iYmllIEJlYW5lIg0Kb3V0cHV0Og0KICBodG1sX25vdGVib29rOg0KICAgIHRoZW1lOiBmbGF0bHkNCiAgICB0b2M6IHRydWUNCiAgICB0b2NfZGVwdGg6IDINCi0tLQ0KDQojIEVzdGltYXRpbmcgdGhlIENlbnRlciBvZiBhIENvbGxlY3Rpb24gb2YgVmFsdWVzDQoNClRoZSBtZWFuIGlzIG9ubHkgb25lIHdheSBvZiBpbmRpY2F0aW5nIHRoZSBjZW50ZXIgb2YgYSBjb2xsZWN0aW9uIG9mIHZhbHVlcy4gT3RoZXIgc28tY2FsbGVkIG1lYXN1cmVzIG9mIGNlbnRyYWwgdGVuZGVuY3kgaW5jbHVkZSB0aGUgbWVkaWFuIGFuZCB0aGUgbWlkcmFuZ2UuIEluIHRoaXMgbGVjdHVyZSwgd2UgaWxsdXN0cmF0ZSBvbmUgd2F5IGluIHdoaWNoIHRoZSBtZWFuIGlzIGFuICJvcHRpbWFsIiBjaG9pY2UgZm9yIGluZGljYXRpbmcgdGhlIGNlbnRlciBvZiBzZXQgb2Ygb2JzZXJ2YXRpb25zLiANCg0KV2Ugd2lsbCBiZWdpbiBvdXIgZGlzY3Vzc2lvbiBieSBhc3N1bWluZyB0aGF0IHRoZSBjb2xsZWN0aW9uIG9mIHZhbHVlcyByZXByZXNlbnRzIGEgcG9wdWxhdGlvbiwgYnV0IGV2ZXJ5dGhpbmcgd2UgZG8gaW4gdGhlIGxlY3R1cmUgYXBwbGllcyB0byBzYW1wbGVzIGFzIHdlbGwgYXMgcG9wdWxhdGlvbnMuIA0KDQpDb25zaWRlciBhIHBvcHVsYXRpb24gY29uc2lzdGluZyBvZiAkTiQgaW5kaXZpZHVhbHMuIExldCAkWCQgYmUgYSByYW5kb20gdmFyaWFibGUgZGVmaW5lZCBvbiB0aGlzIHBvcHVsYXRpb24sIGFuZCBsZXQgJHhfMSwgeF8yLCAuLi4sIHhfTiQgZGVub3RlIHRoZSByZWFsaXphdGlvbnMgb2YgJFgkIGZvciB0aGUgaW5kaXZpZHVhbHMgd2l0aGluIHRoZSBwb3B1bGF0aW9uLiBXZSB3aXNoIHRvIGVzdGltYXRlIHRoZSBsb2NhdGlvbiBvZiB0aGUgImNlbnRlciIgb2YgJFgkIGZvciAgdGhpcyBwb3B1bGF0aW9uIHVzaW5nIGEgc2luZ2xlIG51bWJlciAkbSQuIA0KDQojIFN1bSBvZiBTcXVhcmVkIEVycm9ycw0KDQpUaGVyZSBhcmUgbWFueSByZWFzb25hYmxlIHZhbHVlcyB0aGF0IGNvdWxkIGJlIHN1Z2dlc3RlZCBmb3IgJG0kLiBXZSBuZWVkIHRvIGludHJvZHVjZSBhIHdheSBvZiBzZWxlY3RpbmcgdGhlICJiZXN0IiBwb3NzaWJsZSB2YWx1ZS4gVG8gZG8gdGhpcywgd2Ugd2lsbCBpbnRyb2R1Y2UgYSBtZXRob2Qgb2Ygc2NvcmluZyBwb3RlbnRpYWwgdmFsdWVzIG9mICRtJC4gVGhpcyBtZXRob2Qgd2lsbCBiZSBjYWxsZWQgIlN1bSBvZiBTcXVhcmVkIEVycm9ycyIsIGFuZCB3aWxsIGJlIHByb3ZpZGVkIGFzIGEgZnVuY3Rpb24gb2YgJG0kLCBkZW5vdGVkIGJ5ICRTU0UobSkkLg0KDQpBc3N1bWUgdGhhdCBhIHZhbHVlIG9mICRtJCBoYXMgYmVlbiBzdWdnZXN0ZWQuIFdlIGNhbiBkZWZpbmUgdGhlICoqZXJyb3IqKiBpbiB0aGlzIGVzdGltYXRlIHdpdGggcmVzcGVjdCB0byB0aGUgaXRoIG9ic2VydmF0aW9uICR4X2kkIGFzIGZvbGxvd3M6IA0KDQo8Y2VudGVyPg0KJCRlX2kgPSB4X2kgLSBtJCQNCjwvY2VudGVyPg0KDQpOb3RpY2UgdGhhdCAkZV9pJCBjb3VsZCBiZSBlaXRoZXIgcG9zaXRpdmUsIG5lZ2F0aXZlLCBvciB6ZXJvLiBUbyBwcmV2ZW50IGxhcmdlIG5lZ2F0aXZlIGVycm9ycyBmcm9tIGNhbmNlbGxpbmcgb3V0IGxhcmdlIHBvc2l0aXZlIGVycm9ycyAodGh1cyByZXN1bHRpbmcgaW4gYSBzbWFsbCBvdmVyYWxsIGVycm9yKSwgd2Ugd2lsbCBzcXVhcmUgZWFjaCBvZiB0aGVzZSBlcnJvciB0ZXJtcy4gV2Ugd2lsbCB0aGVuIHN1bSB0aGUgcmVzdWx0aW5nIChub24tbmVnYXRpdmUpIHNxdWFyZWQgZXJyb3JzIGFjcm9zcyB0aGUgZW50aXJlIHBvcHVsYXRpb24uIEluIHN1bW1hcnksIHdlIGRlZmluZSAkU1NFKG0pJCBhcyBmb2xsb3dzOg0KDQo8Y2VudGVyPg0KJCRTU0UobSkgPSBcc3VtX3tpPTF9Xk4gZV9pXjIgPSBcc3VtX3tpPTF9Xk4gKHhfaSAtIG0pXjIkJA0KPC9jZW50ZXI+DQoNCiMgTWluaW1pemluZyBTU0UNCg0KVGhlIGZ1bmN0aW9uICRTU0UobSkkIHByb3ZpZGVzIGEgbWVhc3VyZSBvZiBob3cgbXVjaCB0aGUgb2JzZXJ2YXRpb25zIGluIG91ciBwb3B1bGF0aW9uIHZhcnkgZnJvbSBhIHByb3Bvc2VkIGNlbnRlciAkbSQuIE91ciBnb2FsIGlzIHRvIHNlbGVjdCB0aGUgdmFsdWUgb2YgJG0kIHRoYXQgbWluaW1pemVzIHRoZSBmdW5jdGlvbiAkU1NFKG0pJC4gVGhlIG5leHQgdGhlb3JlbSBzdGF0ZXMgdGhhdCB0aGlzIGZ1bmN0aW9uIGlzIG1pbmltaXplZCBieSAkbSA9IFxtdSQsIHdoZXJlICRtdSQgaXMgdGhlIHBvcHVsYXRpb24gbWVhbi4gDQoNCi0tLS0tDQoNCioqVGhlb3JlbSAxLioqIExldCAkeF8xLCB4XzIsIC4uLiwgeF9OJCBiZSB0aGUgdmFsdWVzIG9mIGEgcmFuZG9tIHZhcmlhYmxlICRYJCBkZWZpbmVkIG9uIGEgcG9wdWxhdGlvbiBvZiBzaXplICROJC4gTGV0ICRTU0UobSkgPSBcc3VtX3tpPTF9Xk4gKHhfaSAtIG0pXjIkLiBUaGUgZnVuY3Rpb24gJFNTRShtKSQgaGFzIGEgdW5pcXVlIG1pbmltdW0gYXQgJG0gPSBcbXUkLCB3aGVyZSAkXG11ID0gXHN1bV97aT0xfV5OIHhfaSQuDQoNCi0tLS0tDQoNClRoaXMgcmVzdWx0IHN0aWxsIGhvbGRzIHRydWUgaWYgb3VyIG9ic2VydmF0aW9ucyAkeF9pJCBjb25zdGl0dXRlIGEgc2FtcGxlIHJhdGhlciB0aGFuIGFuIGVudGlyZSBwb3B1bGF0aW9uLiBUaGlzIGlzIHN0YXRlZCBpbiB0aGUgZm9sbG93aW5nIHRoZW9yZW0uIA0KDQotLS0tLQ0KDQoqKlRoZW9yZW0gMi4qKiBMZXQgJFgkIGRlbm90ZSBhIHJhbmRvbSB2YXJpYWJsZSwgYW5kIGxldCAkeF8xLCB4XzIsIC4uLiwgeF9uJCBkZW5vdGUgYSBzYW1wbGUgb2YgJG4kIG9ic2VydmF0aW9ucyBvZiAkWCQuIExldCAkU1NFKG0pID0gXHN1bV97aT0xfV5uICh4X2kgLSBtKV4yJC4gVGhlIGZ1bmN0aW9uICRTU0UobSkkIGhhcyBhIHVuaXF1ZSBtaW5pbXVtIGF0ICRtID0gXGJhciB4JCwgd2hlcmUgJFxiYXIgeCA9IFxzdW1fe2k9MX1ebiB4X2kkLg0KDQotLS0tLQ0KDQpXZSB3aWxsIHByb3ZlIFRoZW9yZW0gMSBzb29uLCBidXQgYmVmb3JlIGRvaW5nIHNvLCBsZXQncyBzZWUgYW4gZXhhbXBsZSBpbGx1c3RyYXRpbmcgVGhlb3JlbSAyIHdpdGggYSBzbWFsbCBzYW1wbGUuIA0KDQoNCiMgRXhhbXBsZTogTWluaW1pemluZyBTU0UgZm9yIGEgU2FtcGxlDQoNCkNvbnNpZGVyIHRoZSBzYW1wbGUgZ2l2ZW4gYnkgJHhfMT0yLCB4XzI9MywgeF8zPTUsIHhfND04JC4gDQoNCg0KYGBge3IsIGVjaG89RkFMU0UsIGZpZy5oZWlnaHQ9MiwgZmlnLndpZHRoPTh9DQp4IDwtIGMoMiwzLDUsOCkNCnkgPC0gYygwLDAsMCwwKQ0KDQojcGxvdCh5IH4geCwgY2V4PTIsIGJnPSJjb3JuZmxvd2VyYmx1ZSIsIHBjaD0yMSwgeWF4dD0nbicsIHhheHA9MTo5LCBhbm49RkFMU0UsIHhsaW09YygxLDkpKQ0KDQpwbG90KHkgfiB4LCBjZXg9MiwgYmc9ImNvcm5mbG93ZXJibHVlIiwgcGNoPTIxLCBheGVzPUZBTFNFLCBhbm49RkFMU0UsIHhsaW09YygxLDkpKQ0KYXhpcyhzaWRlPTEsIGF0PTE6OSkNCg0KYGBgDQoNCldlIHdpc2ggdG8gZXN0aW1hdGUgdGhlIGNlbnRlciBvZiB0aGlzIGRhdGEgd2l0aCBhIG51bWJlciAkbSQgc3VjaCB0aGF0IHRoZSBzdW0gb2Ygc3F1YXJlZCBlcnJvcnMgaXMgbWluaW1pemVkLiBUaGUgc3RlcHMgcmVxdWlyZWQgdG8gZmluZCBzdWNoIGFuICRtJCBhcmUgc2hvd24gYmVsb3cuIA0KDQoqICRTU0UobSkgPSAoMi1tKV4yICsgKDMtbSleMiArICg1IC0gbSleMiArICg4IC0gbSleMiQNCg0KKiAkU1NFKG0pID0gKDQgLSA0IG0gKyBtXjIpICsgKDkgLSA2IG0gK21eMikgKyAoMjUgLSAxMCBtICsgbV4yKSArICg2NCAtIDE2IG0gKyBtXjIpJA0KDQoqICRTU0UobSkgPSA0IG1eMiAtIDM2IG0gKyAxMDIkDQoNCiogJFNTRScobSkgPSA4IG0gLSAzNiQNCg0KKiAkOCBtIC0gMzYgPSAwJA0KDQoqICQ4IG0gPSAzNiQNCg0KKiAkbSA9IDQuNSQNCg0KTm90ZSB0aGF0ICRcYmFyIHggPSBcZnJhY3sxfXs0fSgyICsgMyArIDUgKyA4KSA9IDQuNSQNCg0KLS0tLS0NCg0KYGBge3IsIGVjaG89RkFMU0V9DQp4IDwtIGMoMiwzLDUsOCkNCnkgPC0gYygwLDAsMCwwKQ0KeHAgPC0gc2VxKGZyb209MCwgdG89MTAsIGJ5PTAuMSkNCnNzZSA9ICh4cCAtIHhbMV0pXjIgKyAoeHAgLSB4WzJdKV4yICsgKHhwIC0geFszXSleMiArICh4cCAtIHhbNF0pXjINCg0KcGxvdCh5IH4geCwgeWxpbT1jKC01LDgwKSkNCg0KbGluZXMoeHAsIHNzZSwgbHdkPTIsIGNvbD0iZGFya3JlZCIpDQphYmxpbmUodj1tZWFuKHgpLCBsd2Q9MiwgY29sPSJkYXJrZ3JlZW4iKQ0Kc2VnbWVudHMoeCx5LHgsYygxNiw4LDQsMTIpKQ0Kc2VnbWVudHMoeCxjKDE2LDgsNCwxMiksbWVhbih4KSwgYygxNiw4LDQsMTIpKQ0KcG9pbnRzKHkgfiB4LCBjZXg9MiwgYmc9ImNvcm5mbG93ZXJibHVlIiwgcGNoPTIxKQ0KdGV4dCg1LDYwLHBhc3RlKCJ4X2JhciA9IiwgbWVhbih4KSksIGNvbD0iZGFya2dyZWVuIikNCnRpdGxlKCJNZWFuIE1pbmltaXplcyBTU0UiKQ0KYGBgDQoNCg0KLS0tLS0NCg0KV2Ugd2lsbCBub3cgcHJvdmUgVGhlb3JlbSAxLiBUaGUgcHJvb2Ygb2YgVGhlb3JlbSAyIGlzIHByYWN0aWNhbGx5IGlkZW50aWNhbC4gDQoNCg0KKipUaGVvcmVtIDEuKiogTGV0ICR4XzEsIHhfMiwgLi4uLCB4X04kIGJlIHRoZSB2YWx1ZXMgb2YgYSByYW5kb20gdmFyaWFibGUgJFgkIGRlZmluZWQgb24gYSBwb3B1bGF0aW9uIG9mIHNpemUgJE4kLiBMZXQgJFNTRShtKSA9IFxzdW1fe2k9MX1eTiAoeF9pIC0gbSleMiQuIFRoZSBmdW5jdGlvbiAkU1NFKG0pJCBoYXMgYSB1bmlxdWUgbWluaW11bSBhdCAkbSA9IFxtdSQsIHdoZXJlICRcbXUgPSBcc3VtX3tpPTF9Xk4geF9pJC4NCg0KKipQcm9vZi4qKiBXZSBiZWdpbiBieSByZXdyaXRpbmcgdGhlIGV4cHJlc3Npb24gZm9yICRTU0UobSkkLiANCg0KPGNlbnRlcj4NCiQkU1NFKG0pID0gXHN1bV97aT0xfV5OICh4X2leMiAtIDIgeF9pIG0gK21eMikkJA0KJCRTU0UobSkgPSBcc3VtX3tpPTF9Xk4gKG1eMiAtIDIgeF9pIG0gK3hfaV4yKSQkDQoNCiQkU1NFKG0pID0gXHN1bV97aT0xfV5OIChtXjIpICsgXHN1bV97aT0xfV5OKCAtIDIgeF9pIG0pICsgXHN1bV97aT0xfV5OICh4X2leMikkJA0KDQokJFNTRShtKSA9IE4gbV4yIC0yIG0gXHN1bV97aT0xfV5OIHhfaSArIFxzdW1fe2k9MX1eTiB4X2leMiQkDQo8L2NlbnRlcj4NCg0KDQpOb3RpY2UgdGhhdCBmb3IgYSBnaXZlbiBwb3B1bGF0aW9uLCAkTiQsICQtMlxzdW1fe2k9MX1eTiB4X2kkLCAkXHN1bV97aT0xfV5OIHhfaV4yJCBhcmUgYWxsIGtub3duIGNvbnN0YW50cy4gQXMgc3VjaCwgdGhlIGZvcm11bGEgZm9yICRTU0UobSkkIGlzIGEgc2Vjb25kLWRlZ3JlZSBwb2x5bm9taWFsIGluICRtJC4gU2luY2UgdGhlIGxlYWRpbmcgY29lZmZpY2llbnQgKCROJCkgaXMgcG9zaXRpdmUsIHRoZSBncmFwaCBvZiAkU1NFKG0pJCBpcyBhIHBhcmFib2xhIG9wZW5pbmcgdXB3YXJkLCBhbmQgdGh1cyBoYXMgYSB1bmlxdWUgbWluaW11bS4gTGV0ICRtXiokIGJlIHRoaXMgbWluaW11bS4gVGhlbjoNCg0KPGNlbnRlcj4NCiQkbV4qID0gLVxmcmFjey0yXHN1bV97aT0xfV5OIHhfaX17MiBOfSA9IFxmcmFjezF9e059XHN1bV97aT0xfV5OIHhfaSA9IFxtdSQkIA0KPC9jZW50ZXI+DQotLS0tLQ0K