Question Number 5117 by FilupSmith last updated on 14/Apr/16
$$\mathrm{I}\:\mathrm{have}\:\mathrm{a}\:\mathrm{question}.\:\mathrm{I}\:\mathrm{am}\:\mathrm{unsure}\:\mathrm{how}\:\mathrm{this} \\ $$$$\mathrm{is}\:\mathrm{done}\:\mathrm{because}\:\mathrm{I}\:\mathrm{have}\:\mathrm{never}\:\mathrm{learnt}\:\mathrm{it}. \\ $$$$ \\ $$$$\mathrm{How}\:\mathrm{do}\:\mathrm{you}\:\mathrm{determine}\:\mathrm{the}\:\mathrm{line}\:\mathrm{of}\:\mathrm{best}\:\mathrm{fit}? \\ $$
Commented by Yozzii last updated on 14/Apr/16
$${Least}\:{squares}\:{method}\:{is}\:{one}\:{way}. \\ $$$${Given}\:{a}\:{set}\:{of}\:{n}\:{points}\:\left({x}_{{i}} ,{y}_{{i}} \right),\:{suppose} \\ $$$${that}\:{the}\:{line}\:{of}\:{bestfit}\:{has}\:{the}\:{form} \\ $$$${y}={mx}+{c}.\:{Then},\:{for}\:{each}\:{x}_{{i}} ,\:{we}\:{get} \\ $$$${the}\:{result}\:{y}_{{j}} ={mx}_{{i}} +{c}.\:{The}\:{least}\:{squares} \\ $$$${method}\:{requires}\:{that}\:{for}\:{the}\:{regression}\:{line} \\ $$$${y}\:{on}\:{x},\:{the}\:{quantity}\:{Q}\:{must}\:{be}\:{minimised} \\ $$$${where}\:{Q}=\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}\left({y}_{{i}} −{y}_{{j}} \right)^{\mathrm{2}} =\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}\left({y}_{{i}} −{mx}_{{i}} −{c}\right)^{\mathrm{2}} \\ $$$${This}\:{means}\:{that}\:{we}\:{aim}\:{to}\:{minimise} \\ $$$${the}\:{squared}\:{distances}\:{between}\:{a}\:{given}\:{y}_{{i}} \:{and} \\ $$$${the}\:{point}\:{on}\:{the}\:{line}\:{y}_{{j}} \:{corresponding}\:{to} \\ $$$${x}_{{i}} .\:{Q}\:{is}\:{a}\:{function}\:{of}\:{two}\:{variables}\:{m} \\ $$$${and}\:{c}\:{since}\:{we}\:{assume}\:{that}\:\left({x}_{{i}} ,{y}_{{i}} \right)\:{are} \\ $$$${known}.\:{We}\:{can}\:{then}\:{employ}\:{techniques} \\ $$$${of}\:{multivariable}\:{calculus}\:{to}\:{find}\:{m}\:{and} \\ $$$${c}.\:{Since}\:{Q}\:{is}\:{of}\:{a}\:{quadratic}\:{form}\:{its} \\ $$$${stationary}\:{value}\:{is}\:{a}\:{minimum}\:{one}; \\ $$$${in}\:\mathrm{3}{D}\:{space},\:{the}\:{locus}\:{of}\:{points}\:\left({m},{c},{Q}\right)\:{is}\:{a}\:{bowl}\:{surface} \\ $$$${where}\:{Q}\geqslant\mathrm{0}.\: \\ $$$$\Rightarrow\frac{\partial{Q}}{\partial{m}}=\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}\left(−\mathrm{2}{x}_{{i}} \right)\left({y}_{{i}} −{mx}_{{i}} −{c}\right) \\ $$$$\frac{\partial{Q}}{\partial{m}}=\mathrm{2}{m}\left(\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{x}_{{i}} ^{\mathrm{2}} \right)+\mathrm{2}{c}\left(\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{x}_{{i}} \right)−\mathrm{2}\left(\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{x}_{{i}} {y}_{{i}} \right) \\ $$$${and}\:\frac{\partial{Q}}{\partial{c}}=\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}\left(−\mathrm{2}\right)\left({y}_{{i}} −{mx}_{{i}} −{c}\right) \\ $$$$\frac{\partial{Q}}{\partial{c}}=\mathrm{2}{m}\left(\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{x}_{{i}} \right)+\mathrm{2}{c}\left(\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}\mathrm{1}\right)−\mathrm{2}\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{y}_{{i}} \\ $$$${or}\:\frac{\partial{Q}}{\partial{c}}=\mathrm{2}{m}\left(\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{x}_{{i}} \right)+\mathrm{2}{cn}−\mathrm{2}\left(\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{y}_{{i}} \right) \\ $$$${At}\:{the}\:{stationary}\:{point},\:\frac{\partial{Q}}{\partial{m}}=\mathrm{0}\:{and}\:\frac{\partial{Q}}{\partial{c}}=\mathrm{0}. \\ $$$${You}\:{get}\:{the}\:{following}\:{result}\:{for}\:{m} \\ $$$${from}\:{these}\:{two}\:{equations}\:{by}\:{eliminating}\:{c}. \\ $$$${m}=\frac{{n}\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{x}_{{i}} {y}_{{i}} −\left(\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{x}_{{i}} \right)\left(\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{y}_{{i}} \right)}{{n}\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{x}_{{i}} ^{\mathrm{2}} −\left(\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{x}_{{i}} \right)^{\mathrm{2}} } \\ $$$${It}\:{can}\:{shown}\:{that}\:\left(\overset{−} {{x}},\overset{−} {{y}}\right)=\left(\frac{\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{x}_{{i}} }{{n}},\frac{\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{y}_{{i}} }{{n}}\right) \\ $$$${lies}\:{on}\:{the}\:{line}\:{of}\:{best}\:{fit},\:{so}\:{c}\:{can}\:{be} \\ $$$${found}\:{from}\:{c}=\overset{−} {{y}}−{m}\overset{−} {{x}}.\:{The}\:{result} \\ $$$${of}\:{y}={mx}+{c}\:{is}\:{the}\:{least}\:{squares}\:{line}\:{of} \\ $$$${best}\:{fit}. \\ $$$$\left(\overset{−} {{x}},\overset{−} {{y}}\right)\:{is}\:{the}\:{centroid}\:{of}\:{all}\:{the}\:{points} \\ $$$$\left({x}_{{i}} ,{y}_{{i}} \right).\:{m}\:{has}\:{another}\:{form}. \\ $$$${m}=\frac{\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{x}_{{i}} {y}_{{i}} −{n}\overset{−} {{x}}\overset{−} {{y}}}{\underset{{i}=\mathrm{1}} {\overset{{n}} {\sum}}{x}_{{i}} ^{\mathrm{2}} −{n}\left(\overset{−} {{x}}\right)^{\mathrm{2}} }. \\ $$