Excel should always be used with caution. It annoys me that for graphs I often end up using Excel (I keep promising my self to learn a R Graphics package well enough not to). I do not use it for serious statistical analysis, and I do not use it for handling large data sets and here is why.
There are a series of papers on using Excel for statistics, and to give you some idea of the tone the paper from Excel 2007 says
No statistical procedure in Excel should be used until Microsoft documents that the procedure is correct; it is not safe to assume that Microsoft Excel’s statistical procedures give the correct answer. Persons who wish to conduct statistical analyses should use some other package.and that is in the abstract. Now things are not quite as bad as it seems as for the 2010 paper there was an improvement. It is just that I know of other errors which have not been reported due to Microsoft's tardiness in fixing the reported ones. Note you need to use the new procedures.
However I equally do not like to use Excel for data handling
- It tries to load every single value into memory, yes that means the memory is working over time. It is also a sign to the old school programmers that it is using inefficient techniques for calculating values. That is mean is calculated by summing and then dividing by the number of cases. Most statistical software uses efficient routines such as this by C C Spicer. Yes that is 1972 but it is more accurate as well as more efficient. If you want a twenty minute lecture on machine accuracy I can give it you. The short answer is it avoids very small and very large numbers.
- Secondly, if you alter a value in the data sheet it then revises all the calculations in the sheet whether or not they depend on that value. If you want to test this out put the following formula in a cell =rand() and sees how often it changes.
Look as statisticians go I have very little pride, my main package is SPSS. There are many statisticians out there who would not sully their hands with that. However if you said the only alternative was Excel, then suddenly SPSS looks infinitely preferable.
No comments:
Post a Comment