Linguistics are a rather different from many other areas of statistics. The normal approaches do not necessarily apply. First off let me list the time of variables:
- counts - normally of words in various ways
- presence/absence variable the sort you answer yes or no to, but also some categorical
- marks out of, either straight test or number of times one form is chosen in preference to another
- Likert type preference scales
Second there is differing levels of variation and the idea of a unit or what does a member of your population of interest look like. Well it often takes a lot longer with linguists than other people to work these out. A member of your population may be an individual, a conversation, a text, a sentence or an exchange. If it is say a conversation what about the variation that is related to the people conversing? In other words you have a huge lot to think about before you get close to being able to enter the data.
Third, Linguistics scholars use R! Yes I know you have these very arts based people using a programming language. However R is free, therefore they use R.
So I want a basic text books. I go to Amazon and start checking the books for linguistics, order one as it looks hopeful including SPSS. The first section on research methodology appears vaguely alright nothing completely wrong but then I get to the statistics. It does cover mean, median and mode but then it gets to measures of dispersion and straight in with the St Deviation and onto a T-test. Hang on a T-Test is based on a Gaussian distribution assumption, but these people more often than not have variables that are not Gaussian distributed even when scalar. Actually my alarm bells started ringing when the book said that the difference between interval and ratio does not matter for linguistics. It does matter, it matters for low counts, it matters for test scores when close to zero or full marks are achieve. You can not just assume Gaussian here. Sometimes your categorical variable will be your outcome.
Unfortunately this heavy emphasis on the Gaussian based tests seems to be common in the text books. It is back to the drawing board I think. It looks like I will be learning R properly.
No comments:
Post a Comment