Читать книгу Essential Statistics for Bioscientists - Mohammed Meah - Страница 6
Introduction
Оглавление“All life is an experiment. The more experiments you make, the better.”
Ralph Waldo Emerson (1803–1882) - American lecturer, philosopher and poet
The word statistics is derived from the Latin word ‘status’ – meaning political state or a government. Statistics deals with collection, organization, presentation, analysis and interpretation of data to obtain meaningful and useful information. Statistics can be split into two major areas, namely, descriptive and inferential. Descriptive statistics involves collecting, summarizing, and presenting data. Inferential statistics involves analysing sample data to draw conclusions about a population.
Statistics is an area which is often much reduced in the curriculum of undergraduate bioscience degree courses. Statistics tends to be linked to research modules. Lecturers often assume that students have a strong grasp of mathematical and statistical concepts including data analysis. However, the reality is that most students are ‘rusty’ in these areas, particularly in statistics. The most urgent need for statistics is usually for the research project which is typically in the final year of the undergraduate degree (level 6). It is unclear, during undergraduate studies, how much and when statistics should be taught. In addition, there are a variety of software packages which can be used to perform statistical analysis, and display data, not all of which can be accessed or used competently by the students. Indeed, it would be fair to say that existing software can produce extensive statistical analysis, but choosing an appropriate test and interpreting the data analysis can be challenging. It is rare to have the luxury to be able to consult a resident statistician in the Bioscience Department.
There are a variety of statistical software packages, which vary in the difficulty of use, and in what tests they can perform. An additional bonus is the ability to plot graphically, mean and individual data. The most popular software packages used currently to perform statistics and present data in graphical form are Excel (Microsoft), Prism (GraphPad) and SPSS (IBM). Microsoft Excel is a popular spreadsheet software package which is easily available, easy to use for data analysis (although types of analysis are limited), and useful to plot data graphically (limited in detail of graph). Prism is good for statistical analysis but excellent for plotting data (graphs produced are of professional standard). SPSS is the most complex, but most comprehensive statistical package. It allows a very detailed analysis of data using a wide range of tests. However, it is weak in interpreting the statistical analysis and the level of detail in plotting graphs.
A core module that most students would do is a research project. This requires them to put forward a research proposal, in which they design experiments and formulate hypotheses, collect data, analyse data, and then write a research report. From my many years of supervising undergraduates and postgraduate projects, I have observed that firstly, narrowing a project down to a specific aim and secondly, applying statistical analysis to the data obtained causes the most anxiety in students.
Having taught bioscience students for more than 25 years, I am clear that more help, guidance and resources should be made available to students in using statistics and displaying data. This book is intended for all undergraduate students at levels four (year 1), five (year 2) and six (year 3), studying the biological sciences (biomedical science, medical physiology, pharmacology, pharmaceutical science, human biology, biochemistry, microbiology, and biotechnology). Although most examples are drawn from the biological sciences, the statistical methods and tests covered in the book are applicable and useful for (i) students in other disciplines in medical and health subjects, including medicine, physiotherapy, podiatry, nursing, pharmacy, dentistry, and sports science, (ii) postgraduate research, and (iii) a quick refresher for those who are rusty on statistics and using statistics software.
The book starts from a basic level and builds in complexity, allowing readers to dip into the area they are more familiar with. It does not assume any prior knowledge of the area. The book layout is as follows:
Chapter 1 | introduces common terms used in statistics |
Chapter 2 | shows an overview of how to display data |
Chapter 3 | considers statistical significance and choosing inferential tests |
Chapter 4 | gives background to some common parametric tests |
Chapter 5 | gives background to some common nonparametric tests |
Chapter 6 | explains how to use Microsoft Excel with examples |
Chapter 7 | explains how to use GraphPad Prism with examples |
Chapter 8 | explains how to use IBM SPSS with examples |
Chapter 9 | briefly considers misinterpretations/errors of statistics in analysis. |
The appendices have sections on common formulas and symbols, deciding on sample size, historical milestones in statistics, background to Prism, answers to sample problems, and reference tables of critical values for statistical tests.
This book is not comprehensive in its coverage (the focus is on the most commonly used statistical tests in biomedical science) as that would have increased the size and complexity of the book. I have tried to keep the mathematical input to a minimum; however, there are areas such as analysis of variance where this was unavoidable. For those who want more depth and detail in maths and statistics, suggestions for further reading are provided. This book does not cover qualitative analysis (e.g. interviewee responses, social context, interactions with people).
This book:
1 Introduces statistical terms and analysis from the basics to a more advanced level.
2 Shows clear step by step use of three common software used in analysing data and producing graphs.
3 Uses examples of common statistical tests.
4 Does not describe areas such as enzyme and substrate reactions (e.g. Scatchard plots), or non-linear curve fitting or multiple regression.
5 Helps in deciding the factors to consider for study designs.
6 Helps in choosing appropriate tests to analyse data and to display data.
The reader should be able to answer the following questions from the use of this book.
What:
Is your study design?
Sample size is appropriate?
Are the descriptive statistics appropriate for the sample data?
Is the difference between standard deviation and standard error of the mean?
Is a confidence interval?
Is a normal distribution?
Is significance and how do you test for it?
Is the interquartile range?
Is the difference between parametric and non-parametric tests?
How do you:
Graphically describe your data?
Plot a frequency distribution plot?
Check if the sample data is normally distributed?
Decide on which statistical test to use?
Do a paired (related groups) t test?
Do an unpaired (independent groups) t test?
Do a non-parametric test (Wilcoxon)?
Do a non-parametric test (Mann–Whitney)?
Do a 1-way ANOVA test?
Do a repeated measures 1-way ANOVA?
Do a 2-way ANOVA test?
Do a correlation test?
Use Microsoft Excel software to do statistical analysis?
Use SPSS software to do statistical analysis?
Use Prism software to do statistical analysis?
I hope the readers and users of this book will (a) get a better understanding of statistical concepts and (b) be able to use the software packages with more confidence and thereby aid them in their degree studies.
Mohammed Meah BSc MSc PhD FHEA
Senior Lecturer in Physiology
Course Leader for Medical Physiology and Human Biology
University of East London,
London
“It would be so nice if something made sense for a change.”
Lewis Carroll (1832–1898), English novelist. From the book Alice in Wonderland