 # Biostatistics - I

Course TypeCourse CodeNo. Of Credits
Foundation CoreSGA2PH4052

Semester and Year Offered: 1st Semester, 1st Year

Course Coordinator and Team:Samik Chowdhury

Email of course coordinator:samik@aud.ac.in

Pre-requisites: None

Aim: This applied statistics course aims for an understanding of statistical principles, concepts and techniques and their applications in the field of public health, medicine or biology. In view of the multidisciplinary backgrounds of prospective students, the course will be pitched at a basic and intermediate level, with a strong primary focus on intuitive understanding of relevant concepts. On completion of the course, students would have acquired the skills or familiarity with presenting statistical information, and interpreting and performing basic statistical analyses, making inferences and drawing conclusions from such analyses. Some of the topics that would be covered in this course are role of biostatistics in public health; variables and descriptive statistics; exploratory data presentation; uncertainty, probability and sampling; the rationale and technique behind estimation, hypothesis testing and statistical significance. The practical component of the course would familiarize students with MS-Excel and Stata/R (depending on interest and aptitude). Wherever required, discussions, examples, illustrations or exercises will be provided from a public health/health perspective or literature.

Course Outcomes:

• Recognize the diversity of data in public health and clinical studies
• Apply data summarization and visualization techniques to raw primary or secondary data on public health and health care.
• Understand of the principles, theoretical foundations as well as applications of probability, distribution, sampling and inference in public health and health care.
• Apply the statistical techniques learnt on future research projects or similar endeavours.
• Interpret, locate and communicate the results of statistical analysis to public health audience

Brief description of modules/ Main modules:

• Introduction and overview : This introductory module will familiarise students with the basic scientific philosophies of objectivism, deduction, induction, testability, corroboration, falsifiability etc. This will be followed by a discussion on the importance of evidence and measurement in public health and the role of statistics therein. The definition and scope of biostatistics and its importance for public health professionals will be the final component of this module.
• Descriptive statistics and data presentation: This module introduces students to the nature of data, descriptive statistics and basic data visualization. It will impart a conceptual understanding of (1) variables and their types, (2) accuracy and precision of data, (3) measures of central tendency (Mean, Median, Mode), (4) measures of dispersion (Range, Interquartile range, Variance and standard deviation, Coefficient of variation), (5) grouped data, (6) construction of tables (frequency distribution, relative frequency, cumulative frequency and percentiles) and (7) construction of graphs (bar charts, histograms, frequency polygons, pie charts, one-way scatter plots, box plots, two-way scatter plots, line graphs, age pyramids, radar plots)
• Probability theory and probability distribution: This module begins with (1) an overview of concepts like uncertainty, certainty, randomness, events, outcomes, sample space, mutual exclusivity, independence etc, which constitute the foundations of probability theory. It then moves on to the (2) basics of Venn diagram, set theory and notations. Finally the module explains (3) basic probability theory, (4) the multiplication law; the addition law (5) conditional probability, (6) Bayes theorem, (7) relative risks and odds ratio and (8) receiver operating characteristic (ROC) curves. The second part of this module deals with (9) discrete probability distributions (Binomial, Poisson), (10) continuous probability distribution (Normal distribution – derivation, properties, applications, central limit theorem), (11) normal approximation to Binomial and Poisson distribution, (12) probability distribution function and (13) cumulative probability distribution function
• Population, sample and sampling distribution: This module will encourage students to apply the conceptual understanding of the fundamentals of probability to sampling theory. This will include (1) idea of a population, sample and their relationship, (3) why and how to collect a sample (2) probabilistic sampling techniques (simple random, systemic, stratified, cluster etc.), (3) non-probabilistic sampling techniques (quota, judgmental, snowball, expert etc.), (4) sampling bias, (5) sampling distribution (t, Chi square and F) and sampling error and (6) distribution of sample mean, difference in sample means, sample proportion and difference between two sample proportions.
• Estimation, hypothesis testing and statistical significance: This module deals with the estimation of population parameters from sample statistics. The topics that will be covered are (1) point and interval estimation, (2) distribution and variance of means and other statistics, (3) confidence limits and interval, (4) hypothesis testing – null and alternate hypothesis, (5) interpreting statistical significance, (6) type-I and type-II errors, (7) one sample tests, two sample tests, Z test, t test, one-tailed and two-tailed test, paired test and (8) p-values.

Assessment Details with weights:

• Problem solving (30%)
• Lab assignments (40%)
• Quizzes (30%)