Back Statistics : Quick-revision friendly : ABM 25 Nov, 2025

Here are 100 solved previous-year style questions for CAIIB — Advanced Bank Management (Module A: Statistics) in the Quick-revision friendly format you chose.

Each item has:

  • Short Question,

  • 2–4 line explanation, and

  • Final answer (concise).

I distributed questions across the module topics (Basics, Sampling, Measures, Correlation/Regression, Time Series, Probability & Distributions, Estimation, LP & Simulation).


  1. Q: What is the difference between population and sample?
    Explanation: Population is the entire set of units of interest; a sample is a subset used to infer about the population. Samples are used because census is often impractical.
    Answer: Population = Universe; Sample = Subset.

  2. Q: Define primary and secondary data.
    Explanation: Primary data are collected firsthand for the purpose at hand (surveys, experiments). Secondary data are existing data (reports, publications). Use primary when specific, secondary for background.
    Answer: Primary = Firsthand; Secondary = Existing.

  3. Q: What is a frequency distribution?
    Explanation: A tabular summary showing classes (or values) and their frequencies — helps visualize concentration and prepare for measures like mean/SD.
    Answer: Table of frequencies.

  4. Q: Why classify and tabulate data?
    Explanation: Classification groups similar observations; tabulation arranges them in a compact form to reveal patterns and support analysis.
    Answer: For organization and clarity.

  5. Q: What is sampling error?
    Explanation: The difference between sample statistic and true population parameter owing to chance variation in selected sample. It reduces with larger sample sizes.
    Answer: Chance error.

  6. Q: State Central Limit Theorem (CLT) in one line.
    Explanation: For large n, sampling distribution of sample mean ≈ normal with mean μ and variance σ²/n, regardless of population shape (if finite variance).
    Answer: Sample mean tends to normal.

  7. Q: What is stratified sampling and when is it used?
    Explanation: Population split into homogeneous strata; then samples drawn from each stratum—used to ensure representation of subgroups and reduce variance.
    Answer: Stratified = subgroup sampling.

  8. Q: Define systematic sampling.
    Explanation: Select every k-th element from a sampling frame after a random start; simple to implement but can be biased with periodicities.
    Answer: k-th selection.

  9. Q: What is cluster sampling?
    Explanation: Population divided into clusters; a random sample of clusters is selected and all (or some) units within chosen clusters are observed—useful when frame by element is lacking.
    Answer: Group sampling.

  10. Q: When is purposive (judgmental) sampling appropriate?
    Explanation: When expert judgment selects units most relevant to purpose—useful for qualitative insight but not for statistical inference.
    Answer: Expert selection.

  11. Q: How do you compute arithmetic mean? (formula)
    Explanation: Sum of observations divided by number of observations: (\bar{x} = \frac{\sum x_i}{n}). Use for interval/ratio scales without extreme skew.
    Answer: (\bar{x}=\sum x_i/n).

  12. Q: How is median found in odd and even n?
    Explanation: For odd n, median is middle ordered value; for even n, median is average of two middle values after ordering.
    Answer: Middle (or average of two).

  13. Q: Define mode.
    Explanation: The value(s) with highest frequency in a distribution; useful for categorical data.
    Answer: Most frequent value.

  14. Q: Compute geometric mean for data 2,8.
    Explanation: GM = ((\prod x_i)^{1/n} = (2×8)^{1/2} = (16)^{1/2} = 4.)
    Answer: 4.

  15. Q: What is harmonic mean and when used?
    Explanation: HM = (n / \sum (1/x_i)); appropriate for averages of rates (e.g., speed).
    Answer: Harmonic mean for rates.

  16. Q: Define range and its limitation.
    Explanation: Range = max − min; very sensitive to extremes and doesn’t reflect dispersion of middle observations.
    Answer: Max−Min; sensitive.

  17. Q: What is variance and its formula (sample)?
    Explanation: Sample variance (s^2 = \frac{\sum (x_i-\bar{x})^2}{n-1}) measures average squared deviation from sample mean; use n−1 for unbiasedness.
    Answer: (s^2=\sum (x_i-\bar{x})^2/(n-1)).

  18. Q: Convert variance to standard deviation.
    Explanation: SD is square root of variance: (s = \sqrt{s^2}). SD has same units as data.
    Answer: Square root of variance.

  19. Q: What is coefficient of variation (CV)?
    Explanation: CV = SD/mean; expresses dispersion relative to mean, useful to compare variability across scales.
    Answer: SD divided by mean.

  20. Q: Explain positive skew.
    Explanation: Right tail longer; mean > median > mode; indicates some large outliers above central tendency.
    Answer: Right-tailed.

  21. Q: Explain kurtosis in brief.
    Explanation: Kurtosis measures peakedness/fat-tailedness relative to normal; high kurtosis = heavy tails/peaked; low = flat.
    Answer: Peakedness.

  22. Q: What is scatter diagram used for?
    Explanation: Plot of paired observations (x,y) to visualize relationship direction (positive/negative) and form (linear/nonlinear).
    Answer: Visual relationship.

  23. Q: Define Pearson correlation coefficient r (interpretation).
    Explanation: r measures linear association between −1 and +1; magnitude shows strength, sign shows direction.
    Answer: Linear relation (−1 to +1).

  24. Q: What is coefficient of determination (R²)?
    Explanation: R² = proportion of variance in dependent variable explained by independent variable(s); equals square of correlation in simple regression.
    Answer: Explained variance.

  25. Q: State least squares principle.
    Explanation: Choose regression line parameters to minimize sum of squared residuals (differences between observed and predicted y).
    Answer: Minimize squared errors.

  26. Q: Define residual.
    Explanation: Residual = observed y − predicted y; measure of prediction error for each observation.
    Answer: Prediction error.

  27. Q: Give formula for slope (b) in simple linear regression (Y on X).
    Explanation: (b = \frac{\sum (x_i-\bar{x})(y_i-\bar{y})}{\sum (x_i-\bar{x})^2}). It’s covariance/variance of X.
    Answer: Covariance/Var(X).

  28. Q: If correlation r = 0.8, what is R²?
    Explanation: R² = r² = 0.8² = 0.64; 64% variance explained.
    Answer: 0.64 (64%).

  29. Q: What is trend in time series?
    Explanation: Long-term systematic movement (up/down) over time due to structural changes (growth, technology).
    Answer: Long-term movement.

  30. Q: Define seasonal variation.
    Explanation: Regular, short-term pattern repeating within a fixed period (months/quarters) due to seasons or calendar effects.
    Answer: Periodic pattern.

  31. Q: What is cyclical variation?
    Explanation: Long-term wave-like movement related to business cycles, not of fixed period like seasonality.
    Answer: Business cycle.

  32. Q: How does moving average smooth data?
    Explanation: Replace each value by average of neighboring values to reduce short-term fluctuations and reveal trend.
    Answer: Smooths fluctuations.

  33. Q: Define index number (brief).
    Explanation: A measure showing relative change in a variable or group over time, expressed relative to a base period.
    Answer: Relative change.

  34. Q: What is deseasonalization?
    Explanation: Removing seasonal component from series (divide by seasonal index or subtract seasonal effect) to analyze underlying trend.
    Answer: Remove seasonality.

  35. Q: Define probability in classical sense.
    Explanation: Probability = favorable outcomes / total equally likely outcomes (for equally likely sample space).
    Answer: Favorable/total.

  36. Q: What is conditional probability P(A|B)?
    Explanation: Probability of A given B has occurred: (P(A|B) = P(A \cap B)/P(B)) if P(B)>0.
    Answer: (P(A\cap B)/P(B)).

  37. Q: Define discrete vs continuous random variable.
    Explanation: Discrete takes countable values; continuous takes any value in interval. Discrete uses PMF, continuous uses PDF.
    Answer: Countable vs uncountable.

  38. Q: Formula for expectation (discrete).
    Explanation: (E[X] = \sum x_i p_i); weighted average of possible values by probabilities.
    Answer: (\sum x_i p_i).

  39. Q: Variance formula (random variable).
    Explanation: Var(X) = E[(X − μ)²] = E[X²] − (E[X])².
    Answer: (E[X^2]−(E[X])^2).

  40. Q: Binomial distribution parameters?
    Explanation: Parameters n (trials) and p (success prob); P(X=k)=C(n,k)p^k(1−p)^{n−k}.
    Answer: n and p.

  41. Q: For large n and p not extreme, which distribution approximates binomial?
    Explanation: Normal approx to binomial when np and n(1−p) are ≥ ~5 (rule of thumb).
    Answer: Normal.

  42. Q: When is Poisson approximation to binomial used?
    Explanation: For large n and small p with λ = np finite; Poisson approx models rare events.
    Answer: Large n, small p.

  43. Q: Mean and variance of Poisson(λ).
    Explanation: Both mean and variance equal λ.
    Answer: Mean=Variance=λ.

  44. Q: Standard normal: mean and SD?
    Explanation: Z ~ N(0,1): mean 0, standard deviation 1.
    Answer: Mean 0, SD 1.

  45. Q: What is Z-score?
    Explanation: Standardized value: (z=(x-\mu)/\sigma); expresses how many SDs x is from mean.
    Answer: Standardized deviation.

  46. Q: Define Value at Risk (VaR) briefly.
    Explanation: VaR at α% is the maximum loss not exceeded with confidence α over a specified horizon (quantile of loss distribution).
    Answer: Quantile loss.

  47. Q: Key idea of Black-Scholes model (one line).
    Explanation: Option pricing via dynamic hedging assuming lognormal asset price, no arbitrage, continuous trading, constant volatility.
    Answer: Risk-neutral pricing.

  48. Q: Define point estimate vs interval estimate.
    Explanation: Point estimate gives single best value (e.g., sample mean); interval estimate gives a range with confidence level containing the parameter.
    Answer: Single value vs range.

  49. Q: 95% confidence interval for large sample mean formula (known σ).
    Explanation: (\bar{x} \pm z_{0.975}\cdot \sigma/\sqrt{n}). Use z=1.96 for 95% when σ known or n large.
    Answer: (\bar{x}\pm1.96\sigma/\sqrt{n}).

  50. Q: What is type I error?
    Explanation: Rejecting true null hypothesis (false positive). Probability is α (significance level).
    Answer: False rejection.

  51. Q: What is type II error?
    Explanation: Failing to reject false null (false negative). Denoted β; power = 1−β.
    Answer: False acceptance.

  52. Q: Define power of a test.
    Explanation: Probability of correctly rejecting false null = 1−β; increases with sample size and effect size.
    Answer: 1−β.

  53. Q: In linear programming, what is feasible region?
    Explanation: Set of all points satisfying constraints; feasible solutions lie inside this region.
    Answer: Constraint set.

  54. Q: Where does LP optimal solution occur (for convex feasible)?
    Explanation: At a corner (extreme) point of the feasible region (fundamental theorem of LP).
    Answer: Corner point.

  55. Q: What is simplex method purpose?
    Explanation: Iterative algorithm to move from one basic feasible solution to a better one until optimality is reached.
    Answer: Optimize LP.

  56. Q: Define simulation in one line.
    Explanation: Using random numbers and models to imitate real processes to estimate performance metrics.
    Answer: Model imitation.

  57. Q: Give one advantage of simulation.
    Explanation: Can model complex systems where analytical solutions are intractable; flexible and intuitive.
    Answer: Flexibility.

  58. Q: What is Monte Carlo simulation?
    Explanation: Simulation using repeated random sampling to approximate probabilistic outcomes or integrals.
    Answer: Random sampling.

  59. Q: For sample of 50, SD=10, what is standard error of mean?
    Explanation: SE = SD/√n = 10/√50. √50 = 7.0711 → 10/7.0711 = 1.4142 (approx).
    Answer: ≈1.414.

  60. Q: If sample mean=100, n=36, σ=12, 95% CI for μ?
    Explanation: SE=12/6=2; z=1.96 → margin 1.96×2=3.92. CI = 100 ± 3.92 = (96.08, 103.92).
    Answer: (96.08, 103.92).

  61. Q: You observe sample proportion p̂ = 0.6, n=200. SE?
    Explanation: SE = √[p̂(1−p̂)/n] = √[0.6×0.4/200] = √[0.24/200]=√0.0012=0.03464.
    Answer: ≈0.0346.

  62. Q: Test H0: μ=50 vs H1: μ≠50, with (\bar{x}=53), σ=6, n=36; z?
    Explanation: SE=6/6=1; z=(53−50)/1=3 → p-value <0.01; reject H0.
    Answer: z=3 (reject).

  63. Q: Interpret p-value = 0.03 for α=0.05.
    Explanation: p<α so reject H0; evidence significant at 5% level.
    Answer: Reject H0.

  64. Q: In sample of 100, 40 successes, 95% CI for proportion?
    Explanation: p̂=0.4; SE=√(0.4×0.6/100)=√0.0024=0.049; margin=1.96×0.049≈0.096; CI = (0.304,0.496).
    Answer: (0.304, 0.496).

  65. Q: What is unbiased estimator?
    Explanation: Estimator whose expected value equals true parameter (e.g., sample mean unbiased for μ).
    Answer: Expectation matches.

  66. Q: For sample variance s² with denominator n−1 — why n−1?
    Explanation: n−1 gives unbiased estimator of population variance by accounting for estimation of mean.
    Answer: Unbiasedness.

  67. Q: What is law of large numbers?
    Explanation: As sample size increases, sample average converges to population mean (in probability).
    Answer: Convergence of mean.

  68. Q: Suppose X~N(100,16). What is P(X>108)?
    Explanation: σ=4; z=(108−100)/4=2; P(Z>2)=0.0228 (approx).
    Answer: ≈0.0228.

  69. Q: If Poisson λ=3, P(X=2)?
    Explanation: P= e^{-3} 3^2 /2! = e^{-3} ×9/2 = 4.5 e^{-3}. e^{-3}=0.049787 → 4.5×0.049787=0.22404.
    Answer: ≈0.2240.

  70. Q: If binomial n=10, p=0.2, P(X=0)?
    Explanation: (1−p)^n = 0.8^{10}. 0.8^2=0.64; 0.8^4=0.4096; 0.8^8=0.16777216; multiply by 0.8^2 = 0.16777216×0.64=0.107374. So ≈0.1074.
    Answer: ≈0.1074.

  71. Q: Define expected shortfall (ES) in one line.
    Explanation: ES (CVaR) is average loss in worst α% tail beyond VaR — a coherent risk measure capturing tail severity.
    Answer: Tail average loss.

  72. Q: What is the finite population correction (FPC)?
    Explanation: FPC = √[(N−n)/(N−1)] reduces SE when sampling without replacement from finite population.
    Answer: √((N−n)/(N−1)).

  73. Q: For population N=1000, n=200, FPC?
    Explanation: (1000−200)/(1000−1)=800/999=0.8008008; sqrt ≈ 0.8949.
    Answer: ≈0.8949.

  74. Q: A data set: 2,4,6,8,10 → median?
    Explanation: n=5 (odd), ordered median is 3rd value = 6.
    Answer: 6.

  75. Q: Same set 2,4,6,8 → mean and median?
    Explanation: mean=(2+4+6+8)/4=20/4=5; median=(4+6)/2=5.
    Answer: Mean=5; Median=5.

  76. Q: If sample mean increases, what happens to CV?
    Explanation: CV = SD/mean; if mean increases while SD unchanged, CV decreases (relative variability falls).
    Answer: CV decreases.

  77. Q: Compute combined mean of two groups: n1=50, mean1=60; n2=70, mean2=55.
    Explanation: Combined mean = (50×60 + 70×55) / (120) = (3000+3850)/120=6850/120=57.0833.
    Answer: ≈57.083.

  78. Q: If correlation between X and Y is 0, does it imply independence?
    Explanation: Zero correlation implies no linear relation but variables can be dependent (nonlinear dependence possible).
    Answer: Not necessarily.

  79. Q: If regression of Y on X has intercept 10 and slope 2, predicted y for x=5?
    Explanation: ŷ = 10 + 2×5 = 20.
    Answer: 20.

  80. Q: For simple regression, SSE + SSR = SST — what are these?
    Explanation: SST = total sum squares; SSR = regression sum squares (explained); SSE = error sum squares (residual). SST = SSR + SSE.
    Answer: Total = Explained + Error.

  81. Q: If R² = 0.81, what percentage of variation explained?
    Explanation: 81% of variability in dependent variable explained by model.
    Answer: 81%.

  82. Q: Define multicollinearity in brief.
    Explanation: High correlation among independent variables in multiple regression causing unstable coefficient estimates and inflated variances.
    Answer: Predictor intercorrelation.

  83. Q: What is Durbin-Watson statistic used for?
    Explanation: Test for autocorrelation in residuals of regression (value near 2 implies no autocorrelation).
    Answer: Autocorrelation test.

  84. Q: Time series 2018=100, 2019=110 → percent change?
    Explanation: (110−100)/100 ×100 = 10%.
    Answer: 10%.

  85. Q: Given monthly sales: Jan 100, Feb 120, Mar 80 — moving average (3-month centered) for Feb?
    Explanation: 3-month MA at Feb uses Jan+Feb+Mar /3 = (100+120+80)/3=300/3=100.
    Answer: 100.

  86. Q: If annualized volatility is 16% and daily SD ≈ ? (assume 252 trading days)
    Explanation: Daily SD = annual SD / √252 = 0.16/15.8745 ≈ 0.01008 (1.008%). Calculation: √252≈15.8745.
    Answer: ≈1.01% per day.

  87. Q: What is the relation between PD (probability of default) and expected loss?
    Explanation: Expected loss = PD × LGD × EAD (probability × loss given default × exposure at default).
    Answer: EL = PD×LGD×EAD.

  88. Q: What assumption underlies Black-Scholes volatility?
    Explanation: Constant volatility and lognormal returns; continuous trading and no arbitrage.
    Answer: Constant volatility.

  89. Q: A fair die rolled 60 times; expected number of sixes?
    Explanation: Expected = n×p = 60×1/6 = 10.
    Answer: 10.

  90. Q: If α=5% two-sided, critical z?
    Explanation: z_{0.975} = 1.96.
    Answer: 1.96.

  91. Q: For hypothesis test with sample mean 52, H0 μ=50, SE=1.5; statistic?
    Explanation: z=(52−50)/1.5=2/1.5=1.3333 → p≈0.182 (two-tailed >0.05 so fail to reject).
    Answer: z≈1.333 (fail to reject).

  92. Q: In sampling without replacement, why SE smaller?
    Explanation: Because observations are negatively correlated; FPC reduces SE relative to with-replacement case.
    Answer: Because of FPC.

  93. Q: If p̂=0.25 and n=400, 99% CI for p? (z=2.576)
    Explanation: SE=√(0.25×0.75/400)=√(0.1875/400)=√0.00046875=0.02165; margin=2.576×0.02165=0.05578. CI=(0.1942,0.3058).
    Answer: ≈(0.194,0.306).

  94. Q: Define law of total probability (brief).
    Explanation: Probability of event A = sum over B_i of P(A|B_i)P(B_i) when {B_i} partition sample space.
    Answer: Partition sum rule.

  95. Q: For exponential distribution with mean 5, λ?
    Explanation: Mean = 1/λ → λ = 1/5 = 0.2.
    Answer: 0.2.

  96. Q: If daily returns have mean 0.05% and SD 1%, annual expected approximate?
    Explanation: Approx annual mean ≈ daily mean ×252 = 0.0005×252 = 0.126 = 12.6% (approx). SD annual ≈ SD×√252 = 0.01×15.8745=0.1587 = 15.87%.
    Answer: Mean≈12.6%, SD≈15.87%.

  97. Q: What is the purpose of bootstrapping?
    Explanation: Resampling with replacement from sample to estimate sampling distributions, SEs, and CIs without parametric assumptions.
    Answer: Resampling inference.

  98. Q: In simplex method, what indicates optimality?
    Explanation: All reduced costs (for a maximization) ≤ 0 (or appropriate sign) — no entering variable improves objective.
    Answer: No positive reduced cost.

  99. Q: Simulation gives sample mean 200 from 1000 runs; true mean unknown — how improve precision?
    Explanation: Increase number of simulation runs (law of large numbers) or reduce variance by variance reduction techniques (antithetic variates, control variates).
    Answer: Increase runs / variance reduction.

  100. Q: What’s the shortcut to compute covariance from summary stats?
    Explanation: Cov(X,Y) = [ΣXY − n·(\bar{x})(\bar{y})]/(n−1) for sample covariance using sums and means.
    Answer: ((\sum XY - n\bar{x}\bar{y})/(n-1)).