Hypothesis Testing17 statistical tests for formulating, testing and validating hypothesis (using Python)
Hypothesis Testing
Hypothesis testing is the process used to evaluate the strength of evidence from the sample and provides a framework for making determinations related to the population, ie, it provides a method for understanding how reliably one can extrapolate observed findings in a sample under study to the larger population from which the sample was drawn.
Importance of Hypothesis Testing
The most significant benefit of hypothesis testing is it allows you to evaluate the strength of your claim or assumption before implementing it in your data set. Also, hypothesis testing is the only valid method to prove that something “is or is not.” Other benefits include:
- Hypothesis testing provides a reliable framework for making any data decisions for your population of interest.
- It helps the researcher to extrapolate data from the sample to the larger population successfully.
- Hypothesis testing allows the researcher to determine whether the data from the sample is statistically significant.
- Hypothesis testing is one of the most important processes for measuring the validity and reliability of outcomes in any systematic investigation.
- It helps to provide links to the underlying theory and specific research questions.
17 Statistical Tests:
- Normality Tests
- Shapiro-Wilk Test
- D’Agostino’s K^2 Test
- Anderson-Darling Test
- Correlation Tests
- Pearson’s Correlation Coefficient
- Spearman’s Rank Correlation
- Kendall’s Rank Correlation
- Chi-Squared Test
- Stationary Tests
- Augmented Dickey-Fuller
- Kwiatkowski-Phillips-Schmidt-Shin
- Parametric Statistical Hypothesis Tests
- Student’s t-test
- Paired Student’s t-test
- Analysis of Variance Test (ANOVA)
- Repeated Measures ANOVA Test (No Python Algorithm)
- Nonparametric Statistical Hypothesis Tests
- Mann-Whitney U Test
- Wilcoxon Signed-Rank Test
- Kruskal-Wallis H Test
- Friedman Test
Load the dataset
import pandas as pd fact_table = pd.read_excel(r’C:/Users/Kamruzzman Shuvo/Downloads/BusinessIntelligence/BI/dataset/e-commerce-data.xlsx’, sheet_name=’Fact_table’, engine=’openpyxl’) item_dim = pd.read_excel(r’C:/Users/Kamruzzman Shuvo/Downloads/BusinessIntelligence/BI/dataset/e-commerce-data.xlsx’, sheet_name=’Item_dim’, engine=’openpyxl’) customer_dim = pd.read_excel(r’C:/Users/Kamruzzman Shuvo/Downloads/BusinessIntelligence/BI/dataset/e-commerce-data.xlsx’, sheet_name=’Coustomer_dim’, engine=’openpyxl’) time_dim = pd.read_excel(r’C:/Users/Kamruzzman Shuvo/Downloads/BusinessIntelligence/BI/dataset/e-commerce-data.xlsx’, sheet_name=’Time_dim’, engine=’openpyxl’) store_dim = pd.read_excel(r’C:/Users/Kamruzzman Shuvo/Downloads/BusinessIntelligence/BI/dataset/e-commerce-data.xlsx’, sheet_name=’Store_dim’, engine=’openpyxl’) trans_dim = pd.read_excel(r’C:/Users/Kamruzzman Shuvo/Downloads/BusinessIntelligence/BI/dataset/e-commerce-data.xlsx’, sheet_name=’Trans_dim’, engine=’openpyxl’) print(“Successfully Loaded the Dataset!”) |
Normality Tests
This section lists statistical tests that you can use to check if your data has a Gaussian distribution.
1.Shapiro-Wilk Test
Tests whether a data sample has a Gaussian distribution. The Shapiro-Wilk test tests the null hypothesis that the data was drawn from a normal distribution.
Assumptions
Observations in each sample are independent and identically distributed (iid).
Interpretation
H0: the sample has a Gaussian distribution.
H1: the sample does not have a Gaussian distribution.
# Import the library from scipyfrom scipy.stats import shapiro |
data = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]stat, p = shapiro(data)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably Gaussian’)else: print(‘Probably not Gaussian’) |
2. D’Agostino’s K2 Test
Tests whether a data sample has a Gaussian distribution.
Assumptions
Observations in each sample are independent and identically distributed (iid).
Interpretation
H0: the sample has a Gaussian distribution.
H1: the sample does not have a Gaussian distribution.
# Example of the D’Agostino’s K^2 Normality Testfrom scipy.stats import normaltestdata = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]stat, p = normaltest(data)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably Gaussian’)else: print(‘Probably not Gaussian’) |
3. Anderson-Darling Test
Tests whether a data sample has a Gaussian distribution.
Assumptions
Observations in each sample are independent and identically distributed (iid).
Interpretation
H0: the sample has a Gaussian distribution. H1: the sample does not have a Gaussian distribution.
statistic: floatThe Anderson-Darling test statistic.critical_values: listThe critical values for this distribution.significance_level:list¶The significance levels for the corresponding critical values in percents. The function returns critical values for a differing set of significance levels depending on the distribution that is being tested against.Critical values provided are for the following significance levels:normal/exponential15%, 10%, 5%, 2.5%, 1%logistic25%, 10%, 5%, 2.5%, 1%, 0.5%Gumbel25%, 10%, 5%, 2.5%, 1%If the returned statistic is larger than these critical values then for the corresponding significance level, the null hypothesis that the data come from the chosen distribution can be rejected. |
# Example of the Anderson-Darling Normality Testfrom scipy.stats import andersondata = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]result = anderson(data)print(‘stat=%.3f’ % (result.statistic))for i in range(len(result.critical_values)): sl, cv = result.significance_level[i], result.critical_values[i] if result.statistic < cv: print(‘Probably Gaussian at the %.1f%% level’ % (sl)) else: print(‘Probably not Gaussian at the %.1f%% level’ % (sl)) |
Practice problem 7.1Take some normalized quantity samples from the fact table and check whether is a Gussain Distribution or not.HintsFrist normalize data of the fact table.Then take 10-15 quantity samples from the normilized fact table.Load the library from scipy.statsAfter that use the normal tests` any algorithm for hypothesis testing. |
Correlation Tests
This section lists statistical tests that you can use to check if two samples are related.
4. Pearson’s Correlation Coefficient
Tests whether two samples have a linear relationship.
Assumptions
Observations in each sample are independent and identically distributed (iid). Observations in each sample are normally distributed. Observations in each sample have the same variance.
Interpretation
H0: the two samples are independent.
H1: there is a dependency between the samples.
# Example of the Pearson’s Correlation testfrom scipy.stats import pearsonrdata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]data2 = [0.353, 3.517, 0.125, -7.545, -0.555, -1.536, 3.350, -1.578, -3.537, -1.579]stat, p = pearsonr(data1, data2)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably independent’)else: print(‘Probably dependent’) |
Practice problem 7.2Take some normalized samples of quantity and unit price from the fact table and hypothesis that is probably independent or dependent [Use any Correlation Tests algorithm]. |
5. Spearman’s Rank Correlation
Tests whether two samples have a monotonic relationship.
Assumptions
Observations in each sample are independent and identically distributed (iid). Observations in each sample can be ranked.
Interpretation
H0: the two samples are independent.
H1: there is a dependency between the samples.
# Example of the Spearman’s Rank Correlation Testfrom scipy.stats import spearmanrdata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]data2 = [0.353, 3.517, 0.125, -7.545, -0.555, -1.536, 3.350, -1.578, -3.537, -1.579]stat, p = spearmanr(data1, data2)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably independent’)else: print(‘Probably dependent’) |
Practice problem 7.3Take some normalized samples of quantity and total_price from the fact table and hypothesis that is probably independent or dependent [Use Spearman’s Rank Correlation algorithm]. |
6. Kendall’s Rank Correlation
Tests whether two samples have a monotonic relationship.
Assumptions
Observations in each sample are independent and identically distributed (iid). Observations in each sample can be ranked.
Interpretation
H0: the two samples are independent.
H1: there is a dependency between the samples.
# Example of the Kendall’s Rank Correlation Testfrom scipy.stats import kendalltaudata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]data2 = [0.353, 3.517, 0.125, -7.545, -0.555, -1.536, 3.350, -1.578, -3.537, -1.579]stat, p = kendalltau(data1, data2)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably independent’)else: print(‘Probably dependent’) |
7. Chi-Squared Test
Tests whether two categorical variables are related or independent.
Assumptions
Observations used in the calculation of the contingency table are independent. 25 or more examples in each cell of the contingency table.
Interpretation
H0: the two samples are independent.
H1: there is a dependency between the samples.
chi2floatThe test statistic.pfloatThe p-value of the testdofintDegrees of freedomexpected :ndarray, same shape as observedThe expected frequencies, based on the marginal sums of the table. |
# Example of the Chi-Squared Testfrom scipy.stats import chi2_contingencytable = [[10, 20, 30],[6, 9, 17]]stat, p, dof, expected = chi2_contingency(table)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably independent’)else: print(‘Probably dependent’) |
Stationary Tests
This section lists statistical tests that you can use to check if a time series is stationary or not.
adf: floatThe test statistic.pvalue: floatMacKinnon’s approximate p-value based on MacKinnon (1994, 2010).usedlag: intThe number of lags used.nobs: intThe number of observations used for the ADF regression and calculation of the critical values.critical values: dictCritical values for the test statistic at the 1 %, 5 %, and 10 % levels. Based on MacKinnon (2010).icbest: floatThe maximized information criterion if autolag is not None.resstore: ResultStore, optionalA dummy class with results attached as attributes. |
8. Augmented Dickey-Fuller Unit Root Test
Tests whether a time series has a unit root, e.g. has a trend or more generally is autoregressive.
Assumptions
Observations in are temporally ordered.
Interpretation
H0: a unit root is present (series is non-stationary).
H1: a unit root is not present (series is stationary).
# Example of the Augmented Dickey-Fuller unit root testfrom statsmodels.tsa.stattools import adfullerdata = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]stat, p, lags, obs, crit, t = adfuller(data)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably not Stationary’)else: print(‘Probably Stationary’) |
Practice problem 7.4Take 10 samples of quantity from the fact table and hypothesis that is probably stationary or not stationary [Use Augmented Dickey-Fuller Unit Root Test algorithm]. |
9. Kwiatkowski-Phillips-Schmidt-Shin
Tests whether a time series is trend stationary or not.
Assumptions
Observations in are temporally ordered.
Interpretation
H0: the time series is trend-stationary.
H1: the time series is not trend-stationary.
# Example of the Kwiatkowski-Phillips-Schmidt-Shin testfrom statsmodels.tsa.stattools import kpssdata = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]stat, p, lags, crit = kpss(data)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably Stationary’)else: print(‘Probably not Stationary’) |
Practice problem 7.5Take 10 samples of year from the time dimension and hypothesis that is probably stationary or not stationary [Use Kwiatkowski-Phillips-Schmidt-Shin algorithm]. |
Parametric Statistical Hypothesis Tests
This section lists statistical tests that you can use to compare data samples.
10. Student’s t-test
Tests whether the means of two independent samples are significantly different.
Assumptions
Observations in each sample are independent and identically distributed (iid). Observations in each sample are normally distributed. Observations in each sample have the same variance.
Interpretation
H0: the means of the samples are equal.
H1: the means of the samples are unequal.
# Example of the Student’s t-testfrom scipy.stats import ttest_inddata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]stat, p = ttest_ind(data1, data2)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably the same distribution’)else: print(‘Probably different distributions’) |
11. Paired Student’s t-test
Tests whether the means of two paired samples are significantly different.
Assumptions
Observations in each sample are independent and identically distributed (iid). Observations in each sample are normally distributed. Observations in each sample have the same variance. Observations across each sample are paired.
Interpretation
H0: the means of the samples are equal.
H1: the means of the samples are unequal.
# Example of the Paired Student’s t-testfrom scipy.stats import ttest_reldata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]stat, p = ttest_rel(data1, data2)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably the same distribution’)else: print(‘Probably different distributions’) |
Practice problem 7.6Take first 10 normalized samples of unit price and total price from the fact table, and hypothesis that is probably the same or different distribution [Use Paired Student’s t-test algorithm]. |
12. Analysis of Variance Test (ANOVA)
Tests whether the means of two or more independent samples are significantly different.
Assumptions
Observations in each sample are independent and identically distributed (iid). Observations in each sample are normally distributed. Observations in each sample have the same variance.
Interpretation
H0: the means of the samples are equal.
H1: one or more of the means of the samples are unequal.
# Example of the Analysis of Variance Testfrom scipy.stats import f_onewaydata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]data3 = [-0.208, 0.696, 0.928, -1.148, -0.213, 0.229, 0.137, 0.269, -0.870, -1.204]stat, p = f_oneway(data1, data2, data3)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably the same distribution’)else: print(‘Probably different distributions’) |
Nonparametric Statistical Hypothesis Tests
13. Mann-Whitney U Test
Tests whether the distributions of two independent samples are equal or not.
Assumptions
Observations in each sample are independent and identically distributed (iid). Observations in each sample can be ranked.
Interpretation
H0: the distributions of both samples are equal.
H1: the distributions of both samples are not equal.
# Example of the Mann-Whitney U Testfrom scipy.stats import mannwhitneyudata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]stat, p = mannwhitneyu(data1, data2)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably the same distribution’)else: print(‘Probably different distributions’) |
14. Wilcoxon Signed-Rank Test
Tests whether the distributions of two paired samples are equal or not.
Assumptions
Observations in each sample are independent and identically distributed (iid). Observations in each sample can be ranked. Observations across each sample are paired.
Interpretation
H0: the distributions of both samples are equal.
H1: the distributions of both samples are not equal.
# Example of the Wilcoxon Signed-Rank Testfrom scipy.stats import wilcoxondata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]stat, p = wilcoxon(data1, data2)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably the same distribution’)else: print(‘Probably different distributions’) |
15. Kruskal-Wallis H Test
Tests whether the distributions of two or more independent samples are equal or not.
Assumptions
Observations in each sample are independent and identically distributed (iid). Observations in each sample can be ranked.
Interpretation
H0: the distributions of all samples are equal.
H1: the distributions of one or more samples are not equal.
# Example of the Kruskal-Wallis H Testfrom scipy.stats import kruskaldata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]stat, p = kruskal(data1, data2)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably the same distribution’)else: print(‘Probably different distributions’) |
Practice problem 7.6Take first 10 normalized samples of quantity and unit_price from the fact table, and hypothesis that is probably the same or different distribution [Use Kruskal-Wallis H Test algorithm]. |
16. Friedman Test
Tests whether the distributions of two or more paired samples are equal or not.
Assumptions
Observations in each sample are independent and identically distributed (iid). Observations in each sample can be ranked. Observations across each sample are paired.
Interpretation
H0: the distributions of all samples are equal.
H1: the distributions of one or more samples are not equal.
# Example of the Friedman Testfrom scipy.stats import friedmanchisquaredata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]data3 = [-0.208, 0.696, 0.928, -1.148, -0.213, 0.229, 0.137, 0.269, -0.870, -1.204]stat, p = friedmanchisquare(data1, data2, data3)print(‘stat=%.3f, p=%.3f’ % (stat, p))if p > 0.05: print(‘Probably the same distribution’)else: print(‘Probably different distributions’) |
Leave a Reply