Chi-square test for testing hypothesis of association

What is a test of statistical significance?

Allows analysts to estimate confidence in the generalizability of results from a study.
Indicates risk of concluding a relationship in the population when there is no such relationship.
Statistically significant findings are not intrinsically significant or important.
The term significance' implies importance, but it's solely concerned with researcher's confidence in their findings.

What is the level of statistical significance?

The level of statistical significance is the risk of inferring a relationship between two variables in a population from which the sample was taken.
The maximum level of risk in business and managerial research is p < 0.05, meaning there are up to 5 chances in 100 of falsely concluding a relationship.
A significance level of p < 0.1 accepts the possibility that 10 out of 100 samples might show a relationship where none exists in the population.
• The risk of falsely inferring a relationship is greater when the risk is 1 in 10 (10 out of 100 when p < 0.1) than when the risk is 1 in 20 (5 out of 100 when p < 0.05).
For a more stringent test, the p < 0.01 level is chosen, allowing for a probability of only 1 in 100 that the results could have arisen by chance (sampling error).

We can summarize the process of hypothesis testing as follows. This procedure is only for the purposes of illustrating the logic of hypothesis testing;

1. Formulate the null hypothesis, H₀, and the alternative hypothesis, H_a, for the problem.

2. Select a significance level, α, that defines the measure of a statistically significant deviation. (Typical values for α are 0.01 and 0.05.)

3. Choose an appropriate test statistic for evaluating the hypothesis.

4. Calculate the critical value, c, according to the test statistic and the significance level, α.

5. Evaluate the data for the random variable and determine if it exceeds the critical value, c, for the test statistic. If so, then the null hypothesis should be rejected; if not, then the null hypothesis should not be rejected.

Chi-square test:

The chi-square test of independence also known as the chi-square test of association which is used to determine the association between the categorical variables. It is considered as a non-parametric test. It is mostly used to test statistical independence.

The chi-square test of independence is not appropriate when the categorical variables represent the pre-test and post-test observations. For this test, the data must meet the following requirements:

Two categorical variables
Relatively large sample size
Categories of variables (two or more)
Independence of observations

Formula

The chi-squared test is done to check if there is any difference between the observed value and expected value. The formula for chi-square can be written as;

Examples:

1. A survey on cars had conducted in 2011 and determined that 60% of car owners have only one car, 28% have two cars, and 12% have three or more. Supposing that you have decided to conduct your own survey and have collected the data below, determine whether your data supports the results of the study.

Use a significance level of 0.05. Also, given that, out of 129 car owners, 73 had one car and 38 had two cars.

Solution:

Let us state the null and alternative hypotheses.

H₀: The proportion of car owners with one, two or three cars is 0.60, 0.28 and 0.12 respectively.

H₁: The proportion of car owners with one, two or three cars does not match the proposed model.

A Chi-Square goodness of fit test is appropriate because we are examining the distribution of a single categorical variable.

Let’s tabulate the given information and calculate the required values.

Therefore, χ² = ∑(O_i – E_i)²/E_i = 0.7533

Let’s compare it to the chi-square value for the significance level 0.05.

The degrees for freedom = 3 – 1 = 2

Using the table, the critical value for a 0.05 significance level with df = 2 is 5.99.

That means that 95 times out of 100, a survey that agrees with a sample will have a χ² value of 5.99 or less.

The Chi-square statistic is only 0.7533, so we will accept the null hypothesis.

2. A certain casino game involves numbers between 1 and 32 that each have an associated color (red or black). The cross tabulation for the data is shown below.

Determine if color has any relation to evenness/oddness.

Solution: We can use hypothesis testing to determine whether such a relationship exists. Let's assume, as our null hypothesis, that color and evenness/oddness are not related, and we'll assume a significance level of α = 0.05. Note that the number of degrees of freedom, n, in this case is

Let's calculate the expected frequencies and place them below the observed values in the table. The expected frequency in each case is the product of the corresponding row and column totals divided by the grand total (32).

Now, let's calculate the values for adding into the chi-square statistic. These component values are the squared differences between the observed and expected values divided by the expected values.

We can now calculate our chi-square statistic.

From the chi-square table, we find that the critical value for one degree of freedom and 1 – α = 0.95 is 3.84. Thus, since 0.5 < 3.84 (or χ² < c), we can proceed on the assumption that our null hypothesis is correct--no relationship between color and evenness/oddness exists.

Chi-square test for testing hypothesis of association

Formula

Post a Comment

Contact Form