Information

How to compute Chi-square value and degrees of freedom in Excel?

How to compute Chi-square value and degrees of freedom in Excel?



We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I am trying to understand the statistical test of the chi-square in Excel. I already made the test but I still don't know how to compute the χ² value and what it means, how to compute the degrees of freedom and how to write a propper APA approved result section. Can anyone help me and explain this in a simple way?

I have a population of n=76. 'Left' = 25 and 'Right' = 51. The hypothesis is you would expect a normal distribution ('Left' =38 and 'Right' = 38). So the question is 'is this distribution a coincidence?' The p < .029. By seeing this outcome I thought there is that is no coincidence that there isn't a normal distribution and it must be due to a variable.

Is it true in this case the degress of freedom is just n-1?


This is really more of a statistical question (except perhaps the bit about APA style). As such it probably belongs on stats.stackexchange.com .

A binary variable does not have a "normal distribution". A normal distribution is bell shaped and is relevant to continuous data.

Your null hypothesis is that the population proportions for left and right handers are equal. Thus, if your chi-square value is sufficiently large, then you might reject the null hypothesis and conclude that the population proportions are unequal.

If $k$ is the number of categories (you have two categories), then degrees of freedom for the one sample chi-square test is $k-1$ (i.e., $2-1 = 1$).

To calculate chi-square check out the example here.

The following formula in Excel should give you a p-value. For example if your chi-square value was 22, and your degrees of freedom was 3:

=1-CHISQ.DIST(22, 3,TRUE)
  • the first argument is the chi-square value
  • The second argument is the degrees of freedom
  • The third argument indicates that a cumulative distribution (CDF) is desired
  • Thus by taking 1 minus the value of the CDF you get the probability of getting a chi-square value as large or larger than the observed value.

You can check your formula by looking at existing calculated tables. http://home.comcast.net/~sharov/PopEcol/tables/chisq.html


How to compute Chi-square value and degrees of freedom in Excel? - Psychology

In this article, we move on to combine the two to analyze cross tabulations. This article focuses on the chi-square statistic as a way to quantify the relationship between two variables in a cross tabulation.

o Understand how to calculate the chi-square statistic for a cross tabulation

o Use the chi-square statistic to test hypotheses regarding cross tabulations

o A more in-depth discussion of cross tabulations and the chi-square statistic is available in a PDF document at http://eclectic.ss.uci.edu/

o A table of critical values for the chi-square statistic is available at http://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm

Now that we have gained practice creating and understanding cross tabulations and have reviewed statistical hypothesis testing, we can now analyze cross tabulations using a statistical approach. In this article, we consider several possible methods for determining whether the two variables in a bivariate cross tabulation are related.

To avoid making this discussion too vague, we will use an example cross tabulation to illustrate our procedure. As with any such procedure, the reader must be careful to differentiate between the general principles and the specifics of the example. We will use the following cross tabulation as our example these data reflect the gender and handedness of a number of survey participants.

Although we might be able to guess, simply on the basis of inspection, that these data indicate some relationship between gender and handedness. Nevertheless, we want to find some statistical method of proving that such a conjecture is warranted (or statistically significant). To this end, we introduce the chi-square statistic.

Our first step, following the hypothesis testing procedure, is to formulate a null hypothesis, which we will call H0. For our example, we'll say that

The alternative hypothesis is then simply "gender is related to handedness." The second step of the hypothesis testing procedure is to choose a significance level--let's simply select α = 0.05, which is a common value. We are now ready to calculate a test statistic in this case, we'll use the chi-square statistic. The procedure for calculating this statistic is outlined as follows.

First, we must calculate the expected frequencies, which are the probabilistic number of values we would expect in each data cell, given the values in the total cells. Consider the case of left-handed males: out of 1,236 participants in the survey, 628 were male, and 341 were left handed. The fraction of males, rm, is

Thus, we would expect that this ratio multiplied by the number of left-handed participants (341) should yield the number of left-handed males, or flm.

Note that the same logic works if we reverse the order of multiplication and first calculate the ratio of left-handed people to the total number of participants and then multiply by the total number of males. In either case, the expected frequency for a given data cell is the product of its corresponding row total and its corresponding column total divided by the grand total. Let's then calculate all the expected frequencies, placing them just below the actual values in each data cell.

Now, we must decide how we can use these expected frequencies to calculate a statistic that helps us determine if a relationship between gender and handedness exists. Such a statistic might involve the differences between the "observed" values (the actual data) and the "expected" values (which we calculated above). But because the sign of the difference is not important, we will square this difference. Furthermore, let's divide each squared difference by its corresponding expected value this creates something like a proportion rather than a full difference value. Thus, we now create a new table containing these newly calculated values. For left-handed males, we calculate the following:

If we add all of these values, we have something of an aggregate measure of how the observed data values deviate from the expected values this is the chi-square statistic, which we label χ 2 .

We now have a test statistic and its corresponding value for this data set. Our final task is to determine the critical value for this statistic and to determine whether our test statistic value exceeds this critical value. First, recall that we chose 0.05 for our α value. This is a measure of what constitutes a statistically significant deviation. Specifically, α is the probability that the test statistic exceeds the critical value thus, the smaller the α value that we choose, the less likely the conclusion of our hypothesis test will be incorrect. Using basic probability theory, we can then construct the following equation:

This simply states that the probability that our test statistic X exceeds the critical value c is α. Also,

This equation is typically what is used to construct tables of values (for the chi-square statistic, for instance). Thus, we use the value 1 – α = 0.95. To find the critical value, the best approach is usually to consult a table of values. Such tables are often available in standard statistics texts as well as online. To use the table, we must also know the number of degrees of freedom of our data (often represented using the variable n). The number of degrees of freedom is actually the number of cell values that must be specified before the remainder are determined by the row and column totals (which we used to calculate expected frequencies, for instance). This number is equal to the product of the number of variable rows minus one and the number of variable columns minus one. In our example, each variable has two possible values, leading to two variable rows and two variable columns. Subtracting one from each and calculating the product, we get unity. This is the number of degrees of freedom.

We can now consult the table to determine the critical value for the example data. We find from the table that c = 3.84. Note that the value of our test statistic, X = χ 2 = 4.85, exceeds c. Thus, we might say that with 95% certainty (which is 100% times 1 – α) we can reject the null hypothesis and conclude that according to our data, handedness is related to gender. Note that the null hypothesis was carefully chosen-the assumption was that no relationship between the variables existed. In other words, the expected values were assumed to be close to (or equal to) the observed values, so that if the squared differences became large, our test statistic would exceed the critical value and cause us to reject our initial assumption.

The following practice problem provides the opportunity to practice calculating the chi-square statistic.

Practice Problem: A certain casino game involves numbers between 1 and 32 that each have an associated color (red or black). The cross tabulation for the data is shown below.


Waypoint Assignment Submission

The assignments in this course will be submitted to Waypoint. Please refer to the instructions below to submit your assignment.

  1. Click on the Assignment Submission button below. The Waypoint “Student Dashboard” will open in a new browser window.
  2. Browse for your assignment.
  3. Click Upload.
  4. Confirm that your assignment was successfully submitted by viewing the appropriate week’s assignment tab in Waypoint.

For more detailed instructions, refer to the Waypoint Tutorial (Links to an external site.).


Calculating The Critical Chi Square Value By Hand

While you can definitely use our free critical value chi square calculator located at the top of this page, we believe that it is important that you are aware of how you can determine this value by hand. So, all you need to do is to follow a couple of steps.

Let’s imagine that you want to make an experiment in an agricultural firm. Say that the company wants to know if there is a link between cross strains of plants (hybrids) and the unwanted or unexpected plants (number of deviations) that can show up.

This specific firm has two types of corn that they are crossing: the yellow and the blue corn. Most biologists tend to agree that deviations with a chance of probability of more than 5% are not statistically significant.

So, to solve this problem, we will need to determine the critical chi square value. As we already mentioned, you can use our free critical value calculator chi square located at the top of this page.

Step #1: Determine the number of degrees of freedom

The first thing that you need to do to determine the critical chi square value is to determine the number of degrees of freedom. When the answer isn’t in the question that was provided then the degrees of freedom will be equal to the number of classes or categories minus 1. If you remember, the company crosses yellow and blue corn. So, this means that you have 2 categories. This means that:

Degrees Of Freedom = 2 – 1 = 1

Step #2: Determine the probability that the situation you are investigating would happen by chance

Now, on this step, you will need to know the probability which is usually stated in the question as well. If you get back at our example, you will immediately see that the probability if 5% or 0.05.

Step #3: Look up the degrees of freedom and the probability in the chi square table

All you need to do is to grab the value that has 1 degree of freedom and 0.05 probability in the chi square table. This number is 3.84. So, this is your critical value. You can also confirm this by using our critical value calculator chi square.


How to Calculate a Chi-square

The chi-square value is determined using the formula below:

X 2 = (observed value - expected value) 2 / expected value

Returning to our example, before the test, you had anticipated that 25% of the students in the class would achieve a score of 5. As such, you expected 25 of the 100 students would achieve a grade 5. However, in reality, 30 students achieved a score of 5. As such, the chi-square calculation is as follows:

X 2 = (30 - 25) 2 / 25 = (5) 2 / 25 = 25 / 25 = 1


How to compute Chi-square value and degrees of freedom in Excel? - Psychology

A $chi^2$ test with 3 degrees of freedom has significance level .10. Find the critical value.

A researcher wants to know whether responses to a statement (strongly agree, agree, no opinion, disagree, strongly disagree) are dependent on the gender of the interviewer. Which test should we use? Find the null hypothesis and the critical value at $alpha=.01$.

An 8-sided die is rolled 200 times in order to test whether the die is fair. Which test should we use? Find the null hypothesis and check the assumptions for the test. Find the critical value at $alpha=.05$.

Dr. Penta claims to have designed a five-sided die that is equally likely to land on sides 1 through 4, but lands the fifth side $40\%$ of the time.

  1. What kind of test should be used to test the claim? Write the null hypothesis.
  2. How large a sample is needed for the assumptions of the appropriate hypothesis test to be met?

You and a friend are munching on a bag of Harvest Blend M&M's, when your friend says, "There seems to be more yellow and brown candies than red and maroon candies. In fact, I claim there are $30\%$ yellow, $30\%$ brown, and only $20\%$ red and $20\%$ maroon." Together you count the remaining M&M's in the bag with the results below. Use the critical value method with significance level 0.05 to test your friend's claim.

$egin hbox&hbox&hbox&hbox&hbox&hboxhline hbox&58&61&55&46&220 end$

Test statistic: $ chi^2=4.189$
Critical value: $ 7.815$

Conclusion: Fail to reject the null hypothesis because the test statistic is not in the rejection region.

Inference: There is not enough evidence to reject the claim that there are $30\%$ yellow, $30\%$ brown, $20\%$ red, and $20\%$ maroon M&M's.

A sample of coin flips is collected from three different coins. The results are below. Use one hypothesis test to test the claim that all three coins have the same probability of landing heads. Use the critical value method with significance level 0.10.

$egin &hbox&hbox&hboxhline hbox&88&93&110 hline hbox &112&107&90 end$

$H_0: p_A=p_B=p_C $ (or all three coins have the same probability of landing heads.)

Test statistic: $ chi^2=5.325$
Critical value: $ 4.605$

Conclusion: Reject the null hypothesis because the test statistic is in the rejection region.

Inference: There is enough evidence to reject the claim that all three coins have the same probability of landing heads.

An advertising agency conducted a random survey of adults asking their primary source of news and educational level.

The advertising company wants to test whether there is a relationship between the 3 educational levels and the 3 primary news sources. Find the null hypothesis and degrees of freedom for the test. Show that the assumptions for the test are met for the category: "Newspapers/Not High School Graduate".

Test the claim that among college graduates, their primary news source is equally divided among newspapers, television, and the internet. Use the critical value method with significance level 0.05.

$H_0:$ The primary news source and educational level are independent.

Test statistic: $ chi^2=7.557$
Critical value: $ 5.991$

Conclusion: Reject the null hypothesis because the test statistic is in the rejection region.

Inference: There is enough evidence to reject the claim that among college graduates, their primary news source is equally divided among newspapers, television, and the internet.

A school nurse wants to determine whether age is a factor in whether children choose a healthy snack after school. She conducts a survey of 300 middle school students, with the results below. Test at $alpha=.05$ the claim that the proportion who choose a healthy snack differs by grade level. Use the critical value method.

$egin hbox &hbox <6th grade>&hbox <7th grade>&hbox<8th grade>crhline hbox &31 &43 &51 crhline hbox &69 & 57 & 49 end$

Assumptions: $ 41.7, 58.3geq 5$

$H_0: p_6=p_7=p_8 $ (equivalently: proportions who choose a healthy snack are the same for all three grade levels.)

Test statistic: $ chi^2=8.337$
Critical value: $ 5.991$

Conclusion: Reject the null hypothesis because the test statistic is in the rejection region.

Inference: There is enough evidence to support the claim that the proportion who choose a healthy snack differs by grade level.

A survey asked adults nationwide if they thought that the federal government should continue to fund unmanned missions to Mars. Fifty-six percent said they should continue, $40\%$ said they should not continue, and $4\%$ had no opinion. A random sample of 200 college students resulted in the numbers below. At significance level 0.05, test the claim that the opinions of college students on this issue differ from those of the nation as a whole. $egin hbox&hbox &hboxcrhline 126&65&9 end$

Assumptions: $ 112, 80, 8geq 5$

$H_0: p_=.56, p_=.40, p_=.04$ (equivalently: the opinions of college students are the same as the nation as a whole.)

Test statistic: $ chi^2=4.688$
Critical value: $5.991 $

Conclusion: Fail to reject the null hypothesis because the test statistic is not in the rejection region.

Inference: There is not enough evidence to support the claim that the opinions of college students on this issue differ from those of the nation as a whole.

To test the claim that snack choices are related to the gender of the consumer, a survey at a ball park shows this selection of snacks purchased. Write the null hypothesis and check the assumptions. Do not do the rest of the hypothesis test. $egin &hbox &hbox &hboxcrhline hbox&6&12&9crhline hbox&5&5&8 end$

$H_0: $ Snack choice and gender are independent.

Assumptions: $6.6, 10.2, 6.8 geq 5$. But for the category Hotdog/Female $E=4.4

$H_0:$ Student's drinking habits and the number of classes missed are independent.

d.f. $=2$ C.V. $=5.991$
The test statistic is $chi^2=2672$.

We reject $H_0$. There is incredibly significant evidence that the proportion of missed classes is related to one's drinking habit.

$widehat

=dfrac<446><11160>approx 0.04 qquad widehat=0.96$

$alpha/2=0.005 Longrightarrow z_=2.575$

Confidence interval: $(0.0352,0.0448)$

We are $99\%$ confident that the proportion of non-binger students who missed classes is between .0352 and 0.0448.

$widehat

=0.1834 qquad widehat=0.8166$ $alpha/2=0.01$ $n=widehat

widehatleft( over E> ight)^2=0.1834 cdot 0.8166 left(<2.33 over 0.05> ight)^2approx 326$

A game where colored marbles are drawn out of a bag with replacement has three possible outcomes: red, green, and blue. The game is played 100 times with the results shown below. Using $alpha= 0.05$, test the claim that the probabilities for each outcome are as follows: P(red) = .40, P(green) = .35, and P(blue) = .25. $egin hbox &hbox &hbox &hboxcrhline hbox& 32& 45& 23 end$

Assumptions: $ 40, 35, 25, geq 5$

Test statistic: $chi^2=4.617$
Critical value: $5.991$

Conclusion: Fail to reject the null hypothesis because the test statistic is not in the rejection region.

Inference: There is not enough evidence to reject the claim that the probabilities for each outcome are P(red) = .40, P(green) = .35, and P(blue) = .25.

Using the data below, test the claim that there is no difference in the color preferences of men and women. Use $alpha = .05$. $egin hbox &hbox &hbox &hboxcr hline hbox& 21&34& 45cr hbox& 36 &33&31 end$

Assumptions: $28.5, 33.5, 38 geq 5$

$H_0:$ There is no difference in the color preferences of men and women.

Test statistic: $chi^2=6.54 $
Critical value: $5.991$

Conclusion: Reject the null hypothesis because the test statistic is in the rejection region.

Inference: There is enough evidence to reject the claim that there is no difference in the color preferences of men and women.

A researcher wishes to see if the five ways (drinking caffeinated beverages, taking a nap, going for a walk, eating a sugary snack, other) people use to combat midday drowsiness are equally distributed among office workers. A sample of 60 office workers is selected, and the following data are obtained. At .10 significance level can it be concluded that there is no preference? $egin extrm & extrm & extrm & extrm & extrm & extrmhline extrm & 21 & 16 & 10 & 8 & 5 end$

If there is no preference, than all are equally likely. As there are 5 categories, the expectation is that they all occur with probability .20$.

$H_0$: There is no preference for a way to combat midday drowsiness

Test statistic: $chi^2 = 13.83$
Critical value: $7.779$

Conclusion: Reject the null hypothesis as the test statistic is in the rejection region.

Inference: There is significant evidence that the 5 methods to combat midday drowsiness are not all equally likely.

Nationwide the shares of carbon emissions for the year 2000 are transportation, 33% industry, 30% residential, 20% and commercial, 17%. A state hazardous materials official wants to see if her state is the same. Her study of 300 emissions sources finds transportation, 36% industry, 31% residential, 17% and commercial, 16%. At a 0.05 significance level, can she claim the percentages are the same?

$H_0$: The percentages are the same

Assumptions met as all calculated expected counts are $ge 5$:

$displaystyle< egin extrm & (0.33)(300) = 99 ge 5 extrm & (0.30)(300) = 90 ge 5 extrm & (0.20)(300) = 60 ge 5 extrm & (0.17)(300) = 51 ge 5 end>$

We must similarly calculate the observed counts to find the test statistic:

Conclusion: Fail to reject the null hypothesis as the test statistic was not in the rejection region.

Inference: There is no significant evidence that the state percentages are not the same as the national percentages.

A study is conducted as to whether there is a relationship between joggers and the frequency of consumption of nutritional supplements. A random sample of 210 subjects is selected, and they are classified as shown. At a 0.05 significance level, test the claim that jogging and the consumption of supplements are not related. $egin & extrm & extrm & extrmhline extrm & 34 & 52 & 23 extrm & 18 & 65 & 18 end$

$H_0$: jogging and the consumption of supplements are not related.

Assumptions are met. (all $E ge 5$)

Test statistic: $chi^2 = 6.68$
Critical value: 5.991

Conclusion: Reject the null hypothesis as the test statistic is in the rejection region

Inference: There is significant evidence that jogging and the consumption of supplements are related.

An advertising firm has decided to ask 92 customers at each of three local shopping malls if they are willing to take part in a market research survey. According to previous studies, 38% of Americans refuse to take part in such surveys. The results are shown here. At a 0.01 significance level, test the claim that the proportions of those who are willing to participate are equal.

$egin & extrm & extrm & extrmhline extrm & 52 & 45 & 36 extrm & 40 & 47 & 56 end$

$H_0$: the proportions of those who are willing to participate are equal among the 3 malls

We need to first calculate the expected counts using the marginal totals:

$egin & extrm & extrm & extrm & extrmhline extrm & 52 & 45 & 36 & 133 extrm & 40 & 47 & 56 & 143hline extrm & 92 & 92 & 92 & 276 end$

Then the expected counts are given by:

Assumptions are met (all $E ge 5$)

Conclusion: Fail to reject the null hypothesis as the test statistic is not in the rejection region.

Inference: There is no significant evidence that the proportions who participate are not the same in all three locations.

A researcher wishes to see if the proportions of workers for each type of job have changed during the last 10 years. A sample of 100 workers is selected, and the results are shown. At a 0.05 significance level, test the claim that the proportions have not changed.

$egin & extrm & extrm & extrm & extrmhline extrm <10 years ago>& 33 & 13 & 11 & 3 extrm & 18 & 12 & 8 & 2 end$

$H_0$: the proportions have not changed

Assumptions are not met. The expected count in the "Other" category is $3 otge5$.

One should not proceed with a chi-square goodness of fit test.

Test the claim that births are uniformly distributed among the months (i.e., one twelfth of the number of births occur on average in any one month), using the following data collected over the course of one year.

$egin extrm & 34 & extrm & 36 extrm & 31 & extrm & 38 extrm & 35 & extrm & 37 extrm & 32 & extrm & 36 extrm & 35 & extrm & 35 extrm & 35 & extrm & 35 end$

$H_0$: births are uniformly distributed among the months

$419$ births equally uniformly distributed would create an expectation of 34.916 births in each month.

Assumptions met: $34.916 ge 5$.

Test statistic: $chi^2 = 1.1718$
Critical value: 19.675

Conclusion: Fail to reject the null hypothesis as the test statistic in not in the rejection region.

Inference: There is no significant evidence that the births are not uniformly distributed among the months.

Based on the following data from the doomed voyage of the Titanic. decide if the chances that a randomly selected passenger survived was independent of their status.

$egin & extrm & extrm <1st Class>& extrm <2nd Class>& extrm <3rd Class>& extrm hline extrm & 212 & 202 & 118 & 178 & 710 extrm & 673 & 123 & 167 & 528 & 1491hline extrm & 885 & 325 & 285 & 706 & 2201 end$

$H_0$: The chances that a randomly selected passenger survived was independent of their status

Assumptions met as calculated expectations below are all $ge 5$:

Test statistic: $chi^2 = 187.79$
Critical Value: degrees freedom $(4-1)(2-1) = 3$ and $alpha = 0.05$ (default) tells us the critical value is 7.815.

Conclusion: Reject the null hypothesis as the test statistic is in the rejection region.

Inference: There is evidence that passenger's survival is related to their status.

Decide if the proportions of Democrats, Republicans, and Independents are the same for both men and women, based on the following sample data. $egin & extrm & extrm & extrmhline extrm & 36 & 45 & 24 extrm & 48 & 33 & 16 end$

$H_0$: The proportions of democrats, republicans, and independents are the same for both men and women

Assumptions met as calculated expectations below are all $ge 5$:

Test statistic: $chi^2 = 4.8512$
Critical Value: $5.991$ (at default $alpha = 0.05$)

Conclusion: Fail to reject the null hypothesis, as the test statistic is not in the rejection region.

Inference: There is no significant evidence that the proportions of Democrats, Republican, and Independents are different for men and women.

It is a common belief that more fatal car crashes occur on certain days of the week, such as Friday or Saturday. A sample of motor vehicle deaths is randomly selected for a recent year. The number of fatalities for the different days of the week are listed below. At the .05$ significance level, test the claim that accidents occur with equal frequency on the different days. State the null hypothesis, test statistic, critical value, your conclusion and interpretation. $egin extrm & extrm & extrm & extrm & extrm & extrm & extrm & extrmhline extrm & 31 & 20 & 20 & 22 & 22 & 29 & 26hline end$

In a study of drug abuse in a local high school, the school board selected 100 eighth graders, 100 sophomores and 100 seniors randomly from their respective rolls for each grade. Each student was then asked if they used a particular drug frequently, seldom or never. The data are summarized in the table given below. Is there evidence to suggest that the frequency of drug use is the same across the three different grades? State the null hypothesis, give the test statistic, test criterion, conclusion, and interpretation.

$egin extrm & extrm & extrm & extrmhline extrm <8th Grade>& 15 & 30 & 55hline extrm & 20 & 35 & 45hline extrm & 25 & 35 & 40hline end$

In an experiment on extrasensory perception, subjects were asked to identify the month showing on a calendar in the next room. If the results were as shown, test the claim that months were selected with equal frequencies. Assume a significance level of .05$, If it appears that the months were not selected with equal frequencies, is the claim that the subjects have extrasensory perception supported? $egin <|c|c|c|c|c|c|c|c|c|c|c|c|> extrm & extrm & extrm & extrm & extrm & extrm & extrm & extrm & extrm & extrm & extrm & extrmhline 23 & 21 & 35 & 31 & 22 & 41 & 12 & 14 & 10 & 26 & 30 & 24hline end$

You suspect that a die is unfair. Your roll it 60 times and get the following results: $egin extrm & 1 & 2 & 3 & 4 & 5 & 6hline extrm & 10 & 12 & 14 & 8 & 12 & 4hline end$ Determine if the above distribution is significantly different from the expected distribution assuming that the die is fair.

Students at Oxford were asked to indicate their agreement with the following statement: "I find mathematics challenging but I am able to make a good grade." Is there a difference in the distributions of responses between males and females? Students responded as follows: $egin & extrm & extrm & extrm & extrmhline extrm & 75 & 10 & 85 & 170hline extrm & 121 & 8 & 51 & 180hline end$ Give the null hypothesis, test statistic, critical value at an appropriate alpha level, conclusion, and interpretation.

Students were asked to respond to the following statement: "Participating in study groups is an effective way to study for some courses." Is there a significant difference in the responses of freshmen and sophomores? Show appropriate hypothesis testing responses. $egin & extrm & extrm & extrmhline extrm & 34 & 21 & 35hline extrm & 54 & 12 & 29hline end$

A pair of dice was rolled 500 times. The sums that occurred were as recorded in the following table. Test whether the dice seem fair based on this data. For example, $P(2,3, extrm < or >4) = 1/6$ and the sums $2$, $3$, and $4$ occurred at total of $74$ times. Since the dice were rolled $500$ times, one would expect $83.3$ ($500 imes 1/6 approx 83.3$) occurrences of rolling a $2$, $3$, or $4$, so $83.3$ is the expected value. $egin extrm & <2,3,4>& <5,6>& <7>& <8,9>& <10,11,12>hline extrm & 74 & 120 & 83 & 135 & 88hline end$ Now rework this problem using the actual observed values for each sum: $egin extrm & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12hline extrm & 12 & 26 & 36 & 58 & 62 & 83 & 102 & 33 & 20 & 9 & 59hline end$ Did you find that testing the die this way was significant? Which way would be the best for determining if a die were fair?


How Do You Calculate Degrees of Freedom for Chi Square Tests?

To calculate the degrees of freedom for a chi-square test, first create a contingency table and then determine the number of rows and columns that are in the chi-square test. Take the number of rows minus one and multiply that number by the number of columns minus one. The resulting figure is the degrees of freedom for the chi-square test.

Create a contingency table with two categorical variables: one represented in the rows and the other represented in the columns. When a researcher wants to compare the counts of more than one categorical variable, he creates a contingency table in which one variable represents the columns and another represents the rows. Categorical variables are variables that are not numbers. Gender, which is categorized as male or female, is an example of a categorical variable.

Count the total number of rows in the contingency table. For example, if gender is a variable, there are two rows: one for male and one for female. Count the number of columns in the contingency table. Subtract one from the number of rows and one from the number of columns.

Multiply the two numbers that you generated in the second step. The result of this operation is the number of degrees of freedom.


Each F ratio is computed by dividing the MS value by another MS value. The MS value for the denominator depends on the experimental design.

  • For two-way ANOVA with no repeated measures: The denominator MS value is always the MSresidual.
  • For two-way ANOVA with repeated measures in one factor (p 596 of Maxwell and Delaney):
    • For interaction, the denominator MS is MSresidual
    • For the factor that is not repeated measures, the denominator MS is MSsubjects
    • For the factor that is repeated measures, the denominator MS is MSresidual
    • For Row Factor, the denominator MS is for Interaction of Row factor x Subjects
    • For Column Factor, the denominator MS is for Interaction of Column factor x Subjects
    • For the Interaction:Row Factor x Column Factor, the denominator MS is for Residuals (also called the interaction of Row x Column x Subjects)

    Assumptions of the Chi-square

    As with parametric tests, the non-parametric tests, including the χ 2 assume the data were obtained through random selection. However, it is not uncommon to find inferential statistics used when data are from convenience samples rather than random samples. (To have confidence in the results when the random sampling assumption is violated, several replication studies should be performed with essentially the same result obtained). Each non-parametric test has its own specific assumptions as well. The assumptions of the Chi-square include:

    The data in the cells should be frequencies, or counts of cases rather than percentages or some other transformation of the data.

    The levels (or categories) of the variables are mutually exclusive. That is, a particular subject fits into one and only one level of each of the variables.

    Each subject may contribute data to one and only one cell in the χ 2 . If, for example, the same subjects are tested over time such that the comparisons are of the same subjects at Time 1, Time 2, Time 3, etc., then χ 2 may not be used.

    The study groups must be independent. This means that a different test must be used if the two groups are related. For example, a different test must be used if the researcher’s data consists of paired samples, such as in studies in which a parent is paired with his or her child.

    There are 2 variables, and both are measured as categories, usually at the nominal level. However, data may be ordinal data. Interval or ratio data that have been collapsed into ordinal categories may also be used. While Chi-square has no rule about limiting the number of cells (by limiting the number of categories for each variable), a very large number of cells (over 20) can make it difficult to meet assumption #6 below, and to interpret the meaning of the results.

    The value of the cell expecteds should be 5 or more in at least 80% of the cells, and no cell should have an expected of less than one (3). This assumption is most likely to be met if the sample size equals at least the number of cells multiplied by 5. Essentially, this assumption specifies the number of cases (sample size) needed to use the χ 2 for any number of cells in that χ 2 . This requirement will be fully explained in the example of the calculation of the statistic in the case study example.


    Degrees of freedom for Chi-squared test

    I am facing the following dilemma. I am aware of how to handle the one-sided Chi-squared distribution, but I am falling victim to how to handle degrees of freedom. Let me clarify with an example what I mean.

    I have the following obseverd and expected values:

    My question is: Since this is a one sided-Chi square test, are the degrees of freedom counted by the formula: (columns-1)(rows-1), in which case I would have $(6-1)(2-1) = 5$?

    Or is that really just country1 country2 country3 that matters, so that d.f. would be 3-1=2?

    Because d.f. is usually defined as the terms for the chi squared = 6, where we usually subtract 1 from it.

    Please help me out with this one.


    How to Calculate a Chi-square

    The chi-square value is determined using the formula below:

    X 2 = (observed value - expected value) 2 / expected value

    Returning to our example, before the test, you had anticipated that 25% of the students in the class would achieve a score of 5. As such, you expected 25 of the 100 students would achieve a grade 5. However, in reality, 30 students achieved a score of 5. As such, the chi-square calculation is as follows:

    X 2 = (30 - 25) 2 / 25 = (5) 2 / 25 = 25 / 25 = 1


    Assumptions of the Chi-square

    As with parametric tests, the non-parametric tests, including the χ 2 assume the data were obtained through random selection. However, it is not uncommon to find inferential statistics used when data are from convenience samples rather than random samples. (To have confidence in the results when the random sampling assumption is violated, several replication studies should be performed with essentially the same result obtained). Each non-parametric test has its own specific assumptions as well. The assumptions of the Chi-square include:

    The data in the cells should be frequencies, or counts of cases rather than percentages or some other transformation of the data.

    The levels (or categories) of the variables are mutually exclusive. That is, a particular subject fits into one and only one level of each of the variables.

    Each subject may contribute data to one and only one cell in the χ 2 . If, for example, the same subjects are tested over time such that the comparisons are of the same subjects at Time 1, Time 2, Time 3, etc., then χ 2 may not be used.

    The study groups must be independent. This means that a different test must be used if the two groups are related. For example, a different test must be used if the researcher’s data consists of paired samples, such as in studies in which a parent is paired with his or her child.

    There are 2 variables, and both are measured as categories, usually at the nominal level. However, data may be ordinal data. Interval or ratio data that have been collapsed into ordinal categories may also be used. While Chi-square has no rule about limiting the number of cells (by limiting the number of categories for each variable), a very large number of cells (over 20) can make it difficult to meet assumption #6 below, and to interpret the meaning of the results.

    The value of the cell expecteds should be 5 or more in at least 80% of the cells, and no cell should have an expected of less than one (3). This assumption is most likely to be met if the sample size equals at least the number of cells multiplied by 5. Essentially, this assumption specifies the number of cases (sample size) needed to use the χ 2 for any number of cells in that χ 2 . This requirement will be fully explained in the example of the calculation of the statistic in the case study example.


    How Do You Calculate Degrees of Freedom for Chi Square Tests?

    To calculate the degrees of freedom for a chi-square test, first create a contingency table and then determine the number of rows and columns that are in the chi-square test. Take the number of rows minus one and multiply that number by the number of columns minus one. The resulting figure is the degrees of freedom for the chi-square test.

    Create a contingency table with two categorical variables: one represented in the rows and the other represented in the columns. When a researcher wants to compare the counts of more than one categorical variable, he creates a contingency table in which one variable represents the columns and another represents the rows. Categorical variables are variables that are not numbers. Gender, which is categorized as male or female, is an example of a categorical variable.

    Count the total number of rows in the contingency table. For example, if gender is a variable, there are two rows: one for male and one for female. Count the number of columns in the contingency table. Subtract one from the number of rows and one from the number of columns.

    Multiply the two numbers that you generated in the second step. The result of this operation is the number of degrees of freedom.


    Degrees of freedom for Chi-squared test

    I am facing the following dilemma. I am aware of how to handle the one-sided Chi-squared distribution, but I am falling victim to how to handle degrees of freedom. Let me clarify with an example what I mean.

    I have the following obseverd and expected values:

    My question is: Since this is a one sided-Chi square test, are the degrees of freedom counted by the formula: (columns-1)(rows-1), in which case I would have $(6-1)(2-1) = 5$?

    Or is that really just country1 country2 country3 that matters, so that d.f. would be 3-1=2?

    Because d.f. is usually defined as the terms for the chi squared = 6, where we usually subtract 1 from it.

    Please help me out with this one.


    Each F ratio is computed by dividing the MS value by another MS value. The MS value for the denominator depends on the experimental design.

    • For two-way ANOVA with no repeated measures: The denominator MS value is always the MSresidual.
    • For two-way ANOVA with repeated measures in one factor (p 596 of Maxwell and Delaney):
      • For interaction, the denominator MS is MSresidual
      • For the factor that is not repeated measures, the denominator MS is MSsubjects
      • For the factor that is repeated measures, the denominator MS is MSresidual
      • For Row Factor, the denominator MS is for Interaction of Row factor x Subjects
      • For Column Factor, the denominator MS is for Interaction of Column factor x Subjects
      • For the Interaction:Row Factor x Column Factor, the denominator MS is for Residuals (also called the interaction of Row x Column x Subjects)

      How to compute Chi-square value and degrees of freedom in Excel? - Psychology

      In this article, we move on to combine the two to analyze cross tabulations. This article focuses on the chi-square statistic as a way to quantify the relationship between two variables in a cross tabulation.

      o Understand how to calculate the chi-square statistic for a cross tabulation

      o Use the chi-square statistic to test hypotheses regarding cross tabulations

      o A more in-depth discussion of cross tabulations and the chi-square statistic is available in a PDF document at http://eclectic.ss.uci.edu/

      o A table of critical values for the chi-square statistic is available at http://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm

      Now that we have gained practice creating and understanding cross tabulations and have reviewed statistical hypothesis testing, we can now analyze cross tabulations using a statistical approach. In this article, we consider several possible methods for determining whether the two variables in a bivariate cross tabulation are related.

      To avoid making this discussion too vague, we will use an example cross tabulation to illustrate our procedure. As with any such procedure, the reader must be careful to differentiate between the general principles and the specifics of the example. We will use the following cross tabulation as our example these data reflect the gender and handedness of a number of survey participants.

      Although we might be able to guess, simply on the basis of inspection, that these data indicate some relationship between gender and handedness. Nevertheless, we want to find some statistical method of proving that such a conjecture is warranted (or statistically significant). To this end, we introduce the chi-square statistic.

      Our first step, following the hypothesis testing procedure, is to formulate a null hypothesis, which we will call H0. For our example, we'll say that

      The alternative hypothesis is then simply "gender is related to handedness." The second step of the hypothesis testing procedure is to choose a significance level--let's simply select α = 0.05, which is a common value. We are now ready to calculate a test statistic in this case, we'll use the chi-square statistic. The procedure for calculating this statistic is outlined as follows.

      First, we must calculate the expected frequencies, which are the probabilistic number of values we would expect in each data cell, given the values in the total cells. Consider the case of left-handed males: out of 1,236 participants in the survey, 628 were male, and 341 were left handed. The fraction of males, rm, is

      Thus, we would expect that this ratio multiplied by the number of left-handed participants (341) should yield the number of left-handed males, or flm.

      Note that the same logic works if we reverse the order of multiplication and first calculate the ratio of left-handed people to the total number of participants and then multiply by the total number of males. In either case, the expected frequency for a given data cell is the product of its corresponding row total and its corresponding column total divided by the grand total. Let's then calculate all the expected frequencies, placing them just below the actual values in each data cell.

      Now, we must decide how we can use these expected frequencies to calculate a statistic that helps us determine if a relationship between gender and handedness exists. Such a statistic might involve the differences between the "observed" values (the actual data) and the "expected" values (which we calculated above). But because the sign of the difference is not important, we will square this difference. Furthermore, let's divide each squared difference by its corresponding expected value this creates something like a proportion rather than a full difference value. Thus, we now create a new table containing these newly calculated values. For left-handed males, we calculate the following:

      If we add all of these values, we have something of an aggregate measure of how the observed data values deviate from the expected values this is the chi-square statistic, which we label χ 2 .

      We now have a test statistic and its corresponding value for this data set. Our final task is to determine the critical value for this statistic and to determine whether our test statistic value exceeds this critical value. First, recall that we chose 0.05 for our α value. This is a measure of what constitutes a statistically significant deviation. Specifically, α is the probability that the test statistic exceeds the critical value thus, the smaller the α value that we choose, the less likely the conclusion of our hypothesis test will be incorrect. Using basic probability theory, we can then construct the following equation:

      This simply states that the probability that our test statistic X exceeds the critical value c is α. Also,

      This equation is typically what is used to construct tables of values (for the chi-square statistic, for instance). Thus, we use the value 1 – α = 0.95. To find the critical value, the best approach is usually to consult a table of values. Such tables are often available in standard statistics texts as well as online. To use the table, we must also know the number of degrees of freedom of our data (often represented using the variable n). The number of degrees of freedom is actually the number of cell values that must be specified before the remainder are determined by the row and column totals (which we used to calculate expected frequencies, for instance). This number is equal to the product of the number of variable rows minus one and the number of variable columns minus one. In our example, each variable has two possible values, leading to two variable rows and two variable columns. Subtracting one from each and calculating the product, we get unity. This is the number of degrees of freedom.

      We can now consult the table to determine the critical value for the example data. We find from the table that c = 3.84. Note that the value of our test statistic, X = χ 2 = 4.85, exceeds c. Thus, we might say that with 95% certainty (which is 100% times 1 – α) we can reject the null hypothesis and conclude that according to our data, handedness is related to gender. Note that the null hypothesis was carefully chosen-the assumption was that no relationship between the variables existed. In other words, the expected values were assumed to be close to (or equal to) the observed values, so that if the squared differences became large, our test statistic would exceed the critical value and cause us to reject our initial assumption.

      The following practice problem provides the opportunity to practice calculating the chi-square statistic.

      Practice Problem: A certain casino game involves numbers between 1 and 32 that each have an associated color (red or black). The cross tabulation for the data is shown below.


      How to compute Chi-square value and degrees of freedom in Excel? - Psychology

      A $chi^2$ test with 3 degrees of freedom has significance level .10. Find the critical value.

      A researcher wants to know whether responses to a statement (strongly agree, agree, no opinion, disagree, strongly disagree) are dependent on the gender of the interviewer. Which test should we use? Find the null hypothesis and the critical value at $alpha=.01$.

      An 8-sided die is rolled 200 times in order to test whether the die is fair. Which test should we use? Find the null hypothesis and check the assumptions for the test. Find the critical value at $alpha=.05$.

      Dr. Penta claims to have designed a five-sided die that is equally likely to land on sides 1 through 4, but lands the fifth side $40\%$ of the time.

      1. What kind of test should be used to test the claim? Write the null hypothesis.
      2. How large a sample is needed for the assumptions of the appropriate hypothesis test to be met?

      You and a friend are munching on a bag of Harvest Blend M&M's, when your friend says, "There seems to be more yellow and brown candies than red and maroon candies. In fact, I claim there are $30\%$ yellow, $30\%$ brown, and only $20\%$ red and $20\%$ maroon." Together you count the remaining M&M's in the bag with the results below. Use the critical value method with significance level 0.05 to test your friend's claim.

      $egin hbox&hbox&hbox&hbox&hbox&hboxhline hbox&58&61&55&46&220 end$

      Test statistic: $ chi^2=4.189$
      Critical value: $ 7.815$

      Conclusion: Fail to reject the null hypothesis because the test statistic is not in the rejection region.

      Inference: There is not enough evidence to reject the claim that there are $30\%$ yellow, $30\%$ brown, $20\%$ red, and $20\%$ maroon M&M's.

      A sample of coin flips is collected from three different coins. The results are below. Use one hypothesis test to test the claim that all three coins have the same probability of landing heads. Use the critical value method with significance level 0.10.

      $egin &hbox&hbox&hboxhline hbox&88&93&110 hline hbox &112&107&90 end$

      $H_0: p_A=p_B=p_C $ (or all three coins have the same probability of landing heads.)

      Test statistic: $ chi^2=5.325$
      Critical value: $ 4.605$

      Conclusion: Reject the null hypothesis because the test statistic is in the rejection region.

      Inference: There is enough evidence to reject the claim that all three coins have the same probability of landing heads.

      An advertising agency conducted a random survey of adults asking their primary source of news and educational level.

      The advertising company wants to test whether there is a relationship between the 3 educational levels and the 3 primary news sources. Find the null hypothesis and degrees of freedom for the test. Show that the assumptions for the test are met for the category: "Newspapers/Not High School Graduate".

      Test the claim that among college graduates, their primary news source is equally divided among newspapers, television, and the internet. Use the critical value method with significance level 0.05.

      $H_0:$ The primary news source and educational level are independent.

      Test statistic: $ chi^2=7.557$
      Critical value: $ 5.991$

      Conclusion: Reject the null hypothesis because the test statistic is in the rejection region.

      Inference: There is enough evidence to reject the claim that among college graduates, their primary news source is equally divided among newspapers, television, and the internet.

      A school nurse wants to determine whether age is a factor in whether children choose a healthy snack after school. She conducts a survey of 300 middle school students, with the results below. Test at $alpha=.05$ the claim that the proportion who choose a healthy snack differs by grade level. Use the critical value method.

      $egin hbox &hbox <6th grade>&hbox <7th grade>&hbox<8th grade>crhline hbox &31 &43 &51 crhline hbox &69 & 57 & 49 end$

      Assumptions: $ 41.7, 58.3geq 5$

      $H_0: p_6=p_7=p_8 $ (equivalently: proportions who choose a healthy snack are the same for all three grade levels.)

      Test statistic: $ chi^2=8.337$
      Critical value: $ 5.991$

      Conclusion: Reject the null hypothesis because the test statistic is in the rejection region.

      Inference: There is enough evidence to support the claim that the proportion who choose a healthy snack differs by grade level.

      A survey asked adults nationwide if they thought that the federal government should continue to fund unmanned missions to Mars. Fifty-six percent said they should continue, $40\%$ said they should not continue, and $4\%$ had no opinion. A random sample of 200 college students resulted in the numbers below. At significance level 0.05, test the claim that the opinions of college students on this issue differ from those of the nation as a whole. $egin hbox&hbox &hboxcrhline 126&65&9 end$

      Assumptions: $ 112, 80, 8geq 5$

      $H_0: p_=.56, p_=.40, p_=.04$ (equivalently: the opinions of college students are the same as the nation as a whole.)

      Test statistic: $ chi^2=4.688$
      Critical value: $5.991 $

      Conclusion: Fail to reject the null hypothesis because the test statistic is not in the rejection region.

      Inference: There is not enough evidence to support the claim that the opinions of college students on this issue differ from those of the nation as a whole.

      To test the claim that snack choices are related to the gender of the consumer, a survey at a ball park shows this selection of snacks purchased. Write the null hypothesis and check the assumptions. Do not do the rest of the hypothesis test. $egin &hbox &hbox &hboxcrhline hbox&6&12&9crhline hbox&5&5&8 end$

      $H_0: $ Snack choice and gender are independent.

      Assumptions: $6.6, 10.2, 6.8 geq 5$. But for the category Hotdog/Female $E=4.4

      $H_0:$ Student's drinking habits and the number of classes missed are independent.

      d.f. $=2$ C.V. $=5.991$
      The test statistic is $chi^2=2672$.

      We reject $H_0$. There is incredibly significant evidence that the proportion of missed classes is related to one's drinking habit.

      $widehat

      =dfrac<446><11160>approx 0.04 qquad widehat=0.96$

      $alpha/2=0.005 Longrightarrow z_=2.575$

      Confidence interval: $(0.0352,0.0448)$

      We are $99\%$ confident that the proportion of non-binger students who missed classes is between .0352 and 0.0448.

      $widehat

      =0.1834 qquad widehat=0.8166$ $alpha/2=0.01$ $n=widehat

      widehatleft( over E> ight)^2=0.1834 cdot 0.8166 left(<2.33 over 0.05> ight)^2approx 326$

      A game where colored marbles are drawn out of a bag with replacement has three possible outcomes: red, green, and blue. The game is played 100 times with the results shown below. Using $alpha= 0.05$, test the claim that the probabilities for each outcome are as follows: P(red) = .40, P(green) = .35, and P(blue) = .25. $egin hbox &hbox &hbox &hboxcrhline hbox& 32& 45& 23 end$

      Assumptions: $ 40, 35, 25, geq 5$

      Test statistic: $chi^2=4.617$
      Critical value: $5.991$

      Conclusion: Fail to reject the null hypothesis because the test statistic is not in the rejection region.

      Inference: There is not enough evidence to reject the claim that the probabilities for each outcome are P(red) = .40, P(green) = .35, and P(blue) = .25.

      Using the data below, test the claim that there is no difference in the color preferences of men and women. Use $alpha = .05$. $egin hbox &hbox &hbox &hboxcr hline hbox& 21&34& 45cr hbox& 36 &33&31 end$

      Assumptions: $28.5, 33.5, 38 geq 5$

      $H_0:$ There is no difference in the color preferences of men and women.

      Test statistic: $chi^2=6.54 $
      Critical value: $5.991$

      Conclusion: Reject the null hypothesis because the test statistic is in the rejection region.

      Inference: There is enough evidence to reject the claim that there is no difference in the color preferences of men and women.

      A researcher wishes to see if the five ways (drinking caffeinated beverages, taking a nap, going for a walk, eating a sugary snack, other) people use to combat midday drowsiness are equally distributed among office workers. A sample of 60 office workers is selected, and the following data are obtained. At .10 significance level can it be concluded that there is no preference? $egin extrm & extrm & extrm & extrm & extrm & extrmhline extrm & 21 & 16 & 10 & 8 & 5 end$

      If there is no preference, than all are equally likely. As there are 5 categories, the expectation is that they all occur with probability .20$.

      $H_0$: There is no preference for a way to combat midday drowsiness

      Test statistic: $chi^2 = 13.83$
      Critical value: $7.779$

      Conclusion: Reject the null hypothesis as the test statistic is in the rejection region.

      Inference: There is significant evidence that the 5 methods to combat midday drowsiness are not all equally likely.

      Nationwide the shares of carbon emissions for the year 2000 are transportation, 33% industry, 30% residential, 20% and commercial, 17%. A state hazardous materials official wants to see if her state is the same. Her study of 300 emissions sources finds transportation, 36% industry, 31% residential, 17% and commercial, 16%. At a 0.05 significance level, can she claim the percentages are the same?

      $H_0$: The percentages are the same

      Assumptions met as all calculated expected counts are $ge 5$:

      $displaystyle< egin extrm & (0.33)(300) = 99 ge 5 extrm & (0.30)(300) = 90 ge 5 extrm & (0.20)(300) = 60 ge 5 extrm & (0.17)(300) = 51 ge 5 end>$

      We must similarly calculate the observed counts to find the test statistic:

      Conclusion: Fail to reject the null hypothesis as the test statistic was not in the rejection region.

      Inference: There is no significant evidence that the state percentages are not the same as the national percentages.

      A study is conducted as to whether there is a relationship between joggers and the frequency of consumption of nutritional supplements. A random sample of 210 subjects is selected, and they are classified as shown. At a 0.05 significance level, test the claim that jogging and the consumption of supplements are not related. $egin & extrm & extrm & extrmhline extrm & 34 & 52 & 23 extrm & 18 & 65 & 18 end$

      $H_0$: jogging and the consumption of supplements are not related.

      Assumptions are met. (all $E ge 5$)

      Test statistic: $chi^2 = 6.68$
      Critical value: 5.991

      Conclusion: Reject the null hypothesis as the test statistic is in the rejection region

      Inference: There is significant evidence that jogging and the consumption of supplements are related.

      An advertising firm has decided to ask 92 customers at each of three local shopping malls if they are willing to take part in a market research survey. According to previous studies, 38% of Americans refuse to take part in such surveys. The results are shown here. At a 0.01 significance level, test the claim that the proportions of those who are willing to participate are equal.

      $egin & extrm & extrm & extrmhline extrm & 52 & 45 & 36 extrm & 40 & 47 & 56 end$

      $H_0$: the proportions of those who are willing to participate are equal among the 3 malls

      We need to first calculate the expected counts using the marginal totals:

      $egin & extrm & extrm & extrm & extrmhline extrm & 52 & 45 & 36 & 133 extrm & 40 & 47 & 56 & 143hline extrm & 92 & 92 & 92 & 276 end$

      Then the expected counts are given by:

      Assumptions are met (all $E ge 5$)

      Conclusion: Fail to reject the null hypothesis as the test statistic is not in the rejection region.

      Inference: There is no significant evidence that the proportions who participate are not the same in all three locations.

      A researcher wishes to see if the proportions of workers for each type of job have changed during the last 10 years. A sample of 100 workers is selected, and the results are shown. At a 0.05 significance level, test the claim that the proportions have not changed.

      $egin & extrm & extrm & extrm & extrmhline extrm <10 years ago>& 33 & 13 & 11 & 3 extrm & 18 & 12 & 8 & 2 end$

      $H_0$: the proportions have not changed

      Assumptions are not met. The expected count in the "Other" category is $3 otge5$.

      One should not proceed with a chi-square goodness of fit test.

      Test the claim that births are uniformly distributed among the months (i.e., one twelfth of the number of births occur on average in any one month), using the following data collected over the course of one year.

      $egin extrm & 34 & extrm & 36 extrm & 31 & extrm & 38 extrm & 35 & extrm & 37 extrm & 32 & extrm & 36 extrm & 35 & extrm & 35 extrm & 35 & extrm & 35 end$

      $H_0$: births are uniformly distributed among the months

      $419$ births equally uniformly distributed would create an expectation of 34.916 births in each month.

      Assumptions met: $34.916 ge 5$.

      Test statistic: $chi^2 = 1.1718$
      Critical value: 19.675

      Conclusion: Fail to reject the null hypothesis as the test statistic in not in the rejection region.

      Inference: There is no significant evidence that the births are not uniformly distributed among the months.

      Based on the following data from the doomed voyage of the Titanic. decide if the chances that a randomly selected passenger survived was independent of their status.

      $egin & extrm & extrm <1st Class>& extrm <2nd Class>& extrm <3rd Class>& extrm hline extrm & 212 & 202 & 118 & 178 & 710 extrm & 673 & 123 & 167 & 528 & 1491hline extrm & 885 & 325 & 285 & 706 & 2201 end$

      $H_0$: The chances that a randomly selected passenger survived was independent of their status

      Assumptions met as calculated expectations below are all $ge 5$:

      Test statistic: $chi^2 = 187.79$
      Critical Value: degrees freedom $(4-1)(2-1) = 3$ and $alpha = 0.05$ (default) tells us the critical value is 7.815.

      Conclusion: Reject the null hypothesis as the test statistic is in the rejection region.

      Inference: There is evidence that passenger's survival is related to their status.

      Decide if the proportions of Democrats, Republicans, and Independents are the same for both men and women, based on the following sample data. $egin & extrm & extrm & extrmhline extrm & 36 & 45 & 24 extrm & 48 & 33 & 16 end$

      $H_0$: The proportions of democrats, republicans, and independents are the same for both men and women

      Assumptions met as calculated expectations below are all $ge 5$:

      Test statistic: $chi^2 = 4.8512$
      Critical Value: $5.991$ (at default $alpha = 0.05$)

      Conclusion: Fail to reject the null hypothesis, as the test statistic is not in the rejection region.

      Inference: There is no significant evidence that the proportions of Democrats, Republican, and Independents are different for men and women.

      It is a common belief that more fatal car crashes occur on certain days of the week, such as Friday or Saturday. A sample of motor vehicle deaths is randomly selected for a recent year. The number of fatalities for the different days of the week are listed below. At the .05$ significance level, test the claim that accidents occur with equal frequency on the different days. State the null hypothesis, test statistic, critical value, your conclusion and interpretation. $egin extrm & extrm & extrm & extrm & extrm & extrm & extrm & extrmhline extrm & 31 & 20 & 20 & 22 & 22 & 29 & 26hline end$

      In a study of drug abuse in a local high school, the school board selected 100 eighth graders, 100 sophomores and 100 seniors randomly from their respective rolls for each grade. Each student was then asked if they used a particular drug frequently, seldom or never. The data are summarized in the table given below. Is there evidence to suggest that the frequency of drug use is the same across the three different grades? State the null hypothesis, give the test statistic, test criterion, conclusion, and interpretation.

      $egin extrm & extrm & extrm & extrmhline extrm <8th Grade>& 15 & 30 & 55hline extrm & 20 & 35 & 45hline extrm & 25 & 35 & 40hline end$

      In an experiment on extrasensory perception, subjects were asked to identify the month showing on a calendar in the next room. If the results were as shown, test the claim that months were selected with equal frequencies. Assume a significance level of .05$, If it appears that the months were not selected with equal frequencies, is the claim that the subjects have extrasensory perception supported? $egin <|c|c|c|c|c|c|c|c|c|c|c|c|> extrm & extrm & extrm & extrm & extrm & extrm & extrm & extrm & extrm & extrm & extrm & extrmhline 23 & 21 & 35 & 31 & 22 & 41 & 12 & 14 & 10 & 26 & 30 & 24hline end$

      You suspect that a die is unfair. Your roll it 60 times and get the following results: $egin extrm & 1 & 2 & 3 & 4 & 5 & 6hline extrm & 10 & 12 & 14 & 8 & 12 & 4hline end$ Determine if the above distribution is significantly different from the expected distribution assuming that the die is fair.

      Students at Oxford were asked to indicate their agreement with the following statement: "I find mathematics challenging but I am able to make a good grade." Is there a difference in the distributions of responses between males and females? Students responded as follows: $egin & extrm & extrm & extrm & extrmhline extrm & 75 & 10 & 85 & 170hline extrm & 121 & 8 & 51 & 180hline end$ Give the null hypothesis, test statistic, critical value at an appropriate alpha level, conclusion, and interpretation.

      Students were asked to respond to the following statement: "Participating in study groups is an effective way to study for some courses." Is there a significant difference in the responses of freshmen and sophomores? Show appropriate hypothesis testing responses. $egin & extrm & extrm & extrmhline extrm & 34 & 21 & 35hline extrm & 54 & 12 & 29hline end$

      A pair of dice was rolled 500 times. The sums that occurred were as recorded in the following table. Test whether the dice seem fair based on this data. For example, $P(2,3, extrm < or >4) = 1/6$ and the sums $2$, $3$, and $4$ occurred at total of $74$ times. Since the dice were rolled $500$ times, one would expect $83.3$ ($500 imes 1/6 approx 83.3$) occurrences of rolling a $2$, $3$, or $4$, so $83.3$ is the expected value. $egin extrm & <2,3,4>& <5,6>& <7>& <8,9>& <10,11,12>hline extrm & 74 & 120 & 83 & 135 & 88hline end$ Now rework this problem using the actual observed values for each sum: $egin extrm & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12hline extrm & 12 & 26 & 36 & 58 & 62 & 83 & 102 & 33 & 20 & 9 & 59hline end$ Did you find that testing the die this way was significant? Which way would be the best for determining if a die were fair?


      Calculating The Critical Chi Square Value By Hand

      While you can definitely use our free critical value chi square calculator located at the top of this page, we believe that it is important that you are aware of how you can determine this value by hand. So, all you need to do is to follow a couple of steps.

      Let’s imagine that you want to make an experiment in an agricultural firm. Say that the company wants to know if there is a link between cross strains of plants (hybrids) and the unwanted or unexpected plants (number of deviations) that can show up.

      This specific firm has two types of corn that they are crossing: the yellow and the blue corn. Most biologists tend to agree that deviations with a chance of probability of more than 5% are not statistically significant.

      So, to solve this problem, we will need to determine the critical chi square value. As we already mentioned, you can use our free critical value calculator chi square located at the top of this page.

      Step #1: Determine the number of degrees of freedom

      The first thing that you need to do to determine the critical chi square value is to determine the number of degrees of freedom. When the answer isn’t in the question that was provided then the degrees of freedom will be equal to the number of classes or categories minus 1. If you remember, the company crosses yellow and blue corn. So, this means that you have 2 categories. This means that:

      Degrees Of Freedom = 2 – 1 = 1

      Step #2: Determine the probability that the situation you are investigating would happen by chance

      Now, on this step, you will need to know the probability which is usually stated in the question as well. If you get back at our example, you will immediately see that the probability if 5% or 0.05.

      Step #3: Look up the degrees of freedom and the probability in the chi square table

      All you need to do is to grab the value that has 1 degree of freedom and 0.05 probability in the chi square table. This number is 3.84. So, this is your critical value. You can also confirm this by using our critical value calculator chi square.


      Waypoint Assignment Submission

      The assignments in this course will be submitted to Waypoint. Please refer to the instructions below to submit your assignment.

      1. Click on the Assignment Submission button below. The Waypoint “Student Dashboard” will open in a new browser window.
      2. Browse for your assignment.
      3. Click Upload.
      4. Confirm that your assignment was successfully submitted by viewing the appropriate week’s assignment tab in Waypoint.

      For more detailed instructions, refer to the Waypoint Tutorial (Links to an external site.).