Categorical Data Statistics Example 4
Follow the full solution, then compare it with the other examples linked below.
Example 4
hardA researcher records the following for 100 cars: colour (red, blue, white, black, other), fuel type (petrol, diesel, electric, hybrid), and fuel efficiency (km/L). (a) Identify all categorical variables. (b) Can the researcher find a correlation between colour and fuel type? Explain.
Solution
- 1 Step 1: Categorical variables: colour and fuel type. Numerical variable: fuel efficiency.
- 2 Step 2: Correlation (like Pearson's ) measures linear association between two numerical variables. Since both colour and fuel type are categorical, a standard correlation coefficient cannot be computed. Instead, a two-way table (contingency table) and a chi-squared test could be used to check for association.
Answer
(a) Colour and fuel type are categorical. (b) No, standard correlation cannot be computed between two categorical variables; a two-way table or chi-squared test should be used instead.
Standard correlation measures require numerical data. For two categorical variables, association is assessed using contingency tables and tests like chi-squared. Choosing the wrong analysis method for the data type leads to meaningless results.
About Categorical Data
Categorical data is data that can be sorted into groups or categories, like colors, types, or names, rather than measured with numbers. You can count how many items fall into each category, but you cannot meaningfully add, subtract, or average the category labels themselves.
Learn more about Categorical Data โMore Categorical Data Examples
Example 1 easy
Classify each of the following as categorical or numerical data: (a) Favourite pizza topping, (b) He
Example 2 mediumA survey asks: 'Rate your satisfaction: Very Unsatisfied, Unsatisfied, Neutral, Satisfied, Very Sati
Example 3 mediumA student collects data on 20 classmates: shoe size, favourite subject, number of pets, and birth mo