Five hundred residents of a city are polled to obtain information on voting intentions in an upcoming city election. The five hundred residents in this study is an example of a(n)
a. population. |
b. observation. |
c. sample. |
d. census. |
The data measured on ordinal scale exhibits all the properties of data measured on
a. ratio scale. |
b. nominal scale. |
c. nominal and interval scales. |
d. interval scale. |
The height of a building, measured in feet, is an example of
a. categorical data. |
b. quantitative data. |
c. either categorical or quantitative data. |
d. feet data. |
The process of capturing, storing, and maintaining data is known as
a. data mining. |
b. data manipulation. |
c. data warehousing. |
d. big data. |
5 Dr. Kurt Thearling, a leading practitioner in the field, defines data mining as “the _________ extraction of _________ information from databases”.
a. thorough, insightful |
b. automated, predictive |
c. intentional, useful |
d. timely, accurate |
The major applications of data mining have been made by companies with a strong _______ focus.
a. manufacturing |
b. consumer |
c. research and development |
d. wholesale |
Statistical studies in which researchers control variables of interest are
a. observational studies. |
b. control observational studies. |
c. experimental studies. |
d. non-experimental studies. |
Facts and figures that are collected, analyzed and summarized for presentation and interpretation are
a. variables. |
b. time series data. |
c. data. |
d. elements. |
In a sample of 800 students in a university, 240 or 30% are Business majors. The 30% is an example of
a. a population. |
b. statistical inference. |
c. descriptive statistics. |
d. a sample. |
The most common type of observational study is
a. a statistical inference. |
b. a debate. |
c. a survey. |
d. an experiment. |
The set of measurements collected for a particular element are called
a. variables. |
b. observations. |
c. populations. |
d. samples. |
ensus refers to
a. a survey to collect data on a sample. |
b. an experimental study to collect data on the entire population. |
c. an experimental study to collect data on a sample. |
d. a survey to collect data on the entire population. |
The entities on which data are collected are
a. populations. |
b. samples. |
c. observations. |
d. elements. |
In a post office, the mailboxes are numbered from 1 to 4,500. These numbers represent
a. either categorical or quantitative data. |
b. since the numbers are sequential, the data is quantitative. |
c. categorical data. |
d. quantitative data. |
Arithmetic operations are inappropriate for
a. large data sets. |
b. both categorical and quantitative data. |
c. categorical data. |
d. quantitative data. |
In a sample of 1,600 registered voters, 912 or 57% approve of the way the President is doing his job. The 57% approval is an example of
a. a sample. |
b. statistical inference. |
c. descriptive statistics. |
d. a population. |
Data collected at the same, or approximately the same point in time are
a. approximate data. |
b. cross-sectional data. |
c. time series data. |
d. approximate time series data. |
The number observations in a complete data set having 10 elements and 5 variables is
a. 5. |
b. 50. |
c. 10. |
d. 25. |
The collection of all elements of interest in a particular study is
a. the population. |
b. descriptive statistics. |
c. statistical inference. |
d. the sample. |
A portion of the population selected to represent the population is called
a. statistical inference. |
b. a census. |
c. a sample. |
d. descriptive statistics. |
Ordinary arithmetic operations are meaningful
a. only with quantitative data. |
b. only with categorical data. |
c. with neither quantitative or categorical data. |
d. either with quantitative or categorical data. |
Local residents were surveyed for their satisfaction with the local government. The responses were recorded as follows: 0-not satisfied, 1-somewhat dissatisfied, 2-satisfied, 3-very satisfied. The variable recorded is an example of what type of variable?
a. Variable measured on the ratio scale |
b. Variable measured on the interval scale |
c. Quantitative variable |
d. Variable measured on the ordinal scale |
Arithmetic operations are inappropriate for
a. both categorical and quantitative data. |
b. large data sets. |
c. quantitative data. |
d. categorical data. |
Temperature is an example of a variable that uses
a. the ordinal scale. |
b. the interval scale. |
c. either the ratio or the ordinal scale. |
d. the ratio scale. |
Dr. Kurt Thearling, a leading practitioner in the field, defines data mining as “the _________ extraction of _________ information from databases”.
a. timely, accurate |
b. automated, predictive |
c. intentional, useful |
d. thorough, insightful |
In a sample of 1,600 registered voters, 912 or 57% approve of the way the President is doing his job. A political pollster estimates: “Fifty-seven percent of all voters approve of the President.” This statement is an example of
a. a sample. |
b. statistical inference. |
c. descriptive statistics. |
d. a population. |
Income is an example of
a. nominal data. |
b. categorical data. |
c. quantitative data. |
d. either categorical or quantitative data. |
On a street, the houses are numbered from 300 to 450. The house numbers are examples of
a. quantitative data. |
b. categorical data. |
c. both quantitative and categorical data. |
d. neither quantitative nor categorical data. |
The summaries of data, which may be tabular, graphical, or numerical, are referred to as
a. statistical inference. |
b. data analytics. |
c. descriptive statistics. |
d. inferential statistics. |
For ease of data entry into a university database, 1 denotes that the student is an undergraduate and 2 indicates that the student is a graduate student. In this case data are
a. categorical. |
b. either categorical or quantitative. |
c. neither categorical nor quantitative. |
d. quantitative. |
Ordinary arithmetic operations are meaningful
a. only with quantitative data. |
b. only with categorical data. |
c. with neither quantitative or categorical data. |
d. either with quantitative or categorical data. |
The measurement scale suitable for quantitative data is
a. either interval or ratio scale. |
b. nominal scale. |
c. only interval scale. |
d. ordinal scale. |
Statistical studies in which researchers do not control variables of interest are
a. not of any value. |
b. uncontrolled experimental studies. |
c. observational studies. |
d. experimental studies. |
Data dash-board is an analytical technique that falls in the category of
a. descriptive analytics. |
b. prescriptive analytics. |
c. diagnostic analytics. |
d. predictive analytics. |
Data collected over several time periods are
a. cross-sectional data. |
b. categorical data. |
c. time controlled data. |
d. time series data. |
ince a sample is a subset of the population, the sample mean
a. varies around the mean of the population. |
b. is always smaller than the mean of the population. |
c. must be equal to the mean of the population. |
d. is always larger than the mean of the population. |
Social security numbers consist of numeric values. Therefore, social security number is an example of
a. a categorical variable. |
b. an exchange variable. |
c. a quantitative variable. |
d. either a quantitative or a categorical variable. |
Which of the following defines the term “statistics”?
a. Statistics are rarely useful and informative. |
b. Statistics are used only in sports to calculate “stats” for teams and players such as average rushing yards. |
c. Statistics refers only to the calculation of numbers, such as a mean. |
d. Statistics is the art and science of collecting, analyzing, presenting, and interpreting data. |
The set of analytical techniques that yield a best course of action is
a. prescriptive analytics. |
b. descriptive analytics. |
c. diagnostic analytics. |
d. predictive analytics. |
Simulation, which is the use of probability and statistical computer models to better understand risk, falls under the category of
a. descriptive analytics. |
b. prescriptive analytics. |
c. predictive analytics. |
d. diagnostic analytics. |
Which of the following is a scale of measurement?
a. Remedial |
b. Primal |
c. Divisional |
d. Ratio |
The number of observations will always be the same as the
a. population size. |
b. number of variables. |
c. sample size. |
d. number of elements. |
Statistical inference
a. refers to the process of drawing inferences about the sample based on the characteristics of the population. |
b. is the same as descriptive statistics. |
c. is the process of drawing inferences about the population based on the information taken from the sample. |
d. is the same as a census. |
Which of the following variables is quantitative?
a. Phone number |
b. Zip code |
c. Weight of a package |
d. All of these variables are quantitative. |
Which of the following defines the term “statistics”?
a. Statistics refers only to the calculation of numbers, such as a mean. |
b. Statistics is the art and science of collecting, analyzing, presenting, and interpreting data. |
c. Statistics are rarely useful and informative. |
d. Statistics are used only in sports to calculate “stats” for teams and players such as average rushing yards. |
The number of observations will always be the same as the
a. number of elements. |
b. number of variables. |
c. sample size. |
d. population size. |
The owner of a factory regularly requests a graphical summary of all employees’ salaries. The graphical summary of salaries is an example of
a. statistical inference. |
b. an experiment. |
c. descriptive statistics. |
d. a sample. |
In experimental studies, the variable of interest
a. must be numerical. |
b. is not controlled. |
c. cannot be numerical. |
d. is controlled. |
Different methods of developing useful information from large data bases are dealt with under
a. big data. |
b. data warehousing. |
c. data mining. |
d. data manipulation. |
In a questionnaire, respondents are asked to mark their gender as male or female. The scale of measurement for gender is
a. ratio scale. |
b. nominal scale. |
c. ordinal scale. |
d. interval scale. |
In experimental studies, the variable of interest
a. cannot be numerical. |
b. must be numerical. |
c. is controlled. |
d. is not controlled. |
The set of analytical techniques that yield a best course of action is
a. diagnostic analytics. |
b. predictive analytics. |
c. descriptive analytics. |
d. prescriptive analytics. |
A sample of 100 individuals in a town was asked how much they paid in property tax per year. On the basis of this information, the reporter states that the average property tax bill of all residents of the town is $1,500. This is an example of _____.
a. descriptive statistics |
b. a census |
c. an experiment |
d. statistical inference |
Optimization models, which generate solutions that maximize or minimize some objective subject to a set of constraints, fall into the category of
a. diagnostic analytics. |
b. prescriptive analytics. |
c. descriptive analytics. |
d. predictive analytics. |
Which of the following is a categorical variable?
a. Your age on your last birthday |
b. Your cell phone area code |
c. Your accounting class start time |
d. Your high school graduation year |
Quiz 2
The sum of the relative frequencies for all classes will always equal
a. any value larger than one. |
b. the sample size. |
c. the number of classes. |
d. one. |
2. Data that provide labels or names for categories of like items are known as
a. category data. |
b. label data. |
c. quantitative data. |
d. categorical data. |
The sum of the percent frequencies for all classes will always equal
a. one. |
b. the number of classes. |
c. 100. |
d. the number of items in the study. |
A graphical tool typically associated with the display of key performance indicators is a
a. data dashboard. |
b. stem-and-leaf display. |
c. side-by-side bar chart. |
d. stacked bar chart. |
A cumulative relative frequency distribution shows
a. the percentage of data items with values less than or equal to the upper limit of each class. |
b. the percentage of data items with values less than or equal to the lower limit of each class. |
c. the proportion of data items with values less than or equal to the upper limit of each class. |
d. the proportion of data items with values less than or equal to the lower limit of each class. |
Before drawing any conclusions about the relationship between two variables shown in a crosstabulation, you should
a. construct a scatter diagram and find the trendline. |
b. investigate whether any hidden variables could affect the conclusions. |
c. construct a dot plot and look for significant gaps. |
d. develop a relative frequency distribution. |
The difference between the lower class limits of adjacent classes provides the
a. class limits. |
b. number of classes. |
c. class midpoint. |
d. class width. |
A survey of 800 college seniors resulted in the following crosstabulation regarding their undergraduate major and whether or not they plan to go to graduate school.
Undergraduate Major | ||||
Graduate School | Business | Engineering | Others | Total |
Yes | 70 | 84 | 126 | 280 |
No | 182 | 208 | 130 | 520 |
Total | 252 | 292 | 256 | 800 |
Of those students who are planning on going to graduate school, what percentage are majoring in engineering?
a. 28.8 |
b. 30.0 |
c. 10.5 |
d. 40.4 |
In a cumulative relative frequency distribution, the last class will have a cumulative relative frequency equal to
a. the total number of elements in the data set. |
b. the total of classes in the data set. |
c. zero. |
d. one. |
The total number of data items with a value less than the upper limit for the class is given by the
a. cumulative frequency distribution. |
b. frequency distribution. |
c. cumulative relative frequency distribution. |
d. relative frequency distribution. |
A survey of 800 college seniors resulted in the following crosstabulation regarding their undergraduate major and whether or not they plan to go to graduate school.
Undergraduate Major | ||||
Graduate School | Business | Engineering | Others | Total |
Yes | 70 | 84 | 126 | 280 |
No | 182 | 208 | 130 | 520 |
Total | 252 | 292 | 256 | 800 |
Of those students who are majoring in business, what percentage plans to go to graduate school?
a. 70.00 |
b. 27.78 |
c. 72.22 |
d. 8.75 |
A histogram is
a. a graphical presentation of a frequency or relative frequency distribution. |
b. a graphical method of presenting a cumulative frequency or a cumulative relative frequency distribution. |
c. the history of data elements. |
d. the same as a pie chart. |
The sum of frequencies for all classes will always equal
a. the number of elements in a data set. |
b. 1. |
c. the number of classes. |
d. a value between 0 and 1. |
In a cumulative percent frequency distribution, the last class will have a cumulative percent frequency equal to
a. the total number of elements in the data set. |
b. 100. |
c. one. |
d. None of these alternatives are correct. |
The numbers of hours worked (per week) by 400 statistics students are shown below.
Number of hours | Frequency |
0 -9 | 20 |
10 – 19 | 80 |
20 – 29 | 200 |
30 – 39 | 100 |
The relative frequency of students working 10 – 19 hours per week is
a. .20 |
b. .80 |
c. .25 |
d. .40 |
A survey of 800 college seniors resulted in the following crosstabulation regarding their undergraduate major and whether or not they plan to go to graduate school.
Undergraduate Major | ||||
Graduate School | Business | Engineering | Others | Total |
Yes | 70 | 84 | 126 | 280 |
No | 182 | 208 | 130 | 520 |
Total | 252 | 292 | 256 | 800 |
The above crosstabulation shows
a. column percentages. |
b. frequencies. |
c. row percentages. |
d. overall percentages. |
A frequency distribution is
a. a tabular summary of a set of data showing the relative frequency. |
b. a graphical device for presenting categorical data. |
c. a tabular summary of a set of data showing the frequency of items in each of several nonoverlapping classes. |
d. a graphical form of representing data. |
In a cumulative frequency distribution, the last class will always have a cumulative frequency equal to
a. 10. |
b. 100%. |
c. one. |
d. the total number of elements in the data set. |
Which of the following is not a recommended guideline for creating an effective graphical display?
a. Label each axis and show the units of measure |
b. If colors are used to distinguish categories, use a legend to define them |
c. Use three dimensions whenever possible, to give the display depth |
d. Give the display a clear and concise title |
A researcher is gathering data from four geographical areas designated: South = 1; North = 2; East = 3; West = 4. The designated geographical regions represent
a. quantitative data. |
b. categorical data. |
c. crosstabular data. |
d. either categorical or quantitative data. |
If a negative relationship exists between two variables, x and y, which of the following statements is true?
a. As x decreases, y decreases. |
b. As x increases, y increases. |
c. As x decreases, y stays the same. |
d. As x increases, y decreases. |
In a cumulative percent frequency distribution, the last class will have a cumulative percent frequency equal to
a. the total number of elements in the data set. |
b. one. |
c. 100. |
d. None of these alternatives are correct. |
The percent frequency of a class is computed by
a. dividing the relative frequency by 100. |
b. adding 100 to the relative frequency. |
c. multiplying the relative frequency by 100. |
d. multiplying the relative frequency by 10. |
Which of the following is a graphical summary of a set of data in which each data value is represented by a dot above the axis?
a. Crosstabulation |
b. Histogram |
c. Box plot |
d. Dot plot |
Data that provide labels or names for categories of like items are known as
a. label data. |
b. category data. |
c. categorical data. |
d. quantitative data. |
In quality control applications, bar charts are used to identify the most important causes of problems. When the bars are arranged in descending order of height from left to right with the most frequently occurring cause appearing first, the bar chart is called a
a. Simpson,s chart. |
b. Pareto diagram. |
c. Stacked bar chart. |
d. Cause-and-effect diagram. |
A frequency distribution is a tabular summary of data showing the
a. percentage of items in several classes. |
b. relative percentage of items in several classes. |
c. number of items in several classes. |
d. fraction of items in several classes. |
Consider the following graphical summary.

This is an example of a _____.
a. percent frequency distribution |
b. relative frequency distribution |
c. bar chart |
d. pie chart |
A survey of 800 college seniors resulted in the following crosstabulation regarding their undergraduate major and whether or not they plan to go to graduate school.
Undergraduate Major | ||||
Graduate School | Business | Engineering | Others | Total |
Yes | 70 | 84 | 126 | 280 |
No | 182 | 208 | 130 | 520 |
Total | 252 | 292 | 256 | 800 |
Of those students who are planning on going to graduate school, what percentage are majoring in engineering?
a. 10.5 |
b. 30.0 |
c. 40.4 |
d. 28.8 |
For stem-and-leaf displays where the leaf unit is not stated, the leaf unit is assumed to equal
a. 1. |
b. 10. |
c. 0. |
d. -1. |
A sample of 15 children shows their favorite restaurants:
McDonalds | Luppi’s | Mellow Mushroom |
Friday’s | McDonalds | McDonalds |
Pizza Hut | Taco Bell | McDonalds |
Mellow Mushroom | Luppi’s | Pizza Hut |
McDonalds | Friday’s | McDonalds |
Which of the following is the correct frequency distribution?
a. McDonalds 4, Friday’s 3, Pizza Hut 1, Mellow Mushroom 4, Luppi’s 3, Taco Bell 1 |
b. McDonalds 6, Friday’s 2, Pizza Hut 2, Mellow Mushroom 2, Luppi’s 2, Taco Bell 1 |
c. McDonalds 6, Friday’s 1, Pizza Hut 3, Mellow Mushroom 1, Luppi’s 2, Taco Bell 2 |
d. None of these alternatives are correct. |
In a cumulative percent frequency distribution, the last class will have a cumulative percent frequency equal to
a. the total number of elements in the data set. |
b. 100. |
c. one. |
d. None of these alternatives are correct. |
A graphical method that can be used to show both the rank order and shape of a distribution of data simultaneously is a
a. stem-and-leaf display. |
b. dot plot. |
c. relative frequency distribution. |
d. pie chart. |
The numbers of hours worked (per week) by 400 statistics students are shown below.
Number of hours | Frequency |
0 -9 | 20 |
10 – 19 | 80 |
20 – 29 | 200 |
30 – 39 | 100 |
The relative frequency of students working 10 – 19 hours per week is
a. .25 |
b. .80 |
c. .40 |
d. .20 |
The difference between the lower class limits of adjacent classes provides the
a. class width. |
b. number of classes. |
c. class limits. |
d. class midpoint. |
Which of the following is a graphical summary of a set of data in which each data value is represented by a dot above the axis?
a. Crosstabulation |
b. Dot plot |
c. Histogram |
d. Box plot |
The sum of frequencies for all classes will always equal
a. 1. |
b. the number of classes. |
c. the number of elements in a data set. |
d. a value between 0 and 1 |
Information on the number of new teachers hired in a school district for each of four years is given in the table below.

The percent frequency of new hires in 2019 is _____.
a. 80% |
b. 40% |
c. 10% |
d. 25% |
What types of variables can be displayed by a scatter diagram?
a. Two quantitative variables |
b. Two qualitative variables |
c. One quantitative and one qualitative variable |
d. Only two discrete quantitative variables |
The relative frequency of a class is computed by
a. dividing the frequency of the class by the number of classes. |
b. dividing n by cumulative frequency of the class. |
c. dividing the cumulative frequency of the class by n. |
d. dividing the frequency of the class by n. |
Which of the following graphical methods shows the relationship between two variables?
a. Dot plot |
b. Crosstabulation |
c. Histogram |
d. Pie chart |
A researcher is gathering data from four geographical areas designated: South = 1; North = 2; East = 3; West = 4. The designated geographical regions represent
a. crosstabular data. |
b. categorical data. |
c. quantitative data. |
d. either categorical or quantitative data. |
The number of miles from their residence to their place of work for 120 employees is shown below.

The relative frequency of employees who drive 10 miles or less to work is _____.
a. 0.85 |
b. 0.85 |
c. 0.25 |
d. 0.71 |
The most common graphical presentation of quantitative data is a
a. pie chart. |
b. stem and leaf display. |
c. bar chart. |
d. histogram. |
The difference between the lower class limits of adjacent classes provides the
a. class limits. |
b. class midpoint. |
c. number of classes. |
d. class width. |
A graphical method that can be used to show both the rank order and shape of a distribution of data simultaneously is a
a. dot plot. |
b. stem-and-leaf display. |
c. relative frequency distribution. |
d. pie chart. |
A survey of 400 college seniors resulted in the following crosstabulation regarding their undergraduate major and whether or not they plan to go to graduate school.

What percentage of the undergraduates surveyed are majoring in Engineering?
a. 400% |
b. 42% |
c. 151% |
d. 38% |
Information on the number of new teachers hired in a school district for each of four years is given in the table below.

The percent frequency of new hires in 2019 is _____.
a. 10% |
b. 80% |
c. 40% |
d. 25% |
QUIZ 3
Then n-1 is used in the denominator to compute variance,
a. the data set is a population. |
b. the data set could be either a sample or a population. |
c. the data set is from a census. |
d. the data set is a sample. |
Which of the following symbols represents the variance of the population?
a. σ |
b. x̄ |
c. σ2 |
d. μ |
f a data set has an even number of observations, the median
a. cannot be determined. |
b. is the average value of the two middle items. |
c. is the average value of the two middle items when all items are arranged in ascending order. |
d. must be equal to the mean. |
The median of a sample will always equal the
a. (smallest value + largest value)/2. |
b. (Q1 + Q3)/2. |
c. 50th percentile. |
d. Q4/2. |
Which of the following is a measure of variability?
a. Percentiles |
b. Quartiles |
c. Interquartile range |
d. Geometric mean |
The coefficient of variation indicates how large the standard deviation is relative to the
a. median. |
b. mean. |
c. range. |
d. variance. |
The measure of dispersion which is not measured in the same units as the original data is the
a. coefficient of determination. |
b. median. |
c. variance. |
d. standard deviation |
The symbol σ is used to represent
a. the standard deviation of the population. |
b. the standard deviation of the sample. |
c. the variance of the sample. |
d. the variance of the population. |
Which of the following symbols represents the size of the population?
a. σ2 |
b. σ |
c. μ |
d. N |
If two groups of numbers have the same mean, then
a. other measures of location need not be the same. |
b. their medians must also be equal. |
c. their modes must also be equal. |
d. their standard deviations must also be equal. |
The most frequently occurring value of a data set is called the
a. mean. |
b. median. |
c. mode. |
d. range. |
The hourly wages of a sample of 130 system analysts are given below.
mean = 60 | range = 20 |
mode = 73 | variance = 324 |
median = 74 |
The coefficient of variation equals
a. 0.30%. |
b. 54%. |
c. 30%. |
d. 5.4%. |
Which of the following symbols represents the mean of the sample?
a. σ |
b. μ |
c. x̄ |
d. σ2 |
A researcher has collected the following sample data.
5 | 12 | 6 | 8 | 5 |
6 | 7 | 5 | 12 | 4 |
The 75th percentile is
a. 8. |
b. 7.5. |
c. 9. |
d. 7. |
The coefficient of variation is
a. the square of the standard deviation. |
b. the mean divided by the standard deviation. |
c. the standard deviation divided by the mean times 100. |
d. the same as the variance. |
The geometric mean of 1, 1, 8 is
a. 10.0. |
b. 3.33. |
c. 2.0. |
d. 3.0. |
The median is a measure of
a. central location. |
b. relative location. |
c. relative dispersion. |
d. absolute dispersion. |
A numerical measure of linear association between two variables is the
a. coefficient of variation. |
b. standard deviation. |
c. covariance. |
d. variance. |
Which of the following is not a measure of variability of a single variable?
a. Interquartile range |
b. Range |
c. Covariance |
d. Standard deviation |
The difference between the largest and the smallest data values is the
a. interquartile range. |
b. range. |
c. coefficient of variation. |
d. variance. |
The variance of the sample
a. cannot be zero. |
b. can never be negative. |
c. can be negative. |
d. cannot be less than one. |
Statements about the proportion of data values that must be within a specified number of standard deviations of the mean can be made using
a. Chebyshev’s theorem. |
b. A five-number summary. |
c. Percentiles. |
d. The empirical rule. |
Which of the following symbols represents the variance of the population?
a. μ |
b. σ2 |
c. x̄ |
d. σ |
The geometric mean of 2, 4, 8 is
a. 4.67. |
b. 16. |
c. 5.0. |
d. 4.0. |
Suppose a sample of 45 measurements gave a data set with a range of –8 to –22. The standard deviation of the measurements
a. is negative since all the numbers are negative. |
b. cannot be computed since all the numbers are negative. |
c. can be either negative or positive. |
d. must be at least zero. |
Using the following data set of monthly rainfall amounts recorded for 10 randomly selected months in a two-year period, what is the five-number summary?
Sample data (in inches): 2, 8, 5, 0, 1, 5, 7, 5, 2, .5
a. 2, 5, 5, 5, .5 |
b. .5, 2, 5, 7, 8 |
c. 0, 1, 5, 5, 8 |
d. 0, 1, 3.5, 5, 8 |
The difference between the largest and the smallest data values is the
a. range. |
b. interquartile range. |
c. variance. |
d. coefficient of variation. |
In computing the mean of a sample, the value of ∑xi is divided by
a. n + 1. |
b. n – 2. |
c. n. |
d. n – 1. |
The 75th percentile is referred to as the
a. third quartile. |
b. second quartile. |
c. first quartile. |
d. fourth quartile. |
The measure of variability easiest to compute, but seldom used as the only measure, is the
a. range. |
b. standard deviation. |
c. variance. |
d. interquartile range. |
The numerical value of the variance
a. is always smaller than the numerical value of the standard deviation. |
b. is negative if the mean is negative. |
c. is always larger than the numerical value of the standard deviation. |
d. can be larger or smaller than the numerical value of the standard deviation. |
The __________ can be interpreted as the number of standard deviations a data value is from the mean of all the data values.
a. correlation coefficient |
b. skewness |
c. z-score |
d. coefficient of variation |
The most frequently occurring value of a data set is called the
a. median. |
b. range. |
c. mode. |
d. mean. |
If a data set has an even number of observations, the median
a. cannot be determined. |
b. is the average value of the two middle items when all items are arranged in ascending order. |
c. must be equal to the mean. |
d. is the average value of the two middle items. |
The heights (in inches) of 25 individuals were recorded and the following statistics were calculated
mean = 70 | range = 20 |
mode = 73 | variance = 784 |
median = 74 |
The coefficient of variation equals
a. 1120%. |
b. 0.4%. |
c. 11.2%. |
d. 40%. |
From a population of size 1,000, a random sample of 100 items is selected. The mean of the sample
a. must be 10 times larger than the mean of the population. |
b. must be 10 times smaller than the mean of the population. |
c. can be larger, smaller or equal to the mean of the population. |
d. must be equal to the mean of the population, if the sample is truly random. |
Which of the following provides a measure of central location for the data?
a. Range |
b. Variance |
c. Standard deviation |
d. Mean |
The percentage of data values that must be within one, two, and three standard deviations of the mean for data having a bell-shaped distribution can be determined using
a. Chebyshev’s theorem. |
b. Percentiles. |
c. A five-number summary. |
d. The empirical rule. |
During a cold winter, the temperature stayed below zero for ten days (ranging from -20 to -5). The variance of the temperatures of the ten-day period
a. can be either negative or positive. |
b. cannot be computed since all the numbers are negative. |
c. must be at least zero. |
d. is negative since all the numbers are negative. |
The weights (in pounds) of a sample of 36 individuals were recorded and the following statistics were calculated.
mean = 160 | range = 60 |
mode = 165 | variance = 324 |
median = 170 |
The coefficient of variation equals
a. 0.20312%. |
b. 0.1125%. |
c. 11.25%. |
d. 203.12%. |
The measure of dispersion which is not measured in the same units as the original data is the
a. median. |
b. coefficient of determination. |
c. variance. |
d. standard deviation. |
Which of the following is not a measure of variability of a single variable?
a. Standard deviation |
b. Covariance |
c. Range |
d. Interquartile range |
The sample variance
a. could be smaller, equal to, or larger than the true value of the population variance. |
b. is always smaller than the true value of the population variance. |
c. can never be zero. |
d. is always larger than the true value of the population variance. |
The geometric mean of five observations is the
a. fifth root of the product of the 5 observations. |
b. same as their mean. |
c. square root of the product of the 5 observations. |
d. same as their weighted mean. |
The geometric mean of 1, 1, 8 is
a. 2.0. |
b. 3.0. |
c. 3.33. |
d. 10.0. |
When the data are skewed to the right, the measure of Skewness will be
a. positive. |
b. one. |
c. negative. |
d. zero. |
Generally, which one of the following is the least appropriate measure of central tendency for a data set that contains outliers?
a. Median |
b. Mean |
c. 50th percentile |
d. 2nd quartile |
Consider the following data summary.
This is an example of a _____.
a. histogram |
b. frequency table |
c. line graph |
d. box plot |
A survey to collect data on the entire population is
a. a population. |
b. a sample. |
c. an inference. |
d. a census. |
The subject of data mining deals with
a. keeping data secure so that unauthorized individuals cannot access the data. |
b. computational procedure for data analysis. |
c. computing the average for data. |
d. methods for developing useful decision-making information from large data bases. |
Which of the following is not an example of descriptive statistics?
a. The proportion of mailed-out questionnaires that were returned |
b. A histogram depicting the age distribution for 30 randomly selected students |
c. A table summarizing the data collected in a sample of new-car buyers |
d. An estimate of the number of Alaska residents who have visited Canada |
Which of the following disciplines has contributed the least to the development of data mining procedures?
a. Mathematics |
b. Statistics |
c. Psychology |
d. Computer science |
Which of the following is a scale of measurement?
a. Remedial |
b. Ratio |
c. Primal |
d. Divisional |
In a sample of 1,600 registered voters, 912 or 57% approve of the way the President is doing his job. A political pollster estimates: “Fifty-seven percent of all voters approve of the President.” This statement is an example of
a. descriptive statistics. |
b. a sample. |
c. a population. |
d. statistical inference. |
Optimization models, which generate solutions that maximize or minimize some objective subject to a set of constraints, fall into the category of
a. diagnostic analytics. |
b. predictive analytics. |
c. prescriptive analytics. |
d. descriptive analytics. |
The process of analyzing sample data in order to draw conclusions about the characteristics of a population is called
a. descriptive statistics. |
b. data analysis. |
c. data summarization. |
d. statistical inference. |
The number of sick days taken (per month) by 150 factory workers is summarized below.
The cumulative frequency for the class 11–15 is _____.
a. 15 |
b. .10 |
c. .97 |
d. 145 |
Histograms based on data on housing prices and salaries typically are
a. skewed to the left. |
b. skewed to the right. |
c. stacked. |
d. symmetric. |
A sample of 15 children shows their favorite restaurants:
McDonalds | Luppi’s | Mellow Mushroom |
Friday’s | McDonalds | McDonalds |
Pizza Hut | Taco Bell | McDonalds |
Mellow Mushroom | Luppi’s | Pizza Hut |
McDonalds | Friday’s | McDonalds |
Which of the following is the correct relative frequency for McDonalds?
a. .27 |
b. .6 |
c. .4 |
d. .5 |
A sample of 15 children shows their favorite restaurants:
McDonalds | Luppi’s | Mellow Mushroom |
Friday’s | McDonalds | McDonalds |
Pizza Hut | Taco Bell | McDonalds |
Mellow Mushroom | Luppi’s | Pizza Hut |
McDonalds | Friday’s | McDonalds |
Which of the following distributions would be inappropriate for this data?
a. Relative frequency |
b. Frequency |
c. Percent frequency |
d. Cumulative frequency |
A survey of 400 college seniors resulted in the following crosstabulation regarding their undergraduate major and whether or not they plan to go to graduate school.
What percentage of the undergraduates surveyed are majoring in Engineering?
a. 151% |
b. 42% |
c. 38% |
d. 400% |
In quality control applications, bar charts are used to identify the most important causes of problems. When the bars are arranged in descending order of height from left to right with the most frequently occurring cause appearing first, the bar chart is called a
a. Pareto diagram. |
b. Simpson,s chart. |
c. Cause-and-effect diagram. |
d. Stacked bar chart. |
A graphical presentation of the relationship between two quantitative variables is
a. histogram. |
b. stem-and-leaf display. |
c. scatter diagram. |
d. dot plot. |
The proper way to construct a stem-and-leaf display for the data set {62, 67, 68, 73, 73, 79, 91, 94, 95, 97} is to
a. include a stem labeled ‘(8)’ and enter no leaves on the stem. |
b. include a stem labeled ‘8’ and enter one leaf value of ‘0’ on the stem. |
c. include a stem labeled ‘8’ and enter no leaves on the stem. |
d. exclude a stem labeled ‘8. |
Data that indicate how much or how many are known as
a. quantitative data. |
b. categorical data. |
c. cumulative data. |
d. relative data. |
The approximate class width for a frequency distribution involving quantitative data can be determined using the expression
a. desired number of classes/class midpoint. |
b. mean frequency/total frequency. |
c. range/desired number of classes. |
d. total frequency/class midpoint. |
The numbers of hours worked (per week) by 400 statistics students are shown below.
Number of hours | Frequency |
0 -9 | 20 |
10 – 19 | 80 |
20 – 29 | 200 |
30 – 39 | 100 |
The cumulative percent frequency for students working less than 20 hours per week is
a. 20%. |
b. 80%. |
c. 100%. |
d. 25%. |
A frequency distribution is
a. a graphical form of representing data. |
b. a tabular summary of a set of data showing the relative frequency. |
c. a graphical device for presenting categorical data. |
d. a tabular summary of a set of data showing the frequency of items in each of several nonoverlapping classes. |
1. Suppose the number of personal days an employee uses per year has a mean of 8 and a standard deviation of 2. What percent of the data values will be within two standard deviations of the mean if the distribution is bell-shaped?
a. At least 95% |
b. At least 75% |
c. At least 30% |
d. The percentage cannot be computed with the information given. |
Since the median is the middle value of a data set it
a. must always be smaller than the mode. |
b. must always be larger than the mode. |
c. must always be smaller than the mean. |
d. None of these alternatives are correct. |
When data are positively skewed, the mean will usually be
a. greater than the median. |
b. smaller than the median. |
c. equal to the median. |
d. positive. |
Which of the following symbols represents the standard deviation of the population?
a. σ2 |
b. x̄ |
c. σ |
d. μ |
The descriptive measure of variability that is based on the concept of a deviation about the mean is
a. the standard deviation. |
b. the absolute value of the range. |
c. the range. |
d. the interquartile range. |
The following is the frequency distribution for the speed of a sample of automobiles traveling on an interstate highway.
Speed (mph) | Frequency |
50 – 54 | 2 |
55 – 59 | 4 |
60 – 64 | 5 |
65 – 69 | 10 |
70 – 74 | 9 |
75 – 79 | 5 |
35 |
The mean is
a. 35. |
b. 67. |
c. 670. |
d. 10. |
When the data are skewed to the right, the measure of Skewness will be
a. one. |
b. zero. |
c. negative. |
d. positive. |
A box plot is a graphical representation of data that is based on
a. the empirical rule. |
b. a histogram. |
c. a five number summary. |
d. z-scores. |
The standard deviation of a sample was reported to be 20. The report indicated that Σ(x-x̄)2 = 7200. What is the sample size?
a. 17 |
b. 16 |
c. 18 |
d. 19 |
Data that indicate how much or how many are known as
a. cumulative data. |
b. relative data. |
c. quantitative data. |
d. categorical data. |
In a stem-and-leaf display,
a. a single digit is used to define each stem, and one or more digits are used to define each leaf. |
b. one or more digits are used to define each stem, and a single digit is used to define each leaf. |
c. a single digit is used to define each stem, and a single digit is used to define each leaf. |
d. one or more digits are used to define each stem, and one or more digits are used to define each leaf. |
The numbers of hours worked (per week) by 400 statistics students are shown below.
Number of hours | Frequency |
0 – 9 | 20 |
10 – 19 | 80 |
20 – 29 | 200 |
30 – 39 | 100 |
The percentage of students who work at least 10 hours per week is
a. 5%. |
b. 50%. |
c. 100%. |
d. 95%. |
A survey of 400 college seniors resulted in the following crosstabulation regarding their undergraduate major and whether or not they plan to go to graduate school.
What percentage of the undergraduates surveyed are majoring in Engineering?
a. 38% |
b. 42% |
c. 400% |
d. 151% |
A survey of 800 college seniors resulted in the following crosstabulation regarding their undergraduate major and whether or not they plan to go to graduate school.
Undergraduate Major | ||||
Graduate School | Business | Engineering | Others | Total |
Yes | 70 | 84 | 126 | 280 |
No | 182 | 208 | 130 | 520 |
Total | 252 | 292 | 256 | 800 |
The above crosstabulation shows
a. frequencies. |
b. column percentages. |
c. overall percentages. |
d. row percentages. |
A survey of 800 college seniors resulted in the following crosstabulation regarding their undergraduate major and whether or not they plan to go to graduate school.
Undergraduate Major | ||||
Graduate School | Business | Engineering | Others | Total |
Yes | 70 | 84 | 126 | 280 |
No | 182 | 208 | 130 | 520 |
Total | 252 | 292 | 256 | 800 |
The above crosstabulation shows
a. frequencies. |
b. column percentages. |
c. overall percentages. |
d. row percentages. |
The numbers of hours worked (per week) by 400 statistics students are shown below.
Number of hours | Frequency |
0 – 9 | 20 |
10 – 19 | 80 |
20 – 29 | 200 |
30 – 39 | 100 |
The midpoint of the last class is
a. 34. |
b. 35.5 |
c. 35. |
d. 34.5. |
A survey of 800 college seniors resulted in the following crosstabulation regarding their undergraduate major and whether or not they plan to go to graduate school.
Undergraduate Major | ||||
Graduate School | Business | Engineering | Others | Total |
Yes | 70 | 84 | 126 | 280 |
No | 182 | 208 | 130 | 520 |
Total | 252 | 292 | 256 | 800 |
Of those students who are majoring in business, what percentage plans to go to graduate school?
a. 27.78 |
b. 8.75 |
c. 70.00 |
d. 72.22 |
A graphical device for depicting categorical data that have been summarized in a frequency distribution, relative frequency distribution, or percent frequency distribution is a
a. histogram. |
b. dot plot. |
c. bar chart. |
d. stem-and-leaf display. |
Categorical data
a. must be nonnumeric. |
b. are labels used to identify attributes of elements. |
c. indicate either how much or how many. |
d. cannot be numeric. |
A characteristic of interest for the elements is called a(n) _____.
a. data set |
b. sample |
c. variable |
d. observation |
On a street, the houses are numbered from 300 to 450. The house numbers are examples of
a. categorical data. |
b. quantitative data. |
c. neither quantitative nor categorical data. |
d. both quantitative and categorical data. |
The sample size
a. can be larger or smaller than the population size. |
b. is always equal to the size of the population. |
c. is always smaller than the population size. |
d. can be larger than the population size. |
Income is an example of
a. quantitative data. |
b. categorical data. |
c. nominal data. |
d. either categorical or quantitative data. |
Data mining is the process of uncovering hidden information that can be used to _____.
a. justify |
b. control |
c. predict |
d. explain |
Temperature is an example of a variable that uses
a. the ordinal scale. |
b. the ratio scale. |
c. the interval scale. |
d. either the ratio or the ordinal scale. |
In a sample of 400 students in a university, 80 or 20% are Business majors. Based on the above information, the school’s paper reported that “20% of all the students at the university are Business majors.” This report is an example of
a. statistical inference. |
b. descriptive statistics. |
c. a population. |
d. a sample. |
Which of the following is not an example of a firm that sells or leases business database services to clients?
a. Dow Jones & Co. |
b. Census Bureau |
c. Bloomberg |
d. Dun & Bradstreet |
Which of the following scales of measurement are appropriate for quantitative data?
a. Interval and ratio |
b. Interval and ordinal |
c. Nominal and ordinal |
d. Ratio and ordinal |
Which of the following is not a recommended guideline for creating an effective graphical display?
a. Use three dimensions whenever possible, to give the display depth |
b. Give the display a clear and concise title |
c. Label each axis and show the units of measure |
d. If colors are used to distinguish categories, use a legend to define them |
Consider the scatter diagram below.
What type of relationship is shown for the number of students and their average score?
a. A quadratic relationship |
b. A positive relationship |
c. A negative relationship |
d. No apparent relationship |
The relative frequency of a class is computed by
a. dividing the sample size by the frequency of the class. |
b. dividing the frequency of the class by the sample size. |
c. dividing the midpoint of the class by the sample size. |
d. dividing the frequency of the class by the midpoint. |
A survey of 800 college seniors resulted in the following crosstabulation regarding their undergraduate major and whether or not they plan to go to graduate school.
Undergraduate Major | ||||
Graduate School | Business | Engineering | Others | Total |
Yes | 70 | 84 | 126 | 280 |
No | 182 | 208 | 130 | 520 |
Total | 252 | 292 | 256 | 800 |
Of those students who are majoring in business, what percentage plans to go to graduate school?
a. 27.78 |
b. 8.75 |
c. 72.22 |
d. 70.00 |
A tabular summary of a set of data showing the fraction of the total number of items in several classes is a
a. relative frequency distribution. |
b. cumulative relative frequency distribution. |
c. cumulative frequency distribution. |
d. frequency distribution. |
A sample of 15 children shows their favorite restaurants:
McDonalds | Luppi’s | Mellow Mushroom |
Friday’s | McDonalds | McDonalds |
Pizza Hut | Taco Bell | McDonalds |
Mellow Mushroom | Luppi’s | Pizza Hut |
McDonalds | Friday’s | McDonalds |
Which of the following is the correct percent frequency for McDonalds?
a. 27% |
b. 10% |
c. 40% |
d. 2% |
The numbers of hours worked (per week) by 400 statistics students are shown below.
Number of hours | Frequency |
0 – 9 | 20 |
10 – 19 | 80 |
20 – 29 | 200 |
30 – 39 | 100 |
The class width used in this frequency distribution is
a. 9. |
b. 10. |
c. 39. |
d. 4.5. |
Which of the following statements is true with respect to bar charts?
a. The width of the bars depends on the number of observations in the category. |
b. A circle is drawn to represent the entire data set. |
c. A bar chart is used only with quantitative data sets. |
d. A bar chart can be used for a data set with a relatively small number of possible categories. |
A frequency distribution is a tabular summary of data showing the
a. percentage of items in several classes. |
b. relative percentage of items in several classes. |
c. number of items in several classes. |
d. fraction of items in several classes. |
The relative frequency of a class is
a. equal to the frequency of the class. |
b. always equal to 1%. |
c. equal to the frequency of the class multiplied by 100%. |
d. equal to the frequency of the class divided by the total number of observations, n. |
The entities on which data are collected are
a. populations. |
b. observations. |
c. samples. |
d. elements. |
On a street, the houses are numbered from 300 to 450. The house numbers are examples of
a. both quantitative and categorical data. |
b. neither quantitative nor categorical data. |
c. categorical data. |
d. quantitative data. |
The data measured on ordinal scale exhibits all the properties of data measured on
a. nominal scale. |
b. interval scale. |
c. ratio scale. |
d. nominal and interval scales. |
The most common type of observational study is
a. an experiment. |
b. a survey. |
c. a statistical inference. |
d. a debate. |
Quantitative data
a. are always non-numeric. |
b. are always numeric. |
c. are never numeric. |
d. may be either numeric or non-numeric. |
In a post office, the mailboxes are numbered from 1 to 4,500. These numbers represent
a. since the numbers are sequential, the data is quantitative. |
b. either categorical or quantitative data. |
c. quantitative data. |
d. categorical data. |
Data
a. are the raw material of statistics. |
b. are always non-numeric. |
c. are always numeric. |
d. are always categorical. |
For ease of data entry into a university database, 1 denotes that the student is an undergraduate and 2 indicates that the student is a graduate student. In this case data are
a. quantitative. |
b. neither categorical nor quantitative. |
c. categorical. |
d. either categorical or quantitative. |
The average age in a sample of 190 students at City College is 22. As a result of this sample, it can be concluded that the average age of all the students at City College
a. could not be 22. |
b. must be more than 22, since the population is always larger than the sample. |
c. must be less than 22, since the sample is only a part of the population. |
d. is around 22. |
An interviewer has made an error in recording the data. This type of error is known as
a. a data acquisition error. |
b. a non-experimental error. |
c. a conglomerate error. |
d. an experimental error. |