- Given are five observations collected in a regression study on two variables.
2 | 6 | 9 | 13 | 20 | |
7 | 18 | 9 | 26 | 23 |
a. Which of the following scatter diagrams accurately represents the data?
C
b. Develop the estimated regression equation for these data (to decimal).
0.9
7.6
7.6 0.9
c. Use the estimated regression equation to predict the value of when (to decimal).
13
2. Brawdy Plastics, Inc., produces plastic seat belt retainers for General Motors at their plant in Buffalo, New York. After final assembly and painting, the parts are placed on a conveyor belt that moves the parts past a final inspection station. How fast the parts move past the final inspection station depends upon the line speed of the conveyor belt (feet per minute). Although faster line speeds are desirable, management is concerned that increasing the line speed too much may not provide enough time for inspectors to identify which parts are actually defective. To test this theory, Brawdy Plastics conducted an experiment in which the same batch of parts, with a known number of defective parts, was inspected using a variety of line speeds. The following data were collected.
Line Speed | Number of Defective Parts Found | |
20 | 23 | |
20 | 21 | |
30 | 19 | |
30 | 16 | |
40 | 15 | |
40 | 17 | |
50 | 14 | |
50 | 11 |
If required, enter negative values as negative numbers.
a. Select a scatter diagram with the line speed as the independent variable.
D
b. What does the scatter diagram developed in part (a) indicate about the relationship between the two variables?
Negative relationship
c. Use the least squares method to develop the estimated regression equation (to 1 decimal).
27.5 -0.3
d. Predict the number of defective parts found for a line speed of feet per minute.
20
3. David’s Landscaping has collected data on home values (in thousands of $) and expenditures (in thousands of $) on landscaping with the hope of developing a predictive model to help marketing to potential new clients. Data for households may be found in the file Landscape. Click on the datafile logo to reference the data.
a. Select a scatter diagram with home value as the independent variable.
Graph B
b. What does the scatter plot developed in part (a) indicate about the relationship between the two variables?
The scatter diagram indicates a positive linear relationship between the two variables.
c. Use the least squares method to develop the estimated regression equation (to decimals)
0.02142 6.30910
d. For every additional in home value, estimate how much additional will be spent on landscaping (to decimals).
21.42
e. Use the equation estimated in part (c) to predict the landscaping expenditures for a home valued at (to the nearest whole number).
18626
4. A large city hospital conducted a study to investigate the relationship between the number of unauthorized days that employees are absent per year and the distance (miles) between home and work for the employees. A sample of employees was selected and the following data were collected.
Distance to Work | Number of Days | ||
(miles) | Absent | ||
1 | 8 | ||
3 | 5 | ||
4 | 8 | ||
6 | 7 | ||
8 | 6 | ||
10 | 3 | ||
12 | 5 | ||
14 | 2 | ||
14 | 4 | ||
18 | 2 |
If required, enter negative values as negative numbers.
a. Select the correct scatter diagram for these data.
Graph A
Does a linear relationship appear reasonable?
Yes, negative one
b. Develop the least squares estimated regression equation that relates the distance to work to the number of days absent (to decimals).
8.0978 -0.3442
c. Predict the number of days absent for an employee that lives miles from the hospital (to nearest whole number).
6 days
5. Given are five observations for two variables, and .
Excel File: data14-17.xlsx
The estimated regression equation for these data is .
Compute SSE, SST, and SSR (to decimal).
SSE 127.3
SST 281.2
SSR 153.9
What percentage of the total sum of squares can be accounted for by the estimated regression equation (to decimal)?
54.7
What is the value of the sample correlation coefficient (to decimals)?
0.740
6. An important application of regression analysis in accounting is in the estimation of cost. By collecting data on volume and cost and using the least squares method to develop an estimated regression equation relating volume and cost, an accountant can estimate the cost associated with a particular manufacturing volume. Consider the following sample of production volumes and total cost data for a manufacturing operation.
a. Use these data to develop an estimated regression equation that could be used to predict the total cost for a given production volume. Do not round intermediate calculations.
Compute and (to decimal) 7.6
1246.7
Complete the estimated regression equation (to decimal). Do not round intermediate calculations
1246.7 7.6
b. What is the variable cost per unit produced (to decimal)? Do not round intermediate calculations
7.60
c. Compute the coefficient of determination (to decimals). Do not round intermediate calculations. Note: report between and .
0.9584
What percentage of the variation in total cost can be explained by the production volume (to decimal)? Do not round intermediate calculations
95.9
d. The company’s production schedule shows units must be produced next month. Predict the total cost for this operation (to the nearest whole number). Do not round intermediate calculations
5047
7. Given are five observations for two variables, and .
Excel File: data14-25.xlsx
The estimated regression equation is .
a. What is the value of the standard error of the estimate (to decimals)?
6.5141
b. Test for a significant relationship by using the t test. Use .
What is the value of the t test statistic (to decimals)?
1.90
What is the -value? Use Table 2 of Appendix B.
between 0.10 and 0.20
What is your conclusion ()?
cannot conclude a significant relationship exists between x and y
c. Use the test to test for a significant relationship. Use .
Compute the value of the test statistic (to decimals).
3.63
What is the -value? Use Table 4 of Appendix B.
greater than 0.10
What is your conclusion?
cannot conclude a significant relationship exists between x and y
8. Consider the following data on production volume and total cost for a particular manufacturing operation.
Production Volume (units) | Total Cost ($) |
400 | 4,000 |
450 | 5,000 |
550 | 5,400 |
600 | 5,900 |
700 | 6,400 |
750 | 7,000 |
The estimated regression equation is .
Use to test whether the production volume is significantly related to the total cost.
Complete the ANOVA table. Enter all values with nearest whole number, except the test statistic (to decimals) and the -value (to decimals).
Source of Variation | Degrees of Freedom | Sum of Squares | Mean Square | F | p-value |
Regression | 1 | 5415000 | 5415000 | 92.83 | 0.0006 |
Error | 4 | 233333 | 58333 | ||
Total | 5 | 5648333 |
What is your conclusion?
conclude production volume and total cost are related
9. Consider the data set below. Use Table 2 of Appendix B.
Excel File: data14-33.xlsx
a. Estimate the standard deviation of when (to decimals).
4.378
b. Develop a confidence interval for the expected value of when (to decimals).
30.07 57.93
c.Estimate the standard deviation of an individual value of when (to decimals).
9.79
d. Develop a prediction interval for when (to decimals).
12.85 75.15
10. Data given below are on the adjusted gross income and the amount of itemized deductions taken by taxpayers. Data were reported in thousands of dollars. With the estimated regression equation , the point estimate of a reasonable level of total itemized deductions for a taxpayer with an adjusted gross income of is . Click on the datafile to reference the data.
Adjusted Gross | Reasonable Amount of Itemized | ||
Income ($1000s) | Deductions ($1000s) | ||
22 | 9.6 | ||
27 | 9.6 | ||
32 | 10.1 | ||
48 | 11.1 | ||
65 | 13.5 | ||
85 | 17.7 | ||
120 | 25.5 |
Use the estimated regression coefficients rounded to decimals in your calculations.
а. Develop a confidence interval for the mean amount of total itemized deductions for all taxpayers with an adjusted gross income of (to decimals).
11.74 14.42
b. Develop a prediction interval estimate for the amount of total itemized deductions for a particular taxpayer with an adjusted gross income of (to decimals).
9.31 16.85
c. If the particular taxpayer referred to in part (b) claimed total itemized deductions of , would the IRS agent’s request for an audit appear to be justified?
Yes it is larger than anticipated.
d. Use your answer to part (b) to give the IRS agent a guideline as to the amount of total itemized deductions a taxpayer with an adjusted gross income of should claim before an audit is recommended (to the nearest whole number).
Any deductions exceeding the 16850 upper limit could suggest an audit.
11. The Wall Street Journal asked Concur Technologies, Inc., an expense management company, to examine data from million expense reports to provide insights regarding business travel expenses. Their analysis of the data showed that New York was the most expensive city. The following table shows the average daily hotel room rate () and the average amount spent on entertainment () for a random sample of of the most-visited U.S. cities. These data lead to the estimated regression equation . For these data . Click on the datafile logo to reference the data. Use Table 1 of Appendix B.
City | Room Rate ($) | Entertainment ($) |
Boston | 148 | 161 |
Denver | 96 | 105 |
Nashville | 91 | 101 |
New Orleans | 110 | 142 |
Phoenix | 90 | 100 |
San Diego | 102 | 120 |
San Francisco | 136 | 167 |
San Jose | 90 | 140 |
Tampa | 82 | 98 |
a. Predict the amount spent on entertainment for a particular city that has a daily room rate of (to decimals).
109.46
b. Develop a confidence interval for the mean amount spent on entertainment for all cities that have a daily room rate of (to decimals).
94.85 124.08
c. The average room rate in Chicago is . Develop a prediction interval for the amount spent on entertainment in Chicago (to decimals).
110.69 188.84
12. Following is a portion of the regression output for an application relating maintenance expense (dollars per month) to usage (hours per week) for a particular brand of computer terminal. Excel File: data14-41.xlsx
Coefficients | Standard Error | t Stat | P-value | |
Intercept | 6.1092 | 0.9361 | ||
Usage | 0.8951 | 0.149 |
If your answer is zero, enter “”.
a. Write the estimated regression equation (to decimals).
6.1092 0.8951
b. Use a test to determine whether monthly maintenance expense is related to usage at the level of significance (to decimals). Use Table 2 of Appendix B.
6.01
0
Reject the null hypothesis. Monthly maintenance expense is related to usage
c. Did the estimated regression equation provide a good fit? Explain. Hint: If is greater than , the estimated regression equation provides a good fit.
Yes because the value of is greater
13. Given are the data for two variables, and . Do not round your intermediate calculations.
a. Develop an estimated regression equation for these data by computing and (to decimals). Enter negative values as negative numbers.
1.587
-7.022
-7.022 1.587
b. Compute the residuals (to decimals). Enter negative values as negative numbers.
3.50
-2.44
–4.79
-1.55
5.28
c. Consider the following three scatter diagrams of the residuals against the independent variable. Which of the following accurately represents the data?
Scatter diagram 3
Do the assumptions about the error terms seem to be satisfied?
No
d. Compute the standardized residuals. Enter negative values as negative numbers.
standardized residual | |
1.3275 | (to decimal places) |
-0.5857 | (to decimal places) |
-1.1031 | (to decimal places) |
-0.3873 | (to decimal places) |
1.5087 | (to decimal places) |
e. Select a correct plot of the standardized residuals against .
Scatter diagram #1
What conclusions can you draw from this plot?
the error term may not be satisfied
14. Consider the following data for two variables, x and y. Excel File: data14- 51.xlsx
a. Consider the three scatter diagrams below.
Scatter diagram #1
Does the scatter diagram indicate any influential observations?
Yes
b. Compute the standardized residuals for these data (to decimals, if necessary). Enter negative values as negative numbers.
Observation 1 | -1.00 |
Observation 2 | -0.40 |
Observation 3 | 0.01 |
Observation 4 | -0.48 |
Observation 5 | 0.25 |
Observation 6 | 0.65 |
Observation 7 | 2.00 |
Observation 8 | -2.16 |
Do the data include any outliers?
Yes, there appears to be an outlier
c. Compute the leverage values for these data (to decimals). Enter negative values as negative numbers.
Observation 1 | 0.28 |
Observation 2 | 0.24 |
Observation 3 | 0.16 |
Observation 4 | 0.14 |
Observation 5 | 0.13 |
Observation 6 | 0.14 |
Observation 7 | 0.14 |
Observation 8 | 0.76 |
Does there appear to be any influential observations in these data?
Yes, observation 8 is an influential observation
15. Retail chain Kroger has more than locations and is the largest supermarket in the United States based on revenue. Kroger has invested heavily in data, technology, and analytics. Feeding predictive models with data from an infrared sensor system called QueVision to anticipate when shoppers will reach the checkout counters, Kroger is able to alert workers to open more checkout lines as needed. This has allowed Kroger to lower its average checkout time from four minutes to less than seconds (Retail Touchpoints).
Consider the data in the file Checkout. The file contains observations. Each observation gives the arrival time (measured in minutes before p.m.) and the shopping time (measured in minutes).
a. Select the correct scatter diagram for arrival time as the independent variable.
Graph A
b. What does the scatter diagram developed in part (a) indicate about the relationship between the two variables?
There appears to be a Positive relationship between the two variables.
Does there appear to be any outliers and/or influential observations?
Observation 32(111,24) appears to be an observation with high leverage and may be very influential in terms of fitting a linear model to the data.
c. Using the entire data set, develop the estimated regression equation that can be used to predict the shopping time given the arrival time (to decimals).
14.27654095 0.254328505
d. Use residual analysis to determine whether any outliers or influential observations are present.
32(111,24) less than -2
e. After looking at the scatter diagram in part (a), suppose you were able to visually identify what appears to be an influential observation. Drop this observation from the data set and fit an estimated regression equation to the remaining data (to decimals).
13.4525 0.2829
Compare the estimated slope for the new estimated regression equation to the estimated slope obtained in part (c). Does this approach confirm the conclusion you reached in part (d)? Explain.
0.2829 0.2543
has would say
16. The Toyota Camry is one of the best-selling cars in North America. The cost of a previously owned Camry depends upon many factors, including the model year, mileage, and condition. To investigate the relationship between the car’s mileage and the sales price for a model year Camry, the following data show the mileage and sale price for sales (PriceHub website). Click on the datafile logo to reference the data.
Miles (1000s) | Price ($1000s) |
22 | 16.2 |
29 | 16.0 |
36 | 13.8 |
47 | 11.5 |
63 | 12.5 |
77 | 12.9 |
73 | 11.2 |
87 | 13.0 |
92 | 11.8 |
101 | 10.8 |
110 | 8.3 |
28 | 12.5 |
59 | 11.1 |
68 | 15.0 |
68 | 12.2 |
91 | 13.0 |
42 | 15.6 |
65 | 12.7 |
110 | 8.3 |
a. Select a scatter diagram with the car mileage on the horizontal axis and the price on the vertical axis.
Scatter diagram #1
b. What does the scatter diagram developed in part (a) indicate about the relationship between the two variables?
Negative relationship
c. Develop the estimated regression equation that could be used to predict the price (s) (to decimals).
16.46975503 0.058773932
d. Test for a significant relationship at the level of significance (to decimals).
2.98677E-12 less than
e. Did the estimated regression equation provide a good fit?
yes
f. Provide an interpretation for the slope of the estimated regression equation (to decimals but dollar value to the nearest cent). Enter negative values as negative numbers.
The slope of the estimated regression is -0.0588 Therefore, every additional miles on the car’s odometr will result in 58.8 decrease in the predicted price.
g. Suppose that you are considering purchasing a previously owned Camry that has been driven miles. Using the estimated regression equation developed in part (c), predict the price for this car (round to nearest dollar).
12943
Is this the price you would offer the seller?
No
Other Links:
See other websites for quiz:
Check on QUIZLET