Answer :
Answer:
The outlier for this case would be (19.37 , 337) since this point is far away from the others and this point probably strongly influences the correlation value.
What is the correlation with this point? (Round your answer to two decimal places.)
> cor(x,y)
[1] 0.34
What is the correlation without this point? (Round your answer to two decimal places.)
> cor(x1,y1)
[1] 0.79
Calculate the regression line with the outlier.
y = 35.2 x-523.9
Calculate the regression line without the outlier.
y = 77.9 x -1422.3
Step-by-step explanation:
We have the following data:
Isotope %: 19.90, 20.71, 21.63, 19.84, 20.80, 21.63, 19.46, 20.86, 21.19,20.20, 21.28, 19.37 (representing X)
Silicon : 85,152,226,106,263,233,114,265,186,139,298,337 (representing Y)
Find the single outlier in the data. This point strongly influences the correlation. What is the correlation with this point? (Round your answer to two decimal places.)
We can use the scatter plot in order to see any potential outlier. With the following R code:
> x<-c(19.90, 20.71, 21.63, 19.84, 20.80, 21.63, 19.46, 20.86, 21.19,20.20, 21.28, 19.37)
> y<-c(85,152,226,106,263,233,114,265,186,139,298,337)
> plot(x,y, main="Scatter plot Silicon vs Isotope")
And we can see the plot on the figure attached.
The outlier for this case would be (19.37 , 337) since this point is far away from the others and this point probably strongly influences the correlation value.
What is the correlation with this point? (Round your answer to two decimal places.)
> cor(x,y)
[1] 0.34
What is the correlation without this point? (Round your answer to two decimal places.)
> x1<-x[-12]
> x1
[1] 19.90 20.71 21.63 19.84 20.80 21.63 19.46 20.86 21.19 20.20 21.28
> y1<-y[-12]
> y1
[1] 85 152 226 106 263 233 114 265 186 139 298
> cor(x1,y1)
[1] 0.79
As we can see the correlation changes significantly without the outlier.
c) Is the outlier also strongly influential for the regression line? Calculate the regression line with the outlier. (Round your slope to two decimal places, round your y-intercept to one decimal place.)y = ( ) ? x ( )
We can calculate the regression line with the following R code
> linearmod1<-lm(y~ x)
> linearmod1
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
-523.9 35.2
So our equation would be: y = 35.2 x-523.9
Calculate the regression line without the outlier. (Round your slope to two decimal places, round your y-intercept to one decimal place.)y = ( ) ?( ) x
> linearmod2<-lm(y1~x1)
> linearmod2
Call:
lm(formula = y1 ~ x1)
Coefficients:
(Intercept) x1
-1422.26 77.85
The new equation would be y = 77.9 x -1422.3
So as we can see the outlier also changes significantly the estimation for the slope and the intercept of the linear model
