Enn543 Data Analysis And Optimisation Assessment Answers
winequality-red.csv
winequality-white.csv
Using these data
Answer:
Given that,
Random variable X follows normal distribution with mean µ = 5 and standard deviation σ = 10.
Hence, Prob(X > 10) = 1- Prob(X<=10) = 1- Prob(Z<=(10-5)/10)) =1- Prob(Z<=0.5) = 1- 0.6915 = 0.3085. (probabilities under Z values are obtained from standard normal table)
Prob(−20 < X < 15) = Prob(X<=15) – Prob(X<=-20) = P(Z<=(15-5)/10) – P(Z<=(-20-5)/10) = P(Z<=(15-5)/10) – P(Z<=(-20-5)/10) = 0.8413 – 0.0062 = 0.8351. (Probability values are obtained from standard normal table)
Now, P(X > x) = 0.95 => P(X<=x) = 1-0.95 = 0.05 (As normal distribution is symmetric about the mean and the total probability is 1).
Now, from the standard normal table for Z = -1.65 the area under the normal curve is 0.05.
Hence, (x-5)/10 = -1.65 => x = -16.5 + 5 = -11.5
Hence, foe the value of x=-11.5, the area in the right tail of normal curve is 0.95.
Given that,
Random variable N follows Poisson distribution with mean µ=10000.
Now, through Gaussian approximation and the central limit theorem it can be shown that any sample mean distribution with specified mean and variance can be approximated to normal distribution with the same mean and variance as same as the mean of Poisson distribution.
In this case the approximation will be N ~ normal(10000, sqrt(10000)) = normal(10000, 100)
Hence, by the approximation the value of P(N > 10,200) = 1 – P(N<=10200)
= 1- P(Z<=(10200-10000)/100) = 1 – 0.9772 = 0.0228.
Now, in MATLAB putting Poisson distribution to calculate the CDF of P(N > 10,200) or 1 – P(N<=10200) gives the result 0.0227.
MATLAB code:
P = 1 - cdf('Poisson',10200,10000);
disp(P)
ans =
0.0227
Hence, error in approximation of Poisson to normal is |0.0228 – 0.0227| = 0.0001 which is very less.
Question 6:
The MLE (maximum likelihood estimate) of the above function is the value of that maximizes the function L() = f(x1,x2,x3..|). Here, f = probability density function.
So, L( = (x1/(x2/(x3/….and so on.
Now, taking log on the above equation
+
Now, the maximizing.
At, max(,
= 0
- = 0
- =
Hence, for the given probability density function gives the maximum likelihood estimate.
It is stated that the sample of data x= x1,x2,….xn follows a Poisson distribution with mean λ and that λ follows exponential distribution with parameter θ.
So, P(X) =
P() = θ e^(-
Hence, posterior probability = (Probability of likelihood)* (Prior probability)
=
Now, the above distribution is a Gamma distribution with parameters
β = θ + n, α = (Proved)
Question 8:
The variables of the yacht.dat file are the following in order.
X1 Residuary resistance per unit weight of displacement, adimensional
V2 Longitudinal position of the center of buoyancy, adimensional
V3 Prismatic coe?cient, adimensional
V4 Length-displacement ratio, adimensional
V5 Beam-draught ratio, adimensional
V6 Length-beam ratio, adimensional
V7 Froude number, adimensional
Now, using fitlm command in MATLAB the dependent variable X1 is fitted with respect to independent variables V2 to V7.
MATLAB command:
% the yacht.dat is loaded by selecting it from folder
lrm = fitlm(yacht,'X7~V1+V2+V3+V4+V5+V6');
disp(lrm)
Linear regression model:
X7 ~ 1 + V1 + V2 + V3 + V4 + V5 + V6
Estimated Coefficients:
Estimate SE tStat pValue
________ _______ ________ __________
(Intercept) 154.51 32.359 4.775 2.8055e-06
V1 0.018076 0.44595 0.040534 0.96769
V2 -301.54 52.185 -5.7783 1.8779e-08
V3 -9.8484 18.656 -0.52791 0.59795
V4 7.0168 7.2464 0.96832 0.33366
V5 7.6548 18.712 0.40908 0.68277
V6 73.168 5.1483 14.212 1.8803e-35
Number of observations: 309, Error degrees of freedom: 302
Root Mean Squared Error: 11.8
R-squared: 0.402, Adjusted R-Squared 0.39
F-statistic vs. constant model: 33.9, p-value = 3.44e-31
Hence, the linear regression model is,
X1 = 154.51 + 0.018V1 -301.54V2 -9.848V3 + 7.017V4 +7.655V5 + 73.168V6.
Now, this linear regression model can be used as a function of the independent variables and then for some values of the independent variables the estimate of X1 can be evaluated using the ‘feval’ function in MATLAB. Now, the exactness of the regression equation can be verified by dividing the total dataset in two namely, the training dataset (80% data) and the validation
dataset (20% data). MATLAB command fitlm will be evaluated in the training dataset and the regression equation obtained will be used to evaluate using feval function with the validation set.
Question 9:
In this question a generalized linear regression model is fitted for both red wine ‘quality’ variable and white wine ‘quality’ variable assuming Poisson distribution.
- Model fitting for red wine and white wine model:
MATLAB code with output:
% manually load winequalityred.csv from folder
% winequalitywhite.csv and winequalityred.csv are manually loaded from folder
model = 'quality~fixedacidity +volatileacidity + citricacid + residualsugar + chlorides + freesulfurdioxide + totalsulfurdioxide + density + pH + sulphates + alcohol';
lrm1 = fitglm(winequalityred,model,'Distribution','poisson');
disp(lrm1)
lrm2 = fitglm(winequalitywhite,model,'Distribution','poisson');
disp(lrm2)
Output:
lm1 =
Generalized linear regression model:
quality ~ [Linear formula with 12 terms in 11 predictors]
Distribution = Poisson
Estimated Coefficients:
Estimate SE tStat pValue
___________ __________ ________ _________
(Intercept) 3.6538 13.67 0.26728 0.78925
fixedacidity 0.0036583 0.016633 0.21994 0.82592
volatileacidity -0.1977 0.08039 -2.4593 0.013921
citricacid -0.035923 0.096141 -0.37365 0.70866
residualsugar 0.0026177 0.009736 0.26887 0.78803
chlorides -0.33176 0.27688 -1.1982 0.23084
freesulfurdioxide 0.00082523 0.0014126 0.58418 0.5591
totalsulfurdioxide -0.00061063 0.00047979 -1.2727 0.20312
density -2.1729 13.953 -0.15573 0.87624
pH -0.074826 0.12406 -0.60317 0.5464
sulphates 0.15912 0.072618 2.1912 0.028434
alcohol 0.04815 0.016999 2.8325 0.0046188
1599 observations, 1587 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 66.1, p-value = 6.81e-10
lm2 =
Generalized linear regression model:
quality ~ [Linear formula with 12 terms in 11 predictors]
Distribution = Poisson
Estimated Coefficients:
Estimate SE tStat pValue
___________ __________ ________ __________
(Intercept) 28.094 11.144 2.5211 0.011698
fixedacidity 0.012809 0.011881 1.0781 0.281
volatileacidity -0.33456 0.064234 -5.2085 1.9041e-07
citricacid 0.0025292 0.053278 0.047471 0.96214
residualsugar 0.014557 0.0043653 3.3347 0.00085393
chlorides -0.062667 0.31275 -0.20037 0.84119
freesulfurdioxide 0.00062244 0.00046312 1.344 0.17894
totalsulfurdioxide -3.6945e-05 0.00021042 -0.17558 0.86063
density -27.359 11.298 -2.4215 0.015457
pH 0.1235 0.059026 2.0922 0.036417
sulphates 0.10875 0.054501 1.9953 0.046011
alcohol 0.03036 0.014207 2.137 0.032594
4898 observations, 4886 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 185, p-value = 1.04e-33
- As the overall p value of the white wine model is less than considered significance level of 0.05, so the model is appropriate. Now, the independent variables which are significant are volatileacidity, residualsugar, density, pH and alcohol as the p values of these variables is less than the considered significance level of 0.05.
Similarly, the red wine model is a proper fit as overall p value is 6.81e-10 which is less than considered level of significance of 0.05.
In this model the independent variables which are significant are volatileacidity, sulphates and alcohol as the p values of those are less than 0.05.
So, in white wine model there are more significant independent predictor variables than in red wine model. The similarity of these two models are
- a) both models are significant
- b) volatileacidity and alcohol are significant independent variables in both.
Buy Enn543 Data Analysis And Optimisation Assessment Answers Online
Talk to our expert to get the help with Enn543 Data Analysis And Optimisation Assessment Answers to complete your assessment on time and boost your grades now
The main aim/motive of the management assignment help services is to get connect with a greater number of students, and effectively help, and support them in getting completing their assignments the students also get find this a wonderful opportunity where they could effectively learn more about their topics, as the experts also have the best team members with them in which all the members effectively support each other to get complete their diploma assignments. They complete the assessments of the students in an appropriate manner and deliver them back to the students before the due date of the assignment so that the students could timely submit this, and can score higher marks. The experts of the assignment help services at urgenthomework.com are so much skilled, capable, talented, and experienced in their field of programming homework help writing assignments, so, for this, they can effectively write the best economics assignment help services.