Question: Here is code so far, but not getting correct visual. Dotted red lines should not be straight. Showed image of what correct visual should look

Here is code so far, but not getting correct visual. Dotted red lines should not be straight. Showed image of what correct visual should look like.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
fig,axes=plt.subplots(3,2, figsize=(15,10))
for i, course in enumerate(courses):
data=df[course].dropna()
mean,std=data.mean(), data.std()
ax=axes[i //2, i %2]
stats.probplot(data, dist="norm", plot=ax)
ax.set_title(f"{course.replace('_grade', '')}, n={len(data)}")
ax.legend([f"{course.replace('_grade', '')}, n={len(data)}"])
ax.axhline(mean +2*std, color='red', linestyle='--')
ax.axhline(mean -2*std, color='red', linestyle='--')
ax.set_xlabel("Theoretical Quantiles")
ax.set_ylabel("Ordered Values")
plt.tight_layout()
plt.show()
Question 2: Grade Distribution Normality Check (35\%)
Seeing the student grade distributions of the 6 large residential courses, the team is tempted to draft recommendations for instructors and report to them what particular aspects could be addressed to improve students' academic learning outcome. However, before they launch statistical tests, they need to verify if the student grades
data approximately follows normal distribution, a sufficient condition rendering the design of statistical models valid for those courses. You suggest that a QQ-plot is a great method to determine how similar a distribution is to another. Great idea!
- Make a 3*2 figure (again,6 subplots) so that for each course you have a QQ plot using the student grade samples versus the normal distribution with the same mean and standard deviation
- You need to use a legend on each plot to specify the corresponding course name and number of students involved. For example, you can draw a legend and specify "STATS 250, n=5000" to indicate that you are analyzing STATS 250 course with 5000 enrolled students records being used for analysis
- For each QQ-plot, add 2 lines representing +/-2 standard deviations outside from the QQ-line (a straight line showing the theoretical values for different quantiles under normal distribution). Use an additional annotation inside each subplot to highlight the outliers that sit outside of these lines. I.e. data points that lie outside the 2 standard deviations on either side. Briefly describe the figure discussing the courses and whether they seem to be normally distributed.
Hint: You may find using fig = plt.figure() and fig.add_subplot() functions helpful to create subplots. You don't have to use these functions though.
ENGLISH125, n=14196
English125 N =14196
Here is code so far, but not getting correct

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!