Context The comp - activ database comprises activity measures of computer systems. Data was gathered from a
Fantastic news! We've Found the answer you've been seeking!
Question:
Context
The compactiv database comprises activity measures of computer systems. Data was gathered from a Sun Sparcstation with Mbytes of memory, operating in a multiuser university department. Users engaged in diverse tasks, such as internet access, file editing, and CPUintensive programs.
Being an aspiring data scientist, you aim to establish a linear equation for predicting 'usr' the percentage of time CPUs operate in user mode Your goal is to analyze various system attributes to understand their influence on the system's 'usr' mode. Problem Define the problem and perform exploratory Data Analysis
Problem definition Check shape, Data types, statistical summary Univariate analysis Multivariate analysis Use appropriate visualizations to identify the patterns and insights Key meaningful observations on individual variables and the relationship between variables
Problem Data Preprocessing
Prepare the data for modelling: Missing Value Treatment if needed Outlier Detection treat if needed Feature Engineering Encode the data Traintest split
Problem Model Building Linear regression
Apply linear Regression using Sklearn Using Statsmodels Perform checks for significant variables using the appropriate method Create multiple models and check the performance of Predictions on Train and Test sets using Rsquare, RMSE & Adj Rsquare.
Problem Business Insights & Recommendations
Comment on the Linear Regression
Related Book For
Applied Regression Analysis and Other Multivariable Methods
ISBN: 978-1285051086
5th edition
Authors: David G. Kleinbaum, Lawrence L. Kupper, Azhar Nizam, Eli S. Rosenberg
Posted Date: