Uncategorized

IDENTIFYING/DEMOGRAPHIC DATA: Sampaguita is a 29-year-old female from the Philippines.

INSTRUCTIONS Please note that you are expected to answer the questions clearly in this document. Use the template included where relevant to answer. Give the R outputs, comments, and discussion clearly and logically. Attach all the R commands in the Appendix. Write the resulting model equation to the relevant questions. Once completed submit the answer scripts as a PDF via TurnItin link within vUWS site. Please note that 10 Marks are allocated for organization, reasoning, logical flow, and the inclusion of all correct R codes and outputs in the Appendix for both Part A and Part B. SCENARIO Recent public health data indicate a troubling increase in kidney disease rates within specific suburban areas, attracting significant attention from public health practitioners. Determined to uncover the root causes and identify actionable risk factors to address this issue, the public health team has embarked on a comprehensive study. They have collected patient records and relevant information on medical factors and water quality, as provided in the dataset. Data Description: Variable Description PatientID Unique identifier of each patient Age Age of the individual Gender Gender of the individual BloodPressure Systolic blood pressure in mmHg BloodSugar Fasting blood sugar levels in mg/dL Cholesterol Total cholesterol level in mg/dL BodyMassIndex BMI, a measure of body fat based on height and weight SmokingStatus Smoking status of the individual [Never/ Former/ Current] ElectricConductivity Measurement of the water’s ability to conduct electricity, which can indicate contamination in μS/cm pH pH level of the water DissolvedOxygen Amount of oxygen dissolved in water in mg/L Turbidity Measure of water clarity in NTU TotalDissolvedSolids Measure of dissolved substances in water in mg/L NitriteLevel Nitrite concentration in water in mg/L NitrateLevel Nitrate concentration in water in mg/L LeadConcentration Lead concentration in water in mg/L ArsenicConcentration Arsenic concentration in water in mg/L Humidity Ambient humidity level in % KidneyDisease Presence or absence of kidney disease 0 – Absence of kidney disease 1 – Presence of kidney disease * Please note that this is a simulated data generated to resemble the real-world data for the purpose of this assignment. Consider the scenario described and the data set provided [KidneyData.csv] to answer the following questions. Identify the target variable and clearly specify the research question. (3 Marks) Target variable: Research Question: Understand the data and perform the necessary data pre-processing. Clearly explain the steps taken. [Hint: data cleaning, make sure to divide the data into training and test set etc.,] (6 Marks) [Write the steps taken here.] Print the structure of the data before cleaning and pre-processing here. [Hint: use str() function] Print the structure of the training data after cleaning and pre-processing here. Perform a thorough data exploration using the provided dataset. You may use various visualization techniques (such as histograms, scatter plots, box plots, correlation matrices, etc.) to uncover significant patterns and insights. Interpret your outputs and discuss key findings. [Hint: You may use as many plots as necessary and make sure to interpret them.] (10 Marks) Use logistic regression to answer the research question. Clearly explain the process or all the steps involved [Hint: model building, model improvement, evaluation]. (8 Marks) Give your resultant model. (3 Marks) — End of questions for Part A. Part B will be available soon — APPENDIX [Attach all your R codes and outputs here.]

 
******CLICK ORDER NOW BELOW AND OUR WRITERS WILL WRITE AN ANSWER TO THIS ASSIGNMENT OR ANY OTHER ASSIGNMENT, DISCUSSION, ESSAY, HOMEWORK OR QUESTION YOU MAY HAVE. OUR PAPERS ARE PLAGIARISM FREE*******."