Ict702 Investigate Mortality Levels Of Assessment Answers
Your task is to investigate mortality levels of children, infants and babies around the world. Which countries have a high rate of child deaths and which countries have low rates? Are there connections between child mortality rates and the income levels of the country, or the region where they are situated? Are the child mortality rates improving in recent years or getting worse? Reduction of child mortality was Goal 4 of the United Nations' Millennium Development Goals (2000-2015):
The data you will need is available from WHO: The World Health Organisation.
child mortality data (Download COMPLETE Data as CSV Dataset).The two files you need to download are available from these web pages (or can be downloaded from the ICT702 Blackboard page):
- country metadata. (To get all the data you need, you must download the JSON version of this file, or the CSV XMART version).
Learning Objectives
In this task you will learn how to:
- Apply relevant Python programming concepts to a data analysis challenge
- Read data from real sources and wrangle it into the form you need.
- Develop creative approaches to solving the wrangling/analysis problems.
- Adhere to the recommended Python programming styles
- Write programs that produce correct and useful output
- Organise and present a data analysis report
- Give an insightful analysis of the given problem.
Task 2 is broken into two parts (each worth 20% of the course).
- Due Week 9. Use Python to read and analyse the child mortality data and generate various useful graphs that give insight into the trends.
- Due Week 12. Use Python to combine the child mortality data and the country metadata, to give higher-level analyses of child mortality in relation to income grouping and regions of the world.
1 Child Mortality
In this first part of Task 2, you should write a Python script that reads and analyses the child mortality data file (WHOSIS_MDG_000003.csv) and produce at least FIVE useful graphs that give insight into the data trends.
You should produce the following graphs:
- graph the under-five mortality rates of all countries in 1990, sorted from lowest rate to highest rate. (If the graph is too crowded, you can show just each 10th country, or show the first 10 and last 10 countries).
- graph the under-five mortality rates as above, but for 2015.
- choose three representative countries in different areas of the world, then create a line graph showing the trends in under-five mortality in those countries over the years from 1990 to 2015.
- for those same three countries, graph the reduction in under-five mortality rates over the 1990/2015 period - that is, the 1990 rate divided by the 2015 rate.
- an under-five mortality graph of your choice... (explain why you chose that graph, and what conclusions you draw from the graph).
Hints:
- Some of the columns contain multiple values (a mortality rate, plus a confidence interval), so you will need to split these up into separate columns.
- You can either use standard Python data structures to store and manipulate the data, or use the Pandas library if you prefer.
- Use markup and headings to break your Jupyter notebook into sections and give commentary about what you doing, and discussion of your results. This Jupyter notebook will be what you submit.
2 Child Mortality and Country Types
In this second part of Task 2, you should write another Python script that reads and analyses the country metadata (COUNTRY.json) and merges it with the child mortality data from Part 1, to allow you to do some higher-level analysis of under-five mortality trends.
Your report should include at least two graphs that display or compare under-five mortality in different regions of the world (using the 'WHO_REGION' string to group the countries).
You report should include at least two graphs that compare under-five mortality across different income groupings (using the 'WORLD_BANK_INCOME_GROUP' string to classify the countries).
Hints:
- You can use the 'json' library to read the .json file. The resulting object is quite deeply nested, so you will need to explore which substructures contain the data that you want, and then extract that substructure into a dictionary or list that is easier to use. Or write a function that extracts the data that you need.
- You can either use standard Python data structures to store and manipulate the data, or use the Pandas library if you prefer.
- Use markup and headings to break your Jupyter notebook into sections and give commentary about what you doing, and discussion of your results. This Jupyter notebook will be what you submit.
3 Marking Criteria
Your submission for each part of this task will be assessed according to the following criteria:
- Presentation and organisation of your report [30%]
- Data analysis and program output [30%]
- Correctness of the program output and graphs
- Insightful analysis and discussion of the data
- Programming style [40%]
- Creative approaches to solving the problem
- Good use of relevant programming concepts, especially good use of functions
- Good use of appropriate Python data structures, such as lists and dictionaries
Answer:
import csv
from json import loads, dumps
import matplotlib.pyplot as plt
import numpy as np
w, h = 5, 5240;
matinil = [[0 for x in range(w)] for y in range(h)]
variable1990,variableco1,variableco2,variableco3,myred1,myred2,myred3=[],[],[],[],[],[],[]
variable2015,val1,val2,val3=[],0,0,0
j,k=0,0
# Start CSV Read
with open('WHOSIS_000003.csv') as inputcsv:
datastore={}
WHOSIS1=[]
WHOSIS2=[]
WHOSIS3=[]
WHOSIS4=[]
WHOSIS5=[]
reader = csv.DictReader(inputcsv)
# DO the processing
for whorow in reader:
datastore=loads(dumps(whorow))
WHOSIS1.append(datastore['Country'])
WHOSIS2.append(datastore['Year'])
spl1=datastore['Infant mortality rate (probability of dying between birth and age 1 per 1000 live births)']
spl2=spl1.split('[')
WHOSIS3.append(spl2[0])
spl1=datastore['Neonatal mortality rate (per 1000 live births)']
spl2=spl1.split('[')
WHOSIS4.append(spl2[0])
spl1=datastore['Under-five mortality rate (probability of dying by age 5 per 1000 live births)']
spl2=spl1.split('[')
WHOSIS5.append(spl2[0])
w, h = 10, 4442;
Matrix = [[0 for x in range(w)] for y in range(h)]
with open('COUNTRY.csv') as csvfile:
myvals={} COUNTRYcomb1,COUNTRYcomb2,COUNTRYcomb3,COUNTRYcomb4,COUNTRYcomb5,COUNTRYcomb6,COUNTRYcomb7,COUNTRYcomb8,COUNTRYcomb9,COUNTRYcomb10=[],[],[],[],[],[],[],[],[],[]
reader = csv.DictReader(csvfile)
for row in reader:
myvals=loads(dumps(row))
COUNTRYcomb1.append(myvals['attribute__label'])
COUNTRYcomb2.append(myvals['attribute__display'])
COUNTRYcomb3.append(myvals['dimension__code__display'])
COUNTRYcomb4.append(myvals['dimension__code__attr__category'])
COUNTRYcomb5.append(myvals['dimension__code__attr__value'])
regionmat1,regionmat2,regionmat3,regionmat4,regionmat5=[],[],[],[],[]
incomemat1,incomemat2,incomemat3,incomemat4,incomemat5=[],[],[],[],[]
for i in range(len(COUNTRYcomb4)):
if 'WORLD_BANK_INCOME_GROUP' in COUNTRYcomb4[i]:
incomemat2.append(COUNTRYcomb2[i])
incomemat3.append(COUNTRYcomb3[i])
incomemat4.append(COUNTRYcomb4[i])
incomemat5.append(COUNTRYcomb5[i])
if 'WORLD_BANK_INCOME_GROUP_RELEASE_DATE' in COUNTRYcomb4[i]:
incomemat1.append(COUNTRYcomb5[i])
if 'WHO_REGION_CODE' in COUNTRYcomb4[i]:
regionmat1.append(COUNTRYcomb5[i])
if 'WHO_REGION' in COUNTRYcomb4[i]:
regionmat2.append(COUNTRYcomb2[i])
regionmat3.append(COUNTRYcomb3[i])
regionmat4.append(COUNTRYcomb4[i])
regionmat5.append(COUNTRYcomb5[i])
fig = plt.figure(figsize=(20,10))
plt.plot(incomemat5,marker='o',markerfacecolor='red')
plt.xlabel(" dimension__code__attr__category")
plt.ylabel("dimension__code__attr__value")
plt.title("WORLD_BANK_INCOME_GROUP value")
plt.grid(b=None, which='major', axis='both')
fig = plt.figure(figsize=(20,10))
plt.plot(incomemat1,marker='o',markerfacecolor='red')
plt.xlabel(" dimension__code__attr__category")
plt.ylabel("dimension__code__attr__value")
plt.title("WORLD_BANK_INCOME_GROUP_RELEASE_DATE value")
plt.grid(b=None, which='major', axis='both')
fig = plt.figure(figsize=(20,10))
plt.plot(regionmat1,marker='o',markerfacecolor='red')
plt.xlabel(" dimension__code__attr__category")
plt.ylabel("dimension__code__attr__value")
plt.title("WHO_REGION_CODE Value")
plt.grid(b=None, which='major', axis='both')
fig = plt.figure(figsize=(20,10))
plt.plot(incomemat5,marker='o',markerfacecolor='red')
plt.xlabel(" dimension__code__attr__category")
plt.ylabel("dimension__code__attr__value")
plt.title("WORLD_BANK_INCOME_GROUP value")
plt.grid(b=None, which='major', axis='both')