Cis8008 Business Intelligence - Decision Assessment Answers

Assignment 3 consists of three main tasks and a number of sub tasks Task 1

Task 1

Critically review and discuss My Health Record system (www.myhealthrecord.gov.au) in terms of current privacy provisions for patients electronic health records drawing the Australian Digital Health Agency’s privacy policy (https://www.myhealthrecord.gov.au/about/privacy-policy) and recent changes to the My Health Record Act which will be brought into line with the existing Australian Digital Health Agency policy.

Your review and discussion of My Health Record system and its privacy provisions for patients should be guided by the following:

Australian Privacy Principles (APPs) in the Privacy Act (https://www.digitalhealth.gov.au/policies/privacy),
Requirements of the (2) My Health Records Act (https://www.legislation.gov.au/Details/C2017C00313)and
Healthcare Identifiers Act(https://www.legislation.gov.au/Details/C2017C00239

Task 2

The goal of Task 2 is to predict the likelihood of a customer becoming a loan delinquency and forfeiting on a loan for ACME Bank (see Table 1 Data Dictionary for loan-delinq.csv data set below). It is important you understand this data set in order to complete Task 2 and four sub tasks.

Task 2.1 Conduct an exploratory data analysis of the training data set loan-delinq.csv using RapidMiner Studio data mining tool.

Provide the following for Task 2.1:

A screen capture of your final EDA process and briefly describe your final EDA process
Summarise the key results of your exploratory data analysis in a table namedTable

2.1 Results of Exploratory Data Analysis for loan-delinq.csv

Discuss the key results of your exploratory data analysis and provide a rationale for selecting your top 5 variables for predicting loan delinquency as the outcome based on the results of your exploratory data analysis and a review of the relevant literature on key factors contributing to a loandelinquency

Note: Table 2.1 should include the key characteristics of each variable in the loan- delinq-train.csv data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc

Hint: The Statistics Tab and the Chart Tab in RapidMiner provide a lot of descriptive statistical information and the ability to create useful charts like Barcharts, Scatterplots etc for the EDA analysis. You might also like to look at running some correlations or chi sq tests whichever is appropriate for the loan-delinq.csv data set to indicate which variables are the top 5 key variables and contribute most to predicting a loan delinquency as an outcome.

Task 2.2 Build a Decision Tree model for predicting loan delinquency based on the data set loan-delinq.csv using RapidMiner and an appropriate set of data mining operators and a reduced set of variables from loan-delinq.csv determined by your exploratory data analysis in Task 2.1. Provide the following for Task 2.2:

(1) Final Decision Tree Model process, (2) Final Decision Tree diagram, and (3) Decision tree
Briefly explain your final Decision Tree Model Process, and discuss the results of the Final Decision Tree Model drawing on the key outputs (Decision Tree Diagram, Decision Tree Rules) for predicting loan delinquency. This discussion should be based on the contribution of each of the top five variables to the Final Decision Tree Model and relevant supporting literature on the interpretationof decision trees.

Table 1 Data dictionary: loan-delinq.csv data set

Variable Name	Description	Type
SeriousDlqin2yrs	Person experienced 90 days past due delinquency or worse	Y/N
RevolvingUtilizationOfUnsecuredLines	Total balance on credit cards and personal lines of credit except real estate and no installment debt like car loans divided by sum of credit limits	percentage
age	Age of borrower in years	integer
NumberOfTime30-59DaysPastDueNotWorse	Number of times borrower 30-59 days past due but no worse in last 2 years.	integer
DebtRatio	Monthly debt payments, alimony, living costs divided by monthly gross income	percentage
MonthlyIncome	Monthly income	real
NumberOfOpenCreditLinesAndLoans	Number of Open loans (installment like car loan or mortgage) and Lines of credit (e.g. credit cards)	integer
NumberOfTimes90DaysLate	Number of times borrower has been 90 days or more past due.	integer
NumberRealEstateLoansOrLines	Number of mortgage and real estate loans including home equity lines of credit	integer
NumberOfTime60-89DaysPastDueNotWorse	Number of times borrower has been 60-89 days past due but no worse in last 2 years.	integer
NumberOfDependents	Number of dependents in family excluding themselves (spouse, children etc.)	integer

Task 2.3 Build a Logistic Regression model for predicting loan delinquency based on the loan-delinq.csv data set using RapidMiner and an appropriate set of data mining operators and a reduced set of variables determined by your exploratory data analysis in Task 2.1. Provide the following for Task 2.3:

(1) Final Logistic Regression Model process and (2) Coefficients, and (3) Odds Ratios. Hint you can RapidMiner Studio Logistic Regression operator or you can to install the Weka Extension in RapidMiner Studio and use Logistic Regression Operator for this Task3.
Briefly explain your final Logistic Regression Model Process and discuss the results of the Final Logistic Regression Model drawing on the key outputs (Coefficients, Odds Ratios) for predicting loan delinquency. This discussion should be based on the contribution of each of the top five variables to the Final Logistic Regression Model and relevant supporting literature on the interpretation of logistic regression models

Task 2.4 Conduct a comparative performance evaluation of your Final Decision Tree Model with your Final Logistic Regression Model for predicting loan delinquency. Note you will need to use the Cross Validation Operator; Apply Model Operator and Performance (Binominal Classification) Operator in your final data mining process models (Decision Tree, Logistic Regression) to generate the required model performance metrics (Accuracy, Miscalculation Rate, True Positive Rate, False Positive Rate, Area under Roc Chart (AUC), Precision, Recall, Lift, Sensitivity, F Measure) required for Task 2.4.

Provide the following for Task 2.4:

A screen snapshot of the Confusion Matrix and AUC for each Final Model (Decision Tree, LogisticRegression)
A table named Table 2.2 Results of Model Performance Evaluation (Decision Tree, Logistic Regression) that compares the key results of the performance evaluation for the Final Decision Tree Model and Final Logistic Regression Model in terms of Model Accuracy, Miscalculation Rate, True Positive Rate, False Positive Rate, Precision, Recall, Lift, Sensitivity, F
Discuss and compare the key results of your performance evaluation of two final models (Decision Tree, Logistic Regression) presented in parts i and ii of the Task 2.4, indicate which model is better and explainwhy.

The important outputs from data mining analyses conducted using RapidMiner for Task 2 should be included in your Assignment 3 report to provide support for conclusions reached regarding each analysis conducted for Task 2.1, Task 2.2, Task 2.3 and Task 2.4.

Note export the important outputs from RapidMiner as jpg image files and include these screenshots in the relevant Task 2 sections and/or appendices of your Assignment 3 Report.

Note you will find the Sharda et al. 2018 and North Text books useful references for the data mining process activities conducted in Task 2 in relation to the exploratory data analysis, decision tree analysis, logistic regression analysis and evaluation of the comparative performance of the Final Decision Tree model and the Final Logistic Regression model.

Task 3

Australian Weather dataset (see Data Dictionary Table 3.1) contains over 145,000 daily observations from January 2008 through to June 2017 from 49 Australian weather station locations for rainfall and evaporation recorded. Note for some weather station locations such as Uluru, the data set is incomplete. The daily observations are available from http://www.bom.gov.au/climate/data Bureau of Meteorology. Variable definitions adapted from http://www.bom.gov.au/climate/dwo/IDCJDW0000.shtml.

Table 3.1 Data dictionary for Australian Weather Data set variables

Variable Name	Data Type	Description
Date	Date	Date of weather observation
Location	Text	Common name of location of weather station.
Rainfall	Real	Amount of rainfall recorded for day in mm.
Evaporation	Real	So-called Class A pan evaporation (mm) in 24 hours to 9am.

Task 3 requires you build a Tableau Dashboard Australian Weather by Location (AWL) which includes four different views of the weatherAus.csv data set as specified in sub Tasks 3.1, 3.2, 3.3 and 3.4. An additional data set weatherAus-locations.csv is provided which will need to be joined with weatherAus.csv on the common variable/field location in order to provide location specific data views in the AWL dashboard.

See first record of weatherAUS-locations.csv data set

stnID	Location	stnNum	latitude	longitude	postcode	state
2002	Albury	72160	-36.069	146.9509	2640	nsw

It’s a simple operation in Tableau to join two different files on a common variable/field name – for Assignment 3 it is locations variable/field

Task 3.1 Create a Tableau View of rainfall by day for each location and a specific state and related locations. Provide a screen capture of and describe the Tableau view you have created and comment on the rainfall over one month across state locations and does this differ much for the different states.

Task 3.2 Create a Tableau View of total rainfall by year for each location and a specific state. Provide a screen capture of and describe the Tableau view you have created and comment on variation of total rain across locations for a specific state .

Task 3.3 Create a Tableau View that compares locations by total evaporation for a specific state over months. Provide a screen capture of and comment on the levels of evaporation for different locations and states .

Task 3.4 Create a Tableau GeoMap View of all Australian weather stations that displays the provides latitude and longitude and total rainfall for a selected year. Provide a screen capture of and describe the Tableau Geomap view you have created and comment on one selected location for state.

Task 3.5 Provide screen snapshot of your AWL Dashboard and an accompanying rationale (drawing on the relevant literature for good dashboard design) for the graphic design and functionality that is provided by your AWL Dashboard for the four specified Tableau views for sub Tasks 3.1, 3.2, 3.3 and 3.4

Note Stephen Few is considered to be the Guru for good Dashboard Design and has wrote a number of books on this topic. Worth having a look at his website https://www.perceptualedge.com/about.php and in particular his examples of poorly designed dashboard views and his suggestions for better dashboard views.

Answer

Business Intelligence

Task 1

Introduction

Privacy concerns in the health research has now a day become one issue that attract considerable attention. In designing relevant health care service, research is required regarding health related issues. Data or information are considered as one vital asset of any organization. In the information set, special attention needs to be given on confidential information. The organization should provide proper focus on maintaining terms of confidentiality of the information set. Securing confidential information by using different software and hardware is known as information security. It implies a combined internal and external system of operation where collected information and data are kept protected. In functioning of an organization, recorded data and information plat several important roles. The information security consists of different functions. The first primary responsibility is to maintain the privacy of the collected data. A secure information system also protects capacity of the concerned organization in performing its assigned functions. Another important aspect the security to the accessed technology of the organization. Organization gives special attention to protection of the information as the unauthorized access to the confidential information has adverse effect on people directly or indirectly connected to the organization.

All the health related data in Australia re documented under the supervision of digital health agency of Australia. With increasing prevalence of various health issues, load of gathered information is increasing at a rapid pace in a very short span of time. The concerned agency gathers and maintains all these health related information. Growing concerns for various diseases encourage more people to take health care services. The number of people willing to have different health care service far exceed the number of available service providers (Digitalhealth.gov.au. 2018). Therefore, maintenance of detailed information regarding about the individual recipient has become extremely important. Information are collected regarding personal details and status of health. The technological advancement in several equipment and machinery used by different health care service unit increases information availability related to individual recipient. Data related to prenatal testing is the most easily accessible. Analysis of the health related risk factors is an important aspect determining continuation of required services. Maintenance of proper data and information also provide protection against unjustified allegation or claims. In addition to direct health care, various other aspects are considered under primary health care service. It is the responsibility of the service providers to document observations and instructions. Intervention by the third party is often observed in the system where relevant information are used to pay for the used services.

Importance of electronic information recording system

Health care organizations today pay great attention in securing information. Given large volume of health care data, security system is designed to maintain confidentiality of the personal information. Unless proper security, it would be very easy to access and misuse these information. Now, information are recorded and shared using electronic medium instead of earlier paper based method of documentation (Dinev et al. 2018). The paper-based method of photocopying important documents is a laborious process and require more time compared to storing data electronically.

Various sources have been used in gathering the relevant data, which is combined and connected to other profiles. Therefore, with the electronic process it is easier exploring the database within the build network for extracting the data from different remote locations. Nevertheless, the system relevantly increases the chance of third party accesses of the data, which is being stored. Hence, the overall system indicates the absence of adequate security, which is not protecting the overall data of the organization, while making the access process easy. The individual can access the data without leaving any kind of trace for such kind of incident. Moreover, the system has directly allowed the service providers to understand the trend of the data, which indicates the health conditions of the population. The service providers depict easy access and understanding of the information presented in data based (Van Cauteren et al. 2016). The information is relevantly used with the advancements in the technology for supporting the electronic health care record to the individuals. Mobile technology is also used in detecting the required level of data for the individuals. The major significance of HER is to support activities of the industry. Thus, with the continuous evolution of the system along with the improvements directly increase the quality of health care service that is being contributed to the overall system.

There are different measures, where the information related to the statues of store data is electronically kept and reduces the overall error processing. Therefore, with this measure risk of malpractices can be avoided for meeting the reimbursement claims. However, there are drawback of the current legal framework that is being used in the health care recording process. Moreover, the obligation of the health care is also based on electronic and paper-based methods. Thus, it could be detected that the confidentiality conditions may vary on the information holder.

Detecting the Evolution in electronically stored health record

The rapid growth in the health record burden has relevantly flourished the use of electronically based recording system, which directly supports the service providers. Therefore, with the large variation in the nation forced the creation of huge support is electronic health record. The electronic health record has played a significant role in major hospitals, as its allows the authorities to understand the history of the patients. The national center has relevantly indicated that 75% of the service provers are able to enhance the quality of patient care with the use of electronic data. Consequently, with the electronic recording system the has relevantly allowed the individuals to access information regarding the patients and make adequate decision during critical hours (Kim et al. 2017). The system has relevantly provided alerts for the new medication and physicians that the patients are considering for their health issues. Hence, it could be understood that the digital health technology has undergone serious changes in recent years for supporting the hospitals with information regarding the patients. The structure of the health record system has also played an adequate role in distributing the information of different hospitals.

The personal information of the Australian citizen and other personal data of the citizen are stored in the Australian Data Agency. Since the data is composed of the Australian Citizen personal and crucial, important documents the same needs to be stored and governed by the regulatory act such as the Privacy Act of 1998. The health record system manages all the personal information of the organization in a more classified way. The personal data and information collected and stored are useful and is always viable for an consideration for an option when the same is identified for the use of the communication process and for the management purpose. The use of “My Health Record System has enabled and widely used by the company and the operators for reclassification and arraignment of the data. The crucial information, which are gathered about the health care products and services rendered are stored in a structured and the same is stored with privacy. The data protection and the personal data gathered is a privacy matter for the company and the same should be regulated with the regulatory bodies by imposing certain rules and guidelines for the same. The regulatory body can take several steps by including steps such as imposition of penalty and fines and imposing several regulatory and criminal proceedings against those involved in the breaching of secured and private data of the organization (Zingg et al.2015).

The organization has several ways through which the data inflow to the organization such as data recorded via telephonic conversations, emails and other general letters and all, which may contain certain other privy data. The Organizations collect several and various kind of employment related data also, which should also be managed and stored effectively. The process of data collection and data processing is well managed by the company in terms of managing relation with the employees of the organizations (Watanabe 2015). It is crucial to note that the management of the company should assess crucial important situations and scenario where the data collected and gathered may be for the use in making effective decision process. Situation arises when the organization reviews data management and the same is used in the various process and steps of the company like in the contract, workforce management, and meeting the obligations and rules of the regulatory bodies and for association of goods with the market information available to the management of the company. There are several requirement by the Human service Department for the for providing health data records and information which is ensured by the regulatory body to ensure betted data management and data processing. The registration are also taken for those interested in registration of the digital health care system and the security for the same ios an important factor. Certain and several steps needs to be taken into consideration for enabling and protecting the data of the organizations (Booth et al. 2018).

There should be several steps and accessibility to the data should be given to those individuals after having careful analyzing of their identity. The parental responsibility should also be taken care where the responsibility of the same should be regarding authorized representative should be over the age of 18.

Task 2:

Task 2.1:

In order to explore the loan delinquency of ACME bank the primary function is to import the data into the rapidminer. Findings from the exploration of data suggest that the data file represents the primary variable was to recognize the variable. By using the “Select Attributes” the identity variable was detached. Consequently, the matrix correlation operator was implanted and the central process was finished. The procedure is illustrated in the figure Task 2.1. The execution procedure offered the analysis of exploratory data of the dataset together with the matrix of correlation.

Figure 2.1: Figure illustrating data analysis and Correlation Matrix Procedure

Table 2.1: The below stated figure provides an analysis of the exploration data relating to the loan delinquency.

Table 2.1: Results relating to the Analysis of Exploratory Data for loan delinq.csv

As understood from the analysis of data it represents that to review the information relating to loan delinquency that has been collected. The information comprises the data relating to the 1.5 lakh customers of bank. Additionally, it is noticed that barring the monthly earnings and number of dependents information relating to all the attributes are present in it. Data relating to the monthly earnings of 29731 customers are missing. Similarly, data relating to the total number of dependents for 3924 customers are also omitted.

The measurement of loan delinquency is assessed with the help of dichotomous variable - SeriousDlqin2yrs. The variable is helpful in measuring the loan defaulting habit of the person that are past 90 days. It is later noticed that 93.3% of the customers have the habit of loan default during the last 90 days. Only the 6.7 per cent of the customers does not has the habit of loan delinquency.

The word “RevolvingUtilizationOfUnsecuredLines” is viewed as continuous variable. Whereas the minimum value relating to the variable is zero while, the maximum value stands 50708. Additionally, it is noticed that the average value stood 6.048, furthermore, it is noticed that the majority of the customers does make the utilization of unsecured lines.

The variable “Age” is treated as continuous variable. The minimum and the maximum age of every customer are stated as 0 and 109 respectively. The average age of customer stood 52.9295. Additionally, representation of histogram represents that there is a normal distribution of customers age.

Variables such as “Number of Time 30-59 Days Past Due Not Worse”, “Number of Times 90 Days Late” and “Number Of Time 60-89 Days Past Due Not Worse” represented 0 and 98 as the minimum and the maximum value. The maximum as well as minimum value of Debt ratio stood 0 and 329664 while the average debt ratio stood 353.005.

The minimum and maximum value stood 0 and 58 for “NumberOfOpenCreditLinesAndLoans” while the average value stood 8.453.

For the NumberOfDependents the minimum and maximum value of the customers stood 0 and 20 respectively. The average number of dependents stood 0.757 whereas the maximum number of customers have the number of dependents at 1 with customers that has the higher number.

To forecast the correlation of loan delinquency relating to the five variables a calculations of variables is performed. The evidences from the figures suggest the matrix of correlation for different variables. The analysis obtained from the correlation matrix represents “NumberOfTime30-59DaysPastDueNotWorse”,

“NumberOfTimes90DaysLate”, “NumberOfTime60-89DaysPastDueNotWorse”,

The “NumberOfDependents” and “age” represents greater degree of correlation with the loan delinquency. Therefore, to analyse the loan delinquency the above stated five factors are selected.

Task 2.2:

Figure 2.2 (a) Decision Tree Procedure

Figure 2.2(b): Decision Tree

Figure 2.2(c): Decision Tree Process

In order to prepare the decision tree in the rapidminer the variables that have the greater analysis of higher correlation is selected. Therefore, 5 variables that have the greater correlation is selected for determining the loan delinquency. The set role reporter is employed to select the delinquency of loan as the targeted variable. The “decision tree” operator is employed to make the decision tree. The use of lease square criterion is used to prepare the decision tree. Additionally, maximum depth of 5 is used in preparing the decision tree. The trimming relating to the decision represents the minimum gain of 0.01.

The evidences from the decision tree represents that the “NumberofTimes90DaysLate” is initially selected and it is separated in less than or greater than 0.500. For the “NumberofTimes90DaysLate” above 0.500 is attributed again to segregate into less than and greater than 1.500. As understood from the decision tree it is noticed that all the “NumberofTimes90DaysLate” possess greater than 1500 variables. The “NumberofTime30-59dayspastduenotworse” is divided under two sections that has less than and greater than 0.500. Use of age attribute is employed to define all the “NumberofTime30-59dayspastduenotworse” to greater than 0.500.

For “NumberofTimes90DaysLate” less than 0.500 is attributed. While the attribute of “NumberofTime30-59dayspastduenotworse” is separated to less than and greater than 0.500. The variables of “NumberofTime60-89dayspastduenotworse” is used to forecast “NumberofTime30-59dayspastduenotworse.”

As understood from the decision tree it is noticed that the entire five variables are used in predicting the delinquency of loan. The separation has made used of the factor of 0.500 for the first order. While the variable of “NumberofTime60-89dayspastduenotworse” is used to forecast the “NumberofTime30-59dayspastduenotworse.”

The understanding from the decision tree suggest that the full five factors has been utilised to predict the delinquency of loan. The segregation has made use of 0.500 for the first order while in the second order the factor stood 1.500 has been used.

Task 2.3:

Figure 2.3 (a): Logistic Regression Model

Figure 2.3(b): Logistic Regression Output

The image represents logistic regression procedure for determining loan delinquency. The process for obtaining the logistic regression comprises of importation of data into the process. The numerical variables are turned into binomial variable and the variables for selecting loan delinquency is chosen. The variables are chosen based on loan delinquency matrix. Later the use of set role operator is used to determine the association among the dependent loan delinquency variable and independent variables.

The association between the dependent and independent variable is stated below;

Loan delinquency = 5.961*Age + 1.230* NumberOfTime30-59DaysPastDueNotWorse + 1.938* NumberOfTimes90DaysLate + 1.256* NumberOfTime60-89DaysPastDueNotWorse + 0.224* NumberOfDependents – 9.462

The equation provides evidences that all the independent variables presents the positive effect on the loan delinquency. The equation provides that there is a greater rise in loan delinquency with rise in age. The changes for loan delinquency falls with the age of customers. Additionally, it is noticed that least effect on loan delinquency is obtained by the number of dependents.

Meanwhile it is noticed that age do not represents statistical significance on loan delinquency at 0.05 level of significance. Additionally, it is noticed that except for all the age other variables possess the noteworthy effect on the loan delinquency.

Task 2.4

Table 2.2: Results of Model Performance Evaluation (Decision Tree, Logistic Regression)

Measures	Logistic Regression	Decision Tree
Model Accuracy	93.51%	93.51%
True Positive	113.3	154.3
False positive	84.100	125.0
Precision	57.44%	55.29%
Recall	11.31%	15.04%
Lift	859.85	828.00
Sensitivity	11.31%	15.40%
F Measure	18.88%	24.08%

The evidences from table 2.2 provides a comparative analysis between the decision tree model and logistic. The cross validation techniques were employed to provide comparative view of models. As understood from the table both the model represents the equivalent accurate level at 93.51%. Additionally, it is noticed that the precision of logistic regression model is greater at 57.44% in comparison to the decision tree model of 55.29%. The recall level of logistic model stood 11.31% in comparison to the decision tree model of 15.04%. The sensitivity of decision tree model stood better at 15.40% while the logistic model stood 11.31%. As understood from cross validation the accuracy of both the models are identical in comparison to the decision tree model.

Task 3

Task 3.1

Figure 3.1 : Daily rainfall at NorfolkIsland

The tableau view has been created for the NorthfolkIsland weather each day in the month of June during the year 2012. The bar chart of this specific weather stations has been generated by measuring days of June month in the horizontal axis and rainfall in the vertical axis. From the above image we find that the maximum rainfall occurred on 30^th June. In addition, it is also found that the rainfall was very high from 12^th to 15^th subsequently there was approximately no rainfall till 29^th June. Moreover, it is also found that there was no rainfall in the starting of the month. Through change of location in tableau file we would get the rainfall on other locations.

Task 3.2

Figure 3.2 : Monthly rainfall at NorfolkIsland

For the yearly analysis again we study the rainfall in NorfolkIsland. The analysis of the rainfall shows that the rainfall follows a normal distribution from 2009 to 2018. The highest total rainfall occurred in 2011. The least amount of rainfall occurred in 2017. Further it is found from the chart that there was a decrease of rainfall from 2011 to 2013. The amount of rainfall from 2013 to 2015 was approximately equal. There was a rise in rainfall in 2016. However, the rainfall fell drastically in 2017.

Task 3.3

Figure 3.3: Monthly rainfall at NorfolkIsland

For the monthly evaluation of rainfall we have again selected the location of NorfolkIsland. The year of analysis is 2012. The month variable (from date) is placed in columns and rainfall as rows. The rainfall measure is converted to sum to present the total rainfall. The colour of the bar is changed to yellow. The year variable (date) is placed in filter. This aids in selecting 2012 as the year. The total monthly rainfall is represented through a bar chart. The height of the bar represents the amount of rainfall. From the above chart it is found that the highest amount of rainfall occurred in the month of January. The rainfall in 2012 followed a cyclic occurred. There was a slump in rainfall from January till the month of June. From the month of June there was rise in the rainfall till the month of November. However, we find that the rainfall fell again in December.

Task 3.4

Figure 3.4: Geomap of Rainfall

The geomap in tableau is created by placing longitude in the columns and latitude in the rows. Tableau automatically creates a geomap based on the given latitude and longitudes. In order to find the locations, the variable is placed in the measure as colour. The locations are highlighted with gradient green colour. The year 2010 is selected for accessing the rainfall. The least total rainfall in 2010 was 206 mm while the highest total rainfall was 2660 mm. In the tableau file when the year is changed then the rainfall for a different year would be provided.

Figure 3.5: Tableau Dashboard

References

Booth, A., Moylan, A., Hodgson, J., Wright, K., Langworthy, K., Shimizu, N. and Maconochie, I., 2018. Resuscitation registers: how many active registers are there and how many collect data on paediatric cardiac arrests?. Resuscitation.

Digitalhealth.gov.au., 2018. Privacy - Australian Digital Health Agency. [online] Available at: https://www.digitalhealth.gov.au/policies/privacy [Accessed 6 Oct. 2018].

Dinev, T., Albano, V., Xu, H., D’Atri, A. and Hart, P., 2016. Individuals’ attitudes towards electronic health records: A privacy calculus perspective. In Advances in healthcare informatics and analytics (pp. 19-50). Springer, Cham.

John, A., Dennis, M., Kosnes, L., Gunnell, D., Scourfield, J., Ford, D.V. and Lloyd, K., 2014. Suicide Information Database-Cymru: a protocol for a population-based, routinely collected data linkage study to explore risks and patterns of healthcare contact prior to suicide to identify opportunities for intervention. BMJ open, 4(11), p.e006780.

Kim, Y.H., Han, K., Son, J.W., Lee, S.S., Oh, S.W., Kwon, H.S., Shin, S.A., Kim, Y.Y., Lee, W.Y. and Yoo, S.J., 2017. Data analytic process of a nationwide population-based study on obesity using the national health information database presented by the national health insurance service 2006-2015. Journal of Obesity & Metabolic Syndrome, 26(1), pp.23-27.

Legislation.gov.au., 2018. Healthcare Identifiers Act 2010. [online] Available at: https://www.legislation.gov.au/Details/C2017C00239 [Accessed 6 Oct. 2018].

Legislation.gov.au., 2018. My Health Records Act 2012. [online] Available at: https://www.legislation.gov.au/Details/C2017C00313 [Accessed 6 Oct. 2018].

Myhealthrecord.gov.au., 2018. My Health Record. [online] Available at: https://www.myhealthrecord.gov.au/ [Accessed 6 Oct. 2018].

Myhealthrecord.gov.au., 2018. My Health Record. Privacy Policy. [online] Available at: https://www.myhealthrecord.gov.au/about/privacy-policy [Accessed 6 Oct. 2018].

Van Cauteren, D., Millon, L., De Valk, H. and Grenouillet, F., 2016. Retrospective study of human cystic echinococcosis over the past decade in France, using a nationwide hospital medical information database. Parasitology research, 115(11), pp.4261-4265.

Watanabe, K., Ricoh Co Ltd, 2015. Data management for hospital form auto filling system. U.S. Patent Application 14/194,365.

Zingg, W., Holmes, A., Dettenkofer, M., Goetting, T., Secci, F., Clack, L., Allegranzi, B., Magiorakos, A.P. and Pittet, D., 2015. Hospital organisation, management, and structure for prevention of health-care-associated infection: a systematic review and expert consensus. The Lancet Infectious Diseases, 15(2), pp.212-224.

Buy Cis8008 Business Intelligence - Decision Assessment Answers Online

Talk to our expert to get the help with Cis8008 Business Intelligence - Decision Assessment Answers to complete your assessment on time and boost your grades now

The main aim/motive of the management assignment help services is to get connect with a greater number of students, and effectively help, and support them in getting completing their assignments the students also get find this a wonderful opportunity where they could effectively learn more about their topics, as the experts also have the best team members with them in which all the members effectively support each other to get complete their diploma assignments. They complete the assessments of the students in an appropriate manner and deliver them back to the students before the due date of the assignment so that the students could timely submit this, and can score higher marks.Â The experts of the assignment help services at urgenthomework.com are so much skilled, capable, talented, and experienced in their field of programming homework help writing assignments, so, for this, they can effectively write the best economics assignment help services.

Get Online Support for Cis8008 Business Intelligence - Decision Assessment Answers Assignment Help Online

Not the Exact Question you were looking for ? Post your question for assignment help and get instant help on your homework and assignment questions from our experts

Cis8008 Business Intelligence - Decision Assessment Answers

Assignment 3 consists of three main tasks and a number of sub tasks Task 1

Task 1

Task 2

Provide the following for Task 2.1:

2.1 Results of Exploratory Data Analysis for loan-delinq.csv

Table 1 Data dictionary: loan-delinq.csv data set

Variable Name

Description

Type

SeriousDlqin2yrs

Person experienced 90 days past due delinquency or worse

Y/N

Provide the following for Task 2.4:

Task 3

Table 3.1 Data dictionary for Australian Weather Data set variables

Variable Name

Data Type

Description

stnID

Location

stnNum

latitude

longitude

postcode

state

It’s a simple operation in Tableau to join two different files on a common variable/field name – for Assignment 3 it is locations variable/field

Answer

Business Intelligence

Task 1

Introduction

Importance of electronic information recording system

Detecting the Evolution in electronically stored health record

Task 2:

Task 2.1:

Figure 2.1: Figure illustrating data analysis and Correlation Matrix Procedure

Table 2.1: The below stated figure provides an analysis of the exploration data relating to the loan delinquency.

Table 2.1: Results relating to the Analysis of Exploratory Data for loan delinq.csv

Task 2.2:

Task 2.3:

Task 2.4

Table 2.2: Results of Model Performance Evaluation (Decision Tree, Logistic Regression)

Measures

Logistic Regression

Decision Tree

Task 3

Task 3.1

Task 3.2

Task 3.3

Task 3.4

References

Buy Cis8008 Business Intelligence - Decision Assessment Answers Online

Get Online Support for Cis8008 Business Intelligence - Decision Assessment Answers Assignment Help Online