ITC516 Data Mining and Visualisation for Business Intelligence-East We

This item requires the dataset EastWestAirlinesCluster.xls which can be found on the subject Interact site.The dataset EastWestAirlinesCluster.xls contains information on 3999 passengers who belong to an airline’s frequent flier program. For each passenger the data include information on their mileage history and on different ways they accrued or spent miles in the last year. The goal is to try to identify clusters of passengers that have similar characteristics for the purpose of targeting different segments for different types of mileage offers.

a) Apply hierarchical clustering with Euclidean distance and Ward's method. Make sure to normalize the data first. How many clusters appear

b) What would happen if the data were not normalized
c) Compare the cluster centroid to characterize the different clusters, and try to give each cluster a label.
d) Use K-means clustering with the number of clusters that you found above. Does the same picture emerge
e) Which clusters would you target for offers, and what types of offers would you target to customers in that cluster.

Answer:

Rule #1: For a customer who initiates purchase of brushes, there exists a conditional probability of 1 that the same customer would also purchase nail polish.

Rule #2: For a customer who initiates purchase of nail polish, there exists a conditional probability of 0.6322 that the same customer would also purchase brushes.

Rule #3: For a customer who initiates purchase of nail polish, there exists a conditional probability of 0.5920 that the same customer would also purchase bronzer.

Definition: If the given rule’s support level is predictable from the rule acting as the ancestor, then the given rule would be termed as a redundant rule (Zaki, 2000).

Application: Rule 2 can be taken as an apt example

Rule 1 (Support Level) = Rule 2 (Support Level) = 2.8735

Additionally, rule 1(Confidence level) > rule 2(Confidence level)

Hence, confirmation of redundancy for rule#2

Utility: It makes sense that the association rules are not considered in an isolated manner but rather collectively as this would result in maximization of the utility of these rules. Also, while collective use is imperative, the imperative characteristics particularly in terms of support and confidence remain important (Liebowitz, 2015).

XLMiner output (Increased minimum confidence interval to 75%)

Observation: The output of the list of rules has only one entry which is in sharp contrast with the output when 50% level was used. This may be attributed to the inability of the other rules to cross the threshold barrier of confidence interval defined as 75% (Ana, 2014).

Advice: It may be disadvantageous to define high values of minimum confidence level since certain rules with good support (lift ratio) may be ruled out which would undermine the association rule utility for the researcher and the sponsor (Ragsdale, 2014).

Hierarchical Clustering Output (Dendrogram)

Total clusters = 3

Reasons: 1) Three clusters were selected while running the hierarchical clustering

2) Horizontal line at distance between 980 and 1000 in the dendogram would lead to identification of three clusters.

The non-normalisation of data potentially leads to issues outlined below (Shumueli et. al., 2016).
The computation of distance (related to cluster centroids) is incorrect.
The accuracy of the measure is compromised as the scale distorts the overall process.
Clusters formed would have lower validity.

Hence, the above reasons reflect at the need to conduct normalisation of data. Also, adjusting the weights in raw data to equal ratio may also resolve the above highlighted issues to some extent.

Cluster 1(Partial Output)

Possible Label: “Middle Class Flyers”.

Rationale: The low to moderate spending done by this segment is visible through multiple aspects such as lesser transactions in a year (both flight and non-flight bonus) coupled with levels of balance bonus miles which are low to moderate.

Possible Label: “High Networth Flyers”.

Rationale: The high spending done by this segment is visible through multiple aspects such as higher transactions in a year (both flight and non-flight bonus) coupled with levels of balance bonus miles which are quite high.

Cluster 3(Partial Output)

Possible Label: “Non-frequent Flyers”.

Rationale: The frequency of flight transactions is quite low especially considering the non-flight bonus transactions which are at the highest level in all the available clusters.

Output (K Means Clustering-XL Miner)

Comparison: The objective is to compare the pattern of this clustering technique with the hierarchical clustering. The requisite step to carry out would be to compare the cluster labeling In both the techniques and ascertain whether the results are same or not (Ragsdale, 2014).

Implementation: The first cluster is the first cluster for comparison.

Possible Label: “High Networth Flyers”

Rationale: Balance bonus miles (Highest), Flight transactions in past year (Highest), Qual_Miles (Highest)

The same cluster when compared with the label drawn in hierarchical clustering clearly reflects the difference. This is because it comprised of “Middle Class Flyers”. Continuing the other clusters is not required as difference has already been brought out.

Conclusion: Owing to the above, it is apparent that the clustering techniques under consideration for the given question do not lead to a uniform pattern which has been established through comparison of each cluster.

References

Ana, A. (2014) Integration of Data Mining in Business Intelligence System (4^th ed.). Sydney: IGA Global

Liebowitz, J. (2015) Business Analytics: An Introduction (2^nd ed.). New York: CRC Press.

Ragsdale, C. (2014) Spread sheet Modeling and Decision Analysis: A Practical Introduction to Business Analytics (7^th ed.). London: Cengage Learning.

Shumueli, G., Bruce, C.P., Yahav, I., Patel, R. N., Kenneth, C., & Lichtendahl, J. (2016) Data Mining For Business Analytics: Concepts Techniques and Application (2^nd ed.).London: John Wiley & Sons.

Zaki, M.J.(2000), Generating non-redundant association rules. In: Proceeding of the ACM SIGKDD, pp. 34–43

Buy ITC516 Data Mining and Visualisation for Business Intelligence-East We Answers Online

Talk to our expert to get the help with ITC516 Data Mining and Visualisation for Business Intelligence-East We Answers to complete your assessment on time and boost your grades now

The main aim/motive of the management assignment help services is to get connect with a greater number of students, and effectively help, and support them in getting completing their assignments the students also get find this a wonderful opportunity where they could effectively learn more about their topics, as the experts also have the best team members with them in which all the members effectively support each other to get complete their diploma assignments. They complete the assessments of the students in an appropriate manner and deliver them back to the students before the due date of the assignment so that the students could timely submit this, and can score higher marks.Â The experts of the assignment help services at urgenthomework.com are so much skilled, capable, talented, and experienced in their field of programming homework help writing assignments, so, for this, they can effectively write the best economics assignment help services.

Get Online Support for ITC516 Data Mining and Visualisation for Business Intelligence-East We Assignment Help Online

Resources

24 x 7 Availability.

Trained and Certified Experts.

Deadline Guaranteed.

Plagiarism Free.

Privacy Guaranteed.

Free download.

Online help for all project.

Homework Help Services

); }

Not the Exact Question you were looking for ? Post your question for assignment help and get instant help on your homework and assignment questions from our experts