NIT 6130 Introduction to Research
Assignment – 4
Experiment Design & Result Analysis
Big Data Predictive Analytics to Overcome Flight Delays
Masters in Applied Information Technology (NMIT) Victoria University, Melbourne, Victoria
Table of Contents
- Collection of data for experiment
1a. Identification and selection of available data sources
1b. Collection of Raw Data
- Experiment Design and Implementation
2a. Data pre-processing
2b. Feature Selection or Dimension Reduction
2c. Experiment Design
2d. Experiment Implementation Records
- Experiment Result Analysis and Summary
- Outline of Experiment and Result Analysis Chapter
- Collection of data for experiment
1a. Identification and selection of available data sources
In order to conduct experiment analysis, the available data sources are analysed and collected. The following table gives a brief description of the available data sources.
Data Source Name |
Source Organization |
Data Description |
Data File Format |
URL |
Charge/ Fee |
Target data source |
Flight Delay Data 1 |
Department of Transportation, Washington, United States |
Commercial Airline Flight Delay Records in 2015 |
.csv |
https://www.kaggle.com/usdot/flight-delays/data |
Free |
Yes |
Flight Delay Data 2 |
Bureau of Transportation Statistics |
Commercial Airline (US) Flight Delay Records in 2017 |
.csv |
https://www.transtats.bts.gov /DL_SelectFields.asp?Table_ID= 236&DB_Short_Name=On-Time |
Free |
Yes |
Flight Delay Data 3 |
Open Flights Airport Database |
Select the delay criteria or reasons |
.csv .txt |
https://openflights.org/data.html |
$50 |
Yes |
Flight Delay Data 4 |
Data World Organization |
Departure Delay Record |
.csv |
https://data.world/data-society/airlines-delay/workspace/file?filename=airlinedelaycauses%2FDelayedFlights.csv |
Free |
Yes |
Flight Delay Data 5 |
Bureau of Infrastructure, Transport and Regional Economics |
International Airline Activity Link |
.csv |
https://data.gov.au/dataset/international-airline-activity |
Free |
Yes |
1b. Collection of Raw Data
The relevant data for the experimental purpose is downloaded from the web and saved in a folder called ‘Raw Data’. These files are in the Microsoft Excel (*.csv) format. The details about these records are summarised in the given table.
Data Source Name |
Date of Collection |
Saved File Location |
Saved File Name |
Saved File Format |
No. of Data Records |
Flight Data 1 |
19/10/2017 |
C:\Users\LIZA\Desktop\Introduction to Research\Raw Dataset |
AirlineDelayCauses.csv |
csv (Excel) |
1048576 |
Flight Data 2 |
19/10/2017 |
C:\Users\LIZA\Desktop\Introduction to Research\Raw Dataset |
Delya_T_Ontime.csv |
csv (Excel) |
450018 |
Flight Data 3 |
21/10/2017 |
C:\Users\LIZA\Desktop\Introduction to Research\Raw Dataset |
Flights.csv |
csv (Excel) |
1048500 |
Flight Data 4 |
22/10/2017 |
C:\Users\LIZA\Desktop\Introduction to Research\Raw Dataset |
PredictingAirlineDelays.csv |
csv (Excel) |
560002 |
Flight Data 5 |
22/10/2017 |
C:\Users\LIZA\Desktop\Introduction to Research\Raw Dataset |
InternationalAirlineActivity.csv |
csv (Excel) |
402050 |
- Experiment Design and Implementation
2a. Data pre-processing
Huge amount of raw data is available for the research experiment. All this data cannot be utilised for the experimentation. Therefore, this collection of data needs to be pre-processed to conduct the experiment.
2b. Feature Selection or Dimension Reduction
The entire data collection files consist of multiple data features. Not all of them are relevant to the experimental process. So, few fields have been eliminated from the existing records and new files are updated accordingly. The dimensionality of the collected data is reduced in order to simplify data processing during experiment analysis. The new result data set are recorded in the following sample table.
Date |
Data Source Name |
Purpose of Pre-processing |
Pre-processing Method |
No. of Original Data Records |
No. of Result Data Records |
No. of Original Features |
No. of Result Features |
New Data File Name |
23/10/17 |
Flight Data1 |
Featured Selection |
Manual data processing |
1048576 |
2000 |
46 |
20 |
AirlineDelayCauses_Updated.csv |
23/10/17 |
Flight Data 2 |
Clean the missing data |
Pre-fill the missing values |
450018 |
4000 |
32 |
15 |
Delya_T_Ontime_Updated.csv |
23/10/17 |
Flight Data 3 |
Discard data that is more than 5 years old |
Manual data processing |
1048500 |
2000 |
30 |
15 |
Flights_Updated.csv |
23/10/17 |
Flight Data 4 |
Report-making followed by better analysis |
Automated data processing using Excel features |
560002 |
2000 |
35 |
15 |
PredictingAirlineDelays_Updated.csv |
23/10/17 |
Flight Data 5 |
Featured Selection |
Manual Data Processing |
402050 |
3000 |
28 |
15 |
InternationalAirlineActivity_Updated.csv |
2c. Experiment Design
Date |
Experiment |
Purpose of Experiment |
Description of Procedure |
Input Data |
Expected Output |
Result File Format |
24/10/2017 |
Experiment 1 |
Evaluate Method 1 |
The Ground Delay Program (GDP) Procedure |
Historical data and weather information using Map Reduce |
A join key and table tag |
Output1.csv |
24/10/2017 |
Experiment 2 |
Evaluate Method 2 |
Regression Prediction Mechanism |
Database input to Naive Bay’s Algorithm |
Result for the prediction of departure delays |
Output2.csv |
24/10/2017 |
Experiment 3 |
Evaluate Method 3 |
Flight delay propagation and Delay probability distribution |
The itineraries of passengers who have missed a flight. Reschedule_Pax algorithm |
new passenger itineraries |
Output3.txt |
24/10/2017 |
Experiment 4 |
Evaluate Method 4 |
Heuristic algorithm – Schedule Minimization for Passenger trip delay |
The flight schedule Itineraries- Regression Based Algorithm |
The updated flight schedule |
Output4.txt |
2d. Experiment Implementation Records
A basic and simple delay model can be built with the help of Empirical Cumulative Distribution Model. The Kernel Density Estimation method is a basic function of the programming language that will be used. A Map-Reduce algorithm will be used that will split the input data set into individual chunks which will be processed be the map tasks in a completely parallel manner. The Linear Regression Model of the average daily delay analyzes the effects of arrival delay, airport capacity, traffic congestion and weather conditions.
- Experiment Result Analysis and Summary
After conducting the aforementioned experiments, there are certain results that are desired to be obtained. They are analysed as below –
- Flight delays are one of the major causes of Total Passenger Trip Delay (TPTD). Other passenger trip delays are due to either missed connections or flight cancellations.
- Airline network design also has a significant impact on the trip delay of passengers. The gap between direct and connected itineraries, frequency of the service, time wasted between banks at the hubs, aircraft size selection and target load factor also play a major role to determine the trip reliability of passengers.
- Flight delay caused due to bad weather conditions should be forecasted much before the scheduled flights so that necessary alternative arrangements can be made. The passengers can also be advised in advance about the future delay in their trip so that they can also plan their journey accordingly. This will reduce chaos among the passengers.
- The delays that passengers experience because of the late or diverted flights can be minimised. The passengers are affected by the trip delay because of the cancelled flights, missed connections or boarding issues. To avoid such situations a new flight can be implemented in order to avoid delay in further flights. But this will increase the cost to the airlines effectively. If the frequency of the flight decreases, then the load factor of the re-booked flights increases. The experimental result can minimize the trip delay of passengers by either rescheduling passengers on the next flight that were late from connecting flights or by holding the next flight until the passenger arrives. The second case may eventually delay all the other passengers and their connecting flights.
- Outline of Experiment and Result Analysis Chapter
4.1 Data Analysis
4.1.1 Data Pre-processing and Transformation
4.1.2 Target Data Creation
4.1.3 Model descriptions and variables
4.1.3.1 The training dataset
4.1.3.2 Decision Trees
4.1.3.3 Random Forest Model
4.2 Delay Prediction
4.2.1 Classification Technique
4.2.2 Hadoop MapReduce
4.3 Analysis of Variance (ANOVA) on the average daily arrival delay
4.3.1 ANOVA test on seasonal pattern
4.4 BRYAGH: Basic Reduction Yare Approach for Flights
4.4.1 BRYAGH Algorithm
4.4.2 Pseudo code
4.5 Conclusion
Buy NIT 6130 Introduction to Research Assignment Answers Online
Talk to our expert to get the help with NIT 6130 Introduction to Research to complete your assessment on time and boost your grades now
The main aim/motive of the management assignment help services is to get connect with a greater number of students, and effectively help, and support them in getting completing their assignments the students also get find this a wonderful opportunity where they could effectively learn more about their topics, as the experts also have the best team members with them in which all the members effectively support each other to get complete their diploma assignments. They complete the assessments of the students in an appropriate manner and deliver them back to the students before the due date of the assignment so that the students could timely submit this, and can score higher marks. The experts of the assignment help services at urgenthomework.com are so much skilled, capable, talented, and experienced in their field of programming homework help writing assignments, so, for this, they can effectively write the best economics assignment help services.
Get Online Support for NIT 6130 Introduction to Research Assignment Help Online
Resources
- 24 x 7 Availability.
- Trained and Certified Experts.
- Deadline Guaranteed.
- Plagiarism Free.
- Privacy Guaranteed.
- Free download.
- Online help for all project.
- Homework Help Services
Testimonials
Urgenthomework helped me with finance homework problems and taught math portion of my course as well. Initially, I used a tutor that taught me math course I felt that as if I was not getting the help I needed. With the help of Urgenthomework, I got precisely where I was weak: Sheryl. Read More