MB5021 Bioinformatics
Please answer the following questions for your laboratory write up. You should use the information that you’ve learnt in both the lectures and the laboratory session, as well as information you’ve gained from further reading.
You should write approximately 100 words for each written answer but exact word counts are not needed. References are not needed but can be added to demonstrate further reading.
- Discuss what steps are employed to improve the quality of the filtered reads compared to the raw reads and how this is assessed using FastQC reports.
- Including an explanation of short read sequencing chemistries, discuss why errors and poor quality bases may arise, and why it is important to quality filter sequencing reads.
- Produce the suffix array for the following sequences. Include your workings in your answers:
- a) ATCGTCGGAT
- b) GGCACGTACA
- Produce the BWT for the following sequences. Include your workings in your answers:
- a) ATCGTCGGAT
- b) GGCACGTACA
- List two downstream applications of sequence alignments and discuss the challenges faced during alignment.
- Using the results from Variant Effect Predictor, list the different consequences a sequence variant may have on both the DNA sequence and the protein and discuss the impact variants can have on the phenotype of an individual.
Answer:
- Employed steps which improves quality filtered reads compared to raw reads and its assessments on FastQC reports;
- Quality control and filtering of sequence reads are important steps during processing. however, it is often not easy to identify and figure out which reads need to be adjusted and which can be left out. In FastQC Phred score concept on the usage of important quality metrics.
- In assessing the quality, the FASTQ files are input and return as HTML report which can be presented graphically illustrate information about the different reads quality. The FASTQ toll is run on Inside DNA cloudbuster, using a special wrapper called sub. The wrapper contains computational resources which help in distributing workload and computations between the different nodes in the cloud.
- Short read sequencing and errors occurrences due too poor quality bases observation
- Sequencing of shorter read sequencing is often dependent on the instrument used. Short instrument falls between 100-600 base pairs while long read sequencing varies between 10-15 kb.
- Short read sequencing often yields sequences with scaffold gaps due to the high presence of genome sequence, having repeated sequences and missing insertions.
- Raw digital DNA sequences often consist of assessed nucleotides referred to as base pares. Quality control depends on the probability of the base pairs to deviate from the indemnity of the corresponding nucleotide in the genome reference. Its quality depends on the quality signal used in the reception of the nucleotide. (Kalari et al., 2014 pp 224)
- Suffix Array for the sequences below:
- ATCGTCGGAT
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
A |
T |
C |
G |
T |
C |
G |
G |
A |
T |
$ |
0 |
10 |
$ |
1 |
8 |
A T $ |
2 |
0 |
A T C G T C G G A T $ |
3 |
5 |
C G G A T $ |
4 |
2 |
C G T C G G A T $ |
5 |
7 |
G A T $ |
6 |
6 |
G G A T $ |
7 |
3 |
G T C G G A T $ |
8 |
9 |
T $ |
9 |
4 |
T C G G A T $ |
10 |
1 |
A T C G T C G G A T $ |
- GGCACGTACA
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
G |
G |
C |
A |
C |
G |
T |
A |
C |
A |
$ |
0 |
10 |
$ |
1 |
9 |
A $ |
2 |
7 |
A C A $ |
3 |
3 |
A C G T A C A $ |
4 |
8 |
C A $ |
5 |
4 |
C G T A C A $ |
6 |
2 |
C A C G T A C A $ |
7 |
5 |
G T A C A $ |
8 |
1 |
G C A C G T A C A $ |
9 |
0 |
G G C A C G T A C A $ |
10 |
6 |
T A C A $ |
- BWT for the following sequences.
- a) ATCGTCGGAT
- b) GGCACGTACA
- Produce the BWT for the following sequences.
- Downstream application sequence and alignments
- Alignment-free sequences have been influential in the application of whole-genome phylogeny in the classification of proteins, identifying horizontally transferable genes and the detection of recombinant sequence, making them useful for next-generation sequencing data processing analysis.
Challenges
- Challenges observed s that, alignment programs often assume homologous sequence which has more conserved sequence stretches, further during analysis, alignments often drops significantly between the critical points.
- Consequences of variant predictor effect predictor and sequence consequences on the phenotype of an individual.
- Occurrence of sequence variants occur in the coding regions which have protein effect on the amino acids which affect protein function. The variants can be seen in somatic cells which include nucleotide polymorphism.
This consequences include;-
- Intront variant
- Upstream gene variant
- Intergenic variant
- Missense variant
- NMD transcript
- Non coding transcript
- Regulatory region
Impact of variants
- Variants can have an effect on the substitution of the amino acids which lead to premature development which stops the cordon leading to incomplete proteins. Further the amino acids which is synonymous to single nucleotide polymorphisms which have an effect on the phenotypes which is related to development of diseases and drug response effect.
References
Chaisson, M.J. and Tesler, G., 2012. Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. BMC bioinformatics, 13(1), p.238.
Kalari, K.R., Nair, A.A., Bhavsar, J.D., O’Brien, D.R., Davila, J.I., Bockol, M.A., Nie, J., Tang, X., Baheti, S., Doughty, J.B. and Middha, S., 2014. MAP-RSeq: mayo analysis pipeline for RNA sequencing. BMC bioinformatics, 15(1), p.224.
Leggett, R.M., Ramirez-Gonzalez, R.H., Clavijo, B., Waite, D. and Davey, R.P., 2013. Sequencing quality assessment tools to enable data-driven informatics for high throughput genomics. Frontiers in genetics, 4, p.288.
Patel, R.K. and Jain, M., 2012. NGS QC Toolkit: a toolkit for quality control of next generation sequencing data. PloS one, 7(2), p.e30619.
Buy MB5021 Bioinformatics Answers Online
Talk to our expert to get the help with MB5021 Bioinformatics Answers to complete your assessment on time and boost your grades now
The main aim/motive of the management assignment help services is to get connect with a greater number of students, and effectively help, and support them in getting completing their assignments the students also get find this a wonderful opportunity where they could effectively learn more about their topics, as the experts also have the best team members with them in which all the members effectively support each other to get complete their diploma assignments. They complete the assessments of the students in an appropriate manner and deliver them back to the students before the due date of the assignment so that the students could timely submit this, and can score higher marks. The experts of the assignment help services at urgenthomework.com are so much skilled, capable, talented, and experienced in their field of programming homework help writing assignments, so, for this, they can effectively write the best economics assignment help services.