Showing posts with label Air Asia. Show all posts
Showing posts with label Air Asia. Show all posts

Monday, December 21, 2015

UNCERTAINTIES OF THE QZ8501 INVESTIGATION

Hello World,

After a wait for almost a year after the QZ8501 accident, the final investigation report was released recently. The report indicates the cumulative effects of mechanical/system failures and pilot action as the cause of the accident. However, the report attributes pilot-action as the major cause of the accident. I looked up a few documents and the story seems to be something more than what’s being thrown at us in the form of official conclusion. In this post, I wish to look into the uncovered/ignored aspects of the investigation. Based on my understanding of the facts [as released] and further research, there were system inadequacies that forced the flight crew to attempt over-riding system-driven flight protocols and their efforts could not be completed on time, due to which the aircraft went down with the crew and passengers.

Findings of the Investigation:

Exhibit A:
Oddity 1:
Cracking of a solder-joint [of both channels] leading to a loss of electrical continuity indicates the electrical side of failure I had already warned of in my previous post on the accident. If there is no electrical supply to the system, the system will remain inactive unless it is powered by a back-up power line. 
The ambiguity that stands out to me is:

Was it ‘one’ solder joint that connected both channels [A&B]? 
Or
Was it ‘one’ solder joint for channel A and ‘one’ solder joint for channel B? [Both channels connected to the RTLU separately]

If it were two solder joints, one for each channel, then it should be two separate failures in which case the relationship between the two needs to be ascertained. 

If it was just one solder joint that connected channel A with channel B, then it is clear that the system did not have the needed electrical redundancy, indicating a serious design flaw considering the key flight-critical status of the equipment [the flight came to an end because this failed].

Irrespective of the real nature of the finding, the questions that remain are:
Why is ambiguity being installed in the very beginning of an accident report? 
Also failures such as these are usually a sequence of events. 
If the investigation could go the level of solder-joint failure, what led to the failure of the solder-joint? 
What type of load on the joint increase that it had to fail? 
Why is that side of the failure not being discussed in the report?

Oddity 2:
An ‘unresolved repetitive fault’ occurred 4 times during the flight and the responses registered indicated that the 4th response was not in accordance to that of the message. 

The question that stands out is:

For the first three times, the repetitive fault did not subside or revert based on the ‘message-compliant’ responses from the flight crew. 

Why is this not being discussed in the report?

If the procedural response fails to provide the relief for a crisis situation, the failure needs to be attributed to the ‘Non-fail-Safe’ nature of the system [a design flaw]. If the flight crew did not get the result of ‘message-compliant’ responses, then it is natural for them to resort to out-of-procedure efforts to resolve the crisis as the flight of the aircraft was in deterioration when such off-procedure input was given by the flight crew.

Why hasn’t the report indicated the ‘state of vulnerability’ of the platform?

My Findings 

Exhibit B1:


I came across this patent where the inventor has granted the assignment to Airbus Operations SAS [Assignee on the patent]. Now Airbus is the manufacturer of QZ8501 that went down. This patent, deals with the process for limiting the steering angle of control surfaces. 

The movable parts of an aircraft [the airframe to be specific], visible from the outside, apart from the doors and landing gear are the control surfaces [These are found on the wings, tail-plane and tail-fin]. These are used to control the aircraft’s flight at all times. 

This patent covers the process to control steering angle for control surfaces, specifically the rudder [the one on the tail fin that stands upright on the tail-end of an airplane].

Here’s Exhibit B2:


This description shown above clearly indicates the significance of the technology covered in this patent. Engine failure is being used as an example of abnormal flight condition and the observation describes the way an aircraft will behave when an engine fails. 

As per this observation, the rudder, the control surface on the tail-fin of an airplane will be required to bring back the aircraft to the flight line when an engine fails and the aircraft gets destabilized.  Through this observation, the patent implies, that the rudder will need higher steering clearance so it can produce the force necessary to bring the aircraft back to its flight line [control the destabilization faced by the aircraft].

Here’s Exhibit B3:


As shown in the figure above, the patent moves on to describe the traditional system’s inability to restrict the pilot from sending several commands. This indicates the intent of this technology/process as something related to restricting pilot activity in operating control surfaces under certain ‘abnormal conditions.’

Here’s Exhibit B4:


The patent then describes the outcome of such abnormal conditions when the pilot is allowed to send multiple commands to the rudder, will lead to dangerous failure modes. The patent specifically mentions that the tail-fin may break under these conditions. 

Now look at this picture below.

Exhibit C:

http://redwiretimescom.r.worldssl.net/wp-content/uploads/2015/01/redwire-singapore-air-asia-qz8501-black-box-1.jpg

The tail part was recovered separate from the rest of the airplane. Now there could be multiple theories for how the tail might have separated from the aircraft. However, the wreckage captured in the image directly reflects what the patent describes as a worst case scenario.

Now read this.

Exhibit D:


The Airworthiness Directive issued by FAA indicates the regulator’s acceptance of a finding that under certain conditions the allowable load limits on the vertical tail plane can be reached and possibly exceeded. The directive, as specified by the regulator, is valid for all Airbus model A318, A319, A320 and A321 series airplanes. The directive also mentions that the directive is valid from Dec, 29, 2015, indicating that the finding and directive have happened in the recent past. 

Such findings and directives going out so recently indicates that the A318, A319, A320, A321 series airplanes have so far been flying in a state of vulnerability and they have been lucky to escape such accidents simply because of the low probability of such failures. 

So the aircraft can fail under certain conditions. This is something that is always thrown out of the ‘Consideration Box’ used for any air accident investigation. The ideal case scenario of all-aircraft-are-safe is being thrust into our minds through carefully planned press releases and cover-up activities….All after hundreds of human beings went down into the ocean along with the aircraft.

Sadly the story doesn’t end here.

Exhibit E: 


EASA had issued a Proposal for an Airworthiness Directive dated 23rd July, 2014, indicating the need for a correction, failing which the aircraft will stand vulnerable to lose its tail fin during during flying conditions. The image above indicates that this directive was deemed applicable for a wide range of Airbus aircraft including Airbus A320-216, the one that went down.

The proposal also says Airbus has developed modifications within the Flight Augmentation Computer [FAC] to activate a conditional aural warning within the Flight Warning Computer [FWC] to prevent pilot-induced rudder doublets. 

So the European regulator was aware of such a ‘condition of vulnerability’ that Airbus aircraft were under and proposed an Airworthiness Directive [AD]. Irrespective of whether the AD was implemented or not, the fact that Airbus aircraft had vulnerabilities including that of losing the tail-fin during specific flight conditions. As I have always pointed out, the probability of occurrence of any event should have no bearing on the risk perception of the same. Potential impact, in this case is, loss of aircraft and therefore it should have higher priority. For some reason, frequency and probability of occurrence is being used as a key criteria for prioritising any risk-mitigation effort.

Further research back into the past reveals this:

Exhibit F: 


‘Safety First,’ The Airbus Safety Magazine dated January, 2005 featured an article on the need for enhanced pre-flight checks involving risk conditions, with one of them being the failure of the Rudder Travel Limiter Unit [RTLU]. The article, as you can see the image above, classifies it as an ‘event of undue rudder travel limitation.’

2005 is long back and even then, there had been vulnerabilities with respect to the RTLU in Airbus aircraft. This indicates that Airbus aircraft, like any other aircraft has always stood vulnerable to abnormal flight conditions, including those that concerned the RTLU, the system which failed during the QZ8501 accident.

Conclusion

Based on the oddities and interpretations I derive from the observation of the exhibits presented above, this is what I think happened with QZ8501.

The aircraft, like any other had remained vulnerable to specific abnormal flight conditions and the supplier’s effort to mitigate this risk [concerned with the RTLU] resulted in restricting the pilot’s capacity to take control of the aircraft. 

What was deemed as too-much-freedom for error resulted in a change that took too-much-of-necessary-capacity from the flight crew during those specific abnormal flight conditions. 

So when the aircraft went into what was deemed a ‘very-low-probability’ scenario, it deviated away from its dedicated flight-line and it had to be recovered. The flight crew had responded as per procedure three times to recover the aircraft but realised that the risk-mitigation change was not allowing them to do the same. The 4th time, the flight crew had no other choice but to try to disconnect the controls from the flight computer that was implementing the ‘pilot-restricting’ control criteria. Unfortunately, they couldn’t achieve the recovery in time and the aircraft went down with the crew and passengers.

While the nature of the abnormal flight conditions is still kept out of our minds through ‘official’ statements comprehensively covering obscurity and generality, we can recall what the patent describes as a possible abnormal flight condition: engine failure. This is why I wanted to know if the engine part of the wreckage was recovered and if yes, the details of the engine wreckage inspection. 

Summing up, many events must have occurred in a certain unfortunate sequence that led to system failure and the eventual loss of the aircraft QZ8501. We may never come face-to-face with the truth since the truth will stand in the way of a multi-billion dollar market that hangs on the ‘perception of reliability’ the aircraft brands thrust into the operators’ minds. However, we can be sure that solder joint failures leading to electrical discontinuity don’t occur out of the blue just like that. Also pilots are not fools to try to disconnect the flight computer unless the situation demands such an effort. 

When a report says, someone lost their life because of a knife entering their back and that the victim had by some means consciously maintained proximity to a sharp knife during the event, it is absolutely obvious that someone might have stabbed the victim. Just because the report doesn’t use the word stab doesn’t mean the victim absolutely walked into a knife protruding out of something uncertain [in this case the hands of the assailant]. Just my thought.


Regards,


Sunday, December 28, 2014

Commercial Aircraft Air Asia QZ8501 Goes Missing: Evolving Flight Risks Call for Upgraded Safety Measures

Hello World,


Another commercial aircraft vanishes, this time it is Air Asia QZ8501, an Airbus 320. A320 is one of the widely used commercial aircraft across the globe. There are airlines whose entire fleet is comprised of A320's.





The news release, so far indicates that the pilot requested a path change, a possible change in altitude and direction. This indicates that the pilot foresaw some weather challenges possibly due to moderate or severe turbulence related to thunderstorm events crossing the designated flight paths. The usual 'altitude-clearance' for escaping potential thunderstorm cloud is 2000 feet but this can vary depending on how bad the weather is (as per the pilot based on his observation of the radar & other instruments). The fact that the pilot requested altitude change from (about) 32000 to 38000 feet indicates he wanted to clear the bad weather envelope by a safe margin.

I think, this aircraft faced anomalies including or related to outage of electrical system resulting in the malfunctioning of guidance, navigation and control system. It can be faulty system failing during flight or system failing due to natural events such as lightning strikes (or any other thunderstorm related activity). Either ways, if the pilots had access to information pertaining to where the aircraft is, where it is headed and how it is heading there, they could have achieved an emergency landing. As usual, there is always the possibility of instantaneous disintegration of the aircraft. Again, nothing can be concluded until the investigation is completed.

From the atmospheric sciences view, we can get hit by a lightning even when we are significantly far from the thunderstorm cloud. As long as there is sufficient temperature difference that can cause charge separation and we (aircraft) happen to present ourselves as a means of discharge, we have every chance of inducing the discharge. Now the platform is built to take such weather anomalies. However, we can never say for sure the 'limit' of such natural phenomena. One small overshoot can set off a sequence of smaller anomalies that can spiral into something major and eventually result in a catastrophic event. It all depends on that first anomaly and how long it took to spiral into a catastrophic event.


Global Aircraft Disappearances & Lightning Activity

Two incidents of 'aircraft disappearance' within 1 year in the same airspace indicates more than random coincidence. I don't see any reason to question technology anymore. may be we need to open our minds to the idea that the weather-based flight operation risks are evolving and existing technologies are not versatile/robust enough to meet the new breed of 'worst-case-scenarios.'

The cause of the QZ8051 accident will only be known after the findings of the accident investigation are released.

However, there is a distant correlation, which, in the larger scheme of things, seem to get my attention and I think we need to look into that and go full-force towards preventive measures.

Based on my recent research on global lightning events, I believe the south-east asia (asia pacific in general) is part of the high-risk-regions for intense lightning activity. Please find below the time lapse video of 30 days' lightning events (from June, 2014) plotted on a world map (30 days' data including about 60000 lightning events in each plot):






Here's one plot from the video as reference for our understanding of the region's vulnerability to lightning events:




The plot above has about 60000 events plotted on the world map. If you look at the Asia Pacific region, it is clear that the south east asia is a place for active lightning events. Now this plot includes only those lightning events that are 'significantly large in terms of magnitude' as detected by the World Wide Lightning Location Network (WWLLN) as managed and operated by the University of Washington. The basic criteria for a lightning event to be included in the data is when the event is detected by at least 5 stations from the global spread of over 50 such stations, each of them located about thousands of kilometres from each other. 

For a detailed look into global lightning phenomenon, feel free to look up:






Here's something I just came across on the internet:


Source: http://www.fastcodesign.com/3027794/infographic-of-the-day/infographic-84-planes-thatve-vanished-off-the-face-of-the-earth

Key Lies in Comparison


If we look at the infographic given above closely and compare it to the global lightning plot above (right after the youtube video), it is almost as if the plots resemble each other in terms of concentration of occurrences, geographically. Those locations that have the greatest number of red dots (lightning activity) are the regions where most aircraft vanishes have happened.





 It is true that the image has just one day's plot in comparison. But the video has a month's data in time lapse which has the similar trend. That analysis has data from the World Wide Lightning Location Network, collected by over 50 stations spread across the globe. The trend that we see in one image or the month's data is pretty much the same. Again, my point is not to take any 'causation' from the correlation. I am of the view, in surprise, that most of the 'aircraft disappearances' have happened in 'high-lightning' zones. To me that means that those aircraft that went missing, 'might' have encountered unfavorable weather conditions as the starting point of the 'crisis situation' that eventually led to their disappearance.

Well, this resemblance cannot be used for any conclusions at this point of time but this definitely points us towards a new angle of investigation. May be the flight conditions are evolving, and the current systems are struggling to cope with it. The technology is not bad but is getting outdated as we speak. The flight conditions are getting worse and our protective measures should get equally robust. They have been robust for decades but is this decade the same as the one before? Will the next decade be the same as this decade? The answer is a plain no. Then why would we expect the technology to be time-independent in terms of effectiveness?

What Could Be Done

Going All-Out on Preventive Measures.

I think it is time the ICAO and regional regulators (from Asia Pacific region) get together and draft a set of mandatory requirements that includes :

1. SATCOM Implementation (Cockpit & Cabin)
2. Upgrade of Lightning Protection Solutions

It doesn't stop with drafting new rules. In fact, it just begins with that. The implementation is the key here. Regulators in this region already have a tough time monitoring the operators. Every little compromise happening with procedures related to air-safety, both at the regulator and operator level renders the aircraft unsafe. Just because the probability of occurrence is very low, doesn't mean nothing bad will happen. Again, the potential impact is of importance and not the probability of occurrence.

For 'satcom implementation' I am not referring to the airlines using them for their communications. Satcom is considered expensive (subscription charges) and the airlines most often use them only when their aircraft is out-of-range from their line-of-sight communications. What I wish to suggest is a collaborative-satcom-implementation where along with the airlines, the regulators/ATC have visibility of the aircraft and more importantly, there is a real-time systems data streaming so any health deterioration can be detected and possibly predicted by an effective data-analysis regime. The objective of the collaborative effort should, in my view, focus on cost-sharing. The regulators can waive the % of annual cost for satcom implementation (enforced this way) in the duties/taxes that airlines pay. This will help the airlines achieve compliance without too much cost burden. Again, there are so many ways this could be achieved. All we need is a comprehensive satcom implementation for enhanced situational awareness, irrespective of how rarely these unfavorable situations occur.

I believe, both these upgrades will take the existing commercial aircraft safety one level above, into what may be a 'situationally aware' state of air-safety where the airlines, air-traffic controllers (civil/military) and satellite operators have real-time 'eyes-on-the-plane.' Also the aircraft will have a broader range of lightning protection in them.

I wish the plane is safe somewhere but my hopes are a mere reflection of what is human desperation for survival. If this aircraft is lost at sea, then we should stop talking about how advanced the platforms are and how great the procedures are.

The current capabilities and any capability for that matter will be in accordance with the concerned regulatory framework. However, we need to study evolving weather patterns and update our region wise 'worst-case-scenarios.' In practical terms, regulatory compliance is all about pre-written standards and the equipment checking out on the standards during their testing. I have great respect for the scientific community that develops the methods and regulator community that enforces the standards, but in rather blunt layman terms, a group of humans signing off on any technology will not guarantee safety during flight. It only asserts our confidence on the equipment for uptake. Our confidence has so far been proven right but this past year the aviation accidents have attempted to remind us that, standards and regulations have to get more customized, based on region-specific conditions. This to me means that, we need an evolutionary process for constantly measuring the weather patterns and have an ongoing flight-risk assessment feed into the regional framework of regulations. Now it will cost resources but we need to invest in that to be able to avoid aircraft disappearances. Again, by agreeing to constantly revisit flight risk-assessment, we will not doubt or undermine any technology or any entity's capacity to provide safe equipment/service. We have paid over 350 human lives this past year in 'aircraft disappearances' alone. We cannot go on anymore with the 'On what basis/capacity are you questioning our capability?' attitude. Even from a strict business sense, airlines need passengers to trust their 'air-safety' before they enjoy the 'enhanced passenger experience' through wi-fi and IFE. We will be questioned and we will be blamed but that is a price we must pay to enhance safety in commercial aviation. Just my thought.

Let's face it, whether we agree or not, weather-pattern is always variable and things can always go bad. In this case, it went bad twice within a year, in the same airspace. It is a waste of time identifying who went wrong and who failed to monitor them. We must focus on eliminating all potential possibilities of such an occurrence in future. The times are desperate and we need tough decisions and follow-up with implementation. Lives are more important than Return-On-Investment and budget constraints. This is the time for collaborative effort. We have already paid with human lives in hundreds. The objective is to prevent planes from vanishing. I can't believe, I just wrote that but sadly, it makes all the sense.

Continuing to hope for increasing air-safety in commercial aviation.

On a very different note [a shameless plug], if you are interested in unique tamil short films, feel free to visit https://www.summamovies.com/I couldn't tolerate the mass masala entertainers anymore and decided I will do my best to produce content with substance. I have a long a way to go as a producer and a start-up founder, but I am glad our journey has begun. I look forward to your support. Each film on our site costs INR 15. Thanks!!!


Regards,