The Figure above “Proposed analytics and training approach” illustrates the proposed approach we are exploring railway data analytics. First of all, it is necessary for railway technology and safety experts to tag objects of interest in the accident and incident reports using the railway accident causation taxonomy we are developing. NLP and statistical analysis of the complete accident records are then studied and clusters and correlation between accident causation factors are established. See the object that looks like a pizza in Figure 5; this is an initial cluster diagram from analysis of RAIB online accident reports.
This causation clustering and correlation gives a statistical relationship of the important causes in situations that led to accidents. The theory is that if we can identify when these complex conditions are in place, there is a heightened chance that a serious accident could happen.
Having trained our system to look for heightened risk we then stream the analytics engine with the railway data, real time and historical, structured and unstructured, let us call it operational data. Because of our previous work  we have established that data is available to flag up heightened risk and it can be linked to accident causation. The data is used as a proxy for the causation analysis as these are linked from the data to the accident causes.
The analysed operational data is compared with the data derived from the accident and incident records. If there is a similarity, a flag is raised. The system is interrogated to determine whether we have met a false positive or false negative or
if we have indeed averted a potential accident. In simple terms, if the two pizzas look similar, then there is likely to be a heightened risk.
The system will learn from accidents and incidents on an ongoing basis and become more accurate as it gets feedback and increasing data. Clearly to begin with this type of system would have to be run in parallel with existing safety management system until sufficient confidence is built up.