In the last few years, the term “Big Data” has been used to describe the explosion in the amount of data at the disposal to the science community, national governments and commercial companies alike. Driven by changes in technology, social media but also data policy changes through initiatives such as INSPIRE, the modern challenges in analytics are move towards how to store, manage and compute these vast datasets rather how to work with minimal amounts of data. In particular, the elasticity of Cloud platforms offers so much potential to managing large volumes of data but with some practicalities around data policy and legal issues still outstanding, the Big Data phenomenon has become one of the most important areas in Analytics today.
Social media has created a new and rich data source which is beginning to become usable in science and business alike. Just as technology and innovation has created this data stream in the first place, it must also provide the means to store it, mine it and interpret it and make it useful for analytical consumption. Meeting the challenges surrounding Big Data, immediacy, reliability and authenticity are driving continuous innovation in this space. While mobile technology innovation has rapidly advanced the adoption of social media within the population, non-human sensor networks are also increasingly becoming connected to everyday decision-making via social media feeds.
High Performance Computing
High Performance Computing (HPC) describes the technology area devoted to optimising computational throughput. Grid computing remains one of the better known forms of HPC comprising of thousands of interconnected machines able to split a single piece of work in to thousands of parallel processes. However, HPC also covers other forms such as high-performance databases and graphics card programming (GPU). As these technologies become more available and others emerge, understanding these capabilities and determining which are most useful to specific analytical problems becomes increasingly important.
Architectures and Patterns
The delivery of technology and software has changed significantly in recent years. The growth of the Internet and bandwidth has stimulated the development of distributed and interconnected systems able to exchange information over great distances using web services and common interfaces. These distributed architectures have laid the foundations for the more recent phenomenon of remote clouds of computers whereby software, platforms and services no longer need to exist within the same infrastructure or physical location as the consuming user or system. As organisations strive to adopt these computational advances, other factors such as data policies, data volumes and regulation have led to hybrid designs such as a part-cloudy paradigm whereby some computing resources reside remotely, and others in close proximity to their intended use. Many solutions are emerging and this is a busy area for research and technology advancement.
Open Source and Community Technologies
Open source software has been in existence for many years and the communities contributing to these projects have led to the creation of very mature, robust and credible software products. In insurance, one of the most current and visible examples of such a project is the Global Earthquake Model (GEM) which offers an open source earthquake modelling capability. In the geospatial world OpenLayers is another good example. Community projects have also been created to assist with the collection of data and there are few better examples than OpenStreetMap. This well-known project is now of the scale and quality that it offers a credible and increasingly reliable data source. How should these community projects be used within the commercial insurance world and what benefits to they bring to the risk and analytics arena?
Date: Apr 22, 2013 | Type: Paper | Attachment: Download File ›
Authors: A. Rau-Chaplin, B. Varghese, Z. Yao
Fields: MapReduce model, secondary uncertainty, risk modelling, aggregate risk analysis
Summary: The design and implementation of an extensible framework for performing exploratory analysis of complex property portfolios of catastrophe insurance treaties on the Map-Reduce model is presented in this paper. The framework implements Aggregate Risk Analysis, a Monte Carlo simulation technique, which is at the heart of the analytical pipeline of the modern quantitative insurance/reinsurance pipeline.
Date: Oct 01, 2012 | Type: Paper |
Ext. Link: Click Here ›
Authors: Eric Mason, Andrew Rau-Chaplin, Kunal Shridhar, Blesson Varghese2 and Naman Varshne
Fields: post-event earthquake analysis; catastrophe modelling; loss estimation; loss visualization
Summary: Catastrophe models capable of rapid data ingestion, loss estimation and visualization are required for postevent analysis of catastrophic events such as earthquakes. This paper describes the design and development of the Automated Post-Event Earthquake Loss Estimation and Visualization (APE-ELEV) system for real-time estimation and visualization of losses incurred due to earthquakes. A model for estimating expected losses due to earthquakes in near realtime is described and implemented. Since immediately postevent data is often available from multiple disparate sources, a geo-browser is described that helps users to visualize and integrate hazard, exposure and loss data.
Date: Oct 01, 2012 | Type: Paper |
Conf: International Conference for High Performance Computing, Networking, Storage, and Analysis | Ext. Link: Click Here ›
Authors: A. K. Bahl, O. Baltzer, A. Rau-Chaplin and B. Varghese
Summary: At the heart of the analytical pipeline of a modern quantitative insurance/reinsurance company is a stochastic simulation technique for portfolio risk analysis and pricing process referred to as Aggregate Analysis. This paper explores parallel methods for aggregate risk analysis risk analysis.
Date: Sep 01, 2012 | Type: Paper |
Conf: International Workshop on Cloud Technologies for High Performance Computing | Ext. Link: Click Here ›
Authors: Ishan Patel, Andrew Rau-Chaplin and Blesson Varghese
Fields: Cloud computing; Amazon Cloud Services; Parallel Analytics; R and Snow
Summary: Analytical workloads abound in application domains ranging from computational finance and risk analytics to engineering and manufacturing settings. In this paper the authors describe a Platform for Parallel R-based Analytics on the Cloud (P2RAC. The goal of this platform is to allow an Analyst to take a simulation or optimization job (both the code and associated data) that runs on their personal workstations and with minimum effort have them run on large-scale parallel cloud infrastructure. If this can be facilitated gracefully, an Analyst with strong quantitative but perhaps more limited development skills can harness the computational power of the cloud to solve larger analytically problems in less time. P2RAC is currently designed for executing parallel R scripts on the Amazon Elastic Computing Cloud infrastructure. Preliminary results obtained from an experiment confirm the feasibility of the platform.