Ethical Considerations for Data Usage

Wed, 03/25/2020 - 09:19
by Dale Bowring & Matthew Wolff

Dale Bowring, CPCU
Rating Service Owner, Corporate Information Systems, Amica Mutual Insurance Company

Matthew Wolff, CPCU
Department VP, Corporate Information Systems, Amica Mutual Insurance Company

Ethical Considerations for Data Usage

Ever since the first marine insurance policies were written at Lloyd’s in 1686, insurers have been seeking to obtain more information to better understand and price risk. Now, with the advent of modern computing, Internet of Things (IoT), sensor data and smartphones that can track nearly every element of our life, we’re truly living in the midst of a data explosion. In fact, we’ve created 90% of all data within the past two years alone! (Bernard Marr, n.d.)

The presence of this data poses both practical and ethical challenges to insurers who seek to develop insight from these mountains of information. In practical terms, most insurers have built technology systems that are efficient in sourcing pre-defined data and producing static outputs such as operational and managerial reports. They do not, however, support next-generation technologies such as unsupervised learning models that rely on massive sets of unknown and potentially uncorrelated data.

Many insurers are now working on strategies to source more- and less-structured data through their usage-based insurance platforms, sensor data and third-party providers. Yet more data does not necessarily mean better data; carriers will need to ensure they use this data in ways that are both effective and ethical.

Ethical Considerations

When looking at sourcing and acting on data, carriers should look to principles-based frameworks vs. rules-based systems. We believe this is the most practical approach, as data velocity and trends tend to change quickly, and static approaches will quickly become obsolete or fail to cover all scenarios.

A relevant example of a principles-based framework includes the European Union’s General Data Protection Regime (GDPR), which establishes seven protection and accountability principles that guide market participants to an expected outcome across a broad array of situations. (European Union, n.d.)

We propose carriers consider five principles when determining the consumption and usage of data:

Principle 1 – Data must be accurate. (Accuracy)

When consuming data from internal and external sources, it must represent the true facts and circumstances of the observed object. As data velocity increases, it’s imperative that insurers validate this data through the use of random validations, A/B testing and guardrails that can help identify outliers and inconsistencies.

Principle 2 – Data must be relevant. (Relevancy)

Insurers must carefully assess whether any data has a probabilistic correlation to accurately assessing and pricing a potential exposure. Industry regulators already enforce actuarial standards to prove rate correlations; however, this must also extend to other elements of the insurer’s lifecycle, including claims fraud and underwriting. For example, would a third-party-sourced “social media score” provide sufficient causative relationship to flag a claim as potentially fraudulent? Ultimately, carriers should only be considering data when there is a clear reason and result from using this data.

Principle 3 – Data must be timely. (Timeliness)

When consuming data, carriers should assess whether it has sufficient precision to be accurate. For example, a consumer’s credit score can vary daily, so any usage of this data must be timely at the point a carrier is making a decision or action. With an increasing expectation for real-time, on-demand, data-driven decisions; collecting data at the right moment in time is a crucial dimension of data quality. Short of this, data collected either too soon or too late can result in the wrong decisions being made, potentially resulting in lost time and money.

Principle 4 – Data must not discriminate. (Non-discriminatory)

This is perhaps the most difficult principle to guarantee, but when working with data sets, insurers must be careful to not rebuild “redlines” across data attributes. Many cases exist where data sets can produce outcomes that no human would want to achieve but makes sense within the context of an optimization function in a computer. An often-cited example is Amazon’s experience with machine learning models for resume scanning that ultimately omitted entire classes of applicants. (Reuters, n.d.)

Principle 5 – Data must align to consumer and social expectations. (Alignment with consumer and social expectations)

When deciding whether to use a data source, carriers should consider whether it would negatively impact the perception of customers or the general public. A tradeoff exists between an individual’s personal privacy and a service; however, this tradeoff is not perceived consistently across industries and companies. A good example of this would be social media monitoring. For most users of Instagram, they’d understand that their data is being used to produce targeted ads on the platform; however, they would not necessarily believe that data is being used by their insurance company. Using these so-called unconventional sources of public data – court documents, social media and even credit data – could pose ethical questions. Should insurers be able to scrub the data on your Instagram or Twitter accounts to determine your level of physical activity (posting a picture of working out) or involvement in risky activities (such as regularly tweeting about skydiving, rock-climbing or go-kart racing)?

Conclusion

In conclusion, the explosion of data presents both a unique opportunity and challenge to the insurance industry. It holds the promise of allowing insurers to truly price each risk individually, which performs an important economic function to society and customers by only paying for the risk they truly represent.

However, this data also holds the risk of potentially producing incomplete, inaccurate or undesirable outcomes when not fully understood or taken within a holistic picture. Companies are only now starting to understand the pitfalls of these big data sets, and should seek to incorporate rules-based frameworks now to ensure consistency and positive outcomes for years to come.

 

 

 

Works Cited

Bernard Marr. (n.d.). How Much Data Do We Create Every Day? The Mind-Blowing Stats Everyone Should Read. Retrieved from Forbes: https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-...

European Union. (n.d.). What the GDPR says about... Retrieved from Complete Guide to GDPR Compliance: https://gdpr.eu/what-is-gdpr/

Reuters. (n.d.). Amazon scraps secret AI recruiting tool that showed bias against women. Retrieved from Reuters: https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/am...


Comments (0)