Managing bias in the actuarial sciences with Joshua Pyle, FCAS Artwork

The AI Fundamentalists

A podcast about the fundamentals of safe and resilient modeling systems behind the AI that impacts our lives and our businesses.

All Episodes

The AI Fundamentalists

Managing bias in the actuarial sciences with Joshua Pyle, FCAS

December 07, 2023 • Dr. Andrew Clark & Sid Mangalik • Season 1 • Episode 10

Joshua Pyle joins us in a discussion about managing bias in the actuarial sciences. Together with Andrew's and Sid's perspectives from both the economic and data science fields, they deliver an interdisciplinary conversation about bias that you'll only find here.

OpenAI news plus new developments in language models. 0:03
- The hosts get to discuss the aftermath of OpenAI and Sam Altman's return as CEO
- Tension between OpenAI's board and researchers on the push for slow, responsible AI development vs fast, breakthrough model-making.
- Microsoft researchers find that smaller, high-quality data sets can be more effective for training language models than larger, lower-quality sets (Orca 2).
- Google announces Gemini, a trio of models with varying parameters, including an ultra-light version for phones
Bias in actuarial sciences with Joshua Pyle, FCAS. 9:29
- Josh shares insights on managing bias in Actuarial Sciences, drawing on his 20 years of experience in the field.
- Bias in actuarial work defined as differential treatment leading to unfavorable outcomes, with protected classes including race, religion, and more.
Actuarial bias and model validation in ratemaking. 15:48
- The importance of analyzing the impact of pricing changes on protected classes, and the potential for unintended consequences when using proxies in actuarial ratemaking.
- Three major causes of unfair bias in ratemaking (Contingencies, Nov 2023)
- Gaps in the actuarial process that could lead to bias, including a lack of a standardized governance framework for model validation and calibration.
Actuarial standards, bias, and credibility. 20:45
- Complex state-level regulations and limited data pose challenges for predictive modeling in insurance.
- Actuaries debate definition and mitigation of bias in continuing education.
Bias analysis in actuarial modeling. 27:16
- The importance of identifying dislocation analysis in bias analysis.
- Analyze two versions of a model to compare predictive power of including vs. excluding protected class (race).
Bias in AI models in actuarial field. 33:56
- Actuaries can learn from data scientists' tendency to over-engineer models.
- Actuaries may feel excluded from the Big Data era due to their need to explain their methods
- Standardization is needed to help actuaries identify and mitigate bias.
Interdisciplinary approaches to AI modeling and governance. 42:11
- Sid hopes to see more systematic and published approaches to addressing bias in the data science field.
- Andrew emphasizes the importance of interdisciplinary collaboration between actuaries, data scientists, and economists to create more accurate and fair modeling systems.
- Josh agrees and highlights the need for better governance structures to support this collaboration, citing the lack of good journals and academic silos as a cha

What did you think? Let us know.

Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

LinkedIn - Episode summaries, shares of cited articles, and more.
YouTube - Was it something that we said? Good. Share your favorite quotes.
Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

Susan Peich: 0:03

The AI fundamentalists, a podcast about the fundamentals of safe and resilient modeling systems behind the AI that impacts our lives and our businesses. Here are your hosts, Andrew Clark, and Sid Mangala Mongolic. Hello, everybody. Welcome to the AI fundamentalists. We're excited to be here after a brief hiatus due to travel conferences and exciting projects of our own that have been going on. But nevertheless, we are back and trying to finish the year strong. And before we get into today's subject, we do want us to give our own our own take on what had been going on in that time, particularly open AI. Yeah, well, that was fun to watch, wasn't it? Anyways, any longtime listeners of the podcast will definitely know our feelings about open AI. And we try and keep it relatively guarded on our actual opinions on open AI. And I know if you listen to our podcasts on shifting left, we even went a little bit deeper on on some of those areas. But anyway, I'm not very surprised about this whole kerfuffle that happened. I think a lot of it had to do the, what was it not openly candid, or however the board initially phrased it? Well, as we've talked about, before, that most people seem to have forgotten is open AI is actually open AI was originally started not that long ago, is a research organization that was supposed to publish fully open source different information about how to make AI systems better, and all those things. And then it's a nonprofit, it's still, by the way, has nonprofit tax status. So with all the you know, big trying to become a big Silicon Valley player, it appears they are non profit. Now, of course, they are nonprofit, they're hemorrhaging money, like many startups, but they get non profit tax benefit, which is really strange, because based on how they're currently operating a few of the other issues, I think, that we had going on there, which is a lot of hearsay, and we don't really know the specifics. Gary Marcus has a great blog that is going into a lots of the nuts and bolts. So I don't want to say anything here say too much on the podcast here, besides. The founder of open AI is involved in a lot of companies and has a lot of stake in different companies. And we're not sure exactly how all that plays into his recent congressional hearings and things about where his stakes and conflicts of interest may lie, as well as the nonprofit status. So if you want to go dig more to that, you can definitely do that. We're not gonna get into any of those kind of dicey issues here. But anyway, very interesting how that all happened. I'm surprised. They brought him back specifically, I guess, because the non tax status, it's interesting how that how the investors have so much a say in that. But yeah, I'm not surprised that this is gone. I think actually, the fact that they brought him back is worse for everything is now he's gotten a lot more control, one of his conditions coming back was absolved from any wrongdoing. So that's also an interesting thing to ask is one of your conditions to come back. But in either case, he has more control than he ever has. So it's gonna be interesting to see how that plays out. Any thoughts that are Susan? It, put companies on alarm, don't put all your eggs in one basket, too, from the technical side, because that was all going on, people were scrambling, they did not have a backup plan. And this technology is too new, not to have a backup plan. So I think it woke a lot of companies up and a lot of people who, even at the experimental level, you know, really needed to be wise about why are you using this? What's your goal with it? So despite it being, you know, a big shake up, it also was a big wake up moment, said, what about you?

Sid Mangalik: 3:49

Yeah, I mean, to me, on the research side, this looks like there's almost another story going on underneath this. And obviously, we can't speak this formally, because none of us here work at open AI. But it feels like there's this push and pull between the board wanting to build slower, better, refined, more aligned with human goals. And the actual researchers on the team, or, you know, they're from academia, and they want to go fast, fast, fast, and want to make a better and better and better model. And so, I want to hear what Andrew has to say here before I hop off into LLM world, but I think that's going to be a big piece of what's what's happening there and the tension that's happening there.

Susan Peich: 4:29

Definitely, and adding in like the VC dynamic and things like that in Microsoft stake and Microsoft has a very large stake now and in the couple days of you know, where's the CEO, CEO gotta go. He already had a big job at Microsoft. So there's Microsoft has a very outside, outside stake in open AI and trying to leverage that. So I definitely there's 100% that tension of apps really top researchers they want to move fast and break things which as normal they want to be the leaders in this area that there's doesn't really seem to be a moat of Why is Open I bet anybody else? And I know Susan, your comment about companies scrambling, I would argue as well that like, opening AI has nothing that that's that different. You couldn't switch out Google or Amazon or sorry, or Microsoft's API in the background, that's not going to be that different to you. If you're, if you're doing just if you lose a little bit fine tuning. But I agree with that. I think there's definitely a lot of push and pull on several different dimensions on do we want to go the slow, responsible path? How much do we want to open source? And how much do we actually just want to be the leaders? And also, how much are we prophesizing? That so it's, there's a lot of interesting dimensions going on there. Sure. So said you had you had also seen something about ORCA to that you want to call out?

Sid Mangalik: 5:40

Yeah, yeah. So you know, there's, there's a lot of murmurs and whispers about, you know, they're hyping up this like new big thing they're going to do, which is going to be like, you know, q star, or it's going to be like, you know, we have these innovations to do beyond GPT. Five, which on the record is happening is happening relatively soon. But this really great paper came out of Microsoft just two weeks ago, called orca to teaching small language models, how to reason. And this is a, like an awesome paper. I think, you know, if you're technical enough, I think you should hop in and try and read this paper. It's, it's relatively readable. But I think the really big takeaway we got here is that open AI has been talking about, well, we ran out of internet data, right, we effectively scraped the entire internet, we read every Reddit post, we read every core link, we read every Wikipedia article, what's left, in turns out that maybe the problem isn't going to be a problem of more and more and more data, it's going to be a problem of more effective data, more data, which is good for teaching and learning for models. So it's going to be closer to rather than being a teacher who gives a student, you know, 5000 years of content, what if we give them 50 years of really high quality, well structured content. And this could be, obviously, very new research. But this could bear to be the new paradigm for how we train these LLM is making better and better curriculums for models.

Susan Peich: 7:07

Which makes complete sense. That's one of the things we've been saying actually, on this podcast, since since beginning is smaller data is smaller, high quality data is better that brings stats back we've been talking about. So I'm glad to see that they the NLP world is starting to shift a little or the LM world rather, is starting to shift in that direction, as well as the whole big data, everything more data, better correlation over causality. whole discussion, I'm glad that we've kind of, we're starting to shift back a little bit more into higher quality is better. So happy to see that that's, that's migrating as well, that's still not going to necessarily solve the issues with the underlying paradigms of make it sound accurate, not be accurate. And there's, there's really kindly to go away from the current definition of how LLM 's are built, I think, but the higher quality data will definitely help have higher quality LLM.

Sid Mangalik: 7:57

And then just to close out a little wrap up on news here, just yesterday, so very recent news. Google officially informally announced Gemini. What does this mean to you, on your end? Gemini, famously triplets. So Gemini comes in three versions, an Ultra Pro Nano, nano is targeted turn on phone. So these are 1.8 billion parameters, which is it's great that they even gave a parameter list. You know, if you look at the GPT, technical report, no mention of parameter sizes. So 1.8 is what we're targeting, which is big, but it'll it'll fit on a you know, Google Pixel phone, then they have the Pro, which they're going to be start rolling out four bars. So it was originally built on palm two, which, if anyone use Bard was a little bit disappointing, not really fitting for the company that kind of designed all this tech in the first place to have such an underpowered model relative to the competition. But now with Gemini Pro, we should expect about parody with GPT 3.5. And with Ultra, which is going to be coming out in sounds like a month, but you know, delays may vary should be on par with GPT, four or a little bit better in some tasks. So that's something we have to look forward to is Google finally stepping up their game and you know, they did all this good work in this field. They make the, you know, all the tensor libraries we use. They make all the TPUs we use, but they didn't have a compelling AI product until just yesterday. So we'll see how that goes. But something to keep an eye on, as

Susan Peich: 9:27

well keep an eye on that. Well, let's get into it. We are excited because today's topic you know, we've been covering a couple. I mean a couple of sessions on model validation and the different aspects of that, but today we're excited to have a guest in interview with Josh Pyle. He's the Vice President and head of risk captain, and we're excited to have this session in an interview with Josh Pyle. He is the Vice President and head of risk and captive management for boost insurance. Before boost he was the head actuary at DoorDash. And before that served in multiple actuary and analytics roles from his early career with Liberty, mutual, Allianz and AAA, on through to his new career spent in cyber insurance, malt modeling with Symantec, and cyber Q. Over the past 20 years, Josh has seen and experienced a great number of challenges in the actuarial field. But today he's joining us to talk specifically about bias. Josh, welcome to the podcast.

Unknown: 10:37

Thanks a lot. Great to be here. Appreciate it.

Susan Peich: 10:39

I know that everybody's been excited to everybody's been waiting for this. Because, you know, a lot of times we get into the topics and then when we can bring in somebody who spent the time that you have in this field, you know, it's always in, it's always wonderful for us to have and, you know, makes it really exciting for everybody listening, just to hear you're just hearing other take on things. So, you know, let's get started on the topic of managing bias in Actuarial Sciences. Well, thanks for being here. Josh, really excited for this. If we haven't already messaged if there's an article that Josh and I co authored in the contingencies, which is the is that Academy of Actuaries publication? Josh, I believe, exactly. American economy. Excellent. So this is kind of kind of a part two to that. And we're gonna go a little deeper than what we did in the Academy of Actuaries publication. So really wanted to take this opportunity, Josh, to ask you a couple questions kind of unpack a little bit more, because that was kind of a general almost coming from the D SML. Side and how it works with actuarial sciences, but kind of wanted to take it from the other side and kind of really focus on how do you look at it as an actuary? What are the main bias considerations? And then the gaps? And where do you see those areas? And then how could as a profession and doing an interdisciplinary approach, maybe start closing some of those gaps? So that being said, how do you define bias in the property casualty space?

Unknown: 12:03

Yeah, no, great question. And it was a joy to write that article with you. So happy to be here today and chat more about it. Generally speaking, at a high level, you have different definitions of bias, starting, I guess, probably with the neic. I think formally, they describe it as differential treatment that could leave lead to either favorable or unfavorable outcomes for either a person or a group. More specifically, as it relates to actuarial work. The Casualty Actuarial society, which is the society that I'm a member of, really has a mission statement that focuses on tenets of ratemaking. So they describe the need for pricing to be not inadequate, not excessive and not unfairly discriminatory. And that ladder piece is really where bias comes into play. So there's a lot of talk about what's called a protected class. variables that you can't price based on really vary by state, but there's gonna be things like race, religion, ethnic origin, sexual orientation, in some cases, gender, disability, even credit. And so that's really what you want to be aware of as an x ray, as you think about ratemaking in general.

Sid Mangalik: 13:32

Yeah, that's good. I think that's, you know, that's a lines off how we want to think about this in the machine learning space. Maybe we don't think about it so crisply. And I know that we have our techniques, but how to actually tackle the problem of detecting this bias and what kind of tests might be normal in evaluating that? Yeah.

Unknown: 13:50

Yeah, for sure. So you really can, I guess, break it down into the obvious component parts, which would be inputs and outputs. In terms of outputs, probably the most important piece of ratemaking is making sure that your prices are in line with the actual underlying risk. So if you're looking at one of these protected classes, for example, or any variable, are the prices coming out of your model, in line with observed historical claims, and associated expenses, and where those are out of line is where you need to make those adjustments. Obviously, with actuarial work, certain states will mandate that you either include or exclude certain variables. And in the case of these protected classes, a lot of times you would be forced to exclude those classes from consideration in pricing. But you really want to start with are my rates actually in line with what I've seen historically, from a claim standpoint? On the other side on the input side, making sure that all the data you're using for modeling is actually representative of the population you're trying to model for. So in other words, if you're modeling for the entire state of Michigan, and you're, you're actually using as a data input, some very small, biased subset of that the model results may not be generalizable to the entire population. So those are really kind of looking at the the inputs and outputs are a starting point for identifying what bias exists within your model. There are a number of ways that you can test whether or not bias actually exists. I would say one, one that's pretty commonly used. Actually looking at if you're going through, let's say, a failed rate change within a given state, what is the overall impact by by protected class, so even if you're not using that for pricing, you can look at what the outcomes are for those protected classes, and identify whether or not that's intended output. So let's say for example, you're looking at race, you can actually examine the impact of those pricing changes or that model change on different races to make sure that's in line with what you're expecting, because that's how the DOI or the state level DOI, we'll be looking at that that as well. And so you're really going to be asking about dislocation, which is just just defined generally as pricing change from prior. And then make sure that your, your independent variables going into the model are not themselves correlated with a protected class. So proxies are very common within actuarial ratemaking. But it's important that you understand if you're not including including race, or disability or gender, are there other variables, you're including in the model that are very correlated with that, that kind of circumvent the regulation as it's intended? And so it's kind of important to understand the inputs, outputs, and how those variables connect to each other.

Sid Mangalik: 17:33

Yeah, that's great. I guess that almost makes you an ask a quick follow up question here, which is like when we're looking at the protected classes that we're going to study we're going to analyze to make sure that we're not giving differential are different outcomes for different people. How much of the choice of those classes is reactive? Versus proactive? Meaning? Are we going to purposefully select protected classes to make a principled stance? Or because we anticipate these being monitored later? versus how much? Are we just going to fall back in what regulators want us to do?

Unknown: 18:03

Yeah, that's a great question. I would say, a common approach is actually to use one of the more restrictive states as a basis for the framework for your model construction, and then use that across the different states that you write in. I've seen it both ways where you do have, you have specific regulations state by state that mandates what you can and can't do. But the problem with that is you may have a very protective state, like a California, for example, where you have to use certain variables, you can't use certain variables, and then you have other states that may be a little bit looser with regulation. And then you end up with two models, four models, 10 models? And are you really wanting to update and validate and calibrate all of those every time you make an update? So I think a lot of it, there's good precedent set state by state and what has been done and approved. But if you're thinking from a modeler lens, you really almost want to start from what's most restrictive, and then how can I use that for other states in a meaningful way? Excellent.

Susan Peich: 19:21

Well, that was very helpful. Thanks for that detailed explanation. Um, what gaps Do you think may exist in the current process where we're bias could sneak in or do you think overall, the conversation or other people talking about actuarial bias and things like that, is that just kind of more people uninitiated and don't understand the existing controls in place or do you think there is actual areas to improve on on how that current model structure with making sure that you have the proof it works?

Unknown: 19:54

Yeah, yeah, I think there are a few a few gaps within the The actuarial process that I think could potentially lead to bias. Probably most notably talking about how we build models and invalidate them, and understand them. There is no standardized governance framework that actuaries can rely on, unfortunately. So you have, you know, pricing actuary, from company one, analyzing model credibility and worth completely differently from company two. And so I think there is a need for that framework of how to think about model validation and calibration, and how to think about bias. I already alluded a little bit to state level regulation. And that is, that is a complex factor, because you really, you have to think about 50 different regulations at the same time. And so it's, it kind of is, it clouds the process a little bit, because it isn't free forum, you can't do what you want. It's not it's not a process where you just put in any variables and kind of let it flow through, you're thinking about the right way to address those state level regulations. While building obviously a very predictive model, or as predictive as possible. I think one challenge too, is just the idea of credibility, there's a lot of a lot of talk in Actuarial science, about credibility, for new lines, emerging lines, anything with limited history, you have you necessarily have a lack of data. And so here, you're trying to assess your inputs to your outputs, on very thin data, it's hard enough to get a predictive model, let alone identify nuances around bias. And so I think all of these kind of, combined with, I would say, a general sweeping movement, from the GLM of the world to more advanced methodologies or algorithms set, you know, talking about machine learning, or so called Black Box algorithms, and how you interpret outcomes from those models. It really it makes this process more complicated, and leads to an environment where bias can exist.

Susan Peich: 22:43

Excellent. Thank you for that. Two quick follow ups on that. Number one is, have you read the new neic bulletin that came out about the you know, how to manage your own linkage in here, kind of the use of it's called an NCIC model vaults in the use of artificial intelligence systems by insurers, we've provided a couple rounds of review on there. What's interesting is in the final draft, they actually removed any mentioned in the definition section of what biases so I was curious to get kind of your thoughts on AI? If you're not familiar with with this specifically, that's completely fine. But kind of why would they is bias a word that's kind of hotly debated within actuaries even the notion of bias or the word bias? Or is it just kind of like 15 definitions, or it's kind of just in general, even if you're not familiar with this specific document, and kind of want to get your thoughts on why that might be something that was so hotly debated and removed? Yeah,

Unknown: 23:34

that's a great question. I'm not super familiar. So I will say that upfront. It absolutely is a topic that's getting a ton of attention within the actuarial community. So you have, you have a lot of now webinars and attention going to continuing education around bias, a lot of white papers, kind of walking through what bias is how we define it, how we mitigate it. Similar, kind of along the same lines as as the short article you and I put out, I'm just trying to help actuaries think through what they should be looking for and how to fix the issue. But it's very clear that there aren't established objective definitions at this point. There's some really good white papers that I could that I could refer people to, but not, I would say widespread acceptance of exactly what bias is. And that could be the reason that they're they're kind of a little bit hesitant to put commentary around bias in there, but there is I mentioned continuing education. So within the Casualty Actuarial society, one requirement is every year we have to attest the we've met a certain number have hours of continuing education related to professionalism, organized topics business topics. Just recently, they've introduced the concept of bias for the first time. And so formally it is a requirement for all member actuaries of the CAS to have bias training each year. So it is getting a lot of attention, and it's absolutely a work in progress.

Susan Peich: 25:28

Excellent. That was very helpful. Thank you. Um, one thing when you mentioned the Aesop standards and one part I've read 50 I'm by no means an actuary. But I've read fifth Aesop 56, and some of the other ones and one of the things that I found was kind of interesting, as part of your discussion about the different standards is there's not really that notion of objective effective challenge, at least in that I've read the Aesop unless I'm missing something, kind of want to get your thoughts, thoughts on that and kind of, is that not a standard practice? Because you're an F casts or whatever, that you think you you can kind of review your own work? Or is that a standard thing? That's kind of like an unspoken rule, everybody's already doing that, even though it's not explicitly written down?

Unknown: 26:08

Yeah, even so I would say generally speaking, a SOPs. And I know we've we've talked a little bit about this, they are somewhat vague, around certain topics. So for those that aren't that familiar, a SOPs are actually our standards of practice. And they cover different different topics from reserving to ratemaking to assessing catastrophe loads to assessing credibility. And across the spectrum, they're something like 50 or 60, at this point, and they're issued through the the American Academy of Actuaries. Bias is is mentioned. But it's not clearly mentioned. It's not given a lot of space in those, Aesop's which does leave the the item kind of open ended and open to interpretation. And so there, there is commentary around what actuaries shouldn't shouldn't do. These are all about best practices. And that that does, I would say, hamper progress a little bit because actuaries won't know exactly what to look for. And in fact, truthfully, one of the one of the stark differences in my opinion, as we started talking more about data validation, and model validation was how much detail really was lacking from some of these standards. And I think a lot, a lot more can be done to define what we should be looking for. And not that it's always bad to have that vague guidance, because it does. It's a reminder you read that Aesop. You know, once once a year, if you're studying for professionalism, continuing education, it's a good refresher, you remind yourself to do things the right way, look for bias, great, but there's not a lot of how do you look for bias? What do you look for? And I think that vagary makes it difficult in some cases to actually assess and identify bias.

Sid Mangalik: 28:33

Yes, I mean, let's let's dig into that. Let's get like a little bit more technical here is, you know, bias analysis is like a big problem. That's not something that has like a clear cut solution. You can't say, you know, we've done our bias checkbox. What does? What does it look like to you when you have like a sufficiently comprehensive bias analysis? Like what it is, is it simply correlating outcomes with protected classes are, I guess, to what goes into a sufficient and comprehensive bias analysis that you could present? Yeah, yeah,

Unknown: 29:03

I would, I would think, I guess pulling a little bit from what I had said before, and then adding on a little bit. So I think the first place to start would be really identifying dislocation analysis or an analysis of how inputs are matched to outputs. Looking at truly from an x ray lens, you want your past experience to be reflective of your pricing go forward. So understanding making sure that across all of the variables of consideration, you're doing that even if you're not using protected classes as a predictive variable, understanding whether or not those prices are all else equal, affecting those protected classes differently. And so looking at again, correlation with your protected classes or their hidden Variables in there? Is their confounders their multicollinearity? How are you stripping those protected classes out? And I think probably the best way I've really seen that done is to build two versions of a model. So you would have a version that explicitly excludes your protected classes. And then if you're taking an example, like, we'll say, race, again, it's probably the easiest one, include a model, or build a model where that is explicitly included and understood understanding. As you look at the results of those two, what is the difference in terms of predictive power? You're really getting at? Is the model predicting the same thing in each case? What's the impact of, of that protected class in the model, obviously, when we build something like a GLM, we're looking at parameters, we're looking at p values and order ordered predictive power of each of the independent variables. And so you want to make sure that you're protected class race in this case, is not a very predictive variable, and that there aren't other variables that are super correlated with it. So you have this, this multicollinearity that you're not accounting for. So I think that's probably the I don't know if I would call it the most common approach, but a an approach that I think, gets at what DUIs will ultimately be asking for, which is explainable evidence of having, you know, carefully taken bias out of the picture. Yeah, I

Sid Mangalik: 31:58

mean, that's great. And I love the intentionality of going back and building that model with the protected class and getting a sense of, what is the model really doing? And where's the performance coming from? Right, that's almost like a, like a causality type modeling, which goes a little bit beyond strictly just that, like correlational studies really examining how these independent variables are part of the graph that make these decisions. I guess, you know, let's not get too sidetracked here, but a little bit. Philosophical is I guess, like, how much do you think actuaries care about getting to this almost causal angle? Versus like, what's just apparent through co linearities? Through strict correlations with associations? Yeah.

Unknown: 32:43

Great question, too. I guess my the first place my mind goes is does smoking cause cancer? You know, the these questions have existed forever. I do believe it's impossible to prove causality. In cases like that. You I think it has to be a blend of just understanding truly some of what we've talked about the correlation, confounding variables multicollinearity, but also having that expert opinion into why these things are influencing Alright, there has to be there has to be an explainable reason that variable is included in the first place, what what outcome do we expect? What directionality do we expect? Does it make sense to have all these variables? Or do we have too many variables? Just having that, I guess, subject matter expertise, helps kind of define what should and shouldn't be and what those relationships could look like. But I don't think you're ever going to prove that smoking causes cancer.

Susan Peich: 33:56

do love that, that part of that domain expertise and trying to do parsimony on inputs. And I think that's one area that I hope actuaries don't lose at all. And the data science role can definitely learn from from the opposite of like, let's just throw the kitchen sink at it and use some sort of like, correlation based metric to call down to still 100 features. Are these really relevant they matter? Like, I think machine learning can be very valuable for finding those key features. But then you really need a human in the loop figuring out what what it is. And then I would even argue, like when you get down most of how machine learning is used, it's more that mining for correlations that may or may not be something we want to use. The actual algorithms, depending on if it's an parsimonious feature set that GLM is is not necessarily a bad paradigm is specifically with the interoperability and I've done some several tests myself, it's not, unless it's like really complex data, those genomes are still getting extremely high performance. So I think it's less about like, as we think future of how can we always improve modeling, I think the DS and Alside needs to start going a little bit more actuary on how the how to choose parameters and maybe even looking at some of these more interpretable techniques. versus actuaries can use a little bit more, boil the ocean for certain non less sensitive attributes, and then use the judgment on figuring out how to how to go forward.

Unknown: 35:08

I completely agree. It's interesting too, because I think some some actuaries feel a little bit excluded from the Big Data era, because we are, we're forced to explain everything we do, understandably, one of the key functions of the DOI is make sure they're protecting the customer, that we can explain what we're doing. Imagine, you're the customer, you're buying auto insurance, for example, and your price spikes year over year, and you call, call the company and ask what's going on? And they say, well, we put it in our deep neural net. And I don't know that was the outcome, not a sufficient answer, and not going to work from the DOI approval standpoint. So it is important, even though it's true, maybe we don't get to enjoy all of the fruits of these new machine learning algorithms. It's really important, even as we kind of more in that direction. Everything stays interpretable. And defensible, as we talk to do is to get these rate changes approved. Excellent.

Susan Peich: 36:14

Yeah, I think this has been a great discussion. I'm really appreciate you being here. One more question if you don't, if you don't mind. And then of course, if Susan and Susan have any clarifying as well. But one thing that just kind of caught our eye recently is that Verisk announces their own solution for insurance to assess underwriting models and variables for unfair discrimination. And like, that's all well and good. But I kind of what was sent back to my comment earlier about the ESOP and the lack of objective effective challenge like OCC and other regulations very much you need to have that independent third party look at it. And actuaries, I think, do that a lot in practice, but I thought it was very strange that a data provider is now saying, yeah, all of the data, we have data bias evaluation. So by the time you get it from us, it's all good to go. We're doing everything for you. It's great. So I want to know, if you had any, if you're familiar with that, and had any thoughts on that, I just kind of rubbed me a little bit the wrong way of you can't really assess yourself, like they should be getting an outside party to do that for them.

Unknown: 37:09

Yeah, I hear you. I think I have read that one. And I thought, you know, I was thinking of it more from starting down the path of standardization. So can a company audit their own work? I would say? Yes, to some extent. I mean, if you look at some of what's going on around, say, solvency to or or so from a risk, risk analysis perspective. I think having that governance framework is important. It's important that companies do that, whether they're being watched or not. So I do think to some extent, it's important, especially again, as we go into the, this world of big data and more advanced algorithms, you have to understand transactionally how inputs are leading to outputs, and be able to defend that I, I hear you on the various piece. I just took it more as I think more standardization is needed, the more that can be done to help actuaries understand what they can do to understand to identify the existence of bias and mitigate it, the better off we'll be there is no there isn't that standard today. And so maybe just from the standpoint of building a framework, or giving an idea that then it gets planted and people start thinking like that, I think is beneficial. And I'm you know, I'm coming from a cyber security, cyber insurance world war, that also was an issue. It's hard to be the first person to put out an idea. But there's so there's a an incredible lack of standardization around what people do, how they do it, that I think it just kind of gets everyone thinking similarly, talking similarly, not in a groupthink way, but in a way that gets people to the table and helps build toward that standardization. Excellent. Thank you for that. Josh. Really great

Susan Peich: 39:25

insights into bias and really from the actuarial standpoint, for all three of you, Josh, Andrew and said, Any final thoughts from what we discussed today? I'll give you two choices. Final Thoughts summarizing something that stood out to you today that you can't emphasize enough or any hopes, any hopes or, you know, things you would like to call out for the future that you would like to see the actuarial the actuarial field take on?

Unknown: 39:59

Yeah, I'm happy to happy to go first, I can't emphasize enough the importance of building a governance framework because this will only become, as you know, the lead in today, talking about Chad GPT. And Gemini models will only become more complicated. There will only be more data. So it is imperative for actuaries to really understand what model framework or model governance framework looks like. And start, start building that out. And I think, like we talked about earlier, bias, it's an evolving issue. So, you know, do due diligence, understand what's being talked about the different perspectives on it, as you, you know, think through what what bias is, how you can identify it, how you can mitigate it, just kind of, I guess, keep up with the evolving landscape, it is changing quickly. But it's really important to put some focus on on trying to educate yourself there.

Sid Mangalik: 41:13

said, Yeah, I mean, this, this is a this is a great interview, I think we learned a lot about, you know, how our worlds intersect, how we have a lot of shared language, between how we deal with bias and machine learning and an actuary, the actual real world. And I'd love to see, you know, this field take off even more and really address bias in a systematic way, in a well defined way, in a in a in a published and repeatable way, to give a guideline to people that are working in the DSL, which is a little bit more like the Wild West a lot more. You know, if it works, it works, you know, push to production, and really building out the standards that we can all use and build upon, which I think that, you know, your field is just a lot more principled, and a lot more driven by, you know, ideas and policies, versus our world, which is a little bit more about creating customer value, getting the intersectionally really powerful. Completely agree. Yeah.

Susan Peich: 42:11

Well, thank you so much for your time, Josh, this has been been fantastic. And yeah, I fully agree that I think what I hope going forward is also the actuaries and data scientists, and all statisticians can all start an economist, everything can start talking a little bit more and a little bit more interdisciplinary, because actuaries I think, are kind of closed off sometimes not all of them, you obviously not want to you're not one of those. But there are, it seems as feel just kind of like vs versus actuaries at some companies versus like a little bit of how can we call me a little bit more, I think actuaries is having have that knowledge and skill set and professional background really take that leadership role in integrating the new technologies. But how can they do it in a way that they look and see what all the new technologies are and how to integrate them and kind of go interdisciplinary, and but machine learning has the same issue if it wasn't built here, you know, it's got to be built here first. And every every field is kind of that way. And I think every field like data science needs to learn from actuaries on how to be more disciplined in professional machine learning as well. Computer science needs to learn that it's not all about marketing. And I think every field has a lot of things to learn from each other. And I just hope that across disciplines not I'm not trying to hit on any disciplines, specifically, just all disciplines need to start being a little bit more interdisciplinary and research. And that's even an academic problem. You're very siloed in papers. And if you don't, if you try and go interdisciplinary, there's not even any good journals. So I think in general, for as we're getting better modeling systems and more fair modeling systems that intersection across cross pollination between economic statistics, data science, actuary is a great way forward along with the governance structures you mentioned. Completely

Unknown: 43:43

agreed. Yeah, these are great perspectives. Honestly, it's been a joy to talk with, with all of you from different disciplines. And yeah, happy happy to be here today.

Susan Peich: 43:54

Likewise, Josh, it was a pleasure to have you with us. And for our listeners. If you have any questions about today's episode, please feel free to email us at the AI. We look forward to hearing from you

People on this episode

The AI Fundamentalists