The AI Fundamentalists

Why AI Fundamentals? | AI rigor in engineering | Generative AI isn't new | Data quality matters in machine learning

May 11, 2023 Susan Peich Season 1 Episode 1
The AI Fundamentalists
Why AI Fundamentals? | AI rigor in engineering | Generative AI isn't new | Data quality matters in machine learning
Show Notes Transcript

The AI Fundamentalists - Ep1 

Summary

  • Welcome to the first episode. 0:03
    • Welcome to the first episode of the AI Fundamentalists podcast.
    • Introducing the hosts.
  • Introducing Sid and Andrew. 1:23
    • Introducing Andrew Clark, co-founder and CTO of Monitaur.
    • Introduction of the podcast topic.
  • What is the proper rigorous process for using AI in manufacturing? 3:44
    • Large language models and AI.
    • Rigorous systems for manufacturing and innovation.
  • Predictive maintenance as an example of manufacturing. 6:28
    • Predictive maintenance and predictive maintenance in manufacturing.
    • The Apollo program and the Apollo program.
  • The key things you can see when you’re new to running. 8:31
    • The importance of taking a step back.
    • Getting past the plateau in software engineering.
  • What’s the game changer in these generative models? 10:47
    • Can Chat-GPT become a lawyer, doctor, or teacher?
    • The inflection point with generative models.
  • How can we put guardrails in place for these systems so they know when to not answer? 13:46
    • How to put guardrails in place for these systems.
    • The concept of multiple constraints.
  • Generative AI isn’t new, it’s embedded in our daily lives. 16:20
    • Generative AI is not new, but not a new technology.
    • Examples of generative AI.
  • The importance of data in machine learning. 19:01
    • The fundamental building blocks of machine learning.
    • AI is revolutionary, but it's been around for years.
  •  What can AI learn from systems engineering? 20:59
    • Nasa Apollo program, systems engineering.
    • Systems engineering fundamentals world, rigor, testing and validating.
    • Understanding the why, data and holistic systems management.
    • The AI curmudgeons, the AI fundamentalists.


Good AI Needs Great Governance
Define, manage, and automate your AI model governance lifecycle from policy to proof.

Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.

Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

  • LinkedIn - Episode summaries, shares of cited articles, and more.
  • YouTube - Was it something that we said? Good. Share your favorite quotes.
  • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Susan Peich:

The AI fundamentalists, a podcast about the fundamentals of safe and resilient modeling systems behind the AI that impacts our lives and our businesses. Here are your hosts, Andrew Clark, and Sid Mangala manglik. Hi there, and welcome to our first episode of the AI fundamentalist podcast. My name is Susan Page. And for all intents and purposes of a first podcast, I'm here to introduce our hosts for the day. Today. I'm here with Dr. Andrew Clark, the CTO of monitor and research scholar and analytics and economics. I'm also here with Sid manglik, research scientist with monitor and research scholar in natural language processing. Andrew said, I'm so excited that you guys had chosen to do this and really want to share the same knowledge that you do with us about the value and risks and actually innovative qualities of MLMs. This information that you share with us now we get to share it with everybody else. It's so important right now. And I think there's nothing left to do but just dig in, why don't we start with you said, Hey, there.

Sid Mangalik:

Yeah, so I am the research scientist over at monitor, I'm really deep into this NLP space, currently working through my PhD in the field. And I have a lot of interest in making strong, safe and robust machine learning models. And our goal is to talk through how you do that, what that looks like. And maybe what that doesn't look like too. So I'll let Andrew introduce himself now.

Andrew Clark:

Thanks, Ian. My name is Andrew Clark. I'm co founder and CTO of monitor. And one of the main reasons though we started this company was trying to add that holistic first principles lifecycle type approach to to AI, and ml. And we're gonna make sure this podcast is not about monitor whatsoever, we're just introducing ourselves at three of us work for monitor, we want to make sure this is a kind of a conversational podcast around things going on in the AI space, grounded in the viewpoint of the fundamentals of building modeling systems and how to do so responsibly, and how to do it the hard way. So oftentimes, no matter what you do in life is it's I'm a big runner, you know, whatever hobby you have, or whatever profession you're in learning those first principles, the farthest way, it's a physics term farthest, the simplest thing you can, you can start from those building blocks is the key way, there is no easy way to be an Olympic medalist, you have to do things the hard way and do things properly and build in the proper order. So we want to approach AI and modeling in general, from that perspective, versus the oftentimes, specifically in the AI space and where we are today, people trying to invert that pyramid, and how can I start using chat up tomorrow to revolutionize my business? Versus understanding how we got here? What bases you need to have in place? What fundamental building blocks? And how do you do any part of your business responsibly. So that's going to be really the genesis of was the genesis of this podcast, and will be one of the key themes that binds us together. And we'll talk current events. And we'll talk a little bit today about LLM, the largest language models and which is chat GPT. And kind of what's going on there and then introduce some of the those building blocks first principles that we'll start talking through in subsequent episodes.

Sid Mangalik:

Yeah, that's, that's great. I think, these large language models, right, these chat TPTs of the world, they've made a big splash recently. And everyone's kind of talking about how these AIs are going to change the world. But here on the other side of it, you know, we're we're the engineers and working on these systems, we have to think about how are these models going to be safe? How are these models going to work in ways that are predictable? And how can we create systems that are ultimately doing work for people rather than making models that impress people? So definitely a great time to talk about this, this crazier? So I'm gonna I'm gonna go through a NIST article I found here with you, Andrew, and I want you to I want to hear your thoughts on you know what rigorous systems look like so, and we can we can post this link somewhere. But the idea is that this was about manufacturing, and innovations in AI and manufacturing, and how they approached us in a very different way. Then we see these opening eyes of the world, right? They build from a business side, down to business tasks and business need. So I want to hear some like high level thoughts on what what does this proper rigorous process look like? When even in manufacturing, which is notes, fault tolerance, basically.

Andrew Clark:

Great, great points. And I love that you brought this back, my first job out of college was actually at a manufacturing company. So I have a special place in my heart for manufacturing. And I definitely like that a lot of these other we're gonna say, some heretical things on this podcast in the most loving way possible. But one of the key things of manufacturing in some of these other industries, which do also sometimes the prompting into using new technologies have is they look at it from the business perspective, like Sid said, How is this going to help me make more money or, or make more widgets or, or, or reduced waste, or be more efficient or whatever business goals you have, versus the starting of I found a cool new thing? How can I try and make this work a lot of times, you know, AI space is very dominated by computer scientists, and like, this is cool, let's figure out how we can do this. And let's add this new capability was losing sight for the privacy constraints, the safety constraints. So in manufacturing, you can oftentimes they'll look at what's the simplest model, they might not have the resources of a Facebook or, or a Google or anybody to do these crazy large algorithms. Sometimes those simple forecasting algorithms, those logistic regression models, those very simple types of modeling paradigms that aren't the new fancy, flashy things, really, really can make a big difference. And predictive maintenance as an example of manufacturing. And there's, there's lots of places that these these technologies can be used. One of my favorite examples to always go back to is the Apollo program, you know, for putting them out on the moon. First time in the 1960s. In the United States, we didn't have the computing power we have now I forget the exact number, but it was very, very small, and it was less than a megabyte, you know, the whole space shuttle would have and yet they're using a Kalman filters, which is a type of model for for basically determining where they are in space on that very, very sophisticated, low margin of error, space navigation to make sure you don't burn up on re entry like you're using, with hardly any processing power these these algorithms that are pretty simplistic, but can do major things. I think, as a field data scientists, data science, as a field is kind of lost in thinking they're rediscovering or they're making things for the first time and just coming from a technology first approach versus like using technology to solve a problem. So I really like the this example that it's bringing in about coming from the business perspective, and using the technology you need to solve a problem, but never losing sight of you're trying to solve a problem, not just applying something cool.

Susan Peich:

I'm sorry, Said, I want to bring something together that you said, Andrew, you said in the beginning of the podcast, and something that you were explaining about your manufacturing job, there is, you know, there's something to elite training, like you're going to push the limits and push the limits and push the limits. But you're gonna do it in a very calculated way. I almost equate that to the same thing that's happening now, with AI and models, and all of the cool things that are happening and people are trying to build, they're just trying to push the limits. Can you draw any more examples on that, like between what an elite training elite athlete will go through and training versus what software developers might be doing to push the limits of these models,

Andrew Clark:

you can, you can definitely keep going all along those examples. And one of the key things you can see is, we if you say you're new to running, you're just picking it up and you're really excited to new hobby. First, you're most likely going to get shin splints, because you're gonna start running too much. And your body's not used to that you have to take a step back and start building up the volume, then often a mistake you do is start going really high intensity, then you're gonna burn yourself out and have issues or you might see some progress really fast, like, Hey, I run three times a week, I'm running them all hard and making lots of progress. And then you're going to hit a plateau really fast. And it's because and then you're going to just basically stuck there until you either get frustrated and quit. Or you learn like, Hey, I got to take a step back and learn the fundamentals of how to properly run what is good form, build that aerobic base. So mitochondria can work and you can start getting that base that you can build on adding volume slowly, then you can slowly add intensity and learn those skills. That's it I think a lot of times with and then that's how you can grow into doing it for a long period of time, you know, that 10,000 hour or ideally, you've been doing this since you were five years old sorting running, which not everybody can do, but you can still get to a high level after that. But it's that often in software engineer, I think people learn they quickly jump in and they can get some success. And oftentimes, unlike running where it's gonna be very evident that you need to take a step back and really learn the fundamentals and build that base. Often software engineering doesn't go doesn't go back. There are a lot of people that are stuck in that plateau. And sometimes you don't know what you don't know there's not an objective ruler that you will have versus like if you go to a race and you get beaten by a lot of people because you're running a 18 minute five K's when someone else is going to come in and run 16, you'll know very quick, I gotta fix something, it's a lot different in software, you can get to that plateau level, and be in a good position, but you're not going to know like, Facebook's technology is going to go way past you, right? But you're not going to know that necessarily, you're gonna be like, Oh, they're just a bigger company. So taking that step back, and this is the whole genesis of this podcast, that fundamentals is helping in the software and modeling community, let's get past that plateau. And let's come back to the fundamentals, you can build that big, solid base, so you can take that higher volume of work. So you can use these technologies to effect to solve business problems.

Sid Mangalik:

That's absolutely right. And I'll even contrast this directly with getting back to the MLMs. You're like, what what it looks like if you don't do that, right. So when they when they build these large systems, they're talking about chat, GPT will become a lawyer chat up, you'll become a doctor, it'll become a teacher. And it's not built with this intent in mind, right. It's not built on specific goals. And so you know, there's some great repos on GitHub one that's even just called Chat GPT hyphen failures, which shows time and time again, that, you know, we're comparing to these Apollo systems, which are built on strong AI fundamentals run on systems that are less computationally powerful than your phone, are going up against models that Microsoft open AI spend millions of dollars to run every single day, that's the operating cost of these models. And then if you ask them, you know, 100,000 plus 50,000, it says 600,000. Because these models aren't built on the basis of strong fundamentals, they're not built on robustness. They've, they hit the plateau very quickly, they got this really nice, fun, exciting result. But that doesn't meet business needs. At the end of the day, right? You have this cool, flashy tool. But since it has no promises of performance, honesty, or accuracy, they they become toys almost. And

Andrew Clark:

what's so scary about these is a fantastic point said is it is a lot easier previously, in forecasting models. For instance, say if Walmart, and I don't know if they're doing this, I'm just making up examples. For sake of illustration, let's say Walmart is forecasting pop tarts for different stores at different SKU levels. And they need to know how much to put there. Well, they're going to realize pretty fast that these models are wrong. And then they'll work on correcting them and making them a better spot. What's the game changer in the inflection point with these generative models like chat GPT is that a city mentioned they're going to give you a result, they're going to hallucinate as I think the technical term for it is, they're going to make up a result. And it's going to look realistic. If it's that lawyer, if it's a doctor, you're unless you're really a deep subject matter expert, you're not going to know that it's wrong. And that's what's really scary about these things, and why you have to take a step back now, because if if students are starting to use these for making essays or companies are starting to use these forget about the data, privacy and all those issues right now. But just the actual fact if you don't know what's true. So you know, we've all there's bunch of things on Twitter and other places, you know, everybody's talking about fake news. And that will now that's gotten exponentially worse, because there is no this ground truth. There is a concept you can use in in these open these chat GPT type systems that we'll talk about in future episodes. It's ironically called abstinence is what I've heard, it's been called on, which is how you train these systems properly to when they are aren't confident of an answer to not give one versus right now, as Sid was saying, if it doesn't know math, which chat GPT type solutions aren't very good at math at the moment, it's going to make something up. And if you don't know any better, you might say great, that is 600,000. But how can we how can we put those guardrails in place for these systems? So they know when to not answer?

Sid Mangalik:

Yeah, that's that's absolutely right. And we have to remember that machine learning as a paradigm optimizes on the goal that we set and the goal that we set for these generative models. If you look at the, you know, few reports and publications have given us on instruct GPT they want to create language that humans like they literally hired 1000s of people to just say, which of these responses do you like best? And the model is not rewarded for giving more accurate answers for readily saying, Oh, I don't know the answer to this, right, like this absence model. These models are incentivized on these goals. And so this disconnect in how we train the models is it gets these problems when you don't build from the fundamentals of what the goal you actually want is you just ended up creating, you know, a humic mimicker or people pleaser essentially.

Andrew Clark:

And something we can also dig into future episodes that Sid and I are very passionate about have been talking about and researching For years is the concept of multiple constraints. When you're when you're training your model, you can optimize it over multiple things. So it's not like it's only accuracy or only sounding like like a human speech, you can optimize it for multiple things that makes the complexity a lot more. But if you don't understand the fundamentals of how these things work, or how the math works, or what you're trying to accomplish, from the business side, you won't know to do those things, the actual multi objective modeling is not as going to be as complicated as actually building that neural network in the first place. But if you don't understand the fundamentals, and you're just PIP installing pytorch, and, and rockin and rollin from, they're off of a pre pre built model, which is a lot of companies aren't building these fundamental models, or they're leveraging someone else's work and putting something on top of it, you're not going to know that. So hopefully, this is trying to this is helping to illustrate some of these risks. And these these second order effects, if you will, that aren't readily available. And as part of this podcast, we're going to try and take some of these, the news world is buzzing about this, but what's true, what's the signal versus the noise here and try and try and help distill those those aspects for you and show the the unsexy, the hard way. But the way that you can actually build these systems responsibly to solve business goals, because that's the goal the end of the day, unless you're just taking a Coursera course.

Susan Peich:

This is something that so I know that I start I began studying this and looking at this because I was just fascinated about like, what was possible. And I wanted to actually get your guys's opinion on generative AI, like it's not new. I mean, it's embedded in our daily lives for a while. What do you got? You know, what are the examples that stick out to you?

Sid Mangalik:

Yeah, I mean, it's a really old field. When you when you think about it, right? It's, you know, you can go back to like, the early days of computers on Eliza, we're recruiting these, you know, essentially AI like solutions. But they're essentially just text engineers, they see that you ask a question, and they just spit it back out at you and as a different question. But ultimately, in the day, these models aren't doing anything magical, right? There's this, there's a sense of this notion that this is like a turning point in the models, and that they're doing something fundamentally, fundamentally, they've never done before. But now they've just seen almost all language written by humans on the internet. And so what we're seeing is a scale, which is new, but not a technology, which is ultimately new. And, you know, it can feel like this is a shortcut to the problem, right? If we just show it all the human language in the world will create some general artificial intelligence that can solve any problem. And we won't have to deal with any of the parts in between, where we'll just say, oh, chat, GPT, please give me an answer, which is fully honest, and unbiased, which, which is, you know, the hope. But you know, that's not ultimately how we can expect these things to work since they're just human speech. imitators at the end of the day, right? It's, it's next word, prediction systems. It's not a there's nothing going on underneath the hood, which implies sentience or magic, or consciousness. And so, you know, we have to we have to ground ourselves a little bit, and what are we actually trying to accomplish?

Andrew Clark:

Yeah, was this the most sophisticated as these algorithms are still matrix multiplication, right? It's, they're using vectors, but it's still just multiplication, for the results. And if you know what those use cases are, and you have the domain expertise to use them grammerly is a is an example. I love Grammarly in grad school, it helped edit my papers and things. But also, I'm the one responsible for turning that in. And sometimes I wouldn't accept some of the recommendations because they actually changed the meaning of a phrase. So you have to know that's that domain, we can't turn our brains off. These can be accelerators in their best case. But it really comes back to one of those fundamental building blocks, we'll devote a whole episode to, which is the data, we've kind of straight away. There's fields like statistics and actuarial sciences that have really held on to this for good reason. But focusing on data and data quality has become not not a thing. Correlation does not equal causality. But in machine learning is the field as a whole. And this is one of the reasons, it's not always great to have computer scientists running the show, from a high level perspective is you can get those correlations with any sorts of data, but is your data that you're putting into the system if you're training off the whole web? And that's what you're getting your training data for? Is that really good? Or should you have a smaller set of curated data? That's where statistics comes back into play? And really, that understanding your data and making sure your data is representative of what you're wanting to accomplish is an absolute key part it's at at a certain period of time that is more important than just pure volume.

Sid Mangalik:

Yeah, absolutely. And we're going to talk about this a lot. It's its process, its process process process, it's thinking through from the business intent to the data to the modeling, and having everything documented and available and understood by everyone. And fact check that every step in this diligence is the key to creating these safe and robust models. That

Susan Peich:

point you made about next word is interesting. Are there other systems that might be doing this?

Andrew Clark:

Some other examples is if anybody has an iPhone, it's always suggesting in spot checking in suggestion, the next word, right. So that's another practical application. And we've had that for years. So there's lots of it's more embedded. I was reading a recent article from Wall Street Journal, I can try and find it in the show notes we were talking about. Yeah, AI is revolutionary, it's been doing so for years. It's not like this is some major change right now as much as the accumulation of other progress we've had. But we'll definitely be digging into these considerations. And that how do you build these systems properly in future podcasts,

Susan Peich:

brochure and then any approaches that stand out to you that we might talk about today or in future podcasts?

Andrew Clark:

One way that Sid and I both like to look at it, we talked me mentioned briefly that NASA Apollo program is systems engineering systems engineering isn't is an engineering discipline that looks at the holistic understanding of systems, and how the different pieces are put together and how you define each of those individual steps. It's a pretty time consuming process. But you make sure that all of these complex systems work together. Well. There's also rich literature in systems, thinking and different disciplines that have been around a long time. One of the key things that we often see in the computer science world as it comes with, like, these generative AI models is that these are new things, how we're applying, the concepts might be in a new way. And really the data processing powers what's new, but a lot of these concepts exist in other fields, and have for a long time complex systems theory. There's there's several different levels of the micro and macro flows. I come from economics, there's a lot of different ways to the we'll look at those things. Even how do you validate systems? And how do you think about a, you know the segregation of duties and make sure you have independent review and have diversity of thought and how to evaluate these models and the composition of teams. All of this dates back to model risk management that banks have been using for a long time. There's a lot of a lot of emphasis on we're inventing new things, it's more we might be applying things like Lego bricks in a new way.

Sid Mangalik:

Yeah, that's right. So just to just to close this out here, right, this is, yeah, like interesting. This is nothing new. But it's finally we're integrating this framework that we've had for a long time when we want to make a black box and an airplane, and you want to send that into the air, you need a lot more assurances that is going to work in a specific way, in specific outlier edge case situations, in certain Black Swan scenarios. And so integrating this systems engineering fundamentals world, which is all about rigor, and testing and validating, and to a world which is mostly concerned with AUC and f1. And accuracy is going to be the shift that makes these models usable, and deployable and trustable in the real world.

Andrew Clark:

Redpoint said, and we will be taking inspiration from all these different control theory, statistics, computer science, of course, we'll get a lot of a lot of credit as well. Systems engineering, a behavioral economics, all of these different fields will try and mash them together and show kind of the genesis of these different technologies and why we do the things we do in statistics. And then take apart these discrete components like understanding the why which might be our next podcast, and there'll be one in the future, then understanding the data and what is holistic systems management, meaning we will break apart these these components. And that'll be the main thread for this podcast, as well as when major things happen. In the space, we will also of course, talk about and help help you guys understand the signal from the noise on new technologies.

Susan Peich:

Great, well, I think this is this has been just refreshing to hear just the level of detail that you're understanding and what you want to go through. I'm excited for future episodes. Any final thoughts before we close?

Andrew Clark:

Well, thanks for listening, guys. We definitely want your feedback on what what can we do better? What do you want to hear about? We're going to kind of be this. This, the AI curmudgeons, if you will, of trying to be like, we're all for technology driven innovation, and we are our host. We're dedicated our lives to this right. But we also want to see that things are being done responsibly. And we're not we want to make sure everyone is not getting fooled by flashy marketing. But what is what is fundamentally there Aren't helped understand and peel back those layers on how do you fundamentally build the system? Good systems? How does that work? Think about modeling holistically, we'll even talk about what ai ai actually is. What does that even mean? Everybody says it, but what does it mean? And really pick apart those components. And as we go down this journey together, please provide any feedback or what you would like to have next. And we're going to try and figure out the three of us together what mate? What's the most logical order? Because, I mean, we have a, we have several years worth of podcasts we could do here on breaking down these fundamentals, but what is the most impactful and in what order? So please, give us all your feedback and help help guide the journey.

Susan Peich:

All right, well, thank you said Thank you, Andrew. This has been the AI fundamentalists, please let us know in the comments and through your favorite podcasting systems, what you want to hear and feedback for us signing off for now. See you soon.

Podcasts we love