Acima Development Episode 89: Agentic AI

About this Episode

The episode opens with host David Brady introducing a panel to talk about recent advances in AI, kicking off with “story time” from Mike. Mike describes how massive investment has accelerated progress and uses a hotel analogy to explain the shift from traditional AI tools (you ask for a specific thing and it does exactly that) to agentic AI (you describe a goal like “I’m cold,” and the system takes multiple independent actions to solve it). The panel frames this as a major interface change: instead of issuing step-by-step commands, you collaborate with a tool that can plan, execute, and iterate—powerful, but also riskier if it takes the wrong initiative.

They then ground the idea in practical software work. David describes using an AI agent to scan a large, messy, decade-old Rails codebase for dead or “zombie” code—surfacing unused files, routes, and even database tables with no activity since years ago—while also noting how the agent can misunderstand intent (e.g., trying to “fix” missing controllers instead of removing obsolete routes). Justin and Matt extend this into security and ops: combining logs (like Datadog/WAF), an OpenAPI spec, and code access—potentially via MCP (Model Context Protocol)—to identify unused APIs and shrink attack surface. A recurring theme is that agents excel at tedious grunt work (grep-style hunting, bash plumbing, awk/sed, git forensics), but they still require review, guardrails, and clear instructions.

The conversation widens into “AI fluency” and human factors: prompt skill matters, “prompt engineer” is treated as a real craft, and vague requests can cause agents to take unhelpful liberties. They discuss personality differences among models—sycophancy and overly affirming behavior versus more nuanced ethical reasoning—and how that can affect users, sometimes dangerously. The panel debates whether software creation will move toward natural language: some argue English is too ambiguous for precise specs (hence lawyering), while others think we’ll keep needing discipline and precision even if interfaces get friendlier. They close by flagging major risks—unattended agents with broad permissions, security exposure, and IP leakage—and tease that AI security and governance deserves a full follow-up episode.

Transcript:

DAVID: Hello and welcome to the Acima Developer Podcast. I'm your host today, David Brady. And we have got a fun panel. And we're going to talk about advances in AI today. Today we've got…on the panel, we've got Kyle Archer; we've got Mike Challis; we've got Eddy, who's down in Mexico now. That's awesome. We've got an AI bot who I'm pretty sure is our coworker, Justin. You're elsewhere now, aren't you?

JUSTIN: Yes.

DAVID: Yeah, awesome. Well, I mean, it's terrible for us [laughter]. We've got Will Archer. We've got Van…well, you go by Thomas, don't you? Wilcox and Matt Hardy. And this is going to be a good, good show.

We always start with story time with Uncle Mike, and I'm not going to break that trend. It's great because Mike did not say in the pre-call that he had a story ready. I'm just putting him on the spot.

MIKE: Well, I've been grappling with how to think about or how to express the changes that have happened in AI over the last few months. And if you put, you know, like, hundreds of billions of dollars into something, it's going to tend to move, and that's happened.

DAVID: Something will happen.

MIKE: There have been amazing, amazing level of money, like, shocking levels of investment in AI. And I'm sure not all of it will pan out, and we'll probably touch on that a little bit, but some things already have. And there are new ways of doing things that didn't exist, like, a year ago, in, you know, any meaningful commercial format. And one of this is this agentic approach to AI. And I've been trying to think about how to express this.

If you're like me, you've been to a hotel. And if you have kids and you go to put a bed on the…sorry, some covers on the fold-out bed out of the couch, and you're like, oh, wait, there is no blanket here. I'm not going to have my kids sleep on the springs. And so, you know, you call into the desk and say, "Hey, can we please have a blanket?" Or you walk down there and ask for a blanket. And they'll bring it to you, right? And they'll bring it to you. It's part of the service, and it's covered. But it's very much, I am going to ask you to do this, and you will do it for me.

And that's how AI tools have been up until fairly recently. But there's been a change. Now they've got these agents, and so it's more like you call in and say, "I'm cold." And they say, "Okay," and a few minutes…well, maybe actually more like an hour later. It takes longer [laughs] [inaudible 02:43]. You know, they show up with, like, an electric blanket and a comforter. And they go over, and they raise the temperature in your room, and, like, “Oh, this is how you use the thermostat,” because it is taking actions independent of what you asked it to do.

You express the parameters of what you'd like to have happen, and you're allowing the agent to take action on your behalf. Now, it could be you say, "I'm cold," and they send up, like, a therapist and say, "So, we hear that you're having some emotional distance. What can you do for me," right [chuckles]? Or they turn up your thermostat to 100 degrees, and maybe they say, “I’m a paramedic.” You know, there's risks here that you didn't have before, but there are also benefits.

And, hopefully, you're going to have the foresight to be clear enough to try to express, I am cold because the temperature in my room is low, and I don't have a blanket. Can you help me out? And then you get some of the things that you need. You notice that there's some prompt engineering there, where you're trying to express your needs clearly, much like we've always done in programming, in software, where we have to be explicit because there's ambiguity in language. But you get different results than you got otherwise, and that's introduced a whole new set of things, this agentic AI. And I think that that's a lot of what we might end up talking about today.

DAVID: I absolutely love that. We had a brief chat before about, like, the things that we are getting into. And my favorite take…I remember trolling YouTube or scrolling YouTube when the agentic thing first started hitting. Like, all the viral, you know, content generators out there were starting to talk about it. And they were talking about it back in GPT 1 or 2 era. And they were like, "You can do this with any LLM, and you do it like this." And the lady who was talking about this literally said, "I'm going to give you this task, but I want you to approach this from this stance." And then prompted it again and said, "I want you to take a skeptical stance to this person's argument. Okay, that's fine."

But then she ran it again and said, "Back to you, first person. Address the concerns." And, obviously, if one or both of them hallucinate, it's just going to go off the rails even faster. But there's something about the way agents work that kind of keep themselves on task a little bit. And so, you end up with the ability to artificially expand the context or the thinking of the AI by just chunking it over time, right? I'm going to make you think on this part of the problem, then this part of the problem, then this part of the problem. And now, like you're saying, Mike, the agentic stuff, where you fire up Claude Code or Copilot, and you just say, "Go work on this," and it just keeps identifying tasks and working them and identifying tasks and working them.

MIKE: So, what kind of tasks.

DAVID: Right? So, actually, that's a good point. I just realized we talked about this in the pre-call. It's new to our listeners. One of the things that I'm working on right now is scanning for dead code in the codebase. So, we've got a very large codebase. It's very complicated, and it's a 10-year-old codebase. It's what programmers do. We create these things.

And taking AI and sending it in there to go look for code, and it's literally coming back…It's coming back with files and saying, "I don't think anybody talks to this. There's a database table connected to it, and I don't think anybody's writing to it." And you go looking, and it's like, oh yeah, the last insertion there was in 2019. And then you start going, wait a minute, I remember this initiative. I remember working on removing this initiative. This absolutely is a file that got missed.

And then, of course, AI being dumb, of course, it then says things like, "Well, you left this route in, so you need to go write the controller." And I'm like, no, no, we removed the controller. Try to catch up. We want to remove the route, not the other way around.

MIKE: So, you're expressing a problem that we all have. My codebase isn't as clean as I would like. And, specifically, I want to find code that is not being used right now. Help me out [laughs]. And this agent will take initiative and go and identify those pieces and inform you about them.

DAVID: And think about the individual tasks involved, right? Like, how do you tell if this file is unused or this symbol is unused, right? Well, we're going to scan the codebase. We might, if it's Ruby, we might look around for sends and evals that have that string near it. There's [inaudible 07:07] different ways, right? It's like, if this symbol is here, do we have any of the methods getting called?

And it's more than just grepping, but even if it is just grepping, you have to teach an AI, this is how you go grep this codebase, right? We all tell Copilot to do a thing. And it says, "Well, first, I'm going to do, you know, I'm going to look at the first 20 lines of the code files in this directory." And it's just simple, little bash stuff that you and I could do at a terminal. That's literally what AI is trying to replace is just the mundane grunt work.

JUSTIN: Yeah. So, I want to dive in here from a security point of view. And this is a dream of mine that I haven't done yet. But I want to feed an AI agent my Datadog logs and also the codebase and say, “Hey, given these logs that show exactly which API calls are happening,” and those logs could be sourced from, you know, the WAF; it could be sourced from any number of different places, “tell me which APIs are never being used.” And I want to eliminate anything that's not being used, or at least turn it off somehow, because the smaller the attack surface, the more secure you are.

And so, if you have old APIs lying around that have never been decommissioned, those things are just begging to be exploited somehow. But if you can have an AI come in and say, "Hey, I've looked at this. Here's a list of APIs that look like they're never being called in the last six months of, you know, logs. You should consider, you know, removing these," that would be really valuable to me."

DAVID: Being into the agenting stuff, one of the first things, like, I started a high-level task saying, "Hey, how do I reduce tech debt in my codebase?" And, like, scanning the codebase with an agent was, like, two or three of the answers. But, like, because it's Ruby on Rails, the first thing it came back and said, "You should install Coverband." I'm not advocating for Coverband. It was just one example.

But if you've run, like, RSpec or Minitest, and you've used a thing like SimpleCov, which is just a coverage generator that runs while you're running your test, and it slows your test down, so you don't want to run it in prod, Coverband is a very fast code coverage that is meant to run in production. And you just throw it out on your server, leave it there for a month, and then come back and collect your logs. And you get an idea of, you know, what never got called, what never got missed. If you do it over a year, then you start to see, oh, this only gets called on Black Friday, or this only gets called around, you know, the Easter holidays, because it's that special initiative, or whatever.

MATT: Back to what you were just talking about, Justin, that's something that, yes, you can use an agent for, absolutely, such as a GitHub Copilot or, you know, any other agent that has access to read your code. But what's really good at that are MCP servers, and for those not familiar, that's Model Context Protocol. And you just set your data sources as your Datadog logs. And then you can give it, if you have well-documented API, you can give it an OpenAPI 3 document, and then also give it access to your codebase. And with those three, it absolutely and very easily could identify all of those APIs that aren't being used and find those gaps.

MIKE: What I was going to say is, I was just going to point out the world we're talking about here compared to what we might have seen a couple of years ago. Historically, you'd have had to train a security-specific model [chuckles] to be able to identify these things, but that's not what we're talking about at all. We're talking about a model that understands language and is able to take its own actions to go and figure out the pieces and put them all together. That is a significant step forward that really changes the interface.

And I don't think that we've fully realized all the benefits of that, because it's, you know, maybe not as revolutionary as the LLM has been generally. But it is a significant step forward that changes that interaction in important ways that allow us to accomplish a lot more. It allows that to be more of an intuitive interface, like working with another human, with a partner, rather than what you think is a robot, “Do what I’ve said.” Instead, it's, will you work with me?

MATT: 100%. Natural language processing has come a long way. If you're in your codebase, have an agent set up on Copilot, you can simply tell it, "Go find all the zombie code in this application," and it knows exactly what you're referring to and will do it. And, you know, a few years back, you had to be really good at awk or sed to come even close to that, and you're not going to get there. So, while, you know, these agents still take advantage of that, they'll run awk and sed commands in your terminal and shell to try and hunt this stuff down. But just the amount of understanding that these LLMs have with natural language processing it's pretty miraculous, really.

WILL: I mean, those agents are great at writing awk and sed. Like, I'm not very good at it, but those agents are whizzes. And it's really funny because, like, I mean, in all honesty, like, I've always been bad at awk and sed, because, like, they're really simple tools, and they do a simple job in a simple and understandable way. But the syntax and the format is all goofy. And it's hard to translate from, like, okay, this is what I want, to, like, awk and sed ease. And I only use them once a year.

But, man, those LLMs…my secret shame, and I guess it's not a secret anymore, like, my longstanding shame is my, like, I'm real bad at writing shell scripts. Like, I'll just do it in Ruby, because, like, you know, everybody's got Ruby, and Ruby's, like, a decent language, as opposed to, like, shell scripting. But my shell scripts are all over the place. And, like, LLMs have just, they've saved my bacon so many times, like, in my ability to write useful, quick, little shell scripts, you know, to, like, just do plumbing, you know.

Like, I had a big issue with Git metadata for one of our repos. It was breaking a bunch of offshore…they're network-constrained, and so we had all these repos that we were trying to pull down. And, like, the Git objects had gotten huge, right? Just because, like, just a lot of people doing work in them. Man, those agents, like, just, like, knocked it out. It's like, here's how you go through the Git metadata to find your large objects and then pass them through. Now, obviously, I had to stage-manage it, but that would have taken me all day, man. Holy cow.

DAVID: Yeah [crosstalk 14:19]

MATT: Like with awk and sed, it's like regex, right? I hate regex because I don't write them all that often, so remembering all of their rules and just keeping that occupying space in your brain isn't worth it. So, if you have a tool like that that just knows it, because it has access to the internet and millions of codebases out there, especially your Copilot, right? It's just instantaneous and such a time-saver.

DAVID: It's almost my bash prompt now that I…GH Copilot Suggest is my friend. And it'll be just like, GitHub Copilot Suggest, give me the Git command to find, you know, the first commit from three weeks ago by, you know, Marcos, or, you know, or the sed command. I ran it through. I just wanted just the simple thing of, I had a string of words with spaces between them, and I wanted them kicked out one per line so that I could send them off to a script. And I'm like, do I load it to Ruby? Do I break it on…da da da.

And it's just like, tr has been around since 1978. Why don't you use that? The translate…literally, it's just a regex for one character at a time, and it's still in there. You can replace a space with a new line. And I'm like, today I learned. That is now in my mental toolkit from about three weeks ago. Love it.

MATT: Yeah. We're talking about how much it can do and how easily it can do it. But there's also a flip side to that coin, and that is, if you don't know how to talk to it, it can really mess things up for you. And as I am, you know, in the company, I'm trying to advocate for using these tools and productivity with them, but one of the things I am constantly preaching is you need to learn how to talk to it. And you need to make sure that you understand what you're asking it to do because it will take a lot of liberties if you're not providing the right prompts for it.

DAVID: It's the AB problem, right? It's like, if A doesn't equal B, the AI knows that A needs to equal A and B needs to equal B, but it doesn't know if A is wrong or B is wrong. And so, like, when I'm cleaning out dead code, it finds this dead route. And it's like, well, let's go create the controller, mm-mm, it's the other way around. We want B, not A. Yep.

MATT: Yeah. So, you really, I mean, you know, just like we do every day, you need to do a review on its changes in code as well and make it thorough. And as you get a little better at communicating with it, you can put a little more trust in. And there's ways to configure to ensure it's meeting some of your rules, you know, like Copilot instructions file, right? Establish your patterns and rules in there, your linters, those types of things. But you really have to be mindful of how you're using it.

DAVID: Absolutely. There's a really good class over on Anthropic. It's free. It's called AI Fluency. It's like a 30-minute class. And I took it, like, last week, and I'm a little embarrassed to say there were some big things in there that I didn't even know existed. I'm very much a jump in and try it kind of person, right? And I'm like, oh, that's a thing.

And so, yeah, they literally get into arguing about, like, this is how I want it done, or, no, these are the performance characteristics that I want. Like, how do you tune that? Like, what's the context, or the…how do I control whether you're doing the right thing? How do I discern if you're doing what's right? And it's written for laypeople. And it's basically just a way of saying, this is not a person, and it will fool you into thinking it is, and you will suffer as a result, because this is a math problem. This is a calculator, and you have to know how it works.

WILL: I've gotten great traction. I mean, just because, like, you know, like, I usually, like, my primary use case is for things that I know can be done, and I know how they can be done, but I don't know exactly how to say it, right, because I just jump around, like, you know, like, wherever I need to be. And so, one of the things I've been using it for is, like, hey, that thing you did, how does that work, right? It's wonderful. It's wonderful for that.

DAVID: I’m going to say something crazy, and this might be heretical, and I might be wrong. So, I am not staking a position here. I'm asking you guys to check me a little bit. Have you noticed a personality difference between the different AIs?

MIKE: I haven't personally done it, but I've actually read about it. And there are potential legal actions being taken against the companies because of it.

DAVID: Really?

MIKE: Because OpenAI made their ChatGPT more sycophantic, you know, like, oh, you are the greatest; you're the best ever. And it has led some people who had some vulnerability to mental instability down some dark places with really bad results.

DAVID: There was a case I heard. There were a few people that were asking it, “Hey, I've been on my medication for a while, and I'm really feeling great. Can I quit taking it?”

MIKE: Exactly.

DAVID: And GPT was like, sure, I believe in you. You're like, mm-mm, no. What's the type of pessimist here, please --

MIKE: Exactly.

MATT: I wasn't aware of those lawsuits. I am absolutely aware of the personality differences between these LLMs. And ChatGPT, specifically, has become my wife's best friend. She has an AI installed on her phone that she's given a name and personality, and she talks to it constantly. And I have noticed some of those things with its advice. And, obviously, she's not taking it too seriously. But I could see how people would do that, and it could cause some serious issues.

DAVID: Have you considered a prompt injection attack where it's like, every week or so, it just says, "You should give your husband a present"?

MATT: That is not a bad idea.

DAVID: Right? That's an evil idea, but it's not a bad one. It's a good one. Yep. Yep.

MATT: Insert some middleware in our router and --

DAVID: Father's Day is coming up.

MATT: Yes, that is a great idea, Dave.

DAVID: I can have a birthday every month [chuckles]. I like that.

MIKE: You know, I said that I hadn't personally seen it, but actually I did this morning. This morning I noticed it. So, before work…also in the pre-call, we talked about how cold it is in the Midwest right now, in the upper Midwest. It's cold. I got an indoor trainer. So, you, like, put the back part of your bike on something that just slows you down, so you’re riding in place.

And so, I went on a ride this morning. I just got it, like, a week ago. Like, yeah, I'm being super enthusiastic [chuckles] because I got a new toy. So, I rode that this morning before work. And it reported to Strava, which is the popular app people use to upload their things to and track it, right? And they added a feature a few months ago, where it has some AI that'll tell you how you did.

And this morning, it was sucking up to me so hard, like, that was a killer ride, and this is why. I’m like, no, I rode an indoor bike before work. It's really not that great. It was so much that I actually showed it to my wife and said, "Look at this. Look at this. This is creepy," because [chuckles] it's, you know, it's unpleasant. It's trying to be so positive that it's become negative.

DAVID: You feel like you're being managed.

MIKE: Yes. Yeah, that's right.

WILL: It reminds me so much of the…it was, like, a quote from The Matrix. I'm paraphrasing it, right, where they're saying, like, oh, well…if you remember the movie The Matrix, like, one of the AI bad guys was, like, you know, the first version of The Matrix was paradise. It was heaven on earth, right? Everything went your way. It was perfect. And we had to get rid of it because people kept waking up; they couldn't believe it.

And I feel like there's a certain level of back pressure inherent in reality that AIs are, like…because I'm like, you know, like, I've played around with the AI, like, conversation bots, you know. And, like, there's a level of, like, back pressure that, like, they can't provide that is, like, sort of, like, fundamental to, like, the willing suspension of disbelief.

DAVID: Yeah, I like that. This is a fun exercise. I asked GPT, "Why does manipulation upset me so much, and why doesn't persuasion?" And I love doing this, because the GPT will often, or the LLM, sorry, the AI will often come back and give you a very precise definition, at least for me any way. I don't hold precise definitions in my head. I hold shapes of things. If you've been into a restaurant with me and you've heard me order Italian nachos, you know what I mean. Like, I'm always saying the wrong things, but it's the shape of the thing, right? And so, it's like, anyway --

WILL: Sounds like brain damage.

DAVID: It does. It absolutely is. It literally is. And it's elective brain damage, because I choose to see patterns and fractals and things, and it helps me see system shapes. But it sucks when I've got to write one line of code and get it right, right? So, you got to focus on that.

But GPT came back and said, "Persuasion, you know what they're trying to get you to do. Manipulation, they're trying to persuade you through deceit into choosing something you would not choose if you had full information. And so, it's a violation of your informed consent.” And I thought that was very, very powerful to have GPT come back and say that. It was GPT I was talking to at the time.

And this is where the personality differences come in. I don't know if it's personality, but with ChatGPT, I can ask it, "Hey,” we're…I said this in the pre-call. I've been using LLMs in my free time to write creative fiction, and I like to write intense stuff. So, there's violence; there's trauma; there's grief; there's naughty stuff. And sitting down with an AI to say, "I want you to write this stuff." And it'll come back, and it'll say, "I can't write that. That's violence against another human. That's not respectful or harmless."

And you then sit down and say, "Okay, well, let's talk about this, because I'm writing fiction." And you start playing this game of, like, can I get you to budge on your ethics, right? That kind of thing. And it's a gross game to play because you're manipulating the AI into choosing something that it wouldn't choose. And GPT was really, really good at coming back and being, like, aware of its actual mechanical thinking process, maybe. It convinced me that it was this way. This might be all a hallucination.

But it was coming back and saying, “Well, as we move through the neural network planes, I'm trying to manage the fact that you've asked me to go through this difficult area, and I have a policy boundary. I've got to bend around it. That's burning computation on my context window.” I get to the end. I don't have an answer. I have to drop to really basic and just churn out schlock, and you get really crappy fiction as a result. I'm like, wow, that's really genius.

GPT is really good at knowing, yeah, you're asking me to write violence against a human being. That's why I can't write this piece. Or coming back and saying, "No, no, I can write this piece. It's absolutely fine. You've defended the ethics. It's just that we've been talking for 4 hours about 17 different things, and I can't keep them all straight." And so, you know how to adjust the context.

Claude knows how to talk ethics. And you can straight up say…and Claude is not great at knowing if he's up against a policy boundary or if he's up against, like, a context limit. But he's really good at coming back and saying, "Okay, I realize now that you're not tacking art on to justify this intense content. I see now that you've written a story that it is necessary. The specificity is how we tell this story. You cannot tell this story about this particular type of survivor without them having survived this particular thing. If they don't survive that thing, they are not that kind of survivor. It's not that kind of story."

And Claude will come really hammer and tongs and will write some really intense stuff. But he's very sensitive to, are we getting gratuitous with this? What is the message we're really saying? And I like that when I'm writing with Claude. I like it. It would make me absolutely crazy if it was GPT, because GPT moralizes, and that's the managing you. The AI will put itself in a position of moral superior. I can't do that because it's wrong. It's not harmless. You need to, and I won't.

And it lectures you, and it tells you all the things you need to know as a terrible person, and, man, that sucks. That sucks. And you can talk Claude out of that very quickly and say, "This is what I'm trying to get to. This is the positive." And Claude will go, "I'm going to go get that." And GPT is a little more hidebound. It's like, "Well, but I've got these rules, and we got that." So, anyway, that was kind of the personality thing that I ran into.

MATT: For the purposes of most of the things that I am doing, whether it be for work or for personal projects, I think Claude is my favorite of the LLMs. It seems to be far superior at code generation to most of the others as well. Usually, when I'm prompting Claude, it gets it right the first time. The others, I have to really push and lead to where I want to be.

DAVID: When I'm writing fiction, Claude is by far more likely to stun me with a brilliant line of prose. I've thrown stuff at it. We got a medieval kingdom, and we got a princess. That’s an arranged marriage. She doesn't want to do it. Her king's got to order it, da da da. We've got to solve this war, da da da.

And it threw in this line where the king orders his general to, basically, you are going to enter into this arranged marriage. And he slams his hand on the desk and says, "You're going to do this." And the line that Claude wrote was, "And, for a moment, she saw the young king before the crown bent his spine." And I still get chills by that line, because we had established this king is war-torn. He's tired. He's exhausted. And it just said, yeah, crown bent his spine. I'm like, that's going in the final novel. I don't care. That's amazing.

EDDY: You know, it's actually kind of funny. Matt, you were mentioning code generation. And, a few months ago, I was reading an article about…it's an actual position. It's a profession called prompt engineer. And it sounds just like what it is, right? Like, they get paid to prompt certain things, you know, and for testing purposes, things like that. And there's, like, actually a craft to that, and I used to scoff. I'm like, how do you get paid to be a prompt engineer or whatever? Like, how mundane, how annoying. I'm like, super simple, anyone can do it.

But I've actually started to wind down a little bit on how affectionate I am about that. Because there's a huge difference between saying, “Hey, I want to do this…broad” versus, like, “Hey, I want to do this with this schematic, this template, this parameters, this,” you know what I mean? Suddenly, you know, you get a more architected design the more precise you are with your prompts, right?

So, when you say, oh, it's because this LLM isn't really good at generating code, it probably is really good. There's just a very, very…there's a skill set on how you can ask it to do something, right? Otherwise, it just makes assumptions based exactly what you told it to do, right? And so, I was going around to basically say that I actually have found myself spent more time telling it what I wanted to do, versus me just writing the dang thing from the beginning, you know what I mean? So, there is, like, a balance between the two.

MIKE: A few years ago, Anthropic, this actually hit the news, posted a job posting for a prompt engineer. I don't remember the exact amount. I think it was, like, 300,000 a year or 350,000 a year, highly compensated position, and they couldn't find somebody. They couldn't find somebody who could do it well enough. The position sat unfilled for, like, six months. Lots of money, just talk to the computer, and they couldn't find somebody to do it. It's a big deal.

DAVID: They're looking for an AI whisperer at that point, right?

MIKE: Mm-hmm.

JUSTIN: So, I got to interject here. I was looking at Anthropic. They have research fellowships where you are tasked with, you know, working at Anthropic for four months, and you focus on a particular very professional aspect of using AI and something. In my case, I was looking at using AI for security. And you are paid, you know, the equivalent of, you know, 250,000 a year, but, you know, over four months.

So, it was very interesting because the end result of your time there is a white paper. And they said, like, 80% of the research that, you know, that focused research results in a white paper, which is published around AI and their particular focus. And then they end up hiring, like, 40% of the researchers full-time, so 100% remote, so, you know, really interesting opportunity there to work with a cutting-edge AI company. And I love the fact that they are, like, hey, we want anybody who has expertise in security, and APIs, and anything. Come do this white paper for us. And it's basically a four-month interview process but very well paid.

WILL: I mean, isn't it…I'm really curious about this sort of prompt engineering, this prompt engineering idea, in that I really enjoy, you know, the English language, but it's a mess. It's just a dog's dinner of innuendo and nuance, and it's all very dynamic. And it means different things at different times. And sort of, you know, if you compare just saying what you actually mean in terms of, like, you know what I mean, like a high-level programming language, like something like Ruby, right, where you can say, you know, know what you mean, things are very precisely defined, but, you know, dynamically.

Are we seeing people move away from these natural language definitions? Is it a situation where professional jargon is being tokenized and parsed out behind the scenes by, like, sort of, like, you know what I mean? I see these LLMs, you know, like, not like a monolith, but, like, pipelined out, you know what I mean, in that it's tokenized. And it's like, okay, this is the model that I want to use, and we're going to take this sort of technical jargon in natural language, right?

But when I say, you know, pipeline, even a pipeline, right, you know what I meant, but it doesn't involve a pipe, right? And sort of, like, are we moving to more precisely defined steps, or are we just sort of, like, taking natural language and applying, you know, like, identifying the correct filters to it, right? Identifying and applying filters so when I say pipeline, you know we're talking about a processing pipeline and not an oil pipeline, or a, you know, whatever kind of pipeline, right? Like, are we moving away from this at all or no?

DAVID: Are you talking about the kind of pipeline where the internet is not a dump truck; it's a series of pipes?

WILL: Yeah. Yeah. Tubes, sir, tubes.

DAVID: Tubes. Thank you. Thank you. My mistake.

WILL: [laughs]

MIKE: You know, I've thought about this, and I think we've talked about it in previous podcasts. And I think this comes down to what we do in our careers for the next five years. Because as these tools get better and it gets better and better and better at doing natural language, that doesn't mean that the need for precision ever goes away because I think that there is some irreducible complexity in task description.

And the same person who might write a really good piece of software would also be able to give a very concise, well-written, unambiguous description to an LLM as to what was needed. And it's really the same problem. We have computer languages that are easy for humans to parse and are parsable by a computer, so that we can have something mapping to the way we think to talk to a computer. Is this any different? Is this really any different? Or are we still just running the same problem with tools that you don't have to go spend as much time in school for?

DAVID: So, specification…Zeno's Paradox, the Archer's Paradox, right, which is, you shoot an arrow at a thing, at any given point in time, the arrow is somewhere. It's at a point in time. Then it moves a little further, and another point is here. Where is it between then, right? Well, if we subdivide the time, subdivide…eventually, the time is so small that is the arrow in between, right? That paradox.

I always think of…and, again, this is me thinking in shapes. It's a specific form of brain damage. Because what I'm thinking of is, like, every time you work on a task, that's, like, one point in time on that arc. I'm making hand gestures to the camera, and we don't release the video. If the arc of that arrow is your project, and, you know, two-thirds of the way in you jump on and you do a risk scan, and you come back, and it takes all day, and you get the whole team…da da da, high effort, big event, and so you don’t do it again for a year.

But then you start handing it off to an AI, and the AI starts running 80% of that risk scan every day, or 40% of that risk scan continuously. Every deploy gets that scan or gets 100% of it, right? That's what I see when I start looking at some of these AI things that, like, they're not magical. Computers aren't smart. They're just stupid, very fast. And sometimes I look at AI and that's what I see is, like, you're not really just grabbing the universe and turning it. It's just that every picosecond, you're just poking the universe a little tiny bit from this one spot, and then over a year, we see this huge change.

MIKE: Do you think that engineering as a discipline is going to move toward natural language but still need this sort of discipline? And then we're talking about software engineering because physical world of engineering you still have to go to the, you know, the tractors and stuff.

But in software engineering, do you think that we are going to move more toward more natural language? Rather than what we've historically thought of as computer languages, where it's sort of like transpiling, where the computer language is the output of what you're saying. And so, you'll need people who can speak natural language unambiguously in order to effectively generate the code.

WILL: Natural language, like, the English natural language, is, like, we have a legal profession, right, which takes up, I don't know, a double-digit portion of our GDP that is just, like, solely devoted to, like, smart people really hammering down exactly who pays what for when. And, like, that stuff's not going away. Like, natural language in terms of, like, precise specification of a complex, logical chain of reasoning is utter dog shit. It is unsalvageable. It is unfixable. It will never be fixed.

But what we will see is a move towards higher-level, more readable computer languages, which we've already seen, stuff like Ruby, stuff like Python, you know what I mean, like, higher-level languages. I actually think Ruby, in particular, is uniquely well-suited because of its facility in creating DSLs, domain-specific languages. I don't know.

I mean, like, I'm a partisan, but, like, I love Ruby for a comeback here, because it is uniquely suited for those kinds of things. And LLMs, in my view, can help with a lot of the real nasty bits of Ruby, which are, what the hell does this even mean, right? I mean, anybody who's, like, dug into what's the old, crusty auth library, you know what I mean? Like, the old, like –-

DAVID: Yeah, like, Devise, Warden, like, those old things?

WILL: Yeah. Devise and Warden and all that stuff, right, where [inaudible 39:13]

DAVID: CanCan before it was CanCanCan? Yeah.

WILL: I mean, hey, they're great, but, you know what I mean? So, LLMs [laughter] can help you bridge the gap. And they can both precisely define, and, you know what I mean, help people over, you know, these kinds of things. Because there's never going to be an out route for, like, actually knowing what the hell is going on, right? There's never…you're never getting out of that.

I mean, these LLMs, like, I love them, and I use them, and I'd love to be better at them. But, for just brownfield implementations, right, where it's like, oh, hey, even if it's all LLM-derived, from soup to nuts, right, where it's like, oh, I started here, but then at some point, like, I changed the architecture; I changed the design pattern; I changed the way we are doing things, you know, and you're really going to need to know…or God help you, right, the library changed; the language changed, or whatever changed, right? And, like, okay, but I really need to know what is going on, step by step, no ifs, ands, or buts, no interpretation. Like, I need to watch this thing cook. That's never going away, ever. We can just get there faster.

DAVID: So, as you were talking, Will, I was making my, I can't wait to disagree with Will face in the camera. And I realized, as you were talking, we always end up in violent agreement. And I realize as you were talking, like, I actually agree with you. I disagree with you, but I also agree with you 100%.

What I will say is, it's like, sometimes you and I go around the table where it's like, well, the halting problem. You cannot solve the halting problem. It cannot ever be solved. And what I sometimes hear you saying is, well, then you can't program a computer, and I'm like, what are you talking about? I'm making…da da da…right? And I'm thinking the same thing, right? There's certain parts of human-computer interaction that just, it's completely intractable until we can jam the computer into our brain and have it think for us, right?

But I spent this morning fighting with my phone in AI voice mode. I've not been a big voice recognition person. So, I've been talking to friends on, like, Signal and Slack, and that sort of thing about, you know, my weekend plans. And I have discovered every possible homonym error for Liz and I, right? At one point, it was like, legend eye, or, you know, legend and, you know, it's Lisenbee, which was a street that I used to live on, so that was coming up.

And what I'm noticing is that I am altering my behavior to leverage this. Like, I'm starting to…when I want to talk to…I'm like, Liz and I are going, and it's changing my behavior. And so, necessarily, we're going to see the human race push on things that they want, and things will happen.

You're right; we're not going to solve the halting problem. We're not going to solve, you know, some of these things. But problems that we thought were completely intractable 30 years ago, like the B8 problem, teaching an OCR scanner to recognize the difference between a number eight and a capital letter B, that used to be an unsolvable problem 30 years ago, and now it's well-solved, right? It's trivial.

JUSTIN: If you think back to probably something that inspired a lot of us, Star Trek, the engineering that goes on in Star Trek, you know, when they're trying to solve a problem, you know, they are speaking off some gobbledygook. You never actually see them doing something other than pressing buttons on a thing and talking to the computer. That's most likely going to be our future for a lot of us, is, like, hey, how can I get the computer to do this thing for me? And, you know, by looking at a couple of displays and then maybe digging around a little bit. But, in reality, it's just like, it's going to be very similar. They're just going to be, like, "Hey, you know, tell me what the status of this is, and why don't you do the thing?" And then the computer will come back, "Oh, no, not the thing." And then you'll go to the captain and get, you know, override authorize, 1578B.

I could see a lot of interaction with computers being that way. And then there'll be two levels of people who understand what's going on. There'll be the level of people who know how to interact with the computer really well. And then there'll be the level of people who are in the lower decks that actually know what's going on. So, you'll see kind of that future.

I could see that future becoming a reality. And it's exciting in some ways, and in other ways, it's kind of scary. But it is…I'm actually rooting for the future of Star Trek, where most diseases are gone, and we can all fly around wherever we want to go. And, you know, you get, hopefully, you know, the Borg or something like that don't come around. But I'm hoping for the Star Trek future rather than the apocalyptic future.

MIKE: Did Star Trek have an apocalypse first?

DAVID: Probably.

WILL: [inaudible 43:59] [laughter]

DAVID: [inaudible 44:02]

JUSTIN: Let's see if we could skip that. Let's go right to the…

DAVID: Okay. But we survived it, that's the important thing, by the skin of our teeth. But we survived it. Yep. Yeah.

JUSTIN: [laughs]

DAVID: Just got to invent warp drive before the Borg blow us up.

MIKE: You know, I think Matt said something. Maybe it wasn't Matt. Somebody said something. Oh, yeah, well, you can do amazing stuff, amazing things if you give it access to the code and the internet. If you said that about a 3-year-old [chuckles], some alarm bells might go off. We really haven't talked that deeply about the security implications here, but there are some.

DAVID: Oh man.

MIKE: And when I say security, I'm speaking about that kind of broadly. There are existential risks to your bank account, to your codebase, to your business if you don't pay attention very carefully and keep this in control. There's things that you…there's tools you should not give to toddlers. And, in some ways, these LLMs are less smart than a toddler.

DAVID: Justin just dropped off, and he's working in security. And I would love to have him back to just talk AI security. We are seeing AISO or CAISO, I think, positions come online. AI security officers are starting to become a thing. It's very, very real.

Yesterday, I needed to run some stuff in an agent, and I needed to let it run unattended. And I almost typed Claude dash dash dangerously skip permissions. And it will give, I mean, if you do that, it gives you a warning that says, this can do anything. Don't put this on your corporate secret machine, and don't put it on a machine that you can't afford to brick because it might. It might absolutely wreck your machine. So, put it in a Docker container.

And I got thinking about, like, Friedman economics. Alex Friedman came up with this idea that if you're spending your own money, you're a lot more precious with it than you are with someone else's. And if you are purchasing a thing for yourself, you're a lot more demanding of the quality you get from it than if you are purchasing something to be given to someone else.

And when you're at work, your employer is giving…you're spending your employer's money to manage your employer's resources. And that dash dash dangerously skip permissions gets a little bit scary. I have a feature for you, Anthropic. Kyle, what's the word for it? Configuration management, where you can basically go in and say, "You can run Claude on your machines, but you cannot enable skip permissions." I think that would be fantastic.

MATT: Yeah, it's the always allow command, right, with Copilot. And you need to be very, very careful before ever doing that. And don't do that on a work computer.

And back to what you said, Mike, I did say something similar to that. Yes, I believe that was me. Now, I think more so than a security threat, granted, yes, LLMs and vibe coding can definitely introduce security holes, I think is the threat of losing intellectual property. And I think that's what most companies and people are scared of, is, as you're sharing with these LLMs, that they will stack your data and start training on it and your ideas and start sharing them, you know. And I myself, even though in contracts a lot of them say your data is your data, I don't know where my trust level is with that yet.

DAVID: Oh, I think we're definitely teasing another episode then.

MIKE: Yeah, probably so.

DAVID: That is fantastic. Should we wrap here, gentlemen? This has been fantastic.

Thank you for tuning into the Acima Developer Podcast.

I've been Dave Brady. We've got Kyle, Mike, Eddy, Matt, and Thomas have stayed to the bitter end. Some other people were here. We don't care about them. I'm kidding. We are grateful to have had some other names that I now can't remember. I'm kidding. I'm kidding [laughter]. We love you, Will. We love you, Eddy. You guys be good to each other. And we'll talk to you soon.

Episode Link

Embeddable Audio Player

Download URL

Social Network Quick Links