ALTERNATE UNIVERSE DEV

Serverless Chats

Episode #47: Programming AWS Lambda with Mike Roberts

About Mike Roberts
Mike Roberts is a partner, and co-founder, of Symphonia - a consultancy specializing in Cloud Architecture and the impact it has on companies and teams. During his career, Mike’s been an engineer, a CTO, and other fun places in-between. He’s a long-time proponent of Agile and DevOps values and is passionate about the role that cloud technologies have played in enabling such values for many high-functioning software teams. He sees Serverless as the next evolution of cloud systems and as such is excited about its ability to help teams, and their customers, be awesome.

Watch this episode on YouTube: https://youtu.be/16en-TTGNhk
Jeremy: Hi, everyone. I'm Jeremy Daly and this is Serverless Chats. This week, I'm chatting with Mike Roberts. Hey, Mike, thanks for joining me.

Mike: Thank you very much for inviting me, Jeremy.

Jeremy: So, you are a Cloud Architect and DevOps Consultant that specializes in serverless and AWS, and you're also a partner at Symphonia. So, why don't you tell the listeners a little bit about your background and what Symphonia does.

Mike: Yeah, that'd be great. So, I've been in industry now for 21 years and in that time, I've been an engineer or a senior engineer, manager or CTO, sometimes consulting, sometimes working for product companies, so a whole mixture and sort of up and down the manager versus technical ladder.

About four years ago, I was a VP of Engineering at an ad tech company here in New York and we started using a lot of sort of much higher level AWS technologies and especially at the end of that year, we were using a lot of Lambda, so I really thought that serverless was really interesting and so I wrote an article four years ago now about serverless. That proved to be really popular and I was like, "Oh, wait, other people like this, too. Maybe I should start a company about this kind of stuff." So myself and my business partner, John Chapin, we decided to start Symphonia as a consulting company to help people with the kind of technologies and lessons that we'd sort of seen over the last few years. And that's what we've been doing now for three and a half years.

Jeremy: Awesome. Alright. Well, so recently, you and your business partner, John, wrote a book called Programming AWS Lambda, and great title, right, there it is. He's got it. Okay. Now, the thing that struck me though about it was about Java. And so I'm just curious, it's 2020 and so, why would you write a book about serverless programming in Java?

Mike: Mostly because my writing is terrible and I didn't want anyone to actually read the book. No, that's not the  reason. It is weird and a lot of the things that you read about Lambda, the examples are in Python or JavaScript or Go and then there's this Java thing. And who actually uses Java with Lambda? Well, it turns out a lot of people use Java with Lambda and the other thing was, it's how we got started with Lambda. So when John and I started using Lambda, which was about three and a half years ago, the Java support has just come out and we were working for a Java shop, so we had a lot of engineers who were very Java savvy. We had all of our Java tool chains all sorted out and so we decided to use Java and Lambda and see how it worked and it worked brilliantly.

And one of the reasons it worked brilliantly was that the system that we were building was pretty high throughput, like we were processing millions of messages a day with Lambda and so we never hit, and even back then, any of the concerns with cold starts or anything like that, and so yeah, it really just fitted in like a glove for us and. And so, when we decided to write the book, we knew that we weren't unique and we knew that there were a lot of other people out there who have built up this knowledge in Java and the ecosystem that surrounds Java and we wanted them to have a book for Lambda, just like JavaScript developers and Python developers and all that kind of thing.

Jeremy: Awesome. Well, so the funny thing is, is that I saw this book come out and I immediately was like, "Oh, no, it's a book about Java." And I haven't programmed in Java, and I don't remember how long, but I said, "I know Mike and I know John." I've been following your work for a couple of years now and I know you produce good stuff. So I said, "I'm going to look at it. I just want to give it a look." And what I found was that it's not really a book about Java. It's really a book about building serverless applications with the examples in Java and there are a few very Java specific things in there, which I think is actually great and we'll get into some of those reasons why.

But yeah, but I mean, the book covers everything. All those core concepts like the execution environment and invocation types, logging, timeouts, memory, CPU, environment variables, all those things that you would want to know and it gets into detailed explanations about deployments, infrastructure as code, security, event sources, so it really is a much more complete reference. So, if you pick up this book or if you don't pick up this book, because you're like, "Oh, it's about Java."

I actually would really suggest that you pick it up and just read some of the core concepts because I really like your take, you and John's take, on just some of these different concepts because I think there's a lot of, I don't know if dogma is the right word, but there's people who approach things a certain way and they just sort of think that's the way to do it, but when you start applying those things to real world situations and real applications, you start to butt up against some of the limitations. Anyways, you had some really interesting thoughts on that.

Mike: Well, thank you.

Jeremy: So I want to get into the book because there's some really interesting things and I know I'm talking more than I probably should be here, but this is something that I found really interesting was your approach to testing.

Mike: Yeah, and it's interesting, because John and I, we actually only met about five years ago and we both been working 20 years, but the way that we approach Development and Engineering in general is extremely similar. I mean, I come from 20 years of extreme programming and test-driven development and that background and John does to some extent, but not quite as in... I was working with people that were speaking and writing about this stuff back 15 years ago, but yeah, we very much felt the same way.

And while I don't, I'm not a test-driven developer all the time, I don't always write my tests first. What I learned from when I did write that way 15 years ago is to rely on unit tests and functional tests that are of a specific type and John feels exactly the same way. In fact, John was the primary author on that chapter in the book and it's interesting because I completely agree with everything he wrote in there. And so the way that we think is that when it comes down, especially with Lambda functions, Lambda functions are just code. A Lambda function is a piece of code that accepts some JSON, might interact with some downstream systems and might return some other JSON.

That's it. There's nothing else to it and what that means is that you can run unit tests and in process functional tests and the word - the naming gets a bit weird - very, very simply. And so what we rely on a lot with our testing is writing tests that run within the I.D. or within the process, the same process with the tests and the code under test. We use other forms of testing, which we'll get into maybe in a little bit, but our primary goal is to say, "Let's prove that what we've done works using unit tests that all run in one process."

Jeremy: So yeah, because you read a lot about things like the testing pyramid, right?

Mike: Yeah.

Jeremy: And this idea of integration tests become really expensive where unit tests are really, really cheap because you can run them quickly and you can get that feedback, but then there's this sense that in order to do the testing, you need to have all of this stuff running in the Cloud and you need to do these end-to-end tests and I think I'm with you here, where I like to write a lot of unit tests that make sure my business logic works because I mean, really, the business logic is the interesting part of your application, isn't it?

Mike: Yeah. I have a feeling that I know where some of this comes from, which is that a lot of the complexities, especially when you're learning serverless development, are not about the code. It's about this brand new platform. It's about all the services that you're integrating. It's about how you deploy all that kind of stuff and that's the stuff that's new. And so people I think, naturally go, "Well, that's the stuff that we want to test," but that's the stuff that is kind of hard to write automated tests for, that's why you need all of these long running integration tests and people focus on that.

And that's understandable when you get started with this stuff, but as engineers, we have to really think about wearing two hats when we're when we're writing software that's going to last a long time. There's our experimentation hat, which is, how does this stuff even work at all in the context of the platform that I'm using and then there's the, "I'm now writing software that is going to last a number of years." And those are two different modes of thinking, figuring out how this is going to work and then writing the production code.

And when I think about testing, what I'm talking about, is writing tests with the production code in mind and really separating out those two things. I think what the trap that quite a lot of people get into is they do this experimentation stuff and then they write some code, and then they don't sort of switch gears. They don't come back to the "Okay, what is the actual code I need to write? Where is the domain logic in that and what are the tests around that?"

Jeremy: Yeah. And so then when would you suggest though that people write things like integration tests, and actually we should take a step back because you mentioned a term in the book called "Functional Test," which is something where I don't think it's a standard term maybe and maybe it is, I mean, but it's something that I typically don't hear and it's this idea of basically, it goes beyond the idea of the unit tests to test the business logic, but goes more to testing, not really the integration, but maybe the integration point. I mean, can you explain that a little bit better?

Mike: Yeah. So the term it definitely is one that has been around a long time. However, I would say that it's a term that not everyone agrees on the definition of it. The way that we've defined the term in the book is that like unit tests, when you're running a functional test, everything runs within one process, so you're not calling out to external processes from either your code or your test code or your code under test. And so from a structural point of view, when you're actually running the test, the unit test and the functional test feels very similar. However, what it is that you're testing is quite different.

With a unit test, you are testing an individual method like a language method or language function within the code and it's completely in isolation. And what you're doing when you do functional testing, which is a little bit different, is you are testing how a bigger part of the application is responding to the system around it, but the important thing is that you're not actually running the system around it, you are stubbing out the external environment. So in that case, we would write an internal stub for DynamoDB, if we had a Lambda function that was calling DynamoDB. That's not anything clever. That's not a DynamoDB stub library, that's just literally us like saying, "Okay, what is DynamoDB? What are we assuming that DynamoDB returns and is our code processing that response correctly?"

Jeremy: Right. And that's another point you make in the book, too, where you say things like local stack, or any of these sort of local mocking libraries are essentially a bad idea and you go into it a little bit more in the book, but I mean, I feel like using those sort of systems for building local tests, great for experimentation, right? Like if you just want to check and do something quickly, but once you rely on those then you have more complex testing setups and things like that. But I mean, if you are just calling an API, an API is returning JSON to you, right? So just simulate that JSON and test against that and then you know that JSON that you're testing against is always the same and it isn't going to change because of some update to local stack or some other local mocking library.

Mike: So I think local stack has its place and local stack is excellent when you've got your experimentation hat on and you're wanting to do lots and lots of really quick iteration and you don't want to be constantly deploying your system to the Cloud. And I get that, like deploying to the cloud is 10 times longer than deploying to local stack, and so, using local stack as an experimentation system is brilliant, but that's not testing. That's experimentation and that's where I'd like people to take that experimentation hat off and put your testing hat on. And when you put your testing hat on, you're writing a system that is probably going to last years.

Local stack is a simulation and an occasionally good simulation of it. There are things that local stack doesn't simulate properly about the Cloud and it's also a lot slower than running functional tests that are all in process and so you sort of have, it's great for experimentation, but it's almost like the worst of both worlds when it comes to regression testing.

Jeremy: Right. Yeah.

Mike: I know some people might find this word offensive, but it's a little bit of a, and this isn't about an age thing, this is about a maturity thing. There is a maturity of doing serverless development. Once you've got used to it a little bit, you need to be like, "Okay, now we need to think about testing versus experimentation is a different thing." Because I've seen people get into all kinds of messes where 90% of their testing relies on local stack and A) it's slow, once you've got like 100 tests and B) they're relying on shifting sand and for regression tests, that's really too much of a risk in my opinion.

Jeremy: Yeah, I totally agree. And then the value of integration tests. And again, I think to clarify, I mean, the integration tests are important with serverless applications. I mean, there are a lot of different connectivity pieces or a lot of services communicating with one another, right? So you have API gateway calls a Lambda function that writes data to DynamoDB, that triggers a stream that loads another Lambda function that sends a message to EventBridge that triggers four more functions or something like that. So, there are certainly complex workflows, but simulating those locally with these functional tests is basically saying, "If this Lambda function gets this, does it do what it's supposed to do?" That is relatively easy with those functional tests, but what about those more complex like actually seeing that go all the way through?

Mike: Yeah. And what integration tests are about are testing your assumptions effectively. So when I talked before about functional tests, I said, "So we're going to mock or stub the response that comes back from DynamoDB and make sure that we're doing the right thing with that." That makes an assumption that we've correctly defined what comes back from DynamoDB. And so what integration tests do is validate those assumptions, they validate how you expect your code to run within the larger environment and the larger platform.

And we absolutely advocate for doing that, but remembering that running and maintaining integration tests is a costly exercise. They take a long time to run and they also take a long time to maintain because things change over time. And so, we put a lot of work into the integration test section of the book. And John did this extraordinary thing with Maven, and those of you that are Java developers understand this, but where we run Maven test, which is one command line, and what that does is it brings up an entirely new stack of all of the components in our application, runs all the integration tests against it, and then if the tests work, then it immediately tears that stack down.

We wouldn't have gone into that amount of effort to get that stuff working if we didn't think integration tests were valuable, but we also understand that because they're expensive based on in terms of our time and computer time, that we want to minimize the number of those that we write. And so we're looking normally at just a few, but capture hopefully a number of cases, but again, we're thinking, we're not testing the code when we're writing integration tests. We're testing our assumptions about the larger environment. If we want to test the code, that's when you write a unit test or a functional test.

Jeremy: Right, plus that feedback loop is just so much faster.

Mike: Yeah.

Jeremy: So you mentioned TDD a little bit earlier and I am a big fan of this, but I am a horrible practitioner of it because it's one of those things where it's like, "Yeah, great. If I don't know what I want the code to look like yet, but I know what I want it to do, then it sort of makes sense to do this." But what are your thoughts on TDD and especially as it applies to serverless because you mentioned this idea of experimentation versus production mode and when you're doing experimentation with serverless, I feel like trying to do TDD is really tough.

Mike: Yeah, it is. And I think that there's a difference there between experimentation around, how do we expect the larger environment to respond versus how do we want to write out the main logic? Now, again, we hit these things slightly weird when we're writing Lambda code because when we're writing a large container-based app, it's really easy to see the domain logic, like it's all that. There's a little bit around the edge that's not domain logic, but most of it is domain logic. But when we're writing a Lambda function, it feels like there's all this other stuff around the edge and there's only a little bit of domain logic and sometimes that's true, but oftentimes, it's not. Oftentimes, that stuff around the edge is something that as we write more and more Lambda functions, that's going to get refracted into libraries or whatever.

Jeremy: Right.

Mike: And so the experimentation part is like, "Okay, so what is the JSON that Dynamo is going to respond to me when I make a request to it?" or whatever. That's the experimentation part. And then the TDD part is, "Okay, given that I'm going to get this request from the user and I'm getting this response from DynamoDB, what do I want my code to look like given that that's what's going to happen?" And so sometimes, I do TDD occasionally. Sometimes it's like, I have no idea what I want my code to look like, but I know I have those inputs, so let's start with a test. And remembering that TDD is test-driven design as much as anything else. It's about: how do I design my code for testing. And unit testing is great and we need unit testing, but TDD is a mode of thinking that I use occasionally where I want to be like, "Okay, I don't know how to write this code for testing."

And the good news is that you don't have to make your code testable, you don't have to do TDD. If you haven't done a TDD, you can always refactor it for testing later. And one of the things that John did in the testing chapter in the book is he took one of my earlier examples that was not written with testing in mind, whatsoever, and what he did is he actually updates the code first to allow it to be more easily testable. And then we have the best of both worlds. So effectively, I wrote in chapter five, I've written the experimentation mind part, and then John sort of have done the switching hats in chapter six.

Jeremy: Yeah, and I think that's super important, too. I mean, just thinking about when you're writing code, is it going to be easy to test and I mean, that's where things like hexagonal architecture comes in or your portion adapter, things like that, where you really are separating out so you don't have to call that Lambda function handler in order to invoke business logic that you can test that outside of that and test those different things separately. So, yeah, I think that's super interesting stuff. Alright, so I want to move on to another thing you mentioned in the book. And this is something that comes up all the time, and this is cold starts.

Mike: Yeah and it's funny. So just as an example, we did a signing of an early version of this book, a conference back in February. Do you remember conferences, Jeremy?

Jeremy: Yes. I remember. Have you ever watched something on TV and you see a crowd of people and you're like, "I don't think they're supposed to do that. What?" It's just now that's the mindset.

Mike: I saw a trailer for a movie that was just coming out, and I'm like, "When did they film this?"

Jeremy: Exactly.

Mike: Wow. Anyway. Yeah. So, I was at a conference in February and we were doing a signing of an early version of our book and about 50 people came up because O'Reilly were giving away free books and people love free books.

Jeremy: Sure.

Mike: And if I had a nickel for everyone that said, "But what about cold starts?" And it was like 60% of people said, "What about cold starts in Java?" And cold starts is this... I mean, you know this Jeremy, it's like, "Is the Boogeyman in the Lambda, well, anyway." And so yes, everyone thinks that basically because of cold starts Java is a non-starter for Lambda and obviously that's not true. Otherwise, we wouldn't have written the book because we would never have had production Lambdas. And I think a really good case in point is I was working with a client last year and the year before and they were just writing Java Lambdas, they were writing Scala Lambdas and for those of you don't know, Scala is another language that runs within the JVM and Scala is even worse for cold starts because not only do you have to start with JVM, you also have to start the Scala runtime within the JVM.

Jeremy: Right.

Mike: It was not my idea, but they were very concerned about cold starts, but they were a team of Scala developers, they knew Scala really well. So very, very concerned, and I'm not sure it was going to work. Anyway, they put it in production and a month or so later, I saw some announcement about AWS cold start, this was pre the VPC stuff, so it was something else. And I went up to the tech lead on the team and said, "Oh, hey, by the way, there's this improvement to cold starts coming out." And he looked at me, he was like, "What?" I'm like, "Well, you all are worried about cold starts?" He was like, "Oh, no, no. We put it in production. It was fine."

And this happens nine times out of 10 when we talk to people. Cold starts are these big scary things because when you're in development with Lambda, every time you run your new Lambda function, you see a cold start.

Jeremy: Right.

Mike: And there's this like feeling that that's like every single time your Lambda function is going to get called in production, you're going to get a cold start. Well, if your Lambda functions are running frequently enough for big, serious applications, that's true, then you're actually going to be getting cold start like one in 100,000 times or whatever it is. And so when you amortize the cold start over how many times your Lambda function is actually running, normally in many, many, many situations, it's not a problem even if the cold start was 10 seconds every time, it's not a problem. So that's one thing.

And then the other thing is that cold starts are not as bad as people think they are, especially, we have a number of mechanisms in the book that we recommend people use. We don't completely dismiss cold starts and we spend a lot of time saying how you mitigate them. But if you think a bit about how you're going to package your code and write your code and architect your applications, because you do have to do that. You can't just throw the whole of your typical way of thinking of it because you will end up with 15 second cold starts and that's not great.

Jeremy: Right.

Mike: With a little bit of thinking and a little bit of smarts then you can then you can fix that and this has nothing to do with Java, but especially now Amazon had fixed the VPC issue with cold starts.

Jeremy: Right.

Mike: Cold starts are just not nearly as much of a problem as they used to be.

Jeremy: Yeah. And one of the things, too that I always notice is when it comes to front-end colds starts, I mean, those are obviously more obvious, too. If you're connecting to a Lambda function through API gateway or ALB or something like that, you get that cold start and it's fairly noticeable. Like you said, on an application in production that gets a fair amount of traffic that you usually don't get those cold starts, but even when you do, and I know I think I've said this like a 1000 times like how many times have you typed something into Google and it just didn't respond for some reason, right? There's a network hiccup or something happened. I mean, if it's 10 seconds, it's kind of insane, but I think users will be like, "Why is this not responding?" And then they click refresh, or whatever and then what do you know, it comes up just fine.

Jeremy: But one of the things that I always noticed is, I think the vast majority of my Lambdas now run asynchronously in the background, right? So they're not even hitting user face or you're not hitting them directly and really, when it comes to that, cold starts, they don't matter at all.

Mike: Right. Exactly and again, this is a little bit of the problem that comes from a lot of the places that people start doing serverless development is not how they're thinking when they're writing production applications. The tutorials that you see are all APIs, because those are easy to test.

Jeremy: Right.

Mike: We understand we can just hit it with a web browser, so it's easy, so it's very similar to this testing issue. It turns out that, you know this, Jeremy, but, where Lambda really shines is in large-scale, back-end asynchronous systems and that's how John and I got started with it. It was kind of lucky in some ways and that's how we got started, because it made it gave us this mindset, like our first real Lambda round was processing events from Kinesis and was processing millions of events a day of Kinesis, and a bunch of events off S3, like we weren't doing API stuff there. And then if like one in 100,000 of your invocations takes 10 seconds instead of half a second, do you care? No, you don't care.

Jeremy: Yeah.

Mike: One thing I would say for people listening to this, if all you've ever used Lambda for is synchronous APIs then you're missing like 95% of what Lambda is about.

Jeremy: Yeah, and the other thing, too, and I'm not a huge fan of Java. From my first class in college of remembering public static void main, I've just had nightmares about it ever since.

Mike: Yeah.

Jeremy: But I will say this, the cold starts in Java are certainly higher than something like Node or Python and so forth. They barely ever come into play, but the other thing is that once a Java function is initialized and it's warm, it is fast.

Mike: Exactly. And that's one of the other reasons that we liked using it for high-throughput systems because obviously, Go is going to be Java because Go is compiled down to real code.

Jeremy: Sure.

Mike: Yeah, you compare the JVM with JavaScript or Python, it's going to be faster over time and the thing that people forget about that is that that means it could be cheaper, like if your Lambda function takes 800 milliseconds in Node and 700 milliseconds in Java and you'll run-

Jeremy: Times 10 million or whatever it is.

Mike: You're saving 12% on your compute costs.

Jeremy: Yeah.

Mike: Purely by using a different language. Now, would I tell people to use Java based upon that, solely that? No, but if you have Java experienced engineers on the team, then that's a really nice benefit if you're writing high-throughput Lambda systems.

Jeremy: Absolutely. So then another thing too, about cold starts, is because so many people have, I think, complained about them or I guess, thought they were a problem and there's other reasons for this, too, but AWS came out with provisioned concurrency and you write about this in the book. You have some interesting thoughts about this.

Mike: Yeah. It's interesting. I'll start off by saying, I'm glad that AWS did this because I've met people in this world who are like, "No. Cold starts are a problem. Must always have absolute guaranteed latency." I'm like, "Okay." And very occasionally those people actually need that, and that's fine, but normally they don't. And so I'm glad that AWS have come around because now if needs be, I can just point those people at this and say, "Fine. You have your escape hatch. It's called Provisioned Concurrency."

But oh, my word does Provisioned Concurrency come with some caveats. And the first one was my very first experience with it. It's really slow to deploy. I can't remember the numbers now, but I did some testing and this was in December, so it was only just after it came out, so I'm sure this will get better.

Jeremy: Sure.

Mike: But it took like an extra minute and a half, two minutes to deploy a single provisioned concurrency Lambda function and it took an extra four minutes to deploy something where the Provisioned Concurrency was set to 50. So that was really annoying, because I'm used to my little Lambda apps taking, well, less than that in total for it to deploy.

Jeremy: Exactly.

Mike: So, that was really concerning. The next thing is that the costs around Provisioned Concurrency are troublesome for two reasons. One, and lots of people have already talked about this, which is that the nice thing about Lambda is it's pay per use. You only pay for what your Lambda function is actually doing whereas with Provisioned Concurrency that is broken, like you are always paying a flat fee for your Provisioned Concurrency fee to the point where when AWS launched Provisioned Concurrency, they also showed how you could manually auto scale Provisioned Concurrency.

I'm like, "Wait, we're going backwards here. This is the wrong direction." So that's another part of the problem is like you have to start thinking in terms of like old economics and one of the benefits of Lambda is we don't think about those economics anymore. The other problem with the cost of PC is it's expensive.

Jeremy: Yeah, it is.

Mike: It can get really expensive.

Jeremy: Right.

Mike: So that's problem number two is the cost. And then the third problem, this is frustrating because this is my OCD kicking in a little bit where when I set up a SAM template or whatever you want, the difference between the development configuration of my Lambda versus my production configuration for my Lambda is often precisely the same, maybe different environment variables. With Provisioned Concurrency, you don't want to be using the same Provisioned Concurrency settings in development as you do in production and so you're mixing up this whole thing down into. It's just, yeah.

Jeremy: Yeah, I agree.

Mike: I love the fact that they managed to make it work without any code changes and that was very clever.

Jeremy: Yeah.

Mike: I love the fact that there is now an escape hatch for people that really, really can't have cold start, that's great, but it's something that you should almost, almost never use. And I think the major benefit for someone like me and I think this would apply to many others as well, is just the sort of ramp up that Lambda functions can do. They only scale so much so fast, like it takes five minutes or something like that in order for you to go up to the next 500 of them or whatever it is. And so, that's something that we think about Lambda being infinitely scalable, but in actuality, there's some limits to how fast that can scale.

So having something like Provisioned Concurrency is great to say, "Hey, I need to warm 2000 functions for some big flash sale that I'm having at noon," or something like that, that that it would come in very handy in cases like that, but I just was playing around with it and I'm like, "What if I just kept one function warm or one container warm?" And I forget, I was either $14 or $17 or something. Basically the cost was or maybe $10, whatever it was, but it only got hit when I actually hit it, it would cost me like s$0.6 to run that Lambda function for an entire month. If I use Provisioned Concurrency, it would cost me $10, right? Yeah.

Jeremy: Which is not a lot of money, except if you multiply that by 1000 functions then all of a sudden things start to get more expensive, plus, if you multiply it by saying keeping 50 warm as opposed to just one, so it does get pricey.

Mike: Yeah, and to be fair to AWS, I don't think they particularly wrote it for, built it for cost conscious companies. It's like I think they built it for big enterprises, frankly.

Jeremy: Right. Yeah.

Mike: But that's my guess.

Jeremy: Yeah.

Mike: And so, that becomes less of a deal there, but it does become a deal when you're doing that for provision currency of 50 and you're deploying it five times for multiple environments, then it can ramp up, so yeah.

Jeremy: Yeah. Interesting. Alright. So the other thing you mentioned in the book is sort of when to or when to not use custom runtime. So what are your thoughts on those?

Mike: Yeah. This is interesting. I was actually just using one of these this morning. So yeah, for those of you that don't know that Java comes with however many, it's like 10 standard runtimes now normally for different languages and different versions of languages. In fact, if you take them to multiple versions, you're up to like 30, 40, whatever it is. About a year and a half ago, I think it was, the Reinvent 2018, Amazon came out with a capability where you could basically write any runtime that ran on Linux. And so a number of people came along with specialized runtimes for different environments.

And so one possibility that you have with cold starts is to write your own or to deploy your own custom runtime that is configured in a different way and maybe solve some of these cold starts issues for you. And there's a couple of ways of thinking about this. One is if you're a large organization, and you have a standard VM setup that you want to use, virtual machines, Java stuff, that's different to the Amazon way of doing things then you can use your organization's Java runtimes as opposed to the Amazon runtime. The other option, another sort of way of using these things, which I haven't dug into, but I'd like to, is that there are alternative ways of running Java code other than the stock JVM.

Jeremy: Yeah.

Mike: So, one that's talked about quite a lot is a thing called Graal, G-R-A-A-L and what Graal does is at build time, it will take your Java code and actually produce something that doesn't run in a regular VM, it just runs as a regular executable and so the idea there is that your startup times using Graal are significantly faster. I think I've got that right. I think that's what Graal does. It's one of those things I want to dig into it and try it out.

Jeremy: Right.

Mike: There's also other alternative VMs that just start super quickly, so I think Graal is one that compiles down to real code, but in those situations, so yeah, that could solve it. So then your thing might be, "Okay, well, if cold starts are ever a problem, we'll just use all of these." Well, the thing is, then you have to maintain the use of these runtimes. If something about the platform changes, the Lambda platform changes, you have to update your runtime.

Whereas if you use Amazon's Stock JVM, when they want to update the underlying Linux environments, whether there's like, we get another specter or meltdown or something or doing something else to get a bunch of performance improvements, we get those improvements automatically if we use a standard runtime. Whereas if we use a custom runtime, we probably don't get that and probably want to have to go through a whole bunch of testing against those new environments before we roll out on new runtimes, so nothing comes for free.

Jeremy: Yeah.

Mike: So, it's one of those things. For some people, it's going to be worth looking at the tradeoff.

Jeremy: Yeah, I mean, I think that's one of those things, too, where it's like with serverless you're trying to minimize all of that undifferentiated heavy lifting, so why would you want to go and maintain your own runtime? If you're a big organization, like you said, I think this makes total sense where you can bake in security and other things that you might want to do, but certainly for the average developer or the average company, I think it's something big to bite off, to chew, that's not the right way to say it. It's too much to bite off, I don't know, maybe that's the right way.

But anyway, so I want to get into some really geeky Java stuff here. And like I said, I'm not a Java person, but I did work for a company that everything was written in Java. So I did have to look at it quite a bit. So one of the things that's very, very popular with Java, especially when it comes to building APIs is the spring framework. And AWS has spent, I think, or has invested a significant amount of time and energy into something called the Serverless Java Container project. And this is something they maintain that makes it easy for you to write Java Spring Boot projects or whatever they're called on AWS Lambda. You are very, very clear in the book that you think this is a bad idea.

Mike: I am. So a little bit of context for those of you that have never and will never write Java applications. Back in the dawn of time, otherwise known as about 2001, those of us that wrote software for a living would normally write Java and we normally run Java in these large things called application servers that would take minutes to start up and we'd all run these on our laptops and run them in production and they were horrible and very, very slow. But they allowed doing certain things that at that time would otherwise be a lot of work for us as developers.

Over the course of a few years, people were like, "These are getting way too big and slow and heavyweight. Can we come up with something simpler?" And along came this thing called Spring and Spring tried to still be this idea of running a large application and doing a bunch of stuff for you, but started up in 10% of the time, perhaps even less, and it really became over the sort of next 10 years, the de facto way of writing Java applications. But it's still based on this idea where you are starting an application bringing in a whole bunch of things and dependencies at startup and then your application is going to last a long time and you're going to make requests over the course of days or weeks.

So, people just got used to writing Java apps in that way. However, those assumptions that it was based on, don't make sense in a world of Lambda. We're not building a huge application. We want to write an individual Lambda function. We're not depending on like a whole bunch of different environmental dependencies. We're normally just depending on two or three and those assumptions don't apply, but even if we take those assumptions out, there is still a cost to running something like Spring and those costs come normally at cold start time, and that there is a lot of stuff that we have to bring in, a lot of libraries and whatever that have to be loaded and instantiated and all that kind of stuff.

And also Spring does a bunch of stuff at startup using reflection to dynamically load code, which makes sense when you're writing one of these larger applications that's going do a whole lot of different stuff and is going to last a long time, but it just doesn't make sense. So this framework that you speak about, I'm sorry, Stefano if he hears this, because I know that he's put a lot of work into it. It's one of those great occasions where AWS, they use this phrase like meeting people where they are.

Jeremy: Right.

Mike: And so it's one of these situations where people, they know that there's a lot of Java developers out there that have got all these Spring apps, and they would be like, "Okay, well, we can meet you in your writing your Spring apps and still allow you to write Lambda functions." And I admire AWS for doing that, I think that's great. But there is a real problem there, that Java developers who are using Spring and use this thing will get to a point where they go, "Oh, this is how you build Lambda apps." It's not how Lambda is designed, right?

Jeremy: Right.

Mike: You are missing a big, big trick by sort of locking yourself into the Spring way of thinking when you're building Lambda applications and you will be far more effective as a Java Lambda developer if you got rid of all that Spring stuff and just thought about the underlying function that you're trying to write.

Jeremy: Right. And that's one of those things were like you said, I think AWS has actually done a really good job of giving people on-ramps to serverless, where I mean, "Look, just throw your Flask app, throw your Node.js app or your Express app." Throw it into a single Lambda function, you get this mono Lambda or there's Lambda lift, it works with Spring Boot and all that other stuff. It's not the most efficient way to do it, but it certainly gets you started.

But rather than just complaining that this is not the way to do it, you outlined a really interesting solution, and again, I will never do this because I'm not going to write anything in Java, but for people who are listening, explain this whole idea of multi-module, multifunction Lambdas.

Mike: Yeah. And a lot of this applies to any language as Jeremy just said, shockingly, it's a little easier in Java because Java's tooling is better to handle this, but this is not specific to Java, so what these Lambda lifts or mono Lambdas do is they say, "Hey, we're going to express all of the logic of one application, which might have 10 different types of requests come in, we're going to we're going to put that in one Lambda." And that's to us, is sort of missing the point of a lot of a lot of what Lambda does.

And so what I would rather do and what John and I would like to do is have if we have 10 different types of requests then consider having 10 different Lambda functions. Each way, each Lambda function only has the security that it needs, so we talk about the principle of least privilege a lot in the book when we talk about security. So each Lambda can only access the things it needs, partly that's about bad actors, but mostly, that's about reducing the blast radius so that we don't shoot our own filth. When people think about IAM, sure think about security, but also think about it as a safety blanket.

It's like IAM is about safety. It's also about security. So if we separate out all our functions into 10 different functions, we can have much, much smaller IAM scopes. Cold start is reduced because we're only loading up the code and the libraries that each Lambda function needs. We're not loading up that for everything and that does make a difference. Like the difference between a 45 megabyte distributable and a five megabyte distributable, like you'll notice it and even if it's not important to production, it's nice at development time to have that speed up.

Jeremy: Okay.

Mike: So, you can separate it out, great, but then everyone says, "Yeah, but if I had 10 different functions, then it's 10 different repos and 10 different deployment scripts, and how do I share code and blah, blah, blah, blah. So, the way that we think about this is, well, first of all, just because it's 10 different functions doesn't mean it's 10 different applications. You can have one application and one repo that has 10 functions in it, and serverless or SAM or whatever, will support you having multiple functions in your template. That's fine, so you don't have to have multiple applications and multiple repos.

But then there's the point about, "Okay, but what if I have some shared code among these things? Maybe five of these functions are going to go out to a database. What if I don't want to have to rewrite that database code in all my five functions?" Well, wouldn't it be nice if we could have like our 10 functions and then also some shared code in the same repo. And one of the things we do in the book is show how you do that is how you build like a mini library, that's all in the same application and then your Lambda functions can rely on external dependency libraries, but also can depend on these internal little bits of shared code.

And so we have a whole system that works and when you use it, it's just a matter of running Maven package. It just does the thing. But it uses this this Java tool called Maven under the covers, and Maven, trust me has its drawbacks and it's been around 15 years, and it's XML and oh, my goodness.

Jeremy: Ah, XML.

Mike: But its semantics around modeling dependencies are far advanced from any other main language on Lambda, as far as I'm concerned. If I want to write some quick code, I write it in Node or Python. If I actually want to model some dependencies, I'm going to run away from both of those screaming and use something like Maven.

Jeremy: Yeah. I mean, reading it in the book it was it was actually really, really interesting and my mind was like, how could this be applied to other languages that make it as easy as this does, maybe without XML as the configuration for it. But yeah, but definitely check that out. If you are building AWS Lambda functions using Java and you're writing things in this monolithic fashion, this solution, I think, it's brilliant. It really, really works well or it looks like it works well. I mean, obviously, you've experimented with it, but breaking these Lambdas up is definitely what we want to do.

Mike: I think another sort of metaphor around this stuff is that when you're bringing your Spring based applications into Lambda, it's a little bit like lift and shifting.

Jeremy: Right.

Mike: But you're not writing code in a Lambda native way and just like when we when we lift and shift from a data center onto the Cloud, we then need to go through a second activity, which is building Cloud native apps because Cloud native apps aren't typically lift and shift. And so when you create a Lambda lift, you're effectively lifting and shifting and then what you need to do is learn some new skills and using new techniques to build some Lambda native code.

Jeremy: Yeah, definitely. I totally agree. Alright. So, another thing you have in the book, I think is a great section is sort of your gotchas section and that's one of those things to where it's like, I think, serverless developers get all excited, they're like, "Oh, I can do all this great stuff." And then all of a sudden, you hit some limitation or something and it kind of kicks you a little bit. I mean, that's sort of true with all Cloud native development, but you call out a couple things. One of the things that's actually wasn't in the gotchas section, but I kind of classifying it as it, is this idea of using versions and aliases.

Mike: Yeah. So I mean, this comes from towards the back of the book. So there's a couple of chapters towards the end, which are really not Java specific at all.

Jeremy: Right. That's another important point. This applies to all serverless developers.

Mike: Yeah. So basically, chapters eight and nine are purely about architecture. And so yeah, if you never want to read any Java, then just skip ahead and read those chapters of the book. Yeah, so versions and aliases are, for those of you that have been using Lambda a while you'll know you've been there since very early on, not quite the beginning, but very early on and they are useful when you're using Lambdas, the AB testing, Canary testing, Canary release.

Jeremy: Right. Canary deployments and things like that, yeah, sure.

Mike: They're really useful for that, but when AWS created them first, I think what they were trying to do was it was a sort of another one of these sort of meeting people where they are things where when people deploy applications, they're deploying the test version of their app and the production version of their app, that sort of treating them as one thing. So, even back then with API gateway, you'd have multiple stages. You would have your production stage and your testing stage and development stage because people weren't sort of used to this idea of ephemeral or isolated stacks, as we now think about it.

But what I tend to use now is instead of just deploying a different version of a function, I'll deploy an entirely different set of resources. So if I want my production Lambda versus my test Lambda, that could have probably been in two different AWS accounts, let alone two different stacks. And so I think where people sort of got stuck a little bit with versions and aliases was like, "Oh, so I have to have my test and my production are the same Lambda functions, and it all got very confusing." And I'm like, "Yes, it's all very confusing. Don't do that."

Jeremy: Right.

Mike: If what you want to do is gradual release of production Lambda, sure, use the same Lambda function, but if you want your acceptance testing Lambda versus your production Lambda, just two different Lambda functions is to deploy them separately. And especially what that requires is having your infrastructure as code down, like you really need to have automated deployment, but if you're not doing that, you really shouldn't be doing serverless development anyway.

Jeremy: And the other thing about aliases or versions is that Provisioned Concurrency has to attach directly to a version.

Mike: Yes. Yeah, I forgot about that. That's another Provisioned Concurrency thing. And again, if you're using the canary release stuff that comes with SAM and Lambda, that requires an alias as well, but that makes sense and I'm down with that usage of it.

Jeremy: Yeah, and I agree. The traffic shifting stuff is actually really, really cool if you've never played around with it, and there's a bunch of great plugins that just handle it automatically for you, too, so it's definitely something to checkout. So another one of your gotchas in the gotchas section was at least once delivery, which is something that I think people who are new to distributed systems in general can get bitten by.

Mike: Yeah. And Amazon have started taking more stick for this and I'm kind of with the people giving Amazon stick for this. I think there should be a switch for this now. So yeah, so the idea is that Lambda, your Lambda functions respond to events and Amazon guarantee that should an event that is configured to be attached to your Lambda function occurs, when that event occurs, your Lambda function will be called. They guarantee it will be called. What they don't guarantee is how many times it will be called.

Almost every time there'll be a one-to-one correspondence between the event occurring and your Lambda function being called, but sometimes your Lambda function will be called twice. Okay, what's the big deal? Well, what if your Lambda function was charging your credit card.

Jeremy: Right.

Mike: Right? You wouldn't want to be charging people twice? Well, not if you wanted to keep the business anyway. So, that's obviously a little extreme, but there's a lot of places where you don't want to do something twice. That's what your initial thinking may be when you're designing code. And so when you're writing Lambda, you have to be cognizant about the fact. If your Lambda function is making any change to an external downstream environment, in that if it's not just returning something, if it's actually like updating DynamoDB or writing a file out to S3 or calling an external service. If it's doing any of those things, you have to be aware that your Lambda function may be called multiple times.

Jeremy: Right.

Mike: And there are ways that you can manage this, which we go into in the book.

Jeremy: Yeah, I mean, that's the thing is just this idea of building item potent operations.

Mike: Yeah.

Jeremy: And understanding that, and it's funny, I mean, the billing your credit card twice thing, I think Stripe has this figured out pretty well. They have an item potent ID that you can send in with every with every API call and you can use something like a message ID or something like that that would be unique to each individual event that it won't double charge that card or whatever. But I do think that's interesting, and you outlined some strategies, like DynamoDB, lock tables and some of those other things, so that's certainly interesting.

Now, the last thing I want to get to, though, because this is something that I'm passionate about. I do an entire talk on this when it comes to serverless and that's the impact of Lambda scaling on downstream systems.

Mike: Yeah, this is this is a big deal where you're building what we call hybrid serverless, non-serverless systems. So really, really easy example to describe is if you have a Lambda function and it's in front of a SQL database. One of the really awesome things about Lambda is that it will, by default, scale 1000 instances wide, so, 1000 concurrency. One of the terrible things about Lambda when it's connecting to a SQL database, is that it will automatically scale 1000 instances wide.

Jeremy: Exactly.

Mike: Right? And, and if you're not careful, that could take down your non-serverless infrastructure components.

Jeremy: Or your downstream APIs or what your code is.

Mike: Or whatever, yeah.

Jeremy: Or whatever else. Yeah.

Mike: Or logging systems that you haven't thought about and all that kind of stuff. And so yeah, it can have a real impact. It's one of those things again where that's what it is. You just have to be aware about that. Lambda is a different way of architecting systems. And this is the kind of thing I've talked about for years and even talked about some of this. But when you look at Lambda code, it looks like the same code that we've been writing for years.

Jeremy: Right.

Mike: When you look at Lambda architecture, it's drastically different to what we've been building for years. The difference between writing stuff in a container and writing stuff that was running on bare metal, from an architectural point of view, there's been this gradual evolution through Cloud native through VMs to containers, not that much different.

Jeremy: Right.

Mike: When you're architecting for Lambda, you have to think very, very, very differently. And I think people don't necessarily realize that because the code is easy and it's like, yes, but we've shifted the mental effort from the code to architecture. And now everyone needs to be an architect and I think that's a good thing, right?

Jeremy: Right. I do.

Mike: I think that making architecture not this thing that exists in this ivory tower and bringing it to all engineers is a wonderful thing because engineers that are building these systems are much better able to make optimization decisions than someone that's completely removed, right?

Jeremy: Right.

Mike: So that's a good thing, but the flip side is that people need to learn architecture now. We need to learn about the architectural trade-offs that come when you build distributed systems.

Jeremy: Yeah, I totally agree. I think that's a good exercise, though, for developers who are getting into serverless to start looking at that. And then the thing I loved about the book was that you outlined I think four or five different strategies of being able to mitigate those downstream issues, which are, are something you definitely have to think about because I don't know many people who are building entirely serverless applications where every piece of the infrastructure scales, just like Lambda does. So certainly something to think about. Alright, so anything else? Anything else about the book we should know?

Mike: It's amazing. It's awesome. Everybody should buy it. It was interesting. We spent a long time on it for various reasons, but one of the nice things that came through at the end because we delayed it was that Serverlessconf New York October last year, I think it was.

Jeremy: Yeah.

Mike: I bumped into Tim Wagner, who I know pretty well, by this point. Tim developed and ran the Lambda team for those of you don't know for a number of years. And so I was chatting to Tim and I was telling him about the book, I was like, "Hey, would you be willing to write the foreword to our book?" And he said, "Of course." And he meant it. So Tim wrote the foreword to our book, which in and of itself was great and I'm very appreciative that we have that introduction there with Tim, who if anyone knows anything about running Lambda in the enterprise, it's Tim, because he that's what he designed Lambda for, right?

Jeremy: Right.

Mike: But what also happened was, this is the story. So I'm sitting on the plane to re:Invent, and I'm doing the final read through of our tech draft. We've had all the written input from our tech reviewers. John and I applied all the changes. I'm reading the final thing that we're going to give over to O'Reilly, so they can start doing proofreading. I turned on my plane at the end of the flight, there's an email from Tim, which has his foreword to look at. What it also has that I wasn't expecting was 15 pages of Technology Review, well, technical review for the book. So the book would have come out six weeks to eight weeks earlier, but it didn't.

Jeremy: Sure.

Mike: Apart from the fact that Tim gave us this extraordinarily useful tech review feedback.

Jeremy: Right.

Mike: And we incorporated a lot of those changes and things like for example, Provisioned Concurrency, which was announced around that time. That came into the book because Tim's like, "Yeah, you really should include Provisioned Concurrency in here," and a number of other stuff as well. So yeah, Tim was very much involved in it and we're very appreciative to him for that.

Jeremy: Yeah. Well, if you're going to have somebody write your foreword or review your book, I would say Tim Wagner, about Lambda anyways, is a very, very good source. And I actually read it. I read the foreword. The foreword itself is just worth reading because it actually is really interesting and gives some good insight. So seriously, I mean, the book itself, like I said, if you're a Java developer and you want to write Lambda functions, go pick it up. If you're not a Java developer and you just want to learn more about serverless, and get some insight from two very, very smart people in the industry, pick up the book, take a look at it. It's on oreilly.com. If you've got a subscription, you can read it there.

Excellent book, very well written lots of awesome stuff in there. So thank you and thank John for me as well, for writing it as I think anytime there's good, accurate serverless content out there, it really is a gift to people who are trying to adopt this crazy new thing. So if people want to get a hold of you or find out more about what you're working on, how do they do that?

Mike: Yeah, we have a couple of ways, so our website and the thing that's updated most, my website is our blog, so if you go to blog.symphonia.io, there's a bunch of stuff on there and just symphonia.io shows what we do, has a link to the book. If you want to take a quick look at the book and not sure what you want to do, O'Reilly, who it's published with, have this nice thing where you can get a one-week, I think, free subscription to their online platform. So you can go on there and look at our book for a week and then if you like it, you can either subscribe fully or buy a copy of the book.

I'm on Twitter at @mikebroberts, which I guess will be in the links for this podcast, which will be a combination of tech stuff and New York theater and my cat and not as much theater at the moment because, obvious reasons, but yep. And then we're also on Twitter at @symphoniacloud.

Jeremy: Awesome. Alright. Well, I will get all of that into the show notes. Thanks again, Mike.

Mike: Thank you, Jeremy. Thanks, everybody.

THIS EPISODE IS SPONSORED BY: Datadog and Amazon Web Services (Serverless-First Function May 21 & 28, 2020)

Episode source