#168 - Serverless as a Game Changer - Joseph Emison
Summary
Joseph Emison, author of “Serverless as a Game Changer,” shares his journey from traditional cloud infrastructure to embracing serverless architecture. He explains how the compounding effect of technological advancements every 18 months creates an enormous gap between top-performing development teams and average ones. Top teams leverage serverless and managed services to focus on differentiated work that provides business value, while average teams waste time on undifferentiated heavy lifting like running infrastructure.
A core principle Emison advocates is distinguishing between differentiated work (what makes your business unique) and undifferentiated work (common tasks like sending SMS, search indexing, or logging). He argues that developers should use managed services for undifferentiated work whenever possible, even when those services seem expensive at first glance. The true cost includes developer time, maintenance overhead, and reduced velocity—factors often overlooked when comparing managed services to self-hosted solutions.
Emison shares practical examples from his company Branch, a fully regulated insurance company built entirely serverlessly with no VMs, containers, or Kubernetes. He explains how this approach enabled them to hire junior frontend developers and train them as full-stack developers, since the infrastructure complexity was handled by managed services. The company maintains excellent uptime, security, and compliance while keeping cloud costs remarkably low—around 8,000 monthly for all environments including production, development, and testing.
The conversation covers common objections to serverless including cost concerns, vendor lock-in, and security. Emison addresses each with practical advice: calculate total cost of ownership including developer time, view managed services as temporary learning tools that can be replaced if needed, and implement robust third-party risk management programs. He emphasizes that code is a liability, not an asset, and recommends optimizing for maintainability above all else.
Emison concludes with three pieces of technical leadership wisdom: establish an optimization principle for how your team writes code (he recommends “optimize for maintainability”), remember that less code is generally more maintainable than more code, and invest more time in research than implementation—it’s better to spend two weeks researching and two days developing than two days researching and two weeks developing.
Recommendations
Books
- Serverless as a Game Changer — Joseph Emison’s book that explains how to get the most out of the cloud using serverless architecture, written specifically for skeptics of serverless approaches.
- Accelerate — Recommended for understanding DORA metrics and how to measure software delivery performance, which helps evaluate the impact of architectural choices on development velocity.
Concepts
- DORA metrics — A set of metrics for measuring software delivery performance, particularly cycle time, which Emison recommends using to evaluate the efficiency impact of architectural choices.
- Differentiated vs undifferentiated work — A key framework for deciding what to build vs. what to buy—focus development effort on what makes your business unique, and use services for everything else.
Tools
- Algolia — A managed search service that Emison recommends over self-hosted Elasticsearch/OpenSearch for most applications, as search indexing is undifferentiated work.
- Twilio — Cited as the classic example of a managed service for undifferentiated work like sending SMS messages, allowing developers to focus on business logic.
- Linear B — A tool used at Branch to monitor DORA metrics and track development cycle times, helping quantify the impact of architectural decisions.
- Neon — A serverless Postgres database that Branch adopted early, mentioned as a good alternative to Amazon’s non-serverless Postgres offerings.
Topic Timeline
- 00:01:03 — Introduction and career journey focusing on business alignment — Joseph Emison introduces himself and shares his career evolution from focusing on cool technical problems to prioritizing business goals, organizational health, and working with people he enjoys. He emphasizes the importance of choosing growing organizations where wealth creation enables better working conditions and advancement opportunities.
- 00:04:40 — The serverless journey and writing the book for skeptics — Emison explains his progression from early cloud adoption (S3, EC2) to serverless with Lambda in 2014. He wrote “Serverless as a Game Changer” to address skeptics who claim serverless only works for toy applications, drawing from his experience building Branch—a fully serverless, regulated insurance company with no VMs or containers.
- 00:11:12 — The growing gap between top and average development teams — Emison discusses how technological advancements every 18 months create a compounding advantage for startups that can adopt the latest technologies without legacy constraints. He cites Instagram’s small team building at global scale as an example, contrasting with earlier social networks that required hundreds of developers. This efficiency gap is widening as top teams leverage serverless better.
- 00:15:14 — Differentiated vs undifferentiated work with practical examples — Emison explains the critical distinction between work that makes your business unique (differentiated) and common tasks (undifferentiated). He gives two examples: everyone agrees SMS sending is undifferentiated (use Twilio), but many developers wrongly insist on running their own Elasticsearch/OpenSearch instead of using Algolia. The latter choice represents misplaced optimization that wastes developer time on undifferentiated work.
- 00:20:29 — Calculating true costs including developer time and velocity — The discussion focuses on how to properly evaluate costs when choosing between managed services and self-hosted solutions. Emison recommends using DORA metrics (especially cycle time) and accounting for developer costs, which are often treated as ‘free.’ He notes that managed services typically win in honest total cost comparisons for 90-95% of applications.
- 00:27:58 — Code as a liability and minimizing custom development — Emison argues that code is a liability requiring maintenance, not an asset. At Branch, they follow a waterfall process: first try to buy SaaS, then downscope requirements, then use managed services, then open source—writing custom code only as a last resort. This approach, combined with small iterative releases, reduces maintenance burden and improves feedback cycles.
- 00:32:33 — Defining serverless as ‘not my uptime’ — Emison provides his favorite definition of serverless: ‘it’s not my uptime.’ He distinguishes between true serverless services (where you’re only responsible for your code functioning) and services requiring patching, scaling, or disk management. He also clarifies that serverless encompasses more than just Lambda—it includes managed services like Twilio, Algolia, and even platforms like Shopify.
- 00:36:45 — Addressing security and compliance objections — Emison tackles common security concerns about using third-party services. He emphasizes the need for robust third-party risk management programs rather than avoiding vendors altogether. As a regulated insurance company, Branch uses many managed services while maintaining compliance through proper vendor evaluation, contract review, and security assessments. He argues serverless architectures can be more secure than traditional setups.
- 00:40:46 — Uptime and reliability with third-party dependencies — The conversation addresses concerns about downtime when relying on external services. Emison shares that Branch experiences minimal downtime—mostly during major AWS US-East-1 outages that affect much of the internet. He recommends evaluating vendors’ historical uptime and SLAs, noting that punitive SLAs indicate seriousness about reliability. Proper alerting and fallback strategies handle occasional vendor issues.
- 00:48:50 — Branch’s unique development model with junior developers — Emison explains Branch’s innovative approach: hiring junior frontend developers and training them as full-stack developers because the serverless stack is simple (JavaScript/TypeScript throughout). There are no specialized infrastructure or ops roles since ‘it’s not our uptime.’ This model empowers developers, reduces coordination overhead, and has proven highly effective despite the company’s regulated nature.
- 00:55:42 — Three technical leadership wisdom principles — Emison shares his three key principles: 1) Establish an optimization principle for code (Branch optimizes for maintainability); 2) Remember that less code is more maintainable than more code; 3) Spend more time researching than developing—it’s better to research for two weeks and develop for two days than vice versa. He emphasizes understanding available managed services before building anything custom.
Episode Info
- Podcast: Tech Lead Journal
- Author: Henry Suryawirawan
- Category: Technology
- Published: 2024-03-25T12:00:00Z
- Duration: 01:00:57
References
- URL PocketCasts: https://pocketcasts.com/podcast/tech-lead-journal/952099f0-a7bc-0138-e686-0acc26574db2/168-serverless-as-a-game-changer-joseph-emison/da0444e4-e3d4-4430-8d65-ce3fa6534735
- Episode UUID: da0444e4-e3d4-4430-8d65-ce3fa6534735
Podcast Info
- Name: Tech Lead Journal
- Type: episodic
- Site: https://techleadjournal.dev
- UUID: 952099f0-a7bc-0138-e686-0acc26574db2
Transcript
[00:00:00] you always want to look at what does your business need to do differently or your organization need
[00:00:04] to do differently than other organizations. If you can outsource it, if it’s not something that
[00:00:11] makes you different, you should use a service because you will always be asked to do more
[00:00:17] things that you can build as a developer that are differentiated, that are like special to
[00:00:23] your organization. So don’t add in things that aren’t.
[00:00:53] technical team and to make an impact in your personal work. So let’s dive into our journal.
[00:01:03] Hello, Joe. It’s great to have you here. Welcome to the Tech Lead Journal podcast.
[00:01:08] Great. Thanks for having me.
[00:01:09] So Joe, I always love to start my conversation by asking my guests to actually share a little
[00:01:14] bit more about yourself. Maybe if you can share any highlights or turning points that you think
[00:01:18] we all can learn from.
[00:01:19] I’d be happy to. You know, I think my main journey
[00:01:23] started as a tech lead. And I think the main thing I’ve learned across my career
[00:01:27] is how important it is to understand the business goals and the business itself that I’m working
[00:01:35] with or nonprofit organization or whatever. I think when I started, I really wanted to work
[00:01:40] on cool technical things. And that was my focus. And, you know, I realized over time that the
[00:01:47] quality of the business, the quality of the organization and the quality of the leadership
[00:01:51] and the culture.
[00:01:53] Really mattered a lot more than how cool the technical problem was. And so I’ve kind of
[00:02:00] developed a rubric for myself that it’s much more fun and rewarding to work with people you like
[00:02:06] working with in an organization that’s grown because the organization that’s growing is
[00:02:11] creating more wealth, essentially, in many different ways that can be shared. And so it’s
[00:02:18] an organization that there’s like a healthiness to it that makes it a lot more pleasant to work
[00:02:22] for. And that’s nothing that I optimized for at the beginning of my career. I’m on my sixth,
[00:02:28] starting my sixth company that I’ve been with now for about five years. But every company along the
[00:02:34] way, I’ve made mistakes around not thinking deeply enough about how growing and how great is this
[00:02:41] garden that I’m working in, as opposed to, you know, how cool does it seem? That’s the primary
[00:02:47] piece of advice I’d love to go back and give myself if I could from earlier in my career.
[00:02:52] Hey, thank you for being part of the Tech Lead Journal community. This show wouldn’t be the
[00:02:57] same without your ears. And you are the reason this show exists. If you’re loving TLJ and want
[00:03:03] to see it keep on growing, consider becoming a patron at techleadjournal.dev slash patron,
[00:03:09] or buying me a coffee at techleadjournal.dev slash coffee. Every little bit helps fill the research,
[00:03:16] editing and sleepless nights that go into making this show the best it can be. Thanks for being the
[00:03:22] best listener. I’ll see you in the next video. Bye.
[00:03:22] Any podcast could ask for. And now let’s get back to our episode.
[00:03:27] Yeah. Thanks for reminding us as well about choosing the right place to work with. And it’s
[00:03:32] not all about cool technologies, right? I think there are still some of us technologists who love
[00:03:37] to play with the new techs, right? Especially these days, there are a lot of advancements,
[00:03:41] new things all the time. So always understand about the business, the quality of the business,
[00:03:45] the people you work with, the culture. I think that’s also important. And yeah,
[00:03:49] working in a growing organization, I think that’s also very important.
[00:03:52] Yeah. No, just to say, you know, if you pick a growing organization, there’s going to be plenty
[00:03:57] of money and advancement and things for everyone. And so that one, I think is a real key. It is a
[00:04:03] way to get everything. You have a good culture and it’s not growing and it’s not going to create
[00:04:08] those benefits for you, but if it’s growing, it will. Yeah. Not to mention when the organization
[00:04:13] grows, there’s so many challenges you can learn from. I mean, like it’s a double-edged sword.
[00:04:17] Yeah. A lot more challenges means a lot more headaches probably, but also at the same time,
[00:04:22] you grow much more. Yeah. And I think that’s a really good point. I think that’s a really good
[00:04:22] point. I think that’s a really good point. I think that’s a really good point. I think that’s
[00:04:22] a really good point. I think that’s a really good point. I think that’s a really good point.
[00:04:22] Much, much better as a person or maybe in terms of skillset. So you wrote a book,
[00:04:26] Serverless Game Changer, how to get the most out of the cloud. So maybe before we start talking
[00:04:31] about your book, tell us a little bit more, your relationship with serverless, how did you
[00:04:36] bump into serverless and yeah, what made you started to write this book?
[00:04:40] Yeah. You know, I’ve been on a journey. I think my entire tech career of trying to leverage the
[00:04:47] benefits of advancements in technology. And one of the things that I’ve been doing is I’ve been
[00:04:52] I’ve realized over the past 25 years is that we get really big, important advancements in how to
[00:05:01] build software about every 18 months. Now, every 18 months, it’s not so much better that you should
[00:05:07] like throw away everything you’re working on and like just change everything. But there’s a
[00:05:12] compounding effect to this, like every 18 months, big changes. And this really hit me hard in about
[00:05:19] 2007. I had been building software for like, I don’t know, like, I don’t know, like, I don’t
[00:05:22] know, I don’t know. Like I had a company that was for more than 10 years at that point in time.
[00:05:25] And we had someone I had started a company. And we had a company looking to making an investment in
[00:05:31] us. And they sent someone to do technical due diligence on us and was looking at what we had
[00:05:36] done. And he said, Well, you know, what are you doing with continuous integration? And we were
[00:05:41] like, I don’t know what continuous integration, what are you talking about? And he recommended
[00:05:45] the book, the sort of the key book on continuous integration to us. And it really changed everything
[00:05:50] in terms of how I thought about, you know, how I was going to build this software. And I was like,
[00:05:52] not only like, wow, we need to do this. But also, wow, I had no way of knowing this,
[00:06:00] like in the way I had been practicing. And so at that time, I started asking, how do I learn about
[00:06:06] new things? How do I go on my own continuing education journey? And that led me pretty
[00:06:13] quickly to a hypothesis of, we continue to figure out what parts of building software
[00:06:20] are undifferentiated. And we continue to figure out what parts of building software are
[00:06:21] undifferentiated. And we continue to figure out what parts of building software are undifferentiated.
[00:06:22] for us. And we continue to be able to hand those to other companies, and pay them, and really pay
[00:06:30] them per use. Even in 2007, this was true. I mean, 2007, we had S3, Amazon S3 as a great, wonderful
[00:06:39] way to just store data that it would stay forever, it was very cheap, I mean, relative to anything
[00:06:44] else at that point in time. And when EC2 came out the next year, we had a company that really needed
[00:06:50] not so much to dynamically integrate, but to be able to do that. And we had a company that really
[00:06:51] needed not so much to dynamically integrate, but to be able to do that. And we had a company that really
[00:06:54] had a scheduling problem, we would get all this data in and need to process it. And being able to
[00:07:00] like flex out to n number of virtual machines, and then close them off when running, it was a perfect
[00:07:07] need that we had. Then Hadoop came out, and we saw Wow, then this all of this stuff we were
[00:07:12] building ourselves, we could actually like outsource that even a bit more elastic map
[00:07:17] reduce was Amazon service at that point in time. And so I saw this traditional,
[00:07:21] trajectory of, I can take these parts of these programs that we’re making or the workloads that
[00:07:27] we’re running and processing for our companies, and we can hand them more and more off to people
[00:07:33] and we can break them into chunks. I mean, from the beginning was let’s break the application
[00:07:38] into pieces and then use services for the pieces. And so when Lambda came out, I think in 2014,
[00:07:45] it was just another progression to me of how do I break an application up and just not have to
[00:07:52] worry about running things. And I was already at that point in time, I had started four companies
[00:07:57] and the amount of time and pain that we had to spend as a fairly small staff,
[00:08:03] just keeping things up. I mean, we used RDS early, but like we had cases where the volumes filled up
[00:08:10] and it took us down. And so there was never a case where those services,
[00:08:15] we’re fully like, I can sort of just delegate the running of these. And so really feeling like
[00:08:22] I can delegate my uptime to a managed service. Starting in 2014, 2015, I was like, this is it,
[00:08:29] this is perfect. But interesting, immediately what I saw was people, in my opinion, designing
[00:08:36] things poorly, like looking at Lambda and saying, oh, we’re going to put every function in a
[00:08:40] different Lambda, they’re going to call different Lambdas, and we’re going to take an application,
[00:08:43] which is here, and we’re going to put it in a different Lambda. And so I was like,
[00:08:45] spread all that complexity out across like network latency calls to all these Lambdas.
[00:08:50] And so I started talking. So I gave a talk at the first serverless con, I think it was in
[00:08:56] 2016, 2015, 2016, hosted by a cloud guru and started writing more and started arguing more
[00:09:03] about not only like, this is how we should build in serverless, also building serverlessly on
[00:09:08] Firebase and even building applications like on Google Sheets and Google Apps, really trying
[00:09:13] everything. And it was like, oh, we’re going to put every function in a different Lambda.
[00:09:15] Interacting with a lot of skeptical developers who would say, oh, that’s nice. You get to build
[00:09:21] toy applications that are very simple. The real developers don’t use serverless. It doesn’t work.
[00:09:26] It’s too expensive. And so I started this insurance company. Branch is a full stack
[00:09:31] insurance company. We bear the risks, sell home auto renters, condo, umbrella insurance.
[00:09:37] We’re the fastest to be able to sell those bundles by orders of magnitude over any other carrier.
[00:09:43] And I was like, oh, this is going to be a lot of fun. And I was like, oh, this is going to be a lot of fun.
[00:09:44] And I was like, oh, this is going to be a lot of fun. And I was like, oh, this is going to be a lot of fun.
[00:09:45] I was in all these conversations with people saying you only know how to build toys. Serverless
[00:09:48] is just a toy. And so I said, okay, I think I need to address all of those complaints and
[00:09:54] skepticisms about building serverlessly. And really write down now, like, no, we built a
[00:09:59] full stack and a financial services carrier that has lots of regulatory compliance requirements
[00:10:04] on it. We built it fully serverlessly. There are no VMs, there are no containers, there’s no
[00:10:08] Kubernetes anywhere. And so how do we think about it? How do we do it? And why does this
[00:10:14] actually work? Why is it not a toy? And so that has been the arc. But I feel like it’s been this
[00:10:20] sort of constant attempt to take advantage of we get these new technologies and like they can
[00:10:27] really change how we develop software and make it cheaper, faster and better. And so that’s been my
[00:10:32] focus. And I’m happy to share it in this book. If you’re very skeptical of it, this book is designed
[00:10:38] for you. It’s not designed to say you believe in serverless. Here’s how to do it. It’s a book
[00:10:44] for the skeptics to help you understand how this actually works.
[00:10:48] Yeah, thanks for sharing your interesting story. So actually, when I read the book in preparation
[00:10:53] of this conversation, I’m not a skeptic of serverless. But I get more intrigued by reading
[00:10:58] what you are trying to explain in your book, including branch, right, which I probably will
[00:11:03] talk more about later, like what makes branch unique in terms of adoption of serverless. But
[00:11:08] what piqued my interest in the beginning when you said, right, so technology keeps advancing every
[00:11:12] 18 months or so, right? And then you said, well, you know, we’ve been doing this for a long time,
[00:11:14] right? And then you said, well, we’ve been doing this for a long time, right? And then you said, well,
[00:11:14] you know, we’ve been doing this for a long time, right? And then you said, well,
[00:11:14] the gap, I think you mentioned in your book, the gap between the best software development teams
[00:11:19] and average software development teams is getting more enormous. I think it’s related to all this
[00:11:24] advancement. Maybe if you can pick a little bit, like why you think the gap is getting much bigger?
[00:11:29] Yeah, I mean, I think if you think about like, if there’s like a huge change every 18 months,
[00:11:34] there’s this compounding effect. And so startups that get to start today, get to take advantage,
[00:11:41] there’s no legacy, right? They’re not relying on any existing technology.
[00:11:44] So they can just use the absolute fastest, best way to build something today. And if you started a
[00:11:51] company like we did five years ago, you’re already going to be committed to some choices that if you
[00:11:59] were going to rebuild it today, you would build it differently and it would be cheaper, faster,
[00:12:03] and better, you know, if you had all the knowledge that you could have. And so we, you know, we look
[00:12:08] at that ourselves even. And so in the book, I give an example of Instagram when it was
[00:12:14] fired, had 13 developers and doing about the same thing, roughly Facebook had hundreds of
[00:12:22] developers. And then before that, you know, Friendster and MySpace had over a thousand
[00:12:26] developers. And so there’s an actually Instagram had 13 employees. I think they might’ve had like
[00:12:31] five or six developers. And if anyone has gone through the process of like, how much can one
[00:12:37] developer do? And then how much can four developers do? There are these step points where you lose
[00:12:43] massive efficiency.
[00:12:44] Per developer. I mean, I tend to think of it as kind of like one developer, four developer,
[00:12:49] 12 developer, 25 developers, 50 developers, a hundred developers. And once you’re at a hundred
[00:12:55] developers, there’s like a core inefficiency that’s in the whole operation. I mean, you’re
[00:13:00] getting an order of magnitude, less productivity than if you have a solo developer. Now you can’t
[00:13:06] run everything on a solo developer. It’s not a sustainable thing for a company, but there’s also
[00:13:10] a reason why Amazon has these like two pizza teams and trying to keep these services going.
[00:13:14] It’s the only way to keep really good quality and velocity together. And so when I say that the top
[00:13:22] teams are orders of magnitude better than the average teams, it’s largely that the top teams
[00:13:28] are leveraging technology so much better than the average teams that they’re just not bogged down.
[00:13:36] And I feel that very much at Branch. The company that I was working on 10 years ago, BuiltFax,
[00:13:42] which was fully in the cloud.
[00:13:44] Infrastructure is code, but like running on VMs, I think it’s still probably is, you know,
[00:13:50] in Amazon load balancing, we were using right scale at the time to do some of the orchestration.
[00:13:55] There was a core inefficiency that we had in running and a core inefficiency that we had in
[00:14:00] building new features that Branch doesn’t have. Branch is just faster. And so the average
[00:14:05] developer at Branch is just producing more for Branch and at a higher code quality than the
[00:14:10] average developer was at BuiltFax. Yeah. It’s quite astounding,
[00:14:14] when you mentioned about this, Instagram only have maybe 13 employees or five developers in
[00:14:19] total, right? But can build a world kind of scale, you know, like with users from all around the
[00:14:24] world. And I think what you say is right. So for people like me who has been around in industry for
[00:14:29] quite some time as well, right? I mean, the recent years, you can actually see that, oh, there are so
[00:14:33] many technologies that I’m not even aware of, but actually it’s really cool to probably prototype
[00:14:38] new applications, right? Building something from scratch. And even like, for example, you mentioned,
[00:14:42] so you can actually build something like,
[00:14:44] only Google or Amazon can do last time, right? And now you can actually tap into those technologies
[00:14:50] to build similar things like them. And I think what you mentioned as well, right? If the developer
[00:14:55] can produce more, that means they are focusing a lot more on the right thing, right? Which you
[00:15:00] mentioned as differentiated versus undifferentiated things. I think this concept is very important for
[00:15:05] you to actually adopt as technology leader or technologist in general, right? So tell us more
[00:15:11] about this differentiated versus undifferentiated. Yeah.
[00:15:14] I think you always want to look at what does your business need to do differently or your
[00:15:18] organization need to do differently than other organizations? And what are things that it’s fine
[00:15:24] for your company or organization to do as well as other companies do, like the best other companies
[00:15:30] doing it? And so let me give two examples that are both undifferentiated, but that will generally,
[00:15:37] in my experience, get very different reactions from lead developers. So let’s start with the
[00:15:42] simple one where I think we’ll all agree. So if you’re a company that’s doing something that’s
[00:15:44] not good for you, you’re not going to be able to do it. You’re not going to be able to do it.
[00:15:45] You’re not going to be able to do it. You’re not going to be able to do it. So if I talk about
[00:15:46] sending text messages, sending text messages is undifferentiated. You don’t care if you send text
[00:15:51] messages as well as the best other companies send text messages. Now, by the way, there’s probably
[00:15:56] some listeners who are going, no, no, no. If you don’t register your 10 DLP campaigns, you’ll get
[00:16:01] the messages rejected. Yeah. But if you do it as well as the top people, you’re fine, right? If you
[00:16:07] can send text messages as well as A quality things, you will be registering 10 DLP campaigns. You’ll
[00:16:14] have your short codes and you’ll use Twilio or something like Twilio, but Twilio is a great
[00:16:19] product. I know it’s a little expensive for what it is, but it’s a great product. And so I don’t
[00:16:24] run into a lot of lead developers who will tell me, no, I need to buy, you know, separate boxes
[00:16:29] and hook up phone lines and have telephony cards and like run them in a data center. I don’t find
[00:16:34] that. Although I did that in 2005, I ran IVRs that way. So I remember that it was awful. I would
[00:16:40] never do it again. And so I think everybody agrees that
[00:16:44] sending SMS is undifferentiated heavy lifting and you should use a service like Twilio to do it.
[00:16:49] Right. And in fact, you should probably use a service like Braze or iterable or customer IO
[00:16:55] to like actually trigger those because those help make it even easier. And you can do templating and
[00:17:00] have people do that. And if you don’t know these services, you should look at them up because
[00:17:04] they’ll make your life easier. And the back of my book actually has a whole appendix on,
[00:17:07] you should know all these services because they’re very useful. Now, let me get to a
[00:17:11] controversial example. And I don’t know why this is. I mean, I know,
[00:17:14] sort of why it’s controversial. Most lead developers want to run Elasticsearch or like
[00:17:20] Amazon OpenSearch themselves as their search index. When there is a service called Algolia,
[00:17:27] they will do all of the painful stuff for you. I will tell you that running search index that’s
[00:17:33] performant is undifferentiated. You do not need to do it better than anybody else. I mean,
[00:17:38] unless you’re competing with Algolia as a business. And so you can outsource a lot more to
[00:17:44] Algolia. It works phenomenally. It’s really fast. Elasticsearch, OpenSearch, they’re very
[00:17:49] finicky services. You’ll have to be responsible for their uptime. Indexing to them is kind of a pain
[00:17:55] in the butt. Generally speaking, front ends can’t talk directly to them because the security model
[00:18:00] is too poor. So you’ll have to build a proxy. They’ll have to query to your backend. It’ll slow
[00:18:05] it down. It adds more development. But most lead developers will say, wow, I looked at the pricing
[00:18:11] for Algolia. It’s really expensive. I don’t know how to do it. I don’t know how to do it. I don’t
[00:18:14] know how to do it. And I’m not quite sure. Oh, wait, how do I do it in a dev environment in the
[00:18:17] right way to save costs? No, no, no. It’s just safer. I’ll run it myself. I know how that works.
[00:18:22] We’ll run these things. We know how to run. But that is classically undifferentiated heavy
[00:18:27] lifting. Everything you’re choosing to do there to run OpenSearch or Elasticsearch when Algolia
[00:18:33] exists is undifferentiated. And if you take an actual cost of ownership, like the time that it
[00:18:44] takes internally to run that thing, Algolia is much cheaper, just period. But most developers
[00:18:49] will make the choice on behalf of their organization to in-house this undifferentiated
[00:18:54] heavy lifting with something like OpenSearch or Elasticsearch because of, honestly, this misplaced
[00:19:01] sense of how they should be optimizing. And so that’s my favorite example for this
[00:19:05] undifferentiated heavy lifting. If you can outsource it, if it’s not something that makes
[00:19:11] you different, you should use a service because you can outsource it. And if you can outsource it,
[00:19:14] you will always be asked to do more things that you can build as a developer that are
[00:19:20] differentiated, that are like special to your organization. So don’t add in things that aren’t.
[00:19:26] Yeah, I like the last line, right? So you are always being asked to do more things that bring
[00:19:31] differentiators to the business, right? So I think not just Elasticsearch or such kind of a technology,
[00:19:37] people these days run their own logging, messaging bus, maybe even containers on Kubernetes and all
[00:19:43] that.
[00:19:44] So I think that takes a lot of engineers’ hours and also effort. And not to mention,
[00:19:49] if you want to make it production scale with high availability, that takes even much more effort
[00:19:54] and actually cost. So speaking about cost, I think this is probably where the, I don’t know,
[00:20:01] misconception or miscalculation, so to speak, why people are still opting for managing things
[00:20:07] themselves. They think, oh, it’s open source. I can also run it myself. Maybe build POC. It works
[00:20:12] pretty simple, right?
[00:20:13] And they just start from there. But I think the most important challenge is to understand the real
[00:20:19] cost, right? In your book, you have some breakdown of cost. Maybe from your advice, how would you
[00:20:23] tell us to look at cost differently so that we can actually see the differentiated versus
[00:20:28] undifferentiated?
[00:20:29] I mean, I think that you really need to internally understand how much it costs for you to develop
[00:20:38] and run things yourself in terms of your people costs. And I think you also need to understand
[00:20:43] it in terms of your velocity costs. So in general, the more teams that you have in your company
[00:20:52] that are required to get code live, the more inefficient you are and it’s a compounding effect.
[00:20:58] In my book and everywhere I talk, I recommend people read Accelerate and use the DoraMetrics.
[00:21:04] We use Linear B at Branch as a way of monitoring the DoraMetrics. The cycle time is critical.
[00:21:11] And so what I see…
[00:21:13] I see is that the more undifferentiated work you do, the longer your cycle time. And especially when
[00:21:21] the undifferentiated work is running the things. And so when you use a service like Algolia versus
[00:21:29] you are running your own Amazon OpenSearch or you’re running Elasticsearch on containers and
[00:21:34] you’re building a proxy service in there, those are orders of magnitude different impact on cycle
[00:21:40] time when you release changes to how things go to the index.
[00:21:43] Yeah.
[00:21:43] And I think if you’re honest with yourself, both about the actual costs of people
[00:21:48] and about the impact on something like cycle time or the other DoraMetrics,
[00:21:53] it gets really clear that you just don’t want to run… I mean, ideally, use managed services or
[00:22:00] don’t write code, use managed services always gives you better outcomes here. Now, always does
[00:22:06] have an asterisk. I mean, there are services that are very expensive that don’t make sense for you
[00:22:12] to use.
[00:22:13] Algolia isn’t a solution for everyone all the time because if you have an index where you’re
[00:22:19] searching billions of records, probably Algolia is not reasonable. But the way I generally think
[00:22:25] of this is there’s like 90, 95% of the time, we’re all kind of building the same apps. They
[00:22:31] don’t have like enormous traffic. By enormous traffic, I mean, they don’t have so much traffic
[00:22:37] that your data transfer costs are like the biggest part of your bill. Generally speaking, none of us
[00:22:43] have more data transfer costs. So it’s not a solution for everyone all the time. And so
[00:22:43] it’s a way to set that up. But you know, if you have a user that has more than, I don’t know,
[00:22:44] a thousand simultaneous users, I mean, generally speaking, most applications don’t have more than
[00:22:49] like 15 simultaneous users, truly simultaneous, but a thousand simultaneous users, most of our
[00:22:54] database tables at most are sort of in the tens of millions at most. There’s a way to set that up.
[00:23:03] And so there’s sort of a standard 90, 95% of the time. And yet most or many lead developers I find
[00:23:11] always want in their head to be the same. And so, you know, I think that’s a way to set that up.
[00:23:13] Designing for like huge amount of data transfer, like, you know, millions of simultaneous,
[00:23:18] like that’s the model of aspirations. I built this like tank. But the problem is like,
[00:23:23] those just need to sit. Those are all exceptions. And most of the time,
[00:23:28] actually looking at door metrics, actually looking at costs, you’ll find that managed
[00:23:33] services that are used by everyone at Twilio and Algolia, Cloudinary, whatever, are just going to
[00:23:38] be any sort of honest and fair comparison of actually taking into account the costs. And I’m
[00:23:43] shocked at how many people just don’t want to even think about how much developers cost. They
[00:23:50] want to view them as like developers are free. And so if you’re a lead developer, you don’t know how
[00:23:55] much your development team costs. And you’re making decisions based upon this is more expensive
[00:24:00] than that. Like, you need to find out how much the team costs. I mean, like, if you don’t know,
[00:24:05] you obviously can’t be making good decisions on what actually costs too much or too little.
[00:24:10] Yeah. So I think speaking from my experience, what I can see in,
[00:24:13] my area, right, industry, startups, a lot of people, especially during the tough time recently,
[00:24:18] right? The winter time, some people call it, right? So they just look at the bill. They will
[00:24:22] see, for example, all these managed service or serverless, it costs a lot. The easiest one to
[00:24:26] reduce is actually, you know, those bill, right? And since we have already a number of people,
[00:24:31] right? Okay. Maybe let’s just switch that to something that we can manage ourselves.
[00:24:35] But what you’re saying is true, right? So at the end of the day, the cost of the people needs to
[00:24:39] be calculated equally, right? So you cannot just say, I reduce the managed service bill,
[00:24:43] but actually that cost is translated to your engineer hours, right? And also probably headache
[00:24:48] whenever there’s an incident or trying to scale things, right? The on-call thing will create some
[00:24:52] kind of a cultural issue within the team as well, right? So the harmony might not be as good as
[00:24:57] before. And speaking about this managed bills, sometimes I find as well, the bill is so high
[00:25:03] because it’s unoptimized. The usage is unoptimized. So maybe also you can think of just optimizing
[00:25:10] rather than switching gear to managing yourself. So I think that’s…
[00:25:13] We have these built-in moments, right? Where every quarter or so we take a look at all the
[00:25:19] bills that are more than 30,000 a year. And we just go ask, you know,
[00:25:24] what can we do to get them down? We move services a lot. And one of the benefits of serverless is
[00:25:29] because you have this really nice container, right? I’m using this service. It’s got a
[00:25:34] well-documented API. I know what I’m using in it. It’s all isolated over here. It actually
[00:25:39] turns out to be much easier to switch them and move them out than it is.
[00:25:43] You build it internally. So we’re on our like fourth referral management service.
[00:25:48] We used airplane.dev as an automation tool and they shut down with 60 days notice. I think today
[00:25:55] is the last day they’re running. And we actually decided to in-house what we had put there,
[00:26:00] but it was so much easier to do it because we could just copy their API.
[00:26:05] And so we were able to take all this stuff we’d written for Airplane and port it over.
[00:26:09] You know, I mean, I don’t want to understate the developers who’ve been working on that have like
[00:26:13] working very hard and like weeknights and weekends to like get that done. But to think
[00:26:18] that we took a service that we were spending like $60,000 a year on, and we were able to
[00:26:23] rebuild parts of it, like a little shim infrastructure for what we were using of it
[00:26:29] in about in less than 60 days, probably more like 35 days was a result of having used that service
[00:26:35] was a result of having this serverless mindset and of being able to say, okay, now we understand the
[00:26:40] problem so well, and we isolated it so well. And so we’re able to do that. And so we’re able to
[00:26:43] that actually rebuilding it or understanding how to shift it somewhere else ended up being much
[00:26:48] easier. And so I love being in an environment where everything is kind of contained and small,
[00:26:56] there’s good separation of concerns. And the best way to do that isn’t microservices. The best way
[00:27:00] to do that is to use a lot of managed services because they’ve already built all those great
[00:27:04] APIs for you. And then it, look, the case of it’s too expensive or unlocked in is you just build it
[00:27:10] then. I mean, that thing that’s so wild to me is people saying like,
[00:27:13] serverless is about lock and it’s terrible. It’s actually like a very cheap way to like learn how
[00:27:18] to build something. And then if you’re going to build it yourself, just copy the parts of it that
[00:27:21] you like. It’s actually much faster. Yeah. Interesting insights, right? So I think many
[00:27:27] people are, I mean, do afraid about the lock-in and all that, but I think, hey, if you can get
[00:27:30] started very easily and you can use all the power of the people operating those services, right? So
[00:27:36] they are very well expert in running those services, right? And actually just use that
[00:27:41] because yeah, it will make you start.
[00:27:43] Really, really easily. And I think you mentioned earlier that if you have a choice of writing code
[00:27:49] versus picking up managed service, you should always choose, you know, managed service. And in
[00:27:54] your book, you mentioned this thing called code is a liability. So I think some people think code
[00:27:58] is asset, right? Some people think code is the most important thing in your company. So tell us
[00:28:03] why code is a liability. Yeah. I mean, code is something you have to maintain. And so this idea
[00:28:09] doesn’t come from me. And so in the book, there are some good links here, but you should,
[00:28:13] think about anything that I write is something there that’s necessary for my organization to
[00:28:19] work properly that I have to make sure has proper testing, like doesn’t regress, keeps working. And
[00:28:25] so I have to maintain all of that. That’s a liability. You know, the asset is the experience
[00:28:29] you’re building. And so if I can build something with 10 lines of code, or I can build it with a
[00:28:36] thousand lines of code, I’m going to be much happier in the long run with the 10 lines of
[00:28:41] code. It’ll be easier to maintain. It’ll have a lot of flexibility. It’ll have a lot of
[00:28:43] less bugs. It’ll be easier to test. All of those things will be much better. And so the less code,
[00:28:48] the better. At Branch, we have a kind of a waterfall process where when somebody says,
[00:28:53] hey, can we build this thing? Will you build this thing? Our first question is actually,
[00:28:58] do we need to build anything at all? And so we do a lot of, why don’t you use this software as
[00:29:03] a service instead? And if we need to make some sort of integration, we’ll do that. But like,
[00:29:07] let’s just, just go buy a software as a service. That’s step one. Step two is kind
[00:29:13] of, can we like down scope it? Because if you buy a software as a service, like you get all
[00:29:17] those features, those are great. If that won’t work, how do we like take out what you’re asking
[00:29:22] for and make it really small? One of the things that I’ve found over my career, what generally
[00:29:27] happens in organizations is people say, I want this thing. And they usually make it really big
[00:29:31] because they know that you’re going to go develop it. And as soon as you put it live,
[00:29:37] you can’t work on it again for a really long period of time. And so they’re going to pack
[00:29:41] everything they can into it.
[00:29:43] Because that’s the only way they know how to get. If you reorganize, and this is very much in line
[00:29:49] with the Dora metrics and the agile manifesto. If you say, look, two things, one, I’m going to
[00:29:54] make you down scope this thing to time, but two, I promise we’ll keep working on it immediately.
[00:30:00] Like we won’t stop working on it. It’s not a limited time window. And then we work on other
[00:30:04] projects for you. Keep working on it as soon as it gets released, but it’s got to be really small.
[00:30:09] Then you get all this benefit of the feedback and the cycle. So you just
[00:30:12] squash down what people want to like very tight, like just helping them a little bit.
[00:30:18] And then you get it live and then they can see it. And then they can make that next request.
[00:30:21] If you train an organization that way, it’s easier as a founder than as someone who’s
[00:30:26] inside an organization with other founders. But I can tell you, you can train an entire
[00:30:30] organization to think this way. And as long as you deliver on, we will make updates. You come to us,
[00:30:37] we’ll go get other small updates in as quickly as possible. Everything gets more efficient because
[00:30:42] when you build these big projects, like they’re poorly designed, you put them live. They don’t
[00:30:46] work right. We realized like, oh, that was a bad idea. That’s a bad interface. People don’t
[00:30:50] understand it. Like just build it small and watch what happens and iterate on. So we have this whole
[00:30:55] process of buy SaaS, down scope, use a managed service, use open source, everything we can
[00:31:01] to not write custom code or to write as little custom code as possible.
[00:31:06] Yeah, I think it’s a good principle thought process for anyone who listen, right? So code
[00:31:10] is a liability. It’s a reminder again. So if you’re a founder, you’re going to have to do a lot of
[00:31:12] writing a lot more code and infrastructure code or configuration also counts as code, right? So
[00:31:18] yeah, do remind yourself. It does, although, and we don’t have a ton of it, but I do think that,
[00:31:24] and this may be an unpopular opinion, but I do find that domain specific languages for
[00:31:30] infrastructure like YAML and like CloudFormation, I do in my head, put that in a different category.
[00:31:35] And for this reason, my experience is when we write CloudFormation YAML and we get it right,
[00:31:42] we never change it. And so the maintenance cost of CloudFormation YAML is very low.
[00:31:50] And so I don’t really worry about having a lot of that. In contrast, something like Pulumi or CDK
[00:31:57] that’s like code, I find it gets a lot of edits. People keep updating it. It’s got a lot of
[00:32:01] maintenance. And so it has a different profile to me. And so I would view CDK infrastructure,
[00:32:08] Pulumi infrastructure as code, as this code that you want to minimize.
[00:32:12] I actually think of YAML as more like a yarn lock file or something like that. Like it can
[00:32:18] get really big, but it’s not a maintenance headache. So I actually don’t find lots of
[00:32:22] YAML to have the same liability problem because generally speaking, you get it up and it works
[00:32:28] and you don’t change it. Interesting. So speaking about serverless, right? I think we have talked
[00:32:33] a lot about serverless, but I think it’s appropriate to get the definition right.
[00:32:36] Some people think serverless is, you know, Lambda. Some people think it’s a function as a service.
[00:32:42] Right.
[00:32:42] In your view, right? What is your definition of serverless? Because I think from your book,
[00:32:46] it encompasses more than just a function as a service.
[00:32:50] Yes. And I give, I think like four definitions in my book of it. My favorite definition of
[00:32:55] serverless is it’s not my uptime. And so it’s functionally about not having to run that if it
[00:33:02] goes down and it’s my fault, it has to be like I misconfigured it or I wrote bad code, but otherwise
[00:33:07] I don’t control the uptime. But Ben Kehoe also has, you know, Ben Kehoe’s definition,
[00:33:12] I’m only doing differentiated code, right? Everything that’s undifferentiated, I’m handing
[00:33:18] off. And there’s a bunch of stuff about scale to zero and things like that. If your view of
[00:33:23] serverless is, and I saw a post on LinkedIn recently, which had somebody saying serverless
[00:33:28] is expensive and it’ll send you in a ruin and it’s too complex. And just had all these diagrams
[00:33:32] of like Lambdas calling Lambdas and using AWS services. And like, I tend to agree that those
[00:33:37] are, I don’t like those infrastructure. I’ve never built one of those. And I think that they’re
[00:33:41] overly complex. I think that they’re too complex. And I think that they’re too complex. I think that
[00:33:42] And I don’t think he under, I mean, he’s never read any of kind of the serverless practitioners that I agree with, at least about what serverless is. And so to me, serverless is very much about asking, how do I use most of our serverless footprint at branch is managed services. So it is not, I mean, we use Lambda, we use DynamoDB, we use Cognito, we use AWS AppSync. These are amazing services. My book explains how we use them and why they’re so great.
[00:34:10] And so we do use Lambda, but Lambda isn’t everything in the application.
[00:34:15] It’s like where we need to run some backend code.
[00:34:18] We run it with Lambda, but most of the Lambdas are getting triggered by calls through AppSync.
[00:34:23] But if you don’t know what AppSync is, like to me, it’s like a key part of the infrastructure.
[00:34:27] I also written a lot with Firebase.
[00:34:29] I think Firebase is also wonderful.
[00:34:31] I think what writing kind of hobbyist apps, I think Firebase is just easier to work with than AppSync and the Amazon infrastructure.
[00:34:38] So I like that Firebase infrastructure.
[00:34:41] For simpler applications, the Amazon one is just much better for like financial services, at least it’s more robust in a bunch of ways.
[00:34:49] But that’s how I think about it is really, am I responsible for uptime as long as the code functions properly, then that’s serverless.
[00:34:56] And so I often will tell people, you know, it’s serverless to go build your shopping site on Shopify.
[00:35:03] Like that’s a serverless architecture for me.
[00:35:05] And if you don’t think about it that way, then I don’t think you’re in line with what serverless actually is.
[00:35:10] So I think that’s a good way to get the business benefits out of it.
[00:35:13] Right.
[00:35:13] And you also mentioned managed service.
[00:35:15] I think for some people, there is a lot of confusion as well.
[00:35:18] What do you mean by managed service?
[00:35:20] I think Twilio is the best example for people at a managed service.
[00:35:23] It is cloud-based API, usually, that you pay and they have a well-documented API.
[00:35:30] They’ve got a whole operations team.
[00:35:32] It’s multi-tenant as a service and you pay them money and they’re just up and they do things for you.
[00:35:37] Yeah.
[00:35:37] And I think in your book, if I remember, I also see like, for example,
[00:35:40] if you have to patch something, right?
[00:35:43] If you have to choose the size, if you have to scale it somehow, right?
[00:35:47] I think that’s not serverless, right?
[00:35:49] Right, right.
[00:35:49] And there are these gray areas, right?
[00:35:51] Because in Lambda, you will pick a size.
[00:35:54] And so like, I view Lambda as serverless.
[00:35:56] And I would just say in Lambda, just buy the largest size because you get more CPU with it, at least by, I don’t know, four gigs.
[00:36:03] They default to this minimum size that’s terrible and you shouldn’t use it.
[00:36:07] But, you know, that’s a gray area.
[00:36:08] But yeah, I also think about it like, yeah.
[00:36:10] So if you have to patch it, but also if it runs at a disk space, whose responsibility is it to manage that?
[00:36:16] That’s a question I like to ask a lot.
[00:36:18] And so if it’s your responsibility, if you fill up a disk, like it’s not serverless.
[00:36:22] Right.
[00:36:23] So I think that’s a good reminder, definitely.
[00:36:25] And of course, when we talk about serverless, there are many skeptics, like you mentioned, right?
[00:36:29] So many objections.
[00:36:30] In the beginning, we talk about cost, lock-in.
[00:36:33] Some people also think about security, right?
[00:36:35] Because let’s say you use managed service.
[00:36:37] Those managed service is like a vendor, third party, right?
[00:36:39] You don’t know them.
[00:36:40] And your data spreads across different services.
[00:36:43] What’s your take about all these objections?
[00:36:45] Maybe tell us how would you actually, you know, advise people?
[00:36:49] Yeah, the most important thing that you need is good third party risk management in your organization.
[00:36:54] And I would actually say one of the biggest hindrances to serverless adoption in organizations that really want to is that they have very poor third party risk management programs in their company.
[00:37:05] And what they have decided to do in the company, maybe not even consciously.
[00:37:10] But what they’ve decided to do in their company is basically make it really hard to do contracts with other companies as a way of security.
[00:37:18] That’s not security.
[00:37:19] That’s just pretending you’re secure.
[00:37:21] And I recognize a lot of lead developers may not have full control over this, but it’s useful to know.
[00:37:26] It’s useful to, like, have this in your head.
[00:37:28] Your company should have a robust way for you to say, I would like to use this company and to be able to review it in a reasonable amount of time and see if it meets the security guidelines.
[00:37:40] For the information that that service will get, you should have a good way of looking at its historic uptime and evaluating whether you think it’s uptime is going to be appropriate for your use for it.
[00:37:50] And then, you know, your finance department, your legal department need to be able to review the contracts and make sure that they have an appropriate DPA, for example, and have other things.
[00:37:59] And then if they don’t meet those things, you shouldn’t do business with them.
[00:38:02] But I’ll tell you, as a regulated U.S. insurance company, there are a few companies regulated, you know, as heavily as U.S. insurance companies.
[00:38:10] We use lots of these services and it’s not a problem, but it’s about having really good third-party risk management and that third-party risk management understanding that they exist to help us do contracts and work with other companies because that’s a key differentiator in how fast we can build.
[00:38:28] So getting the company aligned with that is a top-down goal of, like, helping everybody understand we need this function within our company to be able to do these things.
[00:38:37] But that’s how you do it.
[00:38:38] I’ve been in charge of information security.
[00:38:40] We have, you know, many organizations that have had a lot of compliance requirements.
[00:38:45] And every time I run into anyone that says, well, we would never pass compliance with that, I’m just like, no, you just don’t know how to do it.
[00:38:50] I guess to quote Twitter, that’s a skill issue.
[00:38:53] It’s not difficult, but it does require sometimes creativity and it does require knowing your stuff.
[00:38:58] You do need to understand, like, you know, what is our information security policy?
[00:39:02] Why do we have these rules?
[00:39:03] Why do these regulations exist?
[00:39:05] And you need to be fluent enough in it and have a good chief information security officer.
[00:39:10] So you don’t have all these things that can be very difficult, but I can tell you it is so much easier to be secure with serverless and managed services than it is with an internal network.
[00:39:24] And so, like, we are fully no internal network, fully zero trust at Branch.
[00:39:29] And the levels that we can achieve and what we are able to do on the budget we have, which is very low, is absolutely phenomenal and would have been completely unattainable for me 10 years ago.
[00:39:39] At Buildfax, which honestly has what would be considered like a best practices cloud architecture today.
[00:39:45] I mean, it’s full infrastructure as code.
[00:39:47] It’s got great network roles and great VPCs and everything.
[00:39:50] But managing that, proving that and everything around that is so much harder.
[00:39:55] And so, you know, didn’t leave a lot of budget for other things.
[00:39:58] And at Branch, we have managed to make budget for just amazingly high quality security.
[00:40:04] I’m very amazed when I, you know, read that Branch is an insurance company, highly regulated.
[00:40:09] Yeah.
[00:40:09] And yet you succeed in adopting all these serverless, right?
[00:40:12] Because I think security compliance is always top of mind, especially in fintech or, you know, financial services industry, right?
[00:40:18] But I think what you mentioned is probably like it’s more about understanding the compliance itself, right?
[00:40:23] And be creative in solving it.
[00:40:25] And maybe you put more proper checks and balances before you actually use those managed services.
[00:40:30] And speaking about, you mentioned it’s not our uptime, right?
[00:40:33] So what happens in incidents where the third party is down, right?
[00:40:36] I think it’s also one of the most common objections.
[00:40:39] If those core services is down, then we are also down and we can’t do anything.
[00:40:44] So what’s your take on this objection?
[00:40:46] Well, one thing is we run in US East 1, which is great because if there are problems in US East 1, we know a large chunk of the internet’s going down as well.
[00:40:54] And generally speaking, we go down less than the rest of the internet.
[00:40:58] So I’ve been in US East 1 since 2008.
[00:41:00] And this is what I found.
[00:41:01] So I remember the outage in like 2012.
[00:41:04] And there was a Netflix outage there.
[00:41:05] I just realized that, you know, you get a lot of grace when it’s.
[00:41:09] So in the last five years, there was only one significant US East 1 outage.
[00:41:14] It was a couple hours.
[00:41:15] It didn’t take us fully down.
[00:41:17] It was the one that like affected Route 53 DNS, where it became clear that that service is bound in US East 1 and it’s not really free from US East 1.
[00:41:28] And so, you know, everyone says, why would you be in US East 1?
[00:41:30] But I’ll tell you, like, if US East 1 has problems, like no one’s going to blame you because every all of their other products aren’t going to work either.
[00:41:36] And then outside of that, you know, we basically don’t.
[00:41:39] We don’t have outages.
[00:41:40] We did have one in January 2022.
[00:41:43] So we’ve been selling insurance since the middle of 2019.
[00:41:46] And so I can tell you other than that couple hour US East 1 and this January 2022, where this insurance specific vendor we used.
[00:41:56] And so there aren’t a lot of other choices.
[00:41:58] This insurance specific better we use went down.
[00:42:00] Other than that, you know, they’re kind of partial.
[00:42:02] They’re degraded services from time to time.
[00:42:05] Occasionally something won’t be available.
[00:42:06] Like we have to pull motor vehicle records and.
[00:42:09] Those come from the states and those services go down like every other day, one of the services down for like 15 or 20 minutes.
[00:42:15] And so, you know, you just build a lot of alerting around it.
[00:42:18] And so our applications are very aware of that.
[00:42:21] But in terms of the core infrastructure of AppSync, Dynamo, Cognito, Lambda in US East 1, I don’t think those have really gone down at all ever.
[00:42:29] Even in that US East 1 problem, they were all still working.
[00:42:32] I think Cognito had some issues, but was just slow.
[00:42:36] So I would say my experience with it’s not my uptime.
[00:42:39] I hired a company that is very good at uptime has resulted in, we basically don’t go down.
[00:42:45] We have much better uptime than I’ve ever had running, you know, in any of these organizations running my own infrastructure.
[00:42:50] And so we’re running some of the infrastructure.
[00:42:53] So I think it’s kind of a non-issue.
[00:42:55] It is certainly possible.
[00:42:56] I will give one sort of caveat here is I did build an application using off zero for authentication and that service went down all the time in my view, not for long periods of time, but all the time.
[00:43:09] And like, I would never, I mean, maybe they’ve fixed this.
[00:43:12] I’m skeptical though, after their acquisition and like everything that’s happening with Okta, I would never use off zero again and I would be very cautious in authentication services, but I can personally vouch for Cognito and Firebase off having been very reliable services.
[00:43:29] Yeah.
[00:43:29] So when you speak about uptime and reliability, right, when you choose serverless, it’s not like any kind of serverless or managed service, right?
[00:43:35] You also look at their historical uptime.
[00:43:37] You have.
[00:43:38] Yeah.
[00:43:38] Can they get it?
[00:43:39] Can they get a guarantee, a good SLA, right?
[00:43:41] Is there any compensation that they will give whenever that’s the case?
[00:43:44] Yeah, I tend to look SLAs are non-compensatory.
[00:43:46] Like they’ll never compensate you.
[00:43:48] So I tend to like an SLA where it’s punitive to the company.
[00:43:54] What I like to feel is like, if you go down, it hurts you.
[00:43:57] And I do think like Amazon views these outages very personally and reputationally, and I think it matters to them quite a lot.
[00:44:04] And so they’re never going to like compensate me appropriately in the SLA.
[00:44:08] But I think the.
[00:44:09] I think the SLA is the thing to look at.
[00:44:11] And just to make sure you’re getting an honest look at the historic uptime, because that’s the best way to tell.
[00:44:15] We do use some services that are in beta and you know, we just do a lot of testing to get comfort with them.
[00:44:21] But there are occasionally times we’ll say, yeah, we’re not going to use that yet.
[00:44:25] We’re going to wait another six months or a year.
[00:44:27] But we use neon really early serverless Postgres because we really wanted it.
[00:44:32] And Amazon serverless Postgres are not serverless.
[00:44:35] They’re terrible.
[00:44:36] And neon is a great serverless Postgres database.
[00:44:38] And so we started using them when they were in private beta.
[00:44:41] We tested them out.
[00:44:42] We were like, how do we feel?
[00:44:43] And we just built in some backup caching in case it wasn’t available.
[00:44:47] And that’s worked out really well.
[00:44:49] Neon has been very reliable.
[00:44:51] Yeah.
[00:44:51] Apart from the historical uptime, I think one thing that is quite, you know, recent trend, right?
[00:44:56] People updating postmortem whenever there’s an incident, right?
[00:44:59] If the company diligently uploads postmortem and they take it seriously, I think that may also be an indicator that the company is serious about their uptime.
[00:45:08] Right.
[00:45:08] Yeah.
[00:45:09] It’s another reason why Amazon is so serious about that in a way that Google and Microsoft are not as serious about it, that I just feel a lot safer for production workloads in Amazon.
[00:45:19] Although, again, I run them in the other clouds as well.
[00:45:22] Henry Suryawirawan, I think many people here would have been intrigued by branch, right?
[00:45:25] And let’s talk a little bit about branch.
[00:45:27] What makes it unique?
[00:45:28] Maybe in terms of how you run the development team, how you run the infrastructure.
[00:45:32] There are a few things that you mentioned in the book.
[00:45:34] Maybe let’s start with your primary development principle, right?
[00:45:37] Where you say.
[00:45:38] When we write code, we optimize for maintainability.
[00:45:41] I think this is very interesting.
[00:45:42] Maybe can you talk more about it?
[00:45:45] Yeah.
[00:45:45] I’m always surprised at how infrequently development teams will say what they’re optimizing for in terms of what they’re doing.
[00:45:52] Like most teams don’t have this principle, but it’s very useful to know when you’re like looking at code or when you’re sitting down to write something like, what is my guiding principle on how to write this?
[00:46:02] My experience is if you write something for a good organization that is going to live for a while.
[00:46:08] You know, I wrote code in 1997, that’s still being used widely in Pearl in 1997.
[00:46:14] And so I think a lot about what is the most important for the company and I don’t think it’s about making it speedy, right?
[00:46:22] I don’t think it’s about other things.
[00:46:24] I think it is.
[00:46:25] I want an average developer to be able to understand this and work on this.
[00:46:29] I think far too many companies are like, we’ve got these 10 X ninjas.
[00:46:33] We write for 10 X ninjas.
[00:46:34] We’re 10 X ninjas.
[00:46:35] We just hire 10 X ninjas and all that.
[00:46:37] First of all.
[00:46:38] If you have a bunch of people who think they’re 10 X ninjas, one, they’re probably not.
[00:46:42] And two, they’re probably writing unmaintainable garbage.
[00:46:44] It’s a lot better to have people to have the cultural identity of like, it’s important for me to write.
[00:46:49] I do like the idea of like, I’m writing this and next developer has to work on this as a gun to my head.
[00:46:54] Like write this thing.
[00:46:55] So it makes sense.
[00:46:57] So it’s easy to read so that the code review that you have one, you should have code review, you know, everyone, not senior developers should have their code reviewed.
[00:47:05] Everyone should have their code reviewed and we should have.
[00:47:08] You know, an understanding that I’m writing this so that somebody else, an average developer can come along and understand what the heck I was doing.
[00:47:16] And like, let’s do that.
[00:47:18] And it makes code reviews easier to know, like, this is what I’m optimizing for.
[00:47:23] It’s like having in a linter or having style rules.
[00:47:26] You just like have opinions about things and everything gets easier.
[00:47:31] And so that’s our primary principle.
[00:47:32] And there’s a corollary to that, which is less code in general is more maintainable than.
[00:47:38] And so there’s a lot of debate about, like, should you dry things up?
[00:47:43] How do you think about repetition?
[00:47:45] I think you can obviously go overboard with dry, but I do think in general, it’s very common.
[00:47:51] We hire a lot of junior developers at brand.
[00:47:53] It’s very common for junior developers to have patterns that are too repetitive, that are just much better and much more readable because it’s just, everybody knows the pattern of like, if I equals one, then, you know, return one, if I equals two, then return two or whatever.
[00:48:07] Like that.
[00:48:08] Obviously, that’s just, you shouldn’t, that’s too much repetition.
[00:48:11] Like I find more junior developers tend to not have these like ways of like, how do I abstract this?
[00:48:18] Like, there’s a pattern that I’m repeating here.
[00:48:21] How do I abstract that?
[00:48:22] So in a very normal setting, I find that I do reviews and like somewhat frequently say, Hey, you can get rid of some of this repetition, like abstract some of this logic.
[00:48:32] And, you know, I think that that’s the key.
[00:48:34] Part of the principle is to have a less good, but not at the expense of me.
[00:48:38] And I think that’s the key to sustainability.
[00:48:40] Yeah.
[00:48:40] Speaking about developers, I think this is also unique in branch, right?
[00:48:43] You said that you hire more junior developers and actually you have less so-called infra developers or engineers, right?
[00:48:50] So tell us like, how do you actually run a good insurance company with mostly junior developers?
[00:48:57] Yeah.
[00:48:57] My hypothesis in starting branch was that, and I had seen this at a small scale, but not at a large scale before.
[00:49:05] Is it?
[00:49:06] If we build serverlessly.
[00:49:07] And mostly we’re going to be building interfaces.
[00:49:10] And so my belief was I could hire front end developers and I could basically make them full stack developers.
[00:49:16] As long as we were using the same language, JavaScript, TypeScript on the front end of the backend.
[00:49:21] And I thought if it’s not our uptime, if other people are doing the uptime, then I should, in theory, just be able to hire front end developers, make them full stack developers, and then that’s all we would have.
[00:49:33] And then everyone would just be a developer and then everybody could work on anything.
[00:49:37] They could work on.
[00:49:37] The infrastructure, which is in YAML.
[00:49:40] They could work on the front end, they could work on the backend, and then we’d have it all in a monorepo.
[00:49:43] Oh, and we’d have a mobile app.
[00:49:45] They would also be using JavaScript and react as well.
[00:49:49] And you know, everything would be kind of the same.
[00:49:51] It would all be the same repository.
[00:49:52] We would deploy it monolithically.
[00:49:54] Everyone would have their own isolated Amazon account to deploy it.
[00:49:57] And so this was the vision from the beginning and it worked.
[00:50:00] It worked so well.
[00:50:02] And actually let me take this back.
[00:50:03] So at the beginning I said, I can’t hire a typical.
[00:50:07] Senior developer because a typical senior dev, I know what’s going to happen if I hire a typical lead dev who’s listening to this podcast, if I’d hire them into that, the average, one of them would have said like, oh, I know how to build.
[00:50:19] It’s not like this.
[00:50:21] We need containers.
[00:50:22] We need to go build some containers.
[00:50:23] We need to do this thing.
[00:50:23] We need to do that thing.
[00:50:24] And my vision would be eroded.
[00:50:26] And, and I believe in empowering people.
[00:50:28] I hire, I want to hire someone and I want to let them do it the way they want to do it.
[00:50:32] And so I said, I can’t do that.
[00:50:34] So I said, okay, the safest thing that I can do that.
[00:50:37] Yeah.
[00:50:37] Is if I hire more junior front end developers who I feel are capable on the front end side, like I think I can teach them how I’m thinking about building this and I, they will know what he differently, right?
[00:50:50] By the way, it’ll be so empowering because instead of just being a front end developer, who’s like dependent on other people, they’ll be able to work on everything.
[00:50:56] They’ll work on infrastructure and everything.
[00:50:58] And so we did this and it worked really well.
[00:51:00] And we said, okay, let’s hire people directly out of boot camps now, instead of like with a little experience.
[00:51:06] And so we did that.
[00:51:07] And we.
[00:51:07] And we realized these boot camps, aren’t teaching people anything useful.
[00:51:10] Like we’re having to like retrain them on everything.
[00:51:13] I’ve got lots of thoughts about boot camps.
[00:51:15] And so then we said, let’s run our own boot camp because the boot camps aren’t useful.
[00:51:20] And then we ran our own boot camp and we did it again.
[00:51:22] And so, you know, my opinion today is if you set this up correctly, and if you hire the right people who don’t have preconceptions about, you know, how things should be done, then we have lots of people who are very happy with this.
[00:51:37] So we’re going to run our own boot camp and we’re going to be able to do it.
[00:51:38] By the way, we did a search for a VP engineering fairly early on about three and a half years ago and found this wonderful guy, Ivan Herndon, who had been an engineering manager at StockX, but mainly front end, but like really understood everything and had taught at a boot camp.
[00:51:55] And he was like fully bought into like, this is a great model.
[00:51:58] I like this.
[00:51:59] But it took like eight months to find him.
[00:52:01] Like it was just like a long journey, but we had built enough then that we went out and hired more senior developers.
[00:52:04] And I think that’s what we did.
[00:52:05] And I think that’s what we did.
[00:52:06] And I think that’s what we did.
[00:52:06] And I think that’s what we did.
[00:52:07] And I think that’s what we did.
[00:52:07] And I think that’s what we did.
[00:52:07] And we basically screened for like, who’s going to like this, you know, and we found senior developers who were like, I love this.
[00:52:13] This is great.
[00:52:14] Like, this is exactly what I want.
[00:52:15] Like, it’s so easy and I have so much control.
[00:52:17] And like, again, when our senior developers are given a feature, they do all of it.
[00:52:23] They can do 100% of everything.
[00:52:24] The infrastructure, the front of the back end, like we truly have full stack developers, but they’re full stack developers because the stack is simple, right?
[00:52:32] It’s all the same language.
[00:52:34] It’s all in the same repository, right?
[00:52:36] It has a monolith.
[00:52:37] It has a monolithic deploy.
[00:52:38] And so we can make full stack developers if we make the stack simple.
[00:52:43] Otherwise it’s much harder to have full stack developers because you need to know too many different technologies.
[00:52:47] And so it’s worked phenomenally well.
[00:52:49] We just have developers.
[00:52:50] There’s no front end.
[00:52:51] There’s no back end.
[00:52:52] There’s no API.
[00:52:53] There’s no infrastructure people.
[00:52:54] There’s no ops people because it’s not our uptime.
[00:52:57] Yeah, I think it’s a really interesting model, right?
[00:52:59] If people haven’t heard about it, I think this is a very interesting thing that you can learn from Joe’s book, right?
[00:53:05] So I think how to make it work is something.
[00:53:07] I think that maybe you need to work on more, like for example, training people, make sure your code is maintainable, but I can imagine, for example, having everything as a serverless managed service, right?
[00:53:16] Every developer is empowered to not just code and commit their changes, but also bring it up to the production, right?
[00:53:23] And even operating it in a sense, because they can just see it end to end.
[00:53:27] And many people these days try to build platform engineering, right?
[00:53:30] Simply because they cannot do self service, right?
[00:53:32] They cannot make deployment themselves.
[00:53:34] So I think this is a different kind of perspective.
[00:53:36] How you.
[00:53:37] Run the engineering team.
[00:53:38] So if you use more serverless, I think you probably don’t need so much platform engineering effort, right?
[00:53:43] So you can do it end to end.
[00:53:45] And I think there are many other things in the book that you can learn from branch.
[00:53:48] Like for example, a few things that I learned, the department head actually is the one who kind of like negotiate contracts with the SaaS, not like the central team.
[00:53:57] And the other thing is like the engineering lead is the one who configure the cloud or be the DevOps engineer kind of thing, right?
[00:54:04] So I think that’s also interesting.
[00:54:06] Maybe one thing.
[00:54:07] If you can share, right.
[00:54:08] People might be intrigued, like how much cloud bills actually branches running.
[00:54:13] David Pérez- Yeah.
[00:54:14] I mean, I think in the book I shared, so we have an Amazon account for every developer and for every environment and every developer in every environment has a full copy of production.
[00:54:24] And this is cheap when you use serverless because everything scales to zero.
[00:54:29] So like the average developer who’s working at least 40 hours a week doing development, their environment might cost four or $5.
[00:54:37] A month because it’s just not that much usage.
[00:54:40] But yeah, I think I shared a bill where our entire all developer environments, all everything was about a thousand dollars at Amazon for that month who are now significantly bigger.
[00:54:50] So, you know, that production environment is probably more like the full bill seven or $8,000 a month right now for branch, but it’s every environment, every developer, everything, including like all of the continuous integration testing that we have.
[00:55:03] So it’s so much cheaper.
[00:55:04] I mean, at build facts in 2015.
[00:55:06] Yeah.
[00:55:06] I think the Amazon monthly bill was probably $30,000 a month, and that was doing so much less, honestly, in that environment, but having to run, you know, those VMs and containers.
[00:55:18] David Pérez- Wow.
[00:55:18] Seven, 8,000 a month for all environments.
[00:55:21] That’s really.
[00:55:21] All environments.
[00:55:22] Yeah.
[00:55:22] David Pérez- Yeah.
[00:55:23] And you are a full running insurance company as well.
[00:55:26] So kudos for you to run that.
[00:55:27] So I think we reached the end of our conversation.
[00:55:30] So before I let you go, so to speak, I have one last question for you, Joe.
[00:55:33] It’s been a pleasant conversation by the way.
[00:55:35] So this question.
[00:55:36] I call the three technical leadership wisdom.
[00:55:38] You can think of it just like advice that you want to give to us to learn from you.
[00:55:42] Joe Carlasare Yeah.
[00:55:42] So I’ve got three here for you and I’ll start with something that I’ve said before, which is one, you should have an optimization principle for how to write code.
[00:55:52] It’s just as important as having linting rules or style rules.
[00:55:56] You should also have an opinion about what you’re optimizing for.
[00:55:59] So we optimize for maintainability.
[00:56:01] I think it would be fine.
[00:56:02] It, you know, in some places it might be, we’re optimizing for like the lowest.
[00:56:06] Latency execution or whatever it is, but you should have that.
[00:56:09] You should write it down because it will help solve problems and it will help resolve questions and doubts when you’re not there.
[00:56:16] So you should do that.
[00:56:18] And then, you know, my view is optimized for maintainability is a great default.
[00:56:23] And the second thing is less code is more maintainable than more code.
[00:56:27] And that is just true outside of like obfuscation contests for like small lines of code.
[00:56:33] And so you should strive for less code.
[00:56:36] Okay.
[00:56:36] Again, the book has sort of an interesting waterfall about how do I, what is a tactic when I’m being asked to build something?
[00:56:41] How do I end up with less code there?
[00:56:43] And then finally, I have a saying that I give all the time.
[00:56:46] So I’ll give that as my final one, which is along these same lines, but it is that it is better to spend two weeks researching and two days developing than it is to spend two days researching and two weeks developing.
[00:57:00] And I’m always just amazed by organizations that don’t give developers.
[00:57:06] Time to research how they would do something.
[00:57:09] And I’m actually less amazed at developers who just want to start building.
[00:57:12] It’s the hardest thing I know.
[00:57:13] I mean, I always want to just start building, but the more you can train yourself and your teams to think about how do I spend a good period of time where I’m just trying to understand what’s out there, what are the options and how do I make that a good thing?
[00:57:30] And like, I’m going to throw away whatever’s done at the end of it.
[00:57:33] And this really, you need leadership.
[00:57:36] To help you with this.
[00:57:36] But I think as a tech lead, you can set this tone and I’ll give us the example of our airplane migration.
[00:57:42] When we found out essentially like January 5th, that we were going to have 60 days, we actually spent basically all of January trying to figure out what we were going to do.
[00:57:52] We looked at a bunch of other services that were options that we evaluated them and we talked to their teams and tried to understand things.
[00:57:59] We looked at what it would take to build it ourselves and what that would look like.
[00:58:02] And so we spent, I think a lot of people would stay.
[00:58:05] You spent too.
[00:58:05] Much time, you know, like not acting because the service is going to shut down and it was critical production functions.
[00:58:11] But the reality is you just can’t make the right decision unless you spend time understanding what’s out there.
[00:58:17] And today, if you’re building new things, one of the things you need to know what’s out there is what are the managed services that could do a big chunk of this for me.
[00:58:27] And you’re not going to understand those unless you give yourself the time to play around with them.
[00:58:32] And almost all of these providers will give you free trials.
[00:58:35] You may have to talk to someone.
[00:58:37] I know a lot of developers really hate scheduling the meeting and doing the zoom demo and stuff.
[00:58:41] You may have to do that.
[00:58:42] I’m sorry, but it’s still worth it.
[00:58:44] You should do it.
[00:58:45] Don’t say I can’t do a self sign up on this thing.
[00:58:48] I’m not even going to consider it because there’s no pricing page.
[00:58:50] We can’t consider it.
[00:58:51] Don’t do that.
[00:58:52] Go understand it.
[00:58:53] Even if you don’t use it, there’s actually real important value in understanding what is the service, get access to documentation, understand what that interface is, understand what that price is.
[00:59:04] All of that will really help.
[00:59:05] Yeah.
[00:59:05] Help you understand what you need to do better, even if you don’t use it.
[00:59:09] And we just don’t do that.
[00:59:10] We don’t do any research and planning in dev.
[00:59:13] It’s like, oh, I know one way to build that.
[00:59:15] So I’m just going to do it that way.
[00:59:16] And like, don’t do that.
[00:59:17] Do take weeks to research.
[00:59:19] It’s worth it.
[00:59:20] Yeah.
[00:59:20] Not just research in terms of product, but sometimes like design, right?
[00:59:24] Like how would you design your solution?
[00:59:25] Yeah.
[00:59:25] And sometimes understanding the problem itself, right?
[00:59:28] Because sometimes.
[00:59:28] Oh, absolutely.
[00:59:29] Yeah.
[00:59:29] All of that is critical.
[00:59:31] Yeah.
[00:59:31] Yeah.
[00:59:32] So thank you so much, Joe, for this opportunity, for this great talk.
[00:59:35] So if people love it.
[00:59:35] Love this conversation.
[00:59:36] They would want to connect with you or find more about your resources.
[00:59:40] Your book, is there a place where they can reach out online?
[00:59:43] Yeah.
[00:59:43] I’m Joe Emison on Twitter slash X and yeah, my book’s on Amazon.
[00:59:48] Serverless as a game changer.
[00:59:50] I really highly recommend people to read it if you are into serverless.
[00:59:53] And if you’re skeptics, I think you can also read this book just for your knowledge.
[00:59:56] Yes.
[00:59:57] So thanks again for your time, Joe.
[00:59:59] Thank you.
[01:00:02] Thank you for listening to this episode and for staying right.
[01:00:05] Until the end, if you highly enjoyed it, I would appreciate if you share it with your
[01:00:10] friends and colleagues who you think would also benefit from listening to this episode.
[01:00:14] And if you’re new to the podcast, make sure to subscribe and leave me
[01:00:18] your valuable review and feedback.
[01:00:20] It helps me a lot in order to grow this podcast better.
[01:00:23] You can also find the full show notes of this conversation on the episode page at
[01:00:28] techleadjournal.dev website, including the full transcript, interesting quotes, and
[01:00:32] links to the resources mentioned from the conversation.
[01:00:35] And lastly, make sure to subscribe to the show’s mailing list on
[01:00:40] techleadjournal.dev to get notified for any future episodes.
[01:00:44] Stay tuned for the next Tech Lead Journal episode and until then, goodbye.