Episode 10

Quinn and Thorsten announce that the Amp editor extension will soon self-destruct. The time of the sidebar is over.

Listen on other platforms

Transcript

Thorsten: Hello, and welcome to Raising an Agent, Episode 10. I’m Thorsten and here with me is Quinn. Hi, Quinn.

Quinn: Hey, Thorsten. All right, we shipped "Deep Mode." There’s a lot more stuff coming, but how have you been using Deep Mode? You’ve fallen in love with some of these new models.

Thorsten: Yes, a big fault of mine, but yes. We shipped Deep Mode last week. Deep Mode is a new agent mode in Amp. We only have three now: Smart Mode (the main mode with Opus 4 files right now), Rush Mode (using a fast model, I think it’s Haiku right now in Rush Mode), and the new mode called Deep Mode. It’s powered by GPT-5-to-Codec Medium—quite a name. We shipped it because I think that GPT-5-to-Codec is amazing in a lot of ways. It’s really smart. It can write some really good code, and it’s really thorough in how it researches and goes off and finds stuff. We called it Deep Mode because we realized the model isn't great at the "back-and-forth" super fast assistant tasks you might use Opus for. It’s kind of "lazy" for that; it researches for too long. But if you have a well-scoped problem or a big task, you can send this thing off and say, "Go and do this." You’ll need to switch tabs and do something else while it goes, researches in a thorough way, and often comes back with incredibly impressive results. We put this in another mode because it’s a different way of working with Amp.

Quinn: When we are trying a new model, we try it on internal tasks a lot. How did we get to the point where we realized actually this is a model that needs to be used in a different way? It's not a replacement for the way that you work with Opus.

Thorsten: I think when I first tried it—and there’s been a lot of chatter about GPT-5-to-Codec, Peter Steinberger famously loves it, Theo loves it—they all say it’s a different beast. We tried it when it came out and it just didn't feel like it fit the way I work. So then I gave it an honest try again, anticipating that I’d have to adjust my expectations. What we realized is that you don't look at the assistant anymore. A lot of the stuff we've been saying about "this is how you should prompt," the feedback loop—a lot of that doesn't apply. But the results don't lie. Opus is trigger-happy; it wants to run stuff and get back to you. Deep Mode just goes off and does something. When you see this in action, you realize: "Oh, this is another way to get to these results, but we need to treat it differently." Nico and Hitesh started to work on Deep Mode to optimize for this mode of working. Basically, we don't say "make it work exactly like Opus, make it work like an assistant," but make it work like a thing you can send off and it does stuff for you.

Quinn: Yeah. A lot of times when you hear people talking about "this model is better than this other model," it's a lot of people who were stuck on a model they were used to and they try a new model with the same kind of prompts and expectations. That doesn't work. One of the things with Agent Modes is we’re trying to get you out of that mindset. Even if you don't use these things right, they still are so magical and they often work.

Thorsten: Last episode we said the assistant is dead, long live the factory. That was already based on Deep Mode. Once we started playing with it, it became obvious that if the models are going in this direction—where they go for longer and further—then the assistant model of having the model watch you in a sidebar just doesn't feel like the thing you would do. You start async tasks that take 45 minutes and then you just check on them once they're done.

Quinn: If Codec 5.5 is just 10% more accurate and reliable, then you have to wonder: Why only have one running at a time? Why watch it in a sidebar?

Thorsten: We’ve optimized a lot of our dev tooling for agents over the last few months. We added more "skills" for how to do things in our codebase. For example, I added a skill so it knows how we release our web server and how we release our clients. If I have a bug, I can ask Amp, "Help me figure out why this wasn't deployed." It then loads the skill and knows how to do it. The more you accumulate of these skills, the more it knows how to navigate, and you don't have to provide that stuff in the prompt.

Quinn: Skills are the solution to the general problem of wanting to give the agent a little bit more help in a common thing I do. It feels like "Skills" as an abstraction has really stuck.

Thorsten: I don’t know who tweeted it, but somebody said: If you do something with the agent—say you use the GH tool to analyze a build failure—at the end of the conversation you can say, "Now build me a skill. Take everything you learned about solving this problem and put it in a skill." And we have in Amp a built-in skill that's good for creating skills. It can load the skill and figure out where to put it and how to write it. On the low-level spectrum, we have a TMUX skill. That basically explains how to use TMUX in our codebase—how you can test the CLI using TMUX. Our TUI (Terminal User Interface) needs to be killed with a double Control+C. The agent always got it wrong, so I put it in this skill. Now when you ask it to test the CLI, you can see it often goes: "Oh, there's something running, double Control+C." We have skills about GCloud stuff—how to use our GCloud command to host on GCP. I talked with Tim about this. I never loved GCloud; the CLI feels hard to use. Now with agents, it does not matter that much. I’m just like: "Here’s an error message, the user said this request failed, use GCloud and figure this out. Get the logs." And you can see it rip through GCloud commands. It made me think: What is software? If I can give my agent a skill and it figures out how to do the most complex log analysis queries, why would I go to the web dashboard?

Quinn: One of the things with Amp is it’s connected to Ampt.com and you have all these shared threads. I want to show something we’ll be shipping soon: the ability to see what are all of the skills that you’ve been using on your team. Just looking at this, we can see the BigQuery skill is used a lot.

Thorsten: The BigQuery skill is amazing. This would be the biggest argument to say: "Stop copy-pasting stuff to and from ChatGPT." This is it. You ask it, "How many users used this feature?" use BigQuery, and it goes off, finds the table names, and queries stuff. It’s really good at SQL.

Thorsten: We optimized our testing—we have one command to run all tests called pnpm test. It uses cached results, so if you haven't changed anything in a specific package, it doesn't run the tests.

Quinn: This sounds like Bazel. We must have used Bazel, this big, hulking, heavyweight thing that’s perfect for this because there’s no way we could write a caching test harness on our own, right?

Thorsten: Some people are going to hate me for this, but I feel like agents are not a confirmation that these complex build tools like Bazel will win. Agents are really good at using "dumb" tools. You don't need heavy, crazy tools. You just say, "This is the one command you should always run." We have 19 people coding on Amp. Tim on our team optimized a lot of our tests and our Svelte Check. He basically built a whole new implementation of Svelte Check in Zig—it’s called Zvelte Check. It turns out it actually did regress the "human" dev experience using SvelteKit in VS Code for us a bit. We had this difficult decision: which is more important—to preserve the human dev experience or to preserve the agent dev experience? Half of us are in VS Code and the other half are in NeoVim. We did end up briefly rolling it back to preserve the human dev experience, but we are willing to really regress the human experience. It makes me feel like I don't want to stay in VS Code with all of my extensions if that means it’s going to slow us down in making our codebase better for the agents. Once you have an agent in your codebase, if you optimize the codebase for it, then things can move faster.

Thorsten: For many years, there's been this "Broken Window Theory"—don't let the dev tooling get too broken, fix it as soon as possible. Dev tooling is so important; the best companies have the best dev tooling. And what they mean is that for a human, it's the best tooling. But now, we’re like: "This is not ideal for humans, but it doesn't matter that much because the agent is calling this command a lot of times." I don't care how readable the log file is if the agent can read it better than I can. You’ve got to be willing to let go in order to get the benefits of an agent.

Quinn: Who are we building for? We’ve always wanted Amp to be on the frontier. Now that it seems like there’s a lot of things that—if you do those things in your codebase—you’re going to be able to do a lot more with an agent, we want to exclusively make Amp for the people willing to put in that investment.

Thorsten: The assistant is dead, long live the factory. We mean that you need to optimize your codebase for these agents that go longer and more autonomously. We didn't really focus on the "assistant" part that much and didn't spell it out. But basically—and you can say the line—I think the time of you one-on-one with an agent in a sidebar, going back and forth, I think that's coming to an end with programming. You looking at the code in a text editor and having the ping-pong thing—it feels like this is not it anymore.

Quinn: When we say "coming to an end," we mean for the 1% of developers that want to be most ahead. For that 1%, they only need to do the last 20% of their work in the editor. And we think we can get that to 10% or 1%. So for the people that use Amp, we will be killing our editor extension. We’re going to be killing it because we think it's no longer the future. We think the sidebar is dead.

Quinn: When will we be killing it? It will self-destruct in about 60 days from now. We don't know exactly when, we’ll put a timer. We don't know exactly what will replace it because we are trying a lot of stuff. But we are confident that no matter what, we need to replace it, and we want to "burn the boats." We really strongly encourage you in the meantime to switch to the Amp CLI. If there are things where you miss it from VS Code, let us know. I know a lot of you will probably ask: "Couldn't we keep it around but not maintain it?" It's just a focus thing for us. We can't do that without taking our eye off the thing that you all think is 100 times more important. Our sole purpose with Amp is to let software builders harness the full power of AI. That means staying on the very edge. We’re a small team. We cannot keep up while at the same time maintaining something that was the state of the art a year ago.

Thorsten: Imagine if the switch from personal computer to the internet to mobile happened in a year. "Oh, personal computers, we’ve got to do personal software distributed on shrink-wrap CDs." Four months later, we have the internet. "Why shrink-wrap software when they can type in a URL?" Six months later, the App Store happens. All of this is happening right now with software. Two condensed examples: Jeffrey Litt was in a hotel gym. He asked Claude: "I’m in a hotel gym, here’s photos of the equipment I have available. Make me a personalized workout plan." Claude generated a little HTML thing where you could check off reps and sets. Emergent software UI. Ryan Florence (Remix JS) posted a video where he’s in his home gym. He just used OpenAI's voice mode: "Hey ChatGPT, walk me through a workout." And ChatGPT was like: "Okay, first I want you to do one set... let me know when you’re done." He does the set and says, "I’m done." Then it says, "Take a 60-second pause." Everything disappeared; it was just the voice. Is that software? There’s no code, is there? I also set up an agent for me and my wife to manage our shopping list in Todoist. Then I thought: Why do I need Todoist? The system can just write a text file somewhere and use that as memory. All of this stuff is melting. The whole premise of the last 20 years was: Find a market, find a customer base, find a niche, and scale it up. Now, the market you’re selling to is changing every three or four or six weeks. The only way to not die is to constantly reinvent yourself.

Quinn: Thorsten walked through why you’ve got to be really humble and just prepare your company for change. For us on Amp, we have 19 people. Every single person uses Amp; every single person codes every single day. Should we take this "magic in a bottle" and make it enterprise-ready, hire a sales team, a marketing team, and buy a Super Bowl ad? That makes sense if a few assumptions hold: if the product will still be competitive in a year without completely changing it. None of those assumptions hold right now, in particular when you’re building a coding agent. One of our competitive advantages is we want to be the most radically on the frontier. We want to be the fastest-moving one. You can only do that if you make a lot of decisions and you don't say "Okay, we’ll leave the VS Code extension in because a lot of people like it." If you start to allow things to slip, everything falls apart. You have to be paranoid all the time.

Quinn: Let’s talk about Mario Zechner’s Pi assistant and Open Claw. How did they make it? Mario Zechner is awesome; he made Pi, which is a general agent. Peter Steinberger made Open Claw. They are both people that have really strong opinions and a great degree of independence. They’re not building it at some VC-backed company with a big sales team and quarterly targets. Those are the kinds of incentives that start to dull the product edge.

Thorsten: Every two to six months, you get the rug pulled out from under you with something new. Are you going to sit down on the rug and put up your Lego pieces? Or do you want to keep the Lego pieces in your pockets knowing that the rug is going to be pulled?

Thorsten: Cursor built an amazing business—historically incredible business, really smart people. But now people are saying: "Guys, VS Code is holding you back." This is a year after I said I think the editor's days are numbered. Now the fastest-growing startup in history is being told "you're being held back." And now you have Pi and Open Claw with just Telegram interfaces and agents running autonomously somewhere else. It shows you cannot rest on your laurels. Cursor made Copilot look old. Copilot once was the king of the world. Then Cursor made them look old. And half a year later, Claude Code, Amp, and agents made Cursor look old. And now people are saying, "Yeah, but the editor, is that still a thing?" So now it all looks old.

Thorsten: I think the time of manual context management is also coming to an end. For the majority of last year, I’ve been saying you need to be good at managing the context window—what do you put in, when do you create a new thread. That gave you a lot of good results. But now with GPT-5-to-Codec, it leads to really good results even if you don't do this. The abstraction of "Auto-compaction"—making it seem like you don't need to care about the context—is actually viable now. I said for a long time auto-compaction does not work because signal gets lost. But if you look at GPT-5-to-Codec with its inclination to always research, it’s less trigger-happy. If you have a long-running thread and you do auto-compaction, it goes, "Let me research," and then builds up the context again before it does something.

Quinn: We’re going to figure out how we can change the concepts of a thread—is there something "above" a thread? Is it tied to a certain repo, branch, and commit? Because we’ve said "no" to so many other things, we have more freedom to explore that. Thank you to Amp users for coming along on this ride with us. It’s bumpy, it’s exciting, it’s not easy.

Thorsten: We’re a successful business making money, numbers are really good, and yet there’s an aspect to Amp that makes it more like an art installation than a software company. I said this to you two days ago: "What if Amp itself self-destructs?" We joked about the construction and destruction of Amp. But the funny thing is, our customers appreciate this. They come to us and say, "Guys, what's the next thing? How can we help?" When we ripped out four features in January—To-Dos, Forking, Amp Tab, and Custom Commands—people actually really liked it. The customers that pay us money said, "Yeah, I stopped using this," or "That makes sense, I will now stop using this." We need to be doing different things. We all believe deeply that all the usage of Amp today, all the revenue we're doing, we have to totally re-earn that every three months. We are the kind of company—because of this unique history where we spun off from Sourcegraph—we all know what we signed up for. Some people might say: "Shouldn't you take a step back and wait until you can see the big picture and then build it?" I would argue that you might be too far away from where the dice will fall. Shipping is research. We ship to figure out what works and doesn't work. We're not going to sit there and say, "Well, we can't ship this feature because we’re not 100% sure that it'll survive the next three months." You’ve got to put it out there.

Thorsten: All right, we’re at time. That was a lot of change. Any last words?

Quinn: Happy coding.

Thorsten: Happy hacking. We’ll talk to you. Bye-bye.

Quinn: Bye.