Photo: Unsplash
What Prompt Engineers Actually Do All Day
The phrase “prompt engineer” entered the popular lexicon around 2023 and immediately acquired the peculiar status of being simultaneously mocked and wildly overhyped. Twitter was full of people scoffing that “writing instructions to an AI” was not a real skill, while simultaneously a handful of job postings advertised salaries of $300,000 for the role. Both reactions were wrong in the same way: neither camp actually knew what the job involved.
Six years later, prompt engineering is a real occupation with real practitioners, real skill gradients, and real career paths. It looks almost nothing like what either the mockers or the hype merchants described.
What the Job Is Not
Start with the misconception, because it’s so persistent. Prompt engineering is not, primarily, about finding the magic phrase that makes an AI model do what you want. The image people had in 2023 was something like a wizard muttering incantations — say “pretend you are a senior tax attorney” and suddenly you’d get excellent tax advice. That framing was always silly, and it described something that would obviously be temporary even if it were accurate. Models get better at understanding intent. The crude hacks that worked on GPT-3 were largely irrelevant by GPT-5.
The actual skill set has almost nothing to do with magic words.
What the Job Is
A prompt engineer at a mid-sized insurance company — and I’m describing a real person here, though I’ve changed identifying details — spends most of their day doing three things: designing evaluation frameworks, debugging failure modes, and translating between domain experts and AI systems.
The evaluation work is the most time-consuming and least glamorous. When an AI system is being used to help adjusters assess claims, someone has to define what “good” looks like. This requires building test sets of claims with known correct answers, running model outputs against those test sets, analyzing where and why the model fails, and updating the system design to address those failures. The job is closer to quality engineering than it is to anything that would fit the “wizard talking to AI” caricature.
The debugging work is analytically demanding in a specific way. AI systems fail in non-obvious patterns. They might handle one class of claim with 97% accuracy and a superficially similar class with 71% accuracy, and figuring out why requires both technical understanding of how the model works and domain understanding of what distinguishes those claim types. The prompt engineer is the person who sits at the intersection of those two knowledge sets.
The translation work is perhaps the most underrated component. Domain experts — claims adjusters, underwriters, actuaries — know what they need but often can’t articulate it in terms that produce reliable AI behavior. The prompt engineer interviews these experts, builds taxonomies of the use cases, develops test cases that capture the edge conditions the expert cares about, and iterates between the AI system and the domain expert until the behavior is acceptable. This is essentially a specialized form of requirements engineering, applied to a non-deterministic system.
The Skill Stack
The people who do this job well have an unusual combination of abilities. Strong analytical reasoning, because the failure analysis work requires holding a lot of conditional logic in your head. Domain curiosity, because you can’t translate between experts and systems without caring about what the experts actually know. A particular kind of patience with ambiguity, because AI systems are probabilistic and the work of improving them involves improving distributions rather than fixing discrete bugs. And writing ability, not because prompt engineering is about writing clever instructions, but because so much of the job is documentation — creating the evaluation frameworks, the system specifications, the runbooks that allow other people to maintain systems you built.
The technical floor is lower than most people expected. You don’t need to be able to train models, or to understand the mathematical details of transformer architectures, to do this job well. You need to understand how these systems fail, what kinds of inputs produce what kinds of outputs, and how to structure information for a language model — but these things can be learned without a computer science degree. The people who came into prompt engineering from linguistics, technical writing, UX research, and certain corners of business analysis have often been as effective as those who came from software engineering backgrounds.
The Career Path That Emerged
There’s now a reasonably well-defined hierarchy. Junior prompt engineers are doing a lot of test-case creation and documentation work — the entry-level version of evaluation engineering. Mid-level practitioners are owning systems end-to-end: designing the evaluation framework, building the prompt architecture, doing the failure analysis, writing the documentation. Senior practitioners are working more on organizational process — how do you build the cross-functional workflows that allow AI systems to be maintained and improved over time? How do you build the measurement infrastructure that lets you know whether a system is getting better or worse after a model update?
There’s a separate track that went in a more technical direction — people who crossed over into what’s now called “AI systems engineering,” which involves more work with embedding systems, retrieval architectures, fine-tuning pipelines. That track pays more and requires more technical depth.
The Geographic Concentration
The job is concentrated in a way that creates real policy headaches. Most prompt engineering roles are in major metro areas and at companies that are digitally sophisticated enough to be running AI systems at scale. The geographic and firm-size distributions of this occupation are heavily skewed toward the already-advantaged. A mid-sized manufacturer in a smaller city is unlikely to have a formal prompt engineering function, even if they’re using AI tools. They’re more likely to have someone doing the work informally, as part of a broader IT or operations role.
This concentration means the income from these roles isn’t reaching the workers and communities that most need new economic opportunities. The people who are becoming prompt engineers were already in relatively good economic positions. This isn’t unique to prompt engineering — it’s a pattern that shows up across most of the new AI-adjacent occupations — but it’s worth stating plainly because the policy rhetoric around “AI creates new jobs to replace the ones it eliminates” often implies a more even distribution than actually occurs.
The Lifespan Question
The obvious challenge hanging over this occupation is whether it will be automated away. Every year, AI systems become somewhat more capable of specifying, evaluating, and improving other AI systems. If the AI can eventually do its own evaluation engineering, what happens to the humans doing that work?
The honest answer is that this depends on how good AI systems get at the translation work — the part that requires understanding what domain experts actually care about, which often involves understanding organizational context, political dynamics within companies, the unspoken assumptions that shape what “good” means. The technical parts of prompt engineering are probably more automatable than the translation and organizational parts. My best guess is that the job will continue to exist but will shift further toward the organizational and translation work, and the technical parts will be handled by increasingly capable AI tools. Which is, incidentally, what happened to software development — and software developers are still employed in large numbers, just doing different work than they were doing twenty years ago.
The mockery of prompt engineering in 2023 was based on a failure to understand what the job would actually entail. The hype was based on an overestimate of how exotic and valuable the most primitive version of the skill was. What emerged in between was a genuine occupation, less glamorous than the hype suggested and more substantive than the mockers allowed for, following the same basic patterns as every other technical specialty that has consolidated around a new category of tools.