ECO 7377: AI and Economic Research

How AI is reshaping the research pipeline, and what to do about it

Why this note

The capabilities of large language models (LLMs) have been advancing rapidly: ChatGPT was released less than four years ago, and yet AI tools are already woven into many parts of the empirical research workflow. This raises a natural question for PhD students in economics: how should we adjust how we work, what we learn, and which problems we pick?

This note synthesizes two perspectives that I find clarifying — and sits a practical, task-level view from my own use between them:

Paul Goldsmith-Pinkham’s blog post “Research in the Time of AI” (March 2026), which argues that AI compresses the research pipeline rather than replacing it.
Isaiah Andrews’ note “Some Thoughts on AI and Research” (April 2026), which lays out a useful framework for thinking and planning under deep uncertainty about where models are heading.

Part II then walks through the main task types — coding, writing, background research, reading, modelling, data — and the heuristics I have found useful for each. Neither author claims to know how this all plays out, and neither do I. The point is to think clearly about it now — because PhD school is itself a long, expensive human-capital investment, and the right investments depend on what the research production function ends up looking like.

Part I: The research pipeline in the time of AI

Following Paul Goldsmith-Pinkham, it helps to start with what economic research actually consists of before asking where AI fits in.

A paper is not a single shot

A paper is not “downloading a dataset, running a regression, and writing it up.” It is an iterative process — fits and starts, groping and guessing — in which the question, the design, the data, and the findings all reshape each other. A useful decomposition of the pipeline is roughly:

Ideation — generating, refining, and discarding research questions.
Design and feasibility — checking whether a credible empirical strategy is even possible.
Data assembly — sourcing, cleaning, validating.
Core analysis — implementing estimators, running specifications.
Robustness and extensions — stress tests, alternative specifications, heterogeneity.
Writing — turning results into a coherent argument.
Submission, review, and revision.
Dissemination — replication packages, talks, slides, follow-up work.

Each stage typically requires multiple passes, and progress in stage 4 or 5 often forces you back to stage 1.

Two natural anxieties

When AI tools touch every stage above, two anxieties commonly surface:

1. The “slop” problem. Will publication-pressured researchers use AI to short-circuit iteration — going straight from idea to paper, supercharging p-hacking, flooding journals with thin work? It is a real risk; the partial offsets are stronger replication norms, open data and code, and faster verification by other researchers and referees.

2. Career depreciation. Many of us have spent years building human capital in coding, data wrangling, and technical writing. If those execution skills get cheap, the private return on that investment falls, and the anxiety is genuine.

The compression model, not the elimination model

The key reframing is that AI compresses the time between stages of the pipeline rather than removing the stages themselves. You can move from a half-formed idea to a real first cut at data in an afternoon instead of weeks. The right interpretation of that compression is more cycles of iteration on the questions that matter — not “I can now produce three mediocre papers in the time it took to produce one.” Whether researchers actually use the compression that way is a professional and personal choice.

What stays hard

What does not get compressed by current models:

Identifying questions that are genuinely worth answering.
Developing taste about what counts as a credible research design.
Knowing the institutional context well enough to recognize when the data are lying to you.
Iterating until the argument actually holds together.

A useful image: the LLM is a flashlight that brightly illuminates the path in front of you. It does not choose where to walk. Choosing where to walk is still where most of the value of an empirical paper is created.

Part II: How AI helps, task by task

Part I described the research pipeline at a high level. To make the discussion concrete — and useful as a workflow guide — it helps to zoom in on specific task types and ask, for each, what current AI tools do well, where they fail, and what discipline that requires from us. The six categories below come from my own use; the principles are heuristics, not rules, and both the tools and the right way to use them will keep changing. For more detailed treatments of individual use cases, see Anton Korinek’s Generative AI for Economic Research and Ingar Haaland’s notes.

1. Coding

This is where AI tools have changed daily life the most. Pick an IDE with deep AI integration — VS Code, Cursor, AntiGravity, NeoVim — so that chat, completion, generation, and error-aware debugging all sit inside the same workflow. Beyond simple completion, coding agents (Cursor, Codex, Claude Code, AntiGravity) plan and execute multi-step changes; they are powerful but tend to overwork. A few principles keep this useful rather than hazardous:

Iterate with short feedback loops. Describe the task, test the code on sample data, report what failed, and re-run. Treat the model as a fast collaborator, not a one-shot oracle.
Work incrementally. Build scripts step by step rather than asking for a full pipeline; this keeps the logic transparent and errors contained.
Specify context and outputs clearly. State the language, data format, and expected output so the model writes for your environment, not a generic one.
Verify before scaling. Run on toy data first; never trust untested code on the full dataset.
Never lose sight of your code. Empirical work is not a website front page. Do not commit code you do not understand.

2. Writing

The right division of labor in writing is the inverse of the naive one: you draft the substance in your own words; the model polishes phrasing, tone, and flow. A useful jump-start is to give the model your bullet points and underlying logic, then ask for a first draft you can rewrite.

Use it for clarity, not creativity. Treat the model as an editor that sharpens language, not as an author that invents content. Asking it to produce a table listing what it changed and why is a good way to keep the edits accountable.
Feed data-rich sections for context only. Let the model see empirical material so its prose stays consistent with it; keep interpretation and numbers strictly human.
Give it guardrails. Specify which parts may be edited and which must remain unchanged, to avoid distortion or quiet overstatement.
Review and feedback. Use AI also as a critical reader to identify stylistic and structural weaknesses; specialized writing-review services (e.g. refine.ink) go beyond grammar checks but can be expensive.

3. Background research and idea generation

By mid-2025, agentic chatbots and deep-research tools (ChatGPT Deep Research, Gemini Deep Research, Google Scholar Labs, Consensus, Undermind, …) had transformed background research from a passive “summarize this” exercise into an active investigation — scanning, strategic planning, and workflow integration in one. They are also unusually good at breaking down disciplinary walls and surfacing relevant work outside your usual reading.

Two cautions:

Deep research is not deep enough. Most of the time, far from it. Domain knowledge is central; do your own deep reading on top of what the agent retrieves.
The “spark of insight” is still human. Identifying a genuinely novel question, or seeing a connection between disparate phenomena, remains an irreducibly human contribution. AI can help along the way and sometimes seed an idea, but it does not own the framing.

For idea generation specifically, useful patterns are:

Use it to clarify and challenge. A discussion partner that asks clarifying questions, surfaces logical gaps, and points out counterexamples — not a source of original theory.
Generate structured options, then choose. Ask for several alternative mechanisms, estimation strategies, or conceptual framings, and use the variants to sharpen your own reasoning.
Lean on it when stuck. Treat it as a way to break out of local minima — possible explanations, alternative angles, next steps.
Critiques and counter-arguments. Simulating skeptical peer reviews is one of its highest-value uses; it helps fight your own confirmation bias.
Keep ownership of judgment and framing. Let the model expand your option set; rely on your own expertise to decide which options are coherent, feasible, and worth pursuing.

4. Reading and learning

Reading apps and browsers (Gemini for Chrome, Comet by Perplexity, NotebookLM, …) now ship with AI integration that does much more than synthesize information.

Survey, do not substitute. Use these tools to map a literature or get oriented, not to replace careful reading. Depth and accuracy still come from sitting with the paper.
Ask conceptual or comparative questions. “How does X differ from Y?” or “Which assumption is doing the work here?” are far more useful than generic summarize prompts.
Always verify citations and data. Treat every reference and data source the model returns as a lead to check, never as verified truth.
Feynman technique and active recall. Roughly: while (! able to actively recall and explain it simply) { study; explain it simply; identify gaps; review and simplify }. AI can help along the way; it cannot do the recalling for you.

5. Modelling and mathematical derivation

AI for mathematics is a very active field — see, for example, the crowdsourced math projects led by Terence Tao. Frontier reasoning models (ChatGPT 5.1 Pro, Gemini 3 Pro) and specialized math models (Qwen3-math, DeepSeek-V3.2-math) can already handle PhD-level derivations with the right prompting; agentic systems built on top extend this further. For modelling work in economics specifically:

Use the model to generate ideas, fill technical gaps, and check logic. Do not expect end-to-end theoretical development to come out clean.
Keep the math grounded in economics. When using AI to assist with modelling, ensure that generated equations reflect sensible behavioral assumptions, constraints, and equilibrium concepts — not just mathematically valid manipulations.
Describe the economic environment in words first. Spell out agents, objectives, and timing before asking the model to formalize. The framing is yours; the formalization is where AI helps.

6. (Unstructured) data processing

Increasingly, data work in economics involves text and images, not just panels of numbers. Here AI is genuinely transformative.

Start with exploration. Understand the data well before scaling up; AI tools can help you write the visualization and exploratory code quickly.
Use AI for extraction, classification, and simulation. It can pull structured data from text and images, score text on dimensions like sentiment or tone, and even simulate human-subject responses to surveys or experiments. When using the outputs in downstream estimation, follow the econometrics-of-AI-output literature (e.g. Ludwig et al. 2025).
Calibration is mandatory. Randomly sample and hand-check a portion of the AI-extracted data to detect systematic biases or omissions before trusting it.
Prompt with examples. Show the model both the source text and what a correctly extracted record should look like, so it has a template to anchor on.
Chunk long texts and add context tags. Split books or archival sources into short, labeled excerpts so the model can process them without losing track of what each piece is.

A short coda

There is much we have not covered — local models (Ollama, LM Studio), fine-tuning on private data, APIs, RAG, MCP-based tool use, and full agentic systems — and the toolkit will keep changing. The lesson worth carrying forward is the one that has shaped my own use most: accumulate AI-augmented human capital, not AI-substitute human capital. The upper limit of artificial intelligence (for you) is your own intelligence.

Tools I use

A snapshot of my current setup, with the caveat that it will keep changing:

IDE. Cursor for almost everything; considering a transition to Vim + Copilot.
LaTeX. Cursor or VS Code with the LaTeX Workshop and Hypersnips extensions, plus Overleaf for collaboration. Link to a tutorial.
Coding agents. Codex and Claude Code, often used together.
Initial drafts. Claude Code.
Literature reviews. Undermind and ChatGPT Deep Research.
Writing review. ChatGPT Pro and coarse.ink.
Math derivations. Iterating between ChatGPT Pro and Claude Code.

Part III: Thinking under uncertainty

Paul’s piece is mostly about what to do given current capabilities. Isaiah Andrews’ note complements this by offering a framework for thinking about the trajectory — what investments make sense given that no one knows where capabilities will land.

Three cases

Andrews suggests reasoning about three coarse scenarios:

Case 1. Models become better than humans at essentially all intellectual tasks. In this world your current human-capital investments do not pay off. There is little you can do to insure against this case from inside academia, so conditional on continuing to do research at all, this case should not drive your decisions much.
Case 2. Models get much better at things current models are OK at (coding, drafting, proofs of standard results, literature recall) but remain meaningfully below strong human performance on things they are currently bad at (taste, judgment, problem selection, recognizing what is genuinely new or important). Andrews — and many practitioners — view this as the most plausible near-term world.
Case 3. Capabilities level off only modestly above the current frontier. Even this is transformative: the production possibility frontier for both theoretical and applied research has already shifted, and not learning to take advantage of these tools would itself be costly.

What this implies

In Cases 2 and 3, the implication is sharper than the bumper-sticker “AI threatens economists.” Returns to scarce skills that complement AI go up:

Skills the models are bad at — taste, judgment, problem selection.
Access to and fluency with frontier tools, which are often paid and often institutional.
The ability to verify model output — to catch the wrong-but-plausible answer.
The ability to direct research, including across collaborators and tools.

In Case 1, none of this matters. In Cases 2 and 3, these are exactly the dimensions a PhD program should be helping you build.

Part IV: High-level advice

Pulling the two together, here is the practical guidance I want to leave you with.

1. Be intentional, not reflexive

Reach for AI tools when they expand what you can iterate on, not as a default substitute for thinking. A short prompt that produces a plausible answer is not the same as understanding. The compression is only valuable if it buys you more thought, not less.

2. Invest in three dimensions of using the tools

Following Andrews, treat tool use itself as a skill with structure:

Experimentation. Try things aggressively. The tools are new and evolving; nobody has “solved” how to use them best. Most PhD students under-invest here. Frontier paid models are often substantially better than free versions — the cost of access is usually worth it if you actually get value from them.
Verification. Model output sounds plausible and is sometimes wrong in ways that require expertise to detect. Auditing output — proofs, code, citations, magnitudes — is a core skill. Notably, this raises the returns to taste and domain knowledge rather than substituting for them.
Division of labor. Tools change not only solo work but how you organize collaboration: which tasks need an RA, how coauthors split work, what projects are now feasible at all. Plan how the tools fit into your workflow, not just how to call them.

3. Invest in what compounds

Pour energy into the things that do not get cheaper:

Question development and research taste.
Deep knowledge of an institutional setting and a literature.
The ability to communicate clearly with applied researchers about what their problems actually are.
The willingness to iterate until the argument is right, not until it is finished.

A hard but useful exercise: ask honestly whether your past successes leaned on an execution advantage (you out-coded the median student) or on an intellectual one (you saw the question or the design first). The first is being commoditized. The second is not.

Sources

Paul Goldsmith-Pinkham, “Research in the Time of AI,” March 16,
1. https://paulgp.com/2026/03/16/research-in-time-of-ai.html
Isaiah Andrews, “Some Thoughts on AI and Research,” April 3,
1. https://economics.mit.edu/sites/default/files/2026-04/IA%20AI%20note_3.pdf

Resources

https://www.youtube.com/playlist?list=PLPKR-Xs1slgTqMU3E1UJszSK4PYRFF-Py
https://github.com/pedrohcgs/Claude-Mini
https://github.com/pedrohcgs/claude-code-my-workflow
https://psantanna.com/claude-code-my-workflow/workflow-guide.html#sec-acknowledgments