From Command Line to Conversation: How Agentic AI Is Changing Bioinformatics

7 minute read

Published:

For people working in the biological sciences who don’t have a background in programming, it is often difficult to work in the language of DNA at the genomic level.

The most powerful methods for manipulating sequences, testing hypotheses, and processing data are computational. And using them requires a working knowledge of software that runs outside of graphical interfaces—tools that operate through text-based, command-line environments.

For those of you 50 and older, you might remember MS-DOS—when getting anything done meant typing commands into a prompt.

That’s still largely how things are done in bioinformatics today.

So you can imagine, there’s a learning curve. For many people—especially those trained primarily in experimental biology—this way of working feels foreign at first.

Since becoming a professor, I’ve spent a lot of time teaching this type of analysis to non-experts—often people with little to no programming experience.

It’s one of the most fulfilling parts of teaching. Watching a student go from staring at what looks like unintelligible characters on a screen to actually writing code, running commands, and processing data from the command line—it’s a real transformation.

You can almost see the moment it clicks.

It’s like learning a new tool.
Or maybe more accurately, gaining a new kind of superpower—something that once felt completely out of reach suddenly becomes accessible.

Now, though, we’re at an interesting crossroads.

I think it’s worth pausing to describe how different it feels to work with agentic AI compared to the traditional way of doing things in bioinformatics.


A Biological Question

I’ve been thinking about a possible mechanism for how mRNA processing might take place.

In mRNA processing, there are choices that have to be made—specifically, which exons are included or excluded. This process, known as alternative splicing, allows a single gene to produce multiple different transcripts.

You can think of it like assembling a sequence from a set of building blocks.

Imagine a gene with 10 possible exon “blocks.” One version of the processed mRNA might include:

1, 2, 3, 4, 5, 6, 7, 8, 9, 10

But another version might skip one block:

1, 2, 3, 4, 5, 6, 8, 9, 10

Or choose a different combination:

1, 2, 3, 4, 5, 7, 8, 9, 10

Or even:

1, 2, 3, 5, 6, 7, 8, 9, 10

Each of these choices produces a different mRNA—and potentially a different protein.

What I’ve been focusing on is the final block—block 11.

Unlike the others, this one isn’t protein-coding. It’s part of the untranslated region (the 3′ UTR).

My hypothesis is that the sequence within this block can influence the choices made upstream—for example, whether exon 7 or exon 8 gets included.

In other words, information at the end of the transcript might be feeding back to affect decisions made earlier.

The mechanism I’m imagining is based on sequence complementarity.

The idea is that the 3′ UTR contains regions that can base-pair with sequences located in the introns between these alternative exons. Those blocks are exons, and the sequences in between them are introns—which can be very long.

For this hypothesis to work, the 3′ UTR would need to contain sequences complementary to specific intronic regions surrounding the alternative splice sites. If such pairing occurs, it could physically influence how the splicing machinery makes its decisions.


The Traditional Barrier

To test this idea, you would need to take the sequence from the 3′ UTR and ask whether it can pair with any of the intervening intronic sequences.

That means scanning across introns—between exon 1 and 2, 2 and 3, 3 and 4, and so on.

Turning this into a bioinformatics analysis is not trivial.

It would require tools like BLAST or RNAfold to predict base-pairing interactions and estimate free energies. In principle, this is all doable with freely available software.

But in practice, it’s complicated. It takes time, setup, and programming comfort.

To be honest, in the context of a busy life, this is the kind of task that would take me a long time to actually carry out myself.

And so, somewhat ironically, this is exactly the kind of analysis I’ve often suggested to students.


Enter Agentic AI

It struck me that this might be exactly the kind of problem that agentic AI could help solve.

Not just tools that suggest code—but systems that can actually interact with files, run scripts, and manipulate data directly.

I hadn’t really tried this before.

But I had access to Codex, running locally on my MacBook Pro. I gave it access to a folder on my computer and started interacting with it through a prompt-based interface.

And then I began to ask:

How could I test this idea?


The Result

To cut to the chase, the results were incredible.

I had a clear vision in my mind—a figure of a gene showing interactions between the 3′ UTR and intronic regions, driven by sequence complementarity.

That vision—essentially a final figure—was the goal.

And instead of building everything step by step, I described that end result.

With very little prompting, I generated a visual—a cartoon of the analysis—that captured exactly what I had in mind.

More importantly, I was able to go back and verify the underlying sequences. The predicted interactions were real in terms of complementarity.

What would have taken me a month took less than an hour.


A Double-Edged Sword

This kind of power comes with trade-offs.

On one hand, my experience allows me to sanity-check the results. I know when something looks wrong.

But someone without that background might not.

At the same time, the entire process has changed.

I don’t need to open a terminal, remember syntax, or debug commands. I can describe what I want—and the system executes it.

That’s incredibly powerful.

But you do lose something.

Some of the friction that forced understanding disappears. Some of the skill fades.

And yet, maybe that’s not entirely bad.

There are more important things to think about than debugging syntax.


A Massive Shift

I recently read that AI is improving coding efficiency by about 10% in tech.

For me, it’s not 10%.

It’s closer to 300%.

This isn’t incremental improvement. It’s a collapse of a barrier.

And that has implications for the field.

Bioinformatics has long been built around tool creation—papers, grants, and careers based on building new methods.

But if tools can be generated on demand, that model changes.


Teaching at an Impasse

This brings me to teaching.

I start my computational genomics course next week. Last year, I taught it without AI.

Now, I’m not sure what to do.

For now, I’ll probably keep things largely the same.

Because even with AI, students need to understand the language. If they can’t read the output, they can’t evaluate it.

But something deeper is shifting.

We’re moving from a world of skills to a world of ideas.

The ability to think clearly, ask good questions, and reason through problems in real time—that’s becoming the core skill.

Conversation is becoming the interface to computation.


What Should Students Learn?

My advice to students has changed.

Not long ago, I would tell them: learn computation.

Now, I’m not so sure.

The things that seem more durable are the ones that are harder to automate—the physical, hands-on aspects of biology.

Running a Western blot.
Working with mouse tissues.
Doing experiments in the real world.

In a strange way, my advice has almost flipped 180 degrees.


Final Thought

I don’t think we fully appreciate how big this shift is.

It’s not incremental.

It’s a fundamental change in how we work, how we teach, and how people learn.

And we’re right at the beginning of it.