He Built Google for Newsletters—Here's How the AI Actually Works

The newsletter tools I wish existed.

and

Jun 08, 2025

How many times have you landed on a brilliant newsletter, only to wonder: "What else have they written that I'm missing?"

I face this constantly. Scrolling through endless archives hoping to find relevant content? There has to be a better way.

A few weeks ago, my friend

Ryan Ong, Ph.D. 🎮

solved this exact problem. He built AskSubstack and Substack Search—tools that let you ask direct questions and get intelligent answers pulled from any newsletter's entire archive.

The AI methodology behind them is fascinating: combining semantic search with RAG (Retrieval-Augmented Generation) to make newsletter content truly discoverable. Instead of hoping readers stumble across your best work, they can now find exactly what they need.

Try my newsletter chatbot here for FREE (limited to three chats or you can DM me for unlimited access), but the real value is understanding Ryan's approach and what you can learn from it.

If you're into self-improvement mixed with AI, check out Ryan's newsletter—his practical AI insights consistently impress me.

Now, Ryan will walk you through exactly how he built these tools and the methodology that makes them work...

Hello!

I'm Ryan. I am a long-time Substack writer (since 2020) with more than seven years of experience in AI. I recently built AskSubstack and Substack Search because I love Substack and wanted to make it easier to search, ask, and discover more from the writers we follow (and the ones we haven’t found yet).

One thing I kept running into: when I land on a new publication, I want to ask it something. Like, "What have they said about focus?" or "Did they ever write about burnout?" That kind of deep access just wasn't possible… at least that is until now.

In this post, I try my best to give you the maximum value per word (a concept I love from James Clear's newsletter). So I'll walk you through:

Why I built these tools
The two tools I built for Substack readers and writers
And the powerful AI method behind it all → what actually happens when you ask a question

Let’s dive in.

Why I built these tools

As a writer, I have spent years building up an archive (written over 130 Substack articles) but most readers only ever see the latest post.

As a reader, many great newsletters have deep archives full of insights I will probably never find, even though they have written exactly what I'm looking for.

I built AskSubstack and Substack Search to fix that: to make it easy to ask questions, find older gems, and make newsletter content way more discoverable.

Discovery is key.

You can now find writers through your questions even if they don’t have huge followings! These tools help readers surface new voices and help writers get found based on what they’ve written, not how famous they are.

This shift with AI highlights a move away from SEO and algorithms toward something better: Answer Engine Optimization (a term I came across this week and love).

The content that gets surfaced now is the content that actually answers the question. That's what matters.

1. AskSubstack: Chat with any newsletter

AskSubstack turns any Substack into a chatbot. Instead of scrolling through the archive, just ask questions and get answers pulled directly from that writer’s archive.

Want to know what someone said about "community" or "starting a newsletter"? Just ask. Writers can add it to their publication, giving readers an easy, interactive way to explore past posts.

Here’s a demo video:

2. Substack Search: Google for newsletters

Substack Search lets you search across multiple newsletters at once. Just type a phrase or question and instantly see matching posts from different Substack writers.

It's perfect for finding the right article fast or discovering new voices you didn't know about.

Here's a demo video:

Under the hood: What actually happens when you ask a question

AskSubstack and Substack Search work differently under the hood, but both follow the same core idea: understand your question, find the most relevant content, and surface the best answers.

Here's how:

1. Chunking and Encoding the Content

Before any questions are asked, every newsletter is broken into smaller, meaningful sections (chunks), usually by paragraph or theme. This allows the system to retrieve just the right parts instead of scanning entire posts. I use semantic chunking, which splits based on meaning, not just structure.

"Semantic chunking splits documents such that each chunk contains a coherent idea or concept. Unlike simple methods that divide text by a fixed number of words or characters, semantic chunking uses the actual meaning and structure of the content to decide where to break it up. This approach avoids splitting sentences or ideas in half, which can make information harder to understand or use."

2. Turning questions into meaning

Your question is then converted into a vector (a bunch of numbers capturing its meaning). This lets us match it to content that expresses similar ideas, even if the exact words are different.

Here's the example:

Ask: "How do I stay focused while writing?"
Match: "Tips for maintaining concentration during long writing sessions."

3. Hybrid search

Sometimes keyword match is important too. As such, we combine semantic search (meaning) with keyword search (exact match). This hybrid approach improves both depth and precision, ideal for nuanced questions and specific phrases.

4. Reranking the results

Matched chunks are reranked using scoring algorithms that predict which ones best answer your question. The strongest matches rise to the top.

This is where the Substack Search workflow ends, surfacing the top posts from the reranked results.

5. Retrieval-Augmented Generation (RAG)

AskSubstack goes one step further. It sends the top chunks to an AI model (like GPT-4o), which reads them and generates a grounded answer.

This is called Retrieval-Augmented Generation (RAG).

What is RAG?

RAG is a method that combines two main steps: retrieving relevant information from an external knowledge base and then using a large language model (LLM) to generate an answer.

Retrieve: When a user asks a question, the system first searches a database (often made of chunked text) for the most relevant pieces of information.
Augment: The retrieved information is added to the user's question, giving the AI more context.
Generate: The AI model (LLM) uses both the original question and the retrieved context to generate a more grounded, up-to-date answer.

This is a powerful method because standard language models can only answer questions based on what they learned during training, which may be outdated or incomplete. RAG lets the model access the latest or domain-specific information without retraining the whole model, by pulling in relevant, up-to-date chunks from external sources.

Here's a great diagram from Anthropic (AI company behind Claude) that shows the whole flow I talked about, plus some extra steps like adding context to each chunk so nothing gets lost. I won't get into the details of contextualizing chunks, but I can explain more in the future if anyone's interested.

Image from Anthropic’s Introducing Contextual Retrieval article

Anyway, I built these two tools for anyone on Substack. Think of AskSubstack as chatting with a newsletter's brain, and Substack Search as your personal librarian.

The reason is because I love writing and tech. Even though these two tools are still in the early stage, I am constantly improving them. If you run a newsletter and want to try AskSubstack with your archive, just reach out and DM me.

If you read a lot and want to discover new writers, give Substack Search a spin. And please share your thoughts by reaching out, it helps me make them better!

Happy learning,

Ryan

A guest post by

Ryan Ong, Ph.D. 🎮

Obsessive learner. 7+ years in AI. Love building practical AI tools for everyday life and share powerful actionable ideas from the world’s top thinkers, helping you get 1% better each day and live a happier, more fulfilling life.

Karen Smiley

Jun 9

Love having a better search capability for newsletters, @Ryan Ong! And I'm psyched to see my own newsletter listed 1st under "AI ethics" - and 3rd for "ethical AI" (interesting that the lists were close but not quite the same).

You had mentioned that you were working on preventing the underlying AI platform from scraping newsletters with AI training disabled, as a side effect of driving a search tool like this. Curious where that stands?

It would be great to have this kind of search for Notes too 😊 (no 'AI training' concerns there)

Expand full comment

1 reply

David Todd

Jun 11

Excellent Ryan. I haven’t yet launched my substack yet but read and subscribe to many. I’ve tried both Ask & Search. Also TY for this thorough explanation. You did the community a solid

18 more comments...