survival8: Agentic AI

Showing posts with label Agentic AI. Show all posts

Friday, November 21, 2025

YouTube Academy For Agentic AI

Toggle All Sections

Agentic AI Inception

AI agents: The scientist's new superpower | Stefan Harrer | TEDxSydney Salon

TED
The Future Of AI Agents With Dharmesh Shah | INBOUND 2024

HubSpot - INBOUND
Generative AI is just the Beginning. AI Agents are what Comes next | Daoud Abdel Hadi | TEDxPSUT

TED

What is Agentic AI?

What are AI Agents?

IBM Technology
Intro to AI agents

Google Cloud Tech

Large Language Models

[1hr Talk] Intro to Large Language Models

Andrej Karpathy
Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Stanford Online
What Are Large Reasoning Models (LRMs)? Smarter AI Beyond LLMs

IBM Technology

Agentic AI Overview (Stanford)

Stanford Webinar - Agentic AI: A Progression of Language Model Usage

Stanford Online

Building Agents

Building and evaluating AI Agents — Sayash Kapoor, AI Snake Oil

AI Engineer
How We Build Effective Agents: Barry Zhang, Anthropic

AI Engineer
Lets Build An Agent from Scratch

AI Engineer

Model Context Protocol

MCP vs API: Simplifying AI Agent Integration with External Data

IBM Technology
Building Agents with Model Context Protocol - Full Workshop with Mahesh Murag of Anthropic

AI Engineer

Free Courses at DeepLearning.AI

1:
Multi AI Agent Systems with CrewAI
→ Intro to multi-agent systems
Instructor: João Moura

2:
Practical Multi AI Agents and Advanced Use Cases with CrewAI
→ Builds on foundational CrewAI skills
Instructor: João Moura

3:
AI Agents in LangGraph
→ LangGraph’s execution model + architecture
Instructors: Harrison Chase, Rotem Weiss

4:
Long-Term Agentic Memory with LangGraph
→ Advanced memory handling for agents
Instructor: Harrison Chase

5:
AI Agentic Design Patterns with AutoGen
→ Design and coordination best practices
Instructors: Chi Wang, Qingyun Wu

6:
Evaluating AI Agents
→ Measurement and performance evaluation
Instructors: John Gilhuly, Aman Khan

7:
Event-Driven Agentic Document Workflows with LlamaIndex
→ Automate document workflows with RAG + agents
Instructor: Laurie Voss

8:
Build Apps with Windsurf's AI Coding Agents
→ Code generation agents in practice
Instructor: Anshul Ramachandran

9:
Building Code Agents with Hugging Face
→ Explore Hugging Face's agent capabilities
Instructors: Thomas Wolf, Aymeric Roucher

10:
Building AI Browser Agents
→ Web-interacting agents
Instructors: Div Garg, Naman Garg

11:
DsPy: Build and Optimize Agentic Apps
→ Pythonic framework for optimizing agents
Instructor: Chen Qian

12:
MCP: Build Rich-Context AI Apps with Anthropic
→ Anthropic’s take on context-rich agents
Instructor: Elie Schoppik

13:
Semantic Caching for AI Agents using Redis
Instructors: Tyler Hutcherson, Iliya Zhechev

14:
Governing AI Agents
Instructor: Amber Roberts
With: DataBricks

Tags: Agentic AI,YouTube Academy,

Tuesday, November 4, 2025

Agentic AI Books (Nov 2025)

Download Books Download Report

1:
Advanced Introduction to Artificial Intelligence in Healthcare
Thomas H. Davenport, John Glaser, Elizabeth Gardner
Year: 2023

2:
Agentic AI Agents for Business
Year: 2023

3:
Agentic AI Architecture - Designing the Future of AI Agents
Ad Vemula
Year: 2023

4:
Agentic AI Cookbook
Robert J. K. Rowland 
Year: 2023

5:
Agentic AI Engineering: The Definitive Field Guide to Building Production-Grade Cognitive Systems (Generative AI Revolution Series)
Yi Zhou
Year: 2024

6:
Agentic AI for Retail
Year: 2023

7:
Agentic AI with MCP
Nathan Steele 
Year: 2024

8:
Agentic AI: A Guide by 27 Experts
27 Experts
Year: 2023

9:
Agentic AI: Theories and Practices
Ken Huang
Year: 2023

10:
Agentic Artificial Intelligence: Harnessing AI Agents to Reinvent Business, Work and Life
Pascal Bornet
Year: 2024

11:
AI 2025: The Definitive Guide to Artificial Intelligence, APIs, and Python Programming for the Future
Hayden Van Der Post, et al.
Year: 2020

12:
AI Agents for Business Leaders
Ajit K Jha 
Year: 2024

13:
AI Agents in Action
Micheal Lanham
Year: 2024

14:
AI Engineering: Building Applications with Foundation Models
Chip Huyen
Year: 2024

15:
AI for Robotics: Toward Embodied and General Intelligence in the Physical World
Alishba Imran
Year: 2024

16:
All Hands on Tech: The AI-Powered Citizen Revolution
Thomas H. Davenport and Ian Barkin
Year: 2023

17:
All-in On AI: How Smart Companies Win Big with Artificial Intelligence
Thomas H. Davenport and Nitin Mittal
Year: 2023

18:
Artificial Intelligence: A Modern Approach
Stuart Russell and Peter Norvig
Year: 1995

19:
Build a Large Language Model (From Scratch)
Sebastian Raschka
Year: 2024

20:
Building Agentic AI Systems: Create intelligent, autonomous AI agents that can reason, plan, and adapt
Anjanava Biswas
Year: 2024

21:
Building Agentic AI Workflow: A Developer's Guide to OpenAI's Agents SDK
Harvey Bower
Year: 2023

22:
Building AI Agents with LLMs, RAG, and Knowledge Graphs: A practical guide to autonomous and modern AI agents
Salvatore Raieli
Year: 2023

23:
Building AI Applications with ChatGPT APIs
Martin Yanev
Year: 2023

24:
Building Applications with AI Agents: Designing and Implementing Multiagent Systems
Michael Albada
Year: 2024

25:
Building Generative AI-Powered Apps: A Hands-on Guide for Developers
Aarushi Kansal
Year: 2024

26:
Building Intelligent Agents: A Practical Guide to AI Automation
Jason Overand
Year: 2023

27:
Designing Agentic AI Frameworks

Year: 2024

28:
Foundations of Agentic AI for Retail: Concepts, Technologies, and Architectures for Autonomous Retail Systems
Dr. Fatih Nayebi
Year: 2024

29:
Generative AI for Beginners
Caleb Morgan Whitaker
Year: 2023

30:
Generative AI on AWS: Building Context-Aware Multimodal Reasoning Applications
Chris Fregly
Year: 2024

31:
Hands-on AI Agent Development: A Practical Guide to Designing and Building High-Performance and Intelligent Agents for Real-World Applications
Corby Allen
Year: 2023

32:
Hands-On APIs for AI and Data Science: Python Development with FastAPI
Ryan Day
Year: 2024

33:
How HR Leaders Are Preparing for the AI-Enabled Workforce
Tom Davenport
Year: 2024

34:
L'IA n'est plus un outil, c'est un collègue": Moderna fusionne sa DRH et sa DSI
Julien Dupont-Calbo
Year: 2024

35:
Lethal Trifecta for AI agents
Simon Willison
Year: 2025

36:
LLM Powered Autonomous Agents
Lilian Weng
Year: 2023

37:
Mastering Agentic AI: A Practical Guide to Building Self-Directed AI Systems that Think, Learn, and Act Independently
Ted Winston
Year: 2023

38:
Mastering AI Agents: A Practical Handbook for Understanding, Building, and Leveraging LLM-Powered Autonomous Systems to Automate Tasks, Solve Complex Problems, and Lead the AI Revolution
Marcus Lighthaven
Year: 2025

39:
Multi-Agent Oriented Programming: Programming Multi-Agent Systems Using JaCaMo
Olivier Boissier, Rafael H. Bordini, Jomi Fred Hübner, et al.
Year: 2023

40:
Multi-Agent Systems with AutoGen
Victor Dibia
Year: 2023

41:
Multi-Agent Systems: An Introduction to Distributed Artificial Intelligence
Jacques Ferber
Year: 1999

42:
OpenAI API Cookbook: Build intelligent applications including chatbots, virtual assistants, and content generators
Henry Habib
Year: 2023

43:
Principles of Building AI Agents
Sam Bhagwat
Year: 2024

44:
Prompt Engineering for Generative AI
James Phoenix, Mike Taylor
Year: 2023

45:
Prompt Engineering for LLMs: The Art and Science of Building Large Language Model-Based Applications
John Berryman
Year: 2023

46:
Rewired to outcompete
Eric Lamarre, Kate Smaje, and Rodney Zemmel
Year: 2023

47:
Small Language Models are the Future of Agentic AI
Peter Belcak, Greg Heinrich, Shizhe Diao, Yonggan Fu, Xin Dong, Saurav Muralidharan, Yingyan Celine Lin, Pavlo Molchanov
Year: 2025

48:
Superagency in the workplace: Empowering people to unlock AI's full potential
Hannah Mayer, Lareina Yee, Michael Chui, and Roger Roberts
Year: 2023

49:
The Age of Agentic AI: A Practical & Exciting Exploration of AI Agents
Saman Zakpur
Year: 2025

50:
The Agentic AI Bible: The Complete and Up-to-Date Guide to Design, Build, and Scale Goal-Driven, LLM-Powered Agents that Think, Execute and Evolve
Thomas R. Caldwell
Year: 2025

51:
The AI Advantage How to Put the Artificial Intelligence Revolution to Work
Thomas H. Davenport
Year: 2023

52:
The AI Engineering Bible: The Complete and Up-to-Date Guide to Build, Develop and Scale Production Ready AI Systems
Thomas R. Caldwell
Year: 2023

53:
The economic potential of generative AI: The next productivity frontier
McKinsey
Year: 2023

54:
The LLM Engineer's Handbook
Paul Iusztin
Year: 2024

55:
The Long Fix: Solving America's Health Care Crisis with Strategies That Work for Everyone
Vivian S. Lee
Year: 2020

56:
Vibe Coding 2025
Gene Kim and Steve Yegge
Year: 2025

57:
Working with AI Real Stories of Human-Machine Collaboration
Thomas H. Davenport & Steven M. Miller
Year: 2022

Tags: List of Books,Agentic AI,Artificial Intelligence,

Sunday, November 2, 2025

Small Language Models are the Future of Agentic AI

See All Articles on AI Download Research Paper

🧠 Research Paper Summary

Authors: NVIDIA Research (Peter Belcak et al., 2025)

Core Thesis:
Small Language Models (SLMs) — not Large Language Models (LLMs) — are better suited for powering the future of agentic AI systems, which are AI agents designed to perform repetitive or specific tasks.

🚀 Key Points

SLMs are powerful enough for most AI agent tasks.
Recent models like Phi-3 (Microsoft), Nemotron-H (NVIDIA), and SmolLM2 (Hugging Face) achieve performance comparable to large models while being 10–30x cheaper and faster to run.
Agentic AI doesn’t need general chatty intelligence.
Most AI agents don’t hold long conversations — they perform small, repeatable actions (like summarizing text, calling APIs, writing short code). Hence, a smaller, specialized model fits better.
SLMs are cheaper, faster, and greener.
Running a 7B model can be up to 30x cheaper than a 70B one. They also consume less energy, which helps with sustainability and edge deployment (running AI on your laptop or phone).
Easier to fine-tune and adapt.
Small models can be trained or adjusted overnight using a single GPU. This makes it easier to tailor them to specific workflows or regulations.
They promote democratization of AI.
Since SLMs can run locally, more individuals and smaller organizations can build and deploy AI agents — not just big tech companies.
Hybrid systems make sense.
When deep reasoning or open-ended dialogue is needed, SLMs can work alongside occasional LLM calls — a modular mix of “small for most tasks, large for special ones.”
Conversion roadmap:
The paper outlines a step-by-step “LLM-to-SLM conversion” process:
- Collect and anonymize task data.
- Cluster tasks by type.
- Select or fine-tune SLMs for each cluster.
- Replace LLM calls gradually with these specialized models.
Case studies show big potential:
- MetaGPT: 60% of tasks could be done by SLMs.
- Open Operator: 40%.
- Cradle (GUI automation): 70%.

⚙️ Barriers to Adoption

Existing infrastructure: Billions already invested in LLM-based cloud APIs.
Mindset: The industry benchmarks everything using general-purpose LLM standards.
Awareness: SLMs don’t get as much marketing attention.

📢 Authors’ Call

NVIDIA calls for researchers and companies to collaborate on advancing SLM-first agent architectures to make AI more efficient, decentralized, and sustainable.

✍️ Blog Post (Layman’s Version)

💡 Why Small Language Models Might Be the Future of AI Agents

We’ve all heard the buzz around giant AI models like GPT-4 or Claude 3.5. They can chat, code, write essays, and even reason about complex problems. But here’s the thing — when it comes to AI agents (those automated assistants that handle specific tasks like booking meetings, writing code, or summarizing reports), you don’t always need a genius. Sometimes, a focused, efficient worker is better than an overqualified one.

That’s the argument NVIDIA researchers are making in their new paper:
👉 Small Language Models (SLMs) could soon replace Large Language Models (LLMs) in most AI agent tasks.

⚙️ What Are SLMs?

Think of SLMs as the “mini versions” of ChatGPT — trained to handle fewer, more specific tasks, but at lightning speed and low cost. Many can run on your own laptop or even smartphone.

Models like Phi-3, Nemotron-H, and SmolLM2 are proving that being small doesn’t mean being weak. They perform nearly as well as the big ones on things like reasoning, coding, and tool use — all the skills AI agents need most.

🚀 Why They’re Better for AI Agents

They’re efficient:
Running an SLM can cost 10 to 30 times less than an LLM — a huge win for startups and small teams.
They’re fast:
SLMs respond quickly enough to run on your local device — meaning your AI assistant doesn’t need to send every request to a faraway server.
They’re customizable:
You can train or tweak an SLM overnight to fit your workflow, without a massive GPU cluster.
They’re greener:
Smaller models use less electricity — better for both your wallet and the planet.
They empower everyone:
If small models become the norm, AI development won’t stay locked in the hands of tech giants. Individuals and smaller companies will be able to build their own agents.

🔄 The Future: Hybrid AI Systems

NVIDIA suggests a “hybrid” setup — let small models handle 90% of tasks, and call in the big models only when absolutely needed (like for complex reasoning or open conversation).
It’s like having a small team of efficient specialists with a senior consultant on call.

🧭 A Shift That’s Coming

The paper even outlines how companies can gradually switch from LLMs to SLMs — by analyzing their AI agent workflows, identifying repetitive tasks, and replacing them with cheaper, specialized models.

So while the world is chasing “bigger and smarter” AIs, NVIDIA’s message is simple:
💬 Smaller, faster, and cheaper may actually be smarter for the future of AI agents.

Tags: Technology,Artificial Intelligence,

Thursday, October 16, 2025

Agentic AI by Andrew Ng at DeepLearning.ai

View Course on DeepLearning.ai

Legend:
M: Module
L: Lesson


M1 - Introduction to Agentic Workflows


M1L2 - What is Agentic AI




M1L3 - Degrees of Autonomy




M1L4 - Benefits of Agentic AI




M1L5 - Agentic AI Applications




M1L6 - Task Decomposition - Identifying the steps in a workflow




M1L7 - Evaluating Agentic AI (evals)




M1L8 - Agentic Design Patterns




M1L9 - Quiz




Setup Steps (part of module-1 lab)


M2 - Reflection Design Pattern


M2L1 - Reflection to improve outputs of a task




M2L2 - Why not just direct generation




M2L3 - Chart Generation Workflow




M2L4 - Evaluating the impact of reflection




M2L5 - Using External Feedback




M2L6 - Quiz




Open Module-2 Lab Assignments

M3 - Tool Use


M3L1 - What Are Tools




M3L2 - Creating a Tool




M3L3 - Tool Syntax




M3L4 - Code Execution




M3L5 - MCP




M3L6 - Quiz




Open Module-3 Lab Assignments

M4 - Practical Tips for Building Agentic AI


M4L1 - Evaluations (evals)




M4L2 - Error Analysis and prioritizing next steps




M4L3 - More error analysis examples





M4L4 - Component-level evaluations




M4L5 - How to address problems you identify




M4L6 - Latency, cost optimization




M4L7 - Development process summary




M4L8 - Quiz



Open Module-4 Lab Assignment


M5 - Patterns for Highly Autonomous Agents

M5L1 - Planning Workflows




M5L2 - Creating and executing LLM plans




M5L3 - Planning with code execution




M5L4 - Multi-agentic workflows




M5L5 - Communication patterns for multi-agent systems




M5L6 - Quiz



Open Module-5 Lab Assignments

Tags: Technology,Agentic AI,Artificial Intelligence,

Thursday, September 25, 2025

RAG and Agents: The Future of AI Systems (Chapter 6)

Download Book

<<< Previous Chapter Next Chapter >>>

Introduction

Large Language Models (LLMs) have transformed the way we interact with machines. Yet, while these models are powerful, they are also limited by two constraints: instructions and context. Instructions tell the model what to do, but context provides the knowledge needed to do it. Without relevant context, models are prone to mistakes and hallucinations. This is where two critical patterns come into play: Retrieval-Augmented Generation (RAG) and Agents.

RAG enhances models by retrieving relevant external knowledge, while Agents empower models to interact with tools and environments to accomplish more complex tasks. Together, these paradigms represent the next frontier of AI applications.

In this blog post, we will take a deep dive into both approaches—how they work, their architectures, the algorithms involved, optimization strategies, and their transformative potential.

Part 1: Retrieval-Augmented Generation (RAG)

What is RAG?

Retrieval-Augmented Generation is a technique that enriches model outputs by retrieving the most relevant information from external data sources—be it a document database, conversation history, or the web. Rather than relying solely on the model’s training data or its limited context window, RAG dynamically builds query-specific context.

For example, if asked “Can Acme’s fancy-printer-A300 print 100 pages per second?”, a generic LLM might hallucinate. But with RAG, the model first retrieves the printer’s specification sheet and then generates an informed answer.

This retrieval-before-generation workflow ensures:

Reduced hallucinations
More detailed responses
Efficient use of context length

RAG Architecture

A RAG system typically consists of two components:

Retriever – Finds relevant information from external memory sources.
Generator – Produces an output using the retrieved information.

In practice:

Documents are pre-processed (often split into smaller chunks).
A retrieval algorithm finds the most relevant chunks.
These chunks are concatenated with the user’s query to form the final prompt.
The generator (usually an LLM) produces the answer.

This modularity allows developers to swap retrievers, use different vector databases, or fine-tune embeddings to improve performance.

Retrieval Algorithms

Retrieval is a century-old idea—its roots go back to information retrieval systems in the 1920s. Modern RAG employs two main categories:

1. Term-Based Retrieval (Lexical Retrieval)

Uses keywords to match documents with queries.
Classic algorithms: TF-IDF, BM25, Elasticsearch.
Advantages: fast, cheap, effective out-of-the-box.
Limitations: doesn’t capture semantic meaning. For instance, a query for “transformer architecture” might return documents about electrical transformers instead of neural networks.

2. Embedding-Based Retrieval (Semantic Retrieval)

Represents documents and queries as dense vectors (embeddings).
Relevance is measured by similarity (e.g., cosine similarity).
Requires vector databases (e.g., FAISS, Pinecone, Milvus).
Advantages: captures meaning, handles natural queries.
Limitations: slower, costlier, requires embedding generation.

Hybrid Retrieval

Most production systems combine both approaches. For instance:

Step 1: Use BM25 to fetch candidate documents.
Step 2: Use embeddings to rerank and refine results.

This ensures both speed and semantic precision.

Vector Search Techniques

Efficient vector search is key for large-scale RAG. Popular algorithms include:

HNSW (Hierarchical Navigable Small World Graphs) – graph-based nearest neighbor search.
Product Quantization (PQ) – compresses vectors for faster similarity comparisons.
IVF (Inverted File Index) – clusters vectors for scalable retrieval.
Annoy, FAISS, ScaNN – popular libraries for approximate nearest neighbor (ANN) search.

Evaluating Retrieval Quality

Metrics for evaluating retrievers include:

Context Precision: % of retrieved documents that are relevant.
Context Recall: % of relevant documents that were retrieved.
Ranking Metrics: NDCG, MAP, MRR.

Ultimately, the retriever’s success should be measured by the quality of final generated answers.

Optimizing Retrieval

Several strategies enhance retrieval effectiveness:

Chunking Strategy – Decide how to split documents (by tokens, sentences, paragraphs, or recursively).
Reranking – Reorder retrieved documents based on relevance or freshness.
Query Rewriting – Reformulate user queries for clarity.
Contextual Retrieval – Augment chunks with metadata, titles, or summaries.

Beyond Text: Multimodal and Tabular RAG

Multimodal RAG: Retrieves both text and images (using models like CLIP).
Tabular RAG: Converts natural queries into SQL (Text-to-SQL) for structured databases.

These extensions broaden RAG’s applicability to enterprise analytics, ecommerce, and multimodal assistants.

Part 2: Agents

What Are Agents?

In AI, an agent is anything that perceives its environment and acts upon it. Unlike RAG, which focuses on constructing better context, agents leverage tools and planning to interact with the world.

Examples of agents include:

A coding assistant that navigates a repo, edits files, and runs tests.
A customer-support bot that reads emails, queries databases, and sends responses.
A travel planner that books flights, reserves hotels, and creates itineraries.

Components of an Agent

An agent consists of:

Environment – The world it operates in (e.g., web, codebase, financial system).
Actions/Tools – Functions it can perform (search, query, write).
Planner – The reasoning engine (LLM) that decides which actions to take.

Tools: Extending Agent Capabilities

Tools are the bridge between AI reasoning and real-world actions. They fall into three categories:

Knowledge Augmentation: e.g., retrievers, SQL executors, web browsers.
Capability Extension: e.g., calculators, code interpreters, translators.
Write Actions: e.g., sending emails, executing transactions, updating databases.

The choice of tools defines what an agent can achieve.

Planning: The Agent’s Brain

Complex tasks require planning—breaking goals into manageable steps. This involves:

Plan Generation – Decomposing tasks into steps.
Plan Validation – Ensuring steps are feasible.
Execution – Performing steps using tools.
Reflection – Evaluating results, correcting errors.

This iterative loop makes agents adaptive and autonomous.

Failures and Risks

With power comes risk. Agents introduce new failure modes:

Compound Errors – Mistakes in multi-step reasoning accumulate.
Overreach – Misusing tools (e.g., sending wrong emails).
Security Risks – Vulnerable to prompt injection or malicious tool manipulation.

Thus, safety mechanisms, human oversight, and constrained tool permissions are critical.

Evaluating Agents

Evaluating agents is complex and multi-layered:

Task success rate
Efficiency (steps, latency, cost)
Robustness against adversarial inputs
User trust and satisfaction

Unlike single-shot LLMs, agents need evaluation frameworks that capture their sequential reasoning and tool use.

The Convergence of RAG and Agents

While distinct, RAG and Agents are complementary:

RAG provides better knowledge.
Agents provide better action.

Together, they enable AI systems that are:

Knowledge-rich (RAG reduces hallucinations).
Action-oriented (Agents execute tasks).
Adaptive (feedback-driven planning).

Future enterprise AI systems will likely embed both patterns: RAG for context construction and Agents for execution.

Conclusion

RAG and Agents represent two of the most promising paradigms in applied AI today. RAG helps models overcome context limitations by dynamically retrieving relevant information. Agents extend models into autonomous actors that can reason, plan, and interact with the world.

As models get stronger and contexts expand, some may argue RAG will become obsolete. Yet, the need for efficient, query-specific retrieval will persist. Similarly, while agents bring new challenges—such as security, compound errors, and evaluation hurdles—their potential to automate real-world workflows is too transformative to ignore.

In short, RAG equips models with knowledge, and Agents empower them with action. Together, they pave the way for the next generation of intelligent systems.

Tags: Artificial Intelligence,Generative AI,Agentic AI,Technology,Book Summary,

Wednesday, September 24, 2025

Building AI Applications with Foundation Models: A Deep Dive (Chapter 1)

Download Book

Next Chapter >>>

If I had to choose one word to capture the spirit of AI after 2020, it would be scale.

Artificial intelligence has always been about teaching machines to mimic some aspect of human intelligence. But something changed in the last few years. Models like ChatGPT, Google Gemini, Anthropic’s Claude, and Midjourney are no longer small experiments or niche academic projects. They’re planetary in scale — so large that training them consumes measurable fractions of the world’s electricity, and researchers worry we might run out of high-quality public internet text to feed them.

This new age of AI is reshaping how applications are built. On one hand, AI models are more powerful than ever, capable of handling a dazzling variety of tasks. On the other hand, building them from scratch requires billions of dollars in compute, mountains of data, and elite talent that only a handful of companies can afford.

The solution has been “model as a service.” Instead of training your own massive AI model, you can call an API to access one that already exists. That’s what makes it possible for startups, hobbyists, educators, and enterprises alike to build powerful AI applications today.

This shift has given rise to a new discipline: AI engineering — the craft of building applications on top of foundation models. It’s one of the fastest-growing areas of software engineering, and in this blog post, we’re going to explore what it means, where it came from, and why it matters.

From Language Models to Large Language Models

To understand today’s AI boom, we need to rewind a bit.

Language models have been around since at least the 1950s. Early on, they were statistical systems that captured probabilities: given the phrase “My favorite color is __”, the model would know that “blue” is a more likely completion than “car.”

Claude Shannon — often called the father of information theory — helped pioneer this idea in his 1951 paper Prediction and Entropy of Printed English. Long before deep learning, this insight showed that language has structure, and that structure can be modeled mathematically.

For decades, progress was incremental. Then came self-supervision — a method that allowed models to train themselves by predicting missing words or the next word in a sequence, without requiring hand-labeled data. Suddenly, scaling became possible.

That’s how we went from small models to large language models (LLMs) like GPT-2 (1.5 billion parameters) and GPT-4 (over 100 billion). With scale came an explosion of capabilities: translation, summarization, coding, question answering, even creative writing.

Why Tokens Matter

At the heart of a language model is the concept of a token.

Tokens are the building blocks — they can be words, sub-words, or characters. GPT-4, for instance, breaks the sentence “I can’t wait to build AI applications” into nine tokens, splitting “can’t” into can and ’t.

Why not just use whole words? Because tokens strike the right balance:

They capture meaning better than individual characters.
They shrink the vocabulary size compared to full words, making models more efficient.
They allow flexibility for new or made-up words, like splitting “chatgpting” into chatgpt + ing.

This token-based approach makes models efficient yet expressive — one of the quiet innovations that enable today’s LLMs.

The Leap to Foundation Models

LLMs were groundbreaking, but they were text-only. Humans, of course, process the world through multiple senses — vision, sound, even touch.

That’s where foundation models come in. A foundation model is a large, general-purpose model trained on vast datasets, often spanning multiple modalities. GPT-4V can “see” images, Gemini understands both text and visuals, and other models are expanding into video, 3D data, protein structures, and beyond.

These models are called “foundation” models because they serve as the base layer on which countless other applications can be built. Instead of training a bespoke model for each task — sentiment analysis, translation, object detection, etc. — you start with a foundation model and adapt it.

This adaptation can happen through:

Prompt engineering (carefully wording your inputs).
Retrieval-Augmented Generation (RAG) (connecting the model to external databases).
Fine-tuning (training the model further on domain-specific data).

The result: it’s faster, cheaper, and more accessible than ever to build AI-powered applications.

The Rise of AI Engineering

So why talk about AI engineering now? After all, people have been building AI applications for years — recommendation systems, fraud detection, image recognition, and more.

The difference is that traditional machine learning (ML) often required custom model development. AI engineering, by contrast, is about leveraging pre-trained foundation models and adapting them to specific needs.

Three forces drive its rapid growth:

General-purpose capabilities – Foundation models aren’t just better at old tasks; they can handle entirely new ones, from generating artwork to simulating human conversation.
Massive investment – Venture capital and enterprise budgets are pouring into AI at unprecedented levels. Goldman Sachs estimates $200 billion in global AI investment by 2025.
Lower barriers to entry – With APIs and no-code tools, almost anyone can experiment with AI. You don’t need a PhD or a GPU cluster — you just need an idea.

That’s why AI engineering is exploding in popularity. GitHub projects like LangChain, AutoGPT, and Ollama gained millions of users in record time, outpacing even web development frameworks like React in star growth.

Where AI Is Already Making an Impact

The number of potential applications is dizzying. Let’s highlight some of the most significant categories:

1. Coding

AI coding assistants like GitHub Copilot have already crossed $100 million in annual revenue. They can autocomplete functions, generate tests, translate between programming languages, and even build websites from screenshots. Developers report productivity boosts of 25–50% for common tasks.

2. Creative Media

Tools like Midjourney, Runway, and Adobe Firefly are transforming image and video production. AI can generate headshots, ads, or entire movie scenes — not just as drafts, but as production-ready content. Marketing, design, and entertainment industries are being redefined.

3. Writing

From emails to novels, AI is everywhere. An MIT study found ChatGPT users finished writing tasks 40% faster with 18% higher quality. Enterprises use AI for reports, outreach emails, and SEO content. Students use it for essays; authors experiment with co-writing novels.

4. Education

Instead of banning AI, schools are learning to integrate it. Personalized tutoring, quiz generation, adaptive lesson plans, and AI-powered teaching assistants are just the beginning. Education may be one of AI’s most transformative domains.

5. Conversational Bots

ChatGPT popularized text-based bots, but voice and 3D bots are following. Enterprises deploy customer support agents, while gamers experiment with smart NPCs. Some people even turn to AI companions for emotional support — a controversial but rapidly growing trend.

6. Information Aggregation

From summarizing emails to distilling research papers, AI excels at taming information overload. Enterprises use it for meeting summaries, project management, and market research.

7. Data Organization

With billions of documents, images, and videos produced daily, AI is becoming essential for intelligent data management — extracting structured information from unstructured sources.

8. Workflow Automation

Ultimately, AI agents aim to automate end-to-end tasks: booking travel, filing expenses, or processing insurance claims. The dream is a world where AI handles the tedious stuff so humans can focus on creativity and strategy.

Should You Build an AI Application?

With all this potential, the temptation is to dive in immediately. But not every AI idea makes sense. Before building, ask:

Why build this?
- Is it existential (competitors using AI could make you obsolete)?
- Is it opportunistic (boost profits, cut costs)?
- Or is it exploratory (experimenting so you’re not left behind)?
What role will AI play?
- Critical or complementary?
- Reactive (responding to prompts) or proactive (offering insights unasked)?
- Dynamic (personalized, continuously updated) or static (one-size-fits-all)?
What role will humans play?
- Is AI assisting humans, replacing them in some tasks, or operating independently?
Can your product defend itself?
- If it’s easy to copy, what moat protects it? Proprietary data? Strong distribution? Unique integrations?

Setting Realistic Expectations

A common trap in AI development is mistaking a demo for a product.

It’s easy to build a flashy demo in a weekend using foundation models. But going from a demo to a reliable product can take months or even years. LinkedIn, for instance, hit 80% of their desired experience in one month — but needed four more months to polish the last 15%.

AI applications need:

Clear success metrics (e.g., cost per request, customer satisfaction).
Defined usefulness thresholds (how good is “good enough”?).
Maintenance strategies (models, APIs, and costs change rapidly).

AI is a fast-moving train. Building on foundation models means committing to constant adaptation. Today’s best tool may be tomorrow’s outdated choice.

Final Thoughts: The AI Opportunity

We’re living through a rare technological moment — one where barriers are falling and possibilities are multiplying.

The internet transformed how we connect. Smartphones transformed how we live. AI is transforming how we think, create, and build.

Foundation models are the new “operating system” of innovation. They allow anyone — from solo entrepreneurs to global enterprises — to leverage intelligence at scale.

But success won’t come from blindly bolting AI onto everything. The winners will be those who understand the nuances: when to build, how to adapt, where to trust AI, and where to keep humans in the loop.

As with every major shift, there will be noise, hype, and failures. But there will also be breakthroughs — applications we can’t yet imagine that may reshape industries, education, creativity, and daily life.

If you’ve ever wanted to be at the frontier of technology, this is it. AI engineering is the frontier. And the best way to learn it is the simplest: start building.

Tags: Artificial Intelligence,Generative AI,Agentic AI,Technology,Book Summary,

Pages

Friday, November 21, 2025

Agentic AI Inception

What is Agentic AI?

Large Language Models

Agentic AI Overview (Stanford)

Building Agents

Model Context Protocol

Free Courses at DeepLearning.AI

Tuesday, November 4, 2025

Sunday, November 2, 2025

🧠 Research Paper Summary

🚀 Key Points

⚙️ Barriers to Adoption

📢 Authors’ Call

✍️ Blog Post (Layman’s Version)

💡 Why Small Language Models Might Be the Future of AI Agents

⚙️ What Are SLMs?

🚀 Why They’re Better for AI Agents

🔄 The Future: Hybrid AI Systems

🧭 A Shift That’s Coming

Thursday, October 16, 2025

M1 - Introduction to Agentic Workflows

M1L2 - What is Agentic AI

M1L3 - Degrees of Autonomy

M1L4 - Benefits of Agentic AI

M1L5 - Agentic AI Applications

M1L6 - Task Decomposition - Identifying the steps in a workflow

M1L7 - Evaluating Agentic AI (evals)

M1L8 - Agentic Design Patterns

M1L9 - Quiz

M2 - Reflection Design Pattern

M2L1 - Reflection to improve outputs of a task

M2L2 - Why not just direct generation

M2L3 - Chart Generation Workflow

M2L4 - Evaluating the impact of reflection

M2L5 - Using External Feedback

M2L6 - Quiz

M3 - Tool Use

M3L1 - What Are Tools

M3L2 - Creating a Tool

M3L3 - Tool Syntax

M3L4 - Code Execution

M3L5 - MCP

M3L6 - Quiz

M4 - Practical Tips for Building Agentic AI

M4L1 - Evaluations (evals)

M4L2 - Error Analysis and prioritizing next steps

M4L3 - More error analysis examples

M4L4 - Component-level evaluations

M4L5 - How to address problems you identify

M4L6 - Latency, cost optimization

M4L7 - Development process summary

M4L8 - Quiz

M5 - Patterns for Highly Autonomous Agents

M5L1 - Planning Workflows

M5L2 - Creating and executing LLM plans

M5L3 - Planning with code execution

M5L4 - Multi-agentic workflows

M5L5 - Communication patterns for multi-agent systems

M5L6 - Quiz