LAI #76: Qwen Fine-Tuning, Real-Time RL, Agent-to-Agent Systems, and Verifiable RAG
Startup & research collabs, custom training tips, and tools that make agents actually useful.
Good morning, AI enthusiasts,
This week’s issue is all about bridging research and practice. We’re starting with a guide to fine-tuning Qwen-3 using Unsloth, built for anyone customizing models with speed and memory in mind. From there, we get into real-time reinforcement learning to recalibrate forecasts without retraining, and a solid breakdown of how A2A + MCP + LangChain combine for structured, agent-to-agent communication.
We’re also covering visual grounding in RAG systems for better transparency, and a critical look at whether Phi-4’s reasoning is really any stronger than GPT-4o or o1. Add in community-built learning resources, startup collabs, and the weekly poll on open models — and you’ve got a practical, well-rounded issue.
Let’s get into it.
What’s AI Weekly
Every company uses AI, but often, it feels more like hype than something that truly helps you get things done. Today, in What’s AI, I want to talk about how to spot the difference between AI that actually solves real problems and AI that’s just there for the sake of being trendy. And we’ll dive into how Notion is a perfect example of useful AI integrations into their new Notion AI, solving problems and not just hype for the sake of using AI. Read the complete blog and case study here or watch the video on YouTube.
— Louis-François Bouchard, Towards AI Co-founder & Head of Community
Learn AI Together Community Section!
Featured Community post from the Discord
Stormtrooper4432 created a resource called CodeSparkClubs to help high schoolers start or grow AI and computer science clubs. It offers free, ready-to-launch materials, including guides, lesson plans, and project tutorials, all accessible via a website. This is a great initiative to get students interested in AI now, instead of them having to play catch-up later. Check it out here and support a fellow community member. If you have any questions or suggestions on the material, share them in the Discord thread!
AI poll of the week!
The vote is almost split between entirely open models and hybrid models. For those who think open models are the way to go, other than the fine-tuning flexibility, what does knowing the code behind an LLM help with on a day-to-day basis? Share your thoughts in the thread, I am very curious to know!
Collaboration Opportunities
The Learn AI Together Discord community is flooding with collaboration opportunities. If you are excited to dive into applied AI, want a study partner, or even want to find a partner for your passion project, join the collaboration channel! Keep an eye on this section, too — we share cool opportunities every week!
1. Efficientnet_99825 is writing a research paper on recommendation systems and methods, along with improvements to date. He wants to create code bases for it, too. He is looking for someone experienced to guide or collaborate on the paper. If this sounds like you, reach out to him in the thread!
2. Sweatysteve123 is looking for an AI/ML partner to build a startup. If you are in the Bay Area or around and would like to explore the idea, connect with him in the thread!
Meme of the week!
Meme shared by bin4ry_d3struct0r
TAI Curated section
Article of the week
A2A + MCP + LangChain = Powerful Agent Communication By Gao Dalie (高達烈)
This article outlines the development of a multi-agent chatbot by combining Google’s Agent-to-Agent (A2A) and Model Context Protocol (MCP) with LangChain. It explains MCP for tool access and A2A for inter-agent communication. The process involved creating an OpenAI-powered A2A server for financial analysis and an MCP server with tools for fetching stock data and scraping news. These components were converted to LangChain tools and unified under a meta-agent to handle stock-related queries.
Our must-read articles
1. Qwen-3 Fine Tuning Made Easy: Create Custom AI Models with Python and Unsloth By Krishan Walia
A process for fine-tuning the Qwen-3 language model using Python and the Unsloth library, which enables faster training with reduced memory, was presented in this blog. The discussion covered data preparation with reasoning and chat datasets, model initialization, and applying LoRA adapters for efficient customization. Training with SFTTrainer, performing inference using different “thinking” modes, and saving the specialized model were also detailed.
2. Plug-and-Play Reinforcement Learning for Real-Time Forecast Recalibration By Shenggang Li
The author presented a real-time method for recalibrating ARMA sales models using reinforcement learning. Instead of full retraining, a Proximal Policy Optimization (PPO) agent acts as an auto-tuner, observing forecast errors and market context to apply small corrective nudges to baseline ARMA predictions. A case study showed this PPO-corrected approach reduced forecast errors (RMSE and MAE) compared to the standalone ARMA model, enabling quick adaptation to market changes.
3. Visual Grounding for Advanced RAG Frameworks By Felix Pappe
This piece detailed a visual grounding approach for Retrieval-Augmented Generation (RAG) systems to enhance answer verifiability. It explained how Docling, Qdrant, and LangChain can highlight the information’s source on the original document page. The process utilized Docling for layout-aware parsing and metadata extraction, Qdrant for vector storage, and LangChain for the RAG pipeline. This method aims to build user trust by making AI-generated answers transparent.
4. Why Phi-4 Reasoning Is Not Much Better Than GPT-4o and o1 — Here is the Result By Gao Dalie (高達烈)
The author evaluated Phi-4 against GPT-4o and o1 using mathematical and logical reasoning tasks. Findings indicated that while Phi-4 demonstrated reasoning, its responses were often lengthy and less user-friendly than the other models. The piece questions if Phi-4 currently offers a significant practical advantage over established LLMs for general local use.
If you want to publish with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.