





















































Welcome to DataPro 138, where graphs aren’t just visuals, they’re the future of machine learning. Where maps aren’t static, they’re smart, dynamic tools. And where every scroll brings you closer to mastering the bleeding edge of data, AI, and analytics.
🔍 AI Breakthroughs You Need to Know
This month’s top research drops, and product releases are setting the stage for next-gen AI development:
📘 Graph Machine Learning, Second Edition – Reinvent Your ML Stack
Forget flat data. The world is connected, and your models should be too. The newly updated Graph Machine Learning dives deep into graph-native thinking with:
Whether you're building models for fraud detection or brain data analysis, this is your leap forward.
🗺️ Learn QGIS, Fifth Edition – Spatial Thinking Starts Here
If QGIS has ever felt like deciphering an alien control panel… this book is your Rosetta Stone. The Fifth Edition of Learn QGIS is built for curious beginners and seasoned pros alike, offering:
It’s not just a manual. It’s a mentor in book form, authored by the legends of the QGIS ecosystem.
💬 What the Data World’s Talking About
From DuckDB pipelines to Claude-powered code boosts, and Jupyter grads leveling up to full-stack devs -this edition is packed with practical takeaways, including:
🧠 Case Studies & Cloud Innovations from the Tech Titans
Google, AWS, and Snowflake just raised the bar on AI-integrated workflows:
Sponsored
🔐 Mobile App Security
Future-proof your app.Discover how your mobile app can evolve automatically, leaving reverse engineers in the dust with every release.
👉Register Now
🤖 AI Side Hustle
Earn up to $50/hr building your AI skills, no experience needed!
💰 Competitive Pay | ⏰ Flexible Schedule | 🚀 Remote & Beginner-Friendly
👉Apply Now
TL;DR: Graph ML is getting smarter. Geospatial data is going mainstream. And AI tooling is evolving faster than ever. Whether you’re coding smarter, mapping clearer, or just trying to stay ahead - DataPro 138 is your unfair advantage.
👉 Ready to dive in? Let’s explore the future of data, together.
Cheers,
Merlyn Shelley
Growth Lead, Packt
Build Your Own AI Agents Over The Weekend
Join the live"Building AI Agents Over the Weekend"Workshop starting onJune 21stand build your own agent in2 weekend.
In this workshop, the Instructors will guide you through building a fully functional autonomous agent and show you exactly how to deploy it in the real world.
🔶 OpenAI Introduces Four Key Updates to Its AI Agent Framework: OpenAI just dropped a major upgrade to its AI agent stack: TypeScript SDK support, real-time voice agents with human-in-the-loop control, full traceability for voice sessions, and smoother speech-to-speech interactions. These updates make agents easier to build, audit, and deploy across web, server, and multimodal voice apps.
🔶 From Exploration Collapse to Predictable Limits: Shanghai AI Lab Proposes Entropy-Based Scaling Laws for Reinforcement Learning in LLMs. Reinforcement learning for reasoning-centric LLMs just got a breakthrough: Researchers tackled the entropy collapse bottleneck by modeling the entropy-performance link and introducing Clip-Cov and KL-Cov, two novel strategies that sustain exploration during RL. Tested on top open-source models, these techniques deliver major performance gains.
🔶 Snowflake Charts New AI Territory: Cortex AISQL & Snowflake Intelligence Poised to Reshape Data Analytics. Snowflake just redefined data-AI synergy: At the Snowflake Summit, they unveiled Cortex AISQL and Snowflake Intelligence, two new tools that embed AI into SQL workflows and enable natural language data queries. These innovations make advanced analytics intuitive for both analysts and business users, signaling a major leap in accessible enterprise AI.
🔶 Mistral AI Introduces Mistral Code: A Customizable AI Coding Assistant for Enterprise Workflows. Mistral AI enters the enterprise dev arena with Mistral Code: Their new coding assistant prioritizes security, on-prem deployment, and tunability to internal codebases. Backed by four specialized models, it supports full-stack workflows—debugging, refactoring, and more, across 80+ languages. With partners like Capgemini onboard, it’s built for real-world, regulated environments.
The future of ML is graph-native,and this book puts you ahead of the curve.
Fully updated with PyTorch Geometric, new chapters on LLMs and temporal graphs, and expert-backed case studies, it’s your guide to building smarter, more dynamic models.
👉 Preorder now and stay ahead while others catch up.
🚀 Why it matters:
👨🔬 Meet your expert guides:
🔶 Data Science ETL Pipelines with DuckDB: ETL just got easier for data scientists with DuckDB: This open-source, in-memory SQL engine streamlines data pipelines, from extracting and transforming raw datasets to loading them into cloud warehouses like Motherduck. With seamless SQL and Pandas support, you can efficiently prep data for analysis, modeling, and beyond, all from your IDE.
🔶 Unlocking Your Data to AI Platform: Generative AI for Multimodal Analytics: SQL meets multimodal AI in the modern data warehouse: Traditional platforms are evolving, now integrating generative AI to natively analyze text, images, and PDFs alongside structured data. With tools like BigQuery’s AI.GENERATE and ObjectRef, analysts can now ask nuanced, semantic questions using pure SQL, no external ML pipelines or prompt engineering required.
🔶 The Journey from Jupyter to Programmer: A Quick-Start Guide. From notebook to production: why it’s time to graduate from Jupyter. This guide unpacks how transitioning from .ipynb files to modular Python scripts empowers data scientists with structure, scalability, and team collaboration. With tools like Cookie Cutter, VS Code, and best practices like if __name__ == '__main__', you’re coding like a pro.
🔶 Supercharge your development with Claude Code and Amazon Bedrock prompt caching: Claude Code + Amazon Bedrock prompt caching is now live: Anthropic’s AI coding assistant, Claude Code, now leverages Bedrock’s prompt caching to cut token costs and speed up coding workflows, especially in large, iterative projects. With support for Model Context Protocol, it’s enterprise-ready, secure, and optimized for real-world software development on AWS.
Every now and then, a tech book shows up that doesn’t just teach a tool, it redefines how you think about the problem. Learn QGIS, Fifth Edition is exactly that kind of book. It’s not a recycled walkthrough. It’s a no-fluff, deeply practical guide to working with geospatial data like a modern pro, even if you’re just getting started. Whether you're wrangling satellite data or just trying to make sense of your city's zoning chaos... this book has your back.
But wait, what even is QGIS?
QGIS blends the power of Excel with the spatial smarts of Google Maps, plus the logic of environmental science, urban planning, and Python. It’s a leading open-source GIS tool used by governments, researchers, and analysts. But learning it solo? Confusing and overwhelming. This guide makes it simple. From install to building a mobile-ready GIS app, this guide takes you from “Where do I start?” to “Look what I built.”
Meet the Dream Team Behind the Book
This book is built for people solving real-world problems, not just collecting certifications. It’s fully updated for QGIS 3.38, QField, open data workflows, and AI tools, so you're learning what actually works from the experts shaping the future of GIS. If your work touches the physical world, spatial thinking leads to better decisions. Learn QGIS, Fifth Edition helps you master it, one hands-on chapter at a time. Now available for pre-order- Click Here to Buy.
🔶 New MCP integrations to Google Cloud Databases: Google’s new MCP Toolbox for Databases streamlines AI-assisted dev: Now GA, Toolbox connects Claude Code, Cursor, and other AI agents directly to databases like BigQuery, AlloyDB, and Cloud SQL. Developers can query, refactor, and generate tests with simple natural language, all within their IDE. Schema changes? Test updates? Just prompt and go.
🔶 Launching our new state-of-the-art Vertex AI Ranking API: Google launches Vertex AI Ranking API to fix noisy search and flaky RAG: With up to 70% of retrieved content often irrelevant, this precision reranker improves answer quality, speeds up AI agents, and cuts costs. It integrates easily with legacy search, RAG, or tools like AlloyDB, LangChain, and Elasticsearch, so you get better results in minutes.
🔶 Introducing Lightning Engine for Apache Spark: Google Cloud unveils Lightning Engine to supercharge Apache Spark: Now in preview, this next-gen engine boosts query performance up to 3.6x with advanced optimizations from scan reduction to columnar shuffle. Built on Velox and Gluten, it integrates seamlessly with Iceberg, Delta Lake, BigQuery, and GCS, delivering faster insights and lower costs without rewriting code.
🔶 AWS Agentic AI Options for migrating VMware based workloads: AWS streamlines VMware migrations with agentic AI: AWS Transform for VMware accelerates rehost planning by 80x, auto-translating networking configs and sizing EC2 workloads. For complex migrations, Amazon Bedrock enables multi-agent orchestration with deep domain expertise, MCP integrations, and traceability. Use both tools to blend speed and sophistication across your cloud migration strategy.
🔶 Building a Modern Dashboard with Python and Gradio: Gradio makes building interactive dashboards refreshingly simple: This guide walks through creating a polished sales performance dashboard using a CSV file and Python, complete with date filters, key metrics, visualizations, and raw data views. With minimal setup, Gradio offers a lightweight, flexible way to turn data into insights without heavy front-end code.
🔶 Decision Trees Natively Handle Categorical Data: Decision trees handle categories just fine, until they don’t: While DTs natively split on categorical features, high cardinality makes training slow. Mean Target Encoding (MTE) elegantly sidesteps this by reducing the number of splits from exponential to linear, without sacrificing accuracy. Empirical tests confirm: MTE delivers the same split, but exponentially faster.
🔶 LLMs + Pandas: How I Use Generative AI to Generate Pandas DataFrame Summaries. Tired of manually analyzing massive datasets? This guide shows how to pair Pandas with local LLMs (via Ollama) to generate polished executive summaries from raw data, no need to leave your machine or break the bank. With one-time setup, you can transform data insights into clean, readable reports in seconds.
🔶 Data Drift Is Not the Actual Problem: Your Monitoring Strategy Is. Data drift isn’t the real threat, misinterpreting it is: In ML systems, drift is often treated as a red flag, but it's just a signal. Without context, statistical monitoring can trigger false alarms or worse, blind spots. A robust strategy layers statistical, contextual, and behavioral monitoring to answer what really matters: does the drift affect outcomes?
See you next time!