This Week in AI: Dolphins, Deepfakes & the Dawn of the Robot Olympics

Rich Washburn
Apr 20, 2025
3 min read

This Week in AI

5:53

Happy Easter, and welcome to another episode of "AI is evolving faster than your GPU drivers." This week, we saw machines talk to dolphins, robots run marathons, deepfake tech level up again, and OpenAI casually drop its smartest models yet—because it’s 2025 and this shit is in timelapse mode, baby.

Here's the no-fluff rundown of this week’s steps toward the Skynet future. (Just kidding. Probably.)

Wanna Talk to Dolphins?

https://www.youtube.com/watch?v=T8GdEVVvXyE

Remember Flipper? Like Lassie of the sea! Or Mr. Ed? That crazy bastard! Well, Google just introduced Dolphin Gemma, an AI trained to understand dolphin vocalizations—and even speak back in dolphin.

It’s built using the lightweight Gemma model (~400M params), and runs directly on a Google Pixel. It processes clicks, whistles, and squawks in real time using Google’s SoundStream, which tokenizes audio the way ChatGPT tokenizes text.

The wild part? It’s going open source this summer. Dr. Doolittle meets TensorFlow. Dude… think of the podcast opportunities.

One Prompt and Just Like That, I Was Runnin’

In Beijing, over 20 humanoid robots competed in a literal half-marathon. The standout? Tien Gong Ultra, a 1.8m bot from the Beijing Humanoid Robot Innovation Center, sprinting like a caffeinated Boston Dynamics prototype. UniTree’s G1 and others held their own too.

I'm not saying cybernetic track and field is the next Olympic sport... but I’m not not saying it.

AI Image & Video Gen Gets Personal (and Weirdly Good)

This week’s creative tool drops were straight-up cheat codes:

Seaweed 7B by ByteDance A 7B-parameter video model that generates crisp 720p at 24 FPS—in real time. It supports image-to-video, start/end frame control, object-specific generation, and synced audio. It’s fast, freaky, and way ahead of its weight class.

Instant Character by Tencent Upload a reference image, then drop your character into any scene or style (anime, Ghibli, realistic). Fidelity is 🔥—it even outperformed GPT-4o in direct comparisons. Think “your OC but everywhere.”

Uni Animate Animate any still image using a reference motion video. It nails hands, rotation, even guessing body parts not shown in the original image. Works with pets too. Run it locally—no 4090 required.

COBRA (Comic Book AI) Comic colorization powered by reference images. High-accuracy panel-by-panel coloring, plus direct editing tools. Also works on frame-by-frame line art videos. Comic studios, take note—this is a game-changer.

Deepfake Lip-Sync 2.0: Meet Sonic

Tencent’s Sonic model animates faces from a single photo and any audio clip—producing realistic talking-head videos up to 10 minutes long. It handles cartoons, anime, 2.5D, real humans... and it lip-syncs like a boss.

D-ID who? This is Windows XP vs. RTX-level upgrade.

OpenAI’s 03 & 04-Mini: Small Models, Big Brains

OpenAI’s new 03 and 04 Mini models are here, and they’re scary smart—especially in math, code, and reasoning. They’re also multimodal with full-on agentic tool use baked in:

Autonomously pick and run tools (Python, search, image analysis)
Parallelize tasks
Generate full research reports, visuals, and more

For a deeper dive, check out my post: AI Ain’t Beta Anymore: O3 Lets You Be the Intern in Your Own Workflow

Benchmark-wise? They edge out Gemini 2.5 Pro in key areas, and 04 Mini is cheaper to run. These “Mini” models are not playing small.

AI-Powered Time Travel? No, But This Is Cool

Visual Chronicles, a collab from Google and Stanford, scans years of Street View data to detect real-world changes—like when a roof got solar panels or a crosswalk turned red.

It’s a time-traveling urban planning assistant powered by vision models. Huge potential for journalism, civic planning, and environmental research. Just waiting on that public release…

Minecraft Gets a Neural Makeover

Microsoft dropped Mineworld, a playable AI version of Minecraft that responds to your keypresses in real time. It builds scenes on the fly using a visual-action autoregressive transformer.

FPS is still low (4–7), but for live generative gameplay? This is a major leap.

Other AI Goodies This Week

NVIDIA PartField Segments 3D models with precision—perfect for animation or custom textures. Super helpful for game devs, robotics, and 3D artists.
Groq Memory Update xAI’s chatbot now has long-term memory—just like ChatGPT’s. Remembers your projects, preferences, and your... "creative" prompts. Combine that with its unlimited token window? Keep your eyes on this one.
Alibaba’s Wan 2.1 Video Gen Upload two keyframes and let the model generate the in-between. Control your story from start to finish. Gonna be huge for AI filmmaking.

Final Thought: AGI? Not Yet. But the Blocks Are Being Built.

This week wasn’t about intelligence explosions or singularity scares—it was about how deeply embedded AI is in everyday life. From dolphin linguistics to design workflows and city planning, the big nebulous dream of AGI is starting to coalesce into real-world tools that actually work.

The blocks are stacking. And every week, we’re coding our way one step closer to whatever comes next.

Happy Easter, everyone. Maybe next year, I can teach a rabbit to code🐰

#AInews, #OpenAI, #Deepfakes, #VideoAI, #SonicAI, #DolphinGemma, #Seaweed7B, #AI2025, #GenerativeAI, #FutureOfTech