ThursdAI - The top AI news from the past week

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

ThursdAI - The top AI news from the past week

著者： From Weights & Biases Join AI Evangelist Alex Volkov and a panel of experts to cover everything important that happened in the world of AI from the past week

無料で聴く

このコンテンツについて

Every ThursdAI, Alex Volkov hosts a panel of experts, ai engineers, data scientists and prompt spellcasters on twitter spaces, as we discuss everything major and important that happened in the world of AI for the past week. Topics include LLMs, Open source, New capabilities, OpenAI, competitors in AI space, new LLM models, AI art and diffusion aspects and much more.

sub.thursdai.newsAlex Volkov

政治・政府

エピソードもっと見る

📆 ThursdAI - Jul 3 - ERNIE 4.5, Hunyuan A13B, MAI-DxO outperforms doctors, RL beats SWE bench, Zuck MSL hiring spree & more AI news

2025/07/03

Hey everyone, Alex here 👋Welcome back to another mind-blowing week on ThursdAI! We’re diving into the first show of the second half of 2025, and let me tell you, AI is not slowing down. This week, we’ve got a massive wave of open-source models from Chinese giants like Baidu and Tencent that are shaking up the game, Meta’s jaw-dropping hiring spree with Zuck assembling an AI dream team, and Microsoft’s medical AI outperforming doctors on the toughest cases. Plus, a real-time AI game engine that had me geeking out on stream. Buckle up, folks, because we’ve got a lot to unpack!ThursdAI - Recaps of the most high signal AI weekly spaces is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.We had incredible guests like Michael Luo from Agentica, dropping knowledge on RL coding agents, and Ivan Burazin from Daytona, revealing the infrastructure powering the agent era. We had an incredible episode this week, with over 8,000 views for the live show (as always, Links and Show notes in the end, and the YT live video is here for your convienience if you'd prefer watching) Open Source AI & LLMs: The Chinese Powerhouse WaveMan, if there’s one takeaway from this week, it’s that Chinese companies are absolutely dominating the open-source LLM scene. Let’s break down the heavy hitters that dropped this week and why they’ve got everyone talking.Baidu’s ERNIE 4.5: A Suite of 10 Models to Rule Them AllBaidu, a giant in the Chinese tech space, just flipped the script by open-sourcing their ERNIE 4.5 series. We’re talking 10 distinct models ranging from a whopping 424 billion parameters down to a tiny 0.3 billion. With an Apache 2.0 license, 128K context window, and multimodal capabilities handling image, video, and text input, this is a massive drop. Their biggest Mixture-of-Experts (MoE) model, with 47B active parameters, even outshines OpenAI’s o1 on visual knowledge tasks like DocVQA, scoring 93% compared to o1’s 81%! What’s wild to me is Baidu’s shift. They’ve been running ERNIE in production for years—think chatbots and more across their ecosystem—but they weren’t always open-source fans. Now, they’re not just joining the party, they’re hosting it. If you’re into tinkering, this is your playground—check it out on Hugging Face (HF) or dive into their technical paper (Paper).Tencent’s Hunyuan-A13B-Instruct: WizardLM Team Strikes AgainNext up, Tencent dropped Hunyuan-A13B-Instruct, and oh boy, does it have a backstory. This 80B parameter MoE model (13B active at inference) comes from the legendary WizardLM team, poached from Microsoft after a messy saga where their killer models got yanked from the internet over “safety concerns.” I remember the frustration—we were all hyped, then bam, gone. Now, under Tencent’s wing, they’ve cooked up a model with a 256K context window, hybrid fast-and-slow reasoning modes, and benchmarks that rival DeepSeek R1 and OpenAI o1 on agentic tasks. It scores an impressive 87% on AIME 2024, though it dips to 76% on 2025, hinting at some overfitting quirks. Though for a 12B active parameters model this all is still VERY impressive.Here’s the catch—the license. It excludes commercial use in the EU, UK, and South Korea, and bans usage if you’ve got over 100M active users. So, not as open as we’d like, but for its size, it’s a beast that fits on a single machine, making it a practical choice for many. They’ve also released two datasets, ArtifactsBench and C3-Bench, for code and agent evaluation. I’m not sold on the name—Hunyuan doesn’t roll off the tongue for Western markets—but the WizardLM pedigree means it’s worth a look. Try it out on Hugging Face (HF) or test it directly (Try It).Huawei’s Pangu Pro MoE: Sidestepping Sanctions with Ascend NPUsHuawei entered the fray with Pangu Pro MoE, a 72B parameter model with 16B active per token, and here’s what got me hyped—it’s trained entirely on their own Ascend NPUs, not Nvidia or AMD hardware. This is a bold move to bypass US sanctions, using 4,000 of these chips to preprocess 13 trillion tokens. The result? Up to 1,528 tokens per second per card with speculative decoding, outpacing dense models in speed and cost-efficiency. Performance-wise, it’s close to DeepSeek and Qwen, making it a contender for those outside the Nvidia ecosystem.I’m intrigued by the geopolitical angle here. Huawei’s proving you don’t need Western tech to build frontier models, and while we don’t know who’s got access to these Ascend NPUs, it’s likely a game-changer for Chinese firms. Licensing isn’t as permissive as MIT or Apache, but it’s still open-weight. Peek at it on Hugging Face (HF) for more details.DeepSWE-Preview: RL Coding Agent Hits 59% on SWE-BenchSwitching gears, I was blown away chatting with Michael Luo from Agentica about DeepSWE-Preview, an open-source coding agent trained with ...
続きを読む一部表示

1 時間 36 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
📅 ThursdAI - Jun 26 - Gemini CLI, Flux Kontext Dev, Search Live, Anthropic destroys books, Zucks superintelligent team & more AI news

2025/06/26

Hey folks, Alex here, writing from... a undisclosed tropical paradise location 🏝️ I'm on vacation, but the AI news doesn't stop of course, and neither does ThursdAI. So huge shoutout to Wolfram Ravenwlf for running the show this week, Nisten, LDJ and Yam who joined. So... no long blogpost with analysis this week, but I'll def. recommend tuning in to the show that the folks ran, they had a few guests on, and even got some breaking news (new Flux Kontext that's open source) Of course many of you are readers and are here for the links, so I'm including the raw TL;DR + speaker notes as prepared by the folks for the show! P.S - our (rescheduled) hackathon is coming up in San Francisco, on July 12-13 called WeaveHacks, if you're interested at a chance to win a RoboDog, welcome to join us and give it a try. Register HEREOk, that's it for this week, please enjoy the show and see you next week! ThursdAI - June 26th, 2025 - TL;DR* Hosts and Guests* WolframRvnwlf - Host (@WolframRvnwlf)* Co-Hosts - @yampeleg, @nisten, @ldjconfirmed* Guest - Jason Kneen (@jasonkneen) - Discussing MCPs, coding tools, and agents* Guest - Hrishioa (@hrishioa) - Discussing agentic coding and spec-driven development* Open Source LLMs* Mistral Small 3.2 released with improved instruction following, reduced repetition & better function calling (X)* Unsloth AI releases dynamic GGUFs with fixed chat templates (X)* Kimi-VL-A3B-Thinking-2506 multimodal model updated for better video reasoning and higher resolution (Blog)* Chinese Academy of Science releases Stream-Omni, a new Any-to-Any model for unified multimodal input (HF, Paper)* Prime Intellect launches SYNTHETIC-2, an open reasoning dataset and synthetic data generation platform (X)* Big CO LLMs + APIs* Google* Gemini CLI, a new open-source AI agent, brings Gemini 2.5 Pro to your terminal (Blog, GitHub)* Google reduces free tier API limits for previous generation Gemini Flash models (X)* Search Live with voice conversation is now rolling out in AI Mode in the US (Blog, X)* Gemini API is now faster for video and PDF processing with improved caching (Docs)* Anthropic* Claude introduces an "artifacts" space for building, hosting, and sharing AI-powered apps (X)* Federal judge rules Anthropic's use of books for training Claude qualifies as fair use (X)* xAI* Elon Musk announces the successful launch of Tesla's Robotaxi (X)* Microsoft* Introduces Mu, a new language model powering the agent in Windows Settings (Blog)* Meta* Report: Meta pursued acquiring Ilya Sutskever's SSI, now hires co-founders Nat Friedman and Daniel Gross (X)* OpenAI* OpenAI removes mentions of its acquisition of Jony Ive's startup 'io' amid a trademark dispute (X)* OpenAI announces the release of DeepResearch in API + Webhook support (X)* This weeks Buzz* Alex is on vacation; WolframRvnwlf is attending AI Tinkerers Munich on July 25 (Event)* Join W&B Hackathon happening in 2 weeks in San Francisco - grand prize is a RoboDog! (Register for Free)* Vision & Video* MeiGen-MultiTalk code and checkpoints for multi-person talking head generation are released (GitHub, HF)* Google releases VideoPrism for generating adaptable video embeddings for various tasks (HF, Paper, GitHub)* Voice & Audio* ElevenLabs launches 11.ai, a voice-first personal assistant with MCP support (Sign Up, X)* Google Magenta releases Magenta RealTime, an open weights model for real-time music generation (Colab, Blog)* ElevenLabs launches a mobile app for iOS and Android for on-the-go voice generation (X)* AI Art & Diffusion & 3D* Google rolls out Imagen 4 and Imagen 4 Ultra in the Gemini API and Google AI Studio (Blog)* OmniGen 2 open weights model for enhanced image generation and editing is released (Project Page, Demo, Paper)* Tools* OpenMemory Chrome Extension provides shared memory across ChatGPT, Claude, Gemini and more (X)* LM Studio adds MCP support to connect local LLMs with your favorite servers (Blog)* Cursor is now available as a Slack integration (Dashboard)* All Hands AI releases the OpenHands CLI, a model-agnostic, open-source coding agent (Blog, Docs)* Warp 2.0 launches as an Agentic Development Environment with multi-threading (X)* Studies and Others* The /r/LocalLLaMA subreddit is back online after a brief moderation issue (Reddit, News)* Andrej Karpathy's talk "Software 3.0" discusses the future of programming in the age of AI (YouTube, Summary)Thank you, see you next week! This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
続きを読む一部表示

1 時間 40 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
📆 ThursdAI - June 19 - MiniMax M1 beats R1, OpenAI records your meetings, Gemini in GA, W&B uses Coreweave GPUs & more AI news

2025/06/20

Hey all, Alex here 👋This week, while not the busiest week in releases (we can't get a SOTA LLM every week now can we), was full of interesting open source releases, and feature updates such as the chatGPT meetings recorder (which we live tested on the show, the limit is 2 hours!)It was also a day after our annual W&B conference called FullyConnected, and so I had a few goodies to share with you, like answering the main question, when will W&B have some use of those GPUs from CoreWeave, the answer is... now! (We launched a brand new preview of an inference service with open source models)And finally, we had a great chat with Pankaj Gupta, co-founder and CEO of Yupp, a new service that lets users chat with the top AIs for free, while turning their votes into leaderboards for everyone else to understand which Gen AI model is best for which task/topic. It was a great conversation, and he even shared an invite code with all of us (I'll attach to the TL;DR and show notes, let's dive in!)00:00 Introduction and Welcome01:04 Show Overview and Audience Interaction01:49 Special Guest Announcement and Experiment03:05 Wolfram's Background and Upcoming Hosting04:42 TLDR: This Week's Highlights15:38 Open Source AI Releases32:34 Big Companies and APIs32:45 Google's Gemini Updates42:25 OpenAI's Latest Features54:30 Exciting Updates from Weights & Biases56:42 Introduction to Weights & Biases Inference Service57:41 Exploring the New Inference Playground58:44 User Questions and Model Recommendations59:44 Deep Dive into Model Evaluations01:05:55 Announcing Online Evaluations via Weave01:09:05 Introducing Pankaj Gupta from YUP.AI01:10:23 YUP.AI: A New Platform for Model Evaluations01:13:05 Discussion on Crowdsourced Evaluations01:27:11 New Developments in Video Models01:36:23 OpenAI's New Transcription Service01:39:48 Show Wrap-Up and Future PlansHere's the TL;DR and show notes linksThursdAI - June 19th, 2025 - TL;DR* Hosts and Guests* Alex Volkov - AI Evangelist & Weights & Biases (@altryne)* Co Hosts - @WolframRvnwlf @yampeleg @nisten @ldjconfirmed* Guest - @pankaj - co-founder of Yupp.ai* Open Source LLMs* Moonshot AI open-sourced Kimi-Dev-72B (Github, HF)* MiniMax-M1 456B (45B Active) - reasoning model (Paper, HF, Try It, Github)* Big CO LLMs + APIs* Google drops Gemini 2.5 Pro/Flash GA, 2.5 Flash-Lite in Preview ( Blog, Tech report, Tweet)* Google launches Search Live: Talk, listen and explore in real time with AI Mode (Blog)* OpenAI adds MCP support to Deep Research in chatGPT (X, Docs)* OpenAI launches their meetings recorder in mac App (docs)* Zuck update: Considering bringing Nat Friedman and Daniel Gross to Meta (information)* This weeks Buzz* NEW! W&B Inference provides a unified interface to access and run top open-source AI models (inference, docs)* NEW! W&B Weave Online Evaluations delivers real-time production insights and continuous evaluation for AI agents across any cloud. (X)* The new platform offers "metal-to-token" observability, linking hardware performance directly to application-level metrics.* Vision & Video* ByteDance new video model beats VEO3 - Seedance.1.0 mini (Site, FAL)* MiniMax Hailuo 02 - 1080p native, SOTA instruction following (X, FAL)* Midjourney video is also here - great visuals (X)* Voice & Audio* Kyutai launches open-source, high-throughput streaming Speech-To-Text models for real-time applications (X, website)* Studies and Others* LLMs Flunk Real-World Coding Contests, Exposing a Major Skill Gap (Arxiv)* MIT Study: ChatGPT Use Causes Sharp Cognitive Decline (Arxiv)* Andrej Karpathy's "Software 3.0": The Dawn of English as a Programming Language (youtube, deck)* Tools* Yupp launches with 500+ AI models, a new leaderboard, and a user-powered feedback economy - use thursdai link* to get 50% extra credits* BrowserBase announces director.ai - an agent to run things on the web* Universal system prompt for reduction of hallucination (from Reddit)*Disclosure: while this isn't a paid promotion, I do think that yupp has a great value, I do get a bit more credits on their platform if you click my link and so do you. You can go to yupp.ai and register with no affiliation if you wish. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
続きを読む一部表示

1 時間 42 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く