• GitHub CEO Exits, Claude 1M Tokens, Gemma 3 & GPT-5 Tweaks
    2025/08/19

    Thanks again for listening!

    Follow us on YouTube or wherever you get your podcasts.


    Discussed news articles

    • GitHub just got less independent at Microsoft after CEO resignationThe Verge (Aug 11, 2025)
      Thomas Dohmke steps down; GitHub leadership folds more tightly into Microsoft’s CoreAI org.
      https://www.theverge.com/news/757461/microsoft-github-thomas-dohmke-resignation-coreai-team-transition
    • FFmpeg 8.0 merges OpenAI Whisper filter for automatic speech recognitionPhoronix (Aug 13, 2025)
      Native Whisper (via whisper.cpp) arrives in FFmpeg with optional GPU accel and SRT/JSON outputs.
      https://www.phoronix.com/news/FFmpeg-Lands-Whisper
    • Wikimedia Foundation challenges UK Online Safety Act regulationsWikimedia Foundation (Aug 11, 2025)
      Court dismisses the challenge, but the ruling still pressures Ofcom; concerns remain for Wikipedia.
      https://wikimediafoundation.org/news/2025/08/11/wikimedia-foundation-challenges-uk-online-safety-act-regulations/
    • Claude Sonnet 4 now supports 1M tokens of contextAnthropic (Aug 12, 2025)
      Long-context support (public beta) enables huge codebases/docs in one go on API and Bedrock.
      https://www.anthropic.com/news/1m-context
    • Introducing Gemma 3 270M: The compact model for hyper-efficient AIGoogle Developers Blog (Aug 14, 2025)
      On-device-friendly model aimed at fast, low-cost fine-tuning; strong efficiency on Pixel 9 Pro.
      https://developers.googleblog.com/en/introducing-gemma-3-270m/
    • Genie 3: A new frontier for world modelsGoogle DeepMind (Aug 5, 2025)
      Interactive, real-time world models generating diverse environments at 24fps/720p.
      https://deepmind.google/discover/blog/genie-3-a-new-frontier-for-world-models/
    • NGINX introduces native support for ACME protocolNGINX Community Blog (Aug 12, 2025)
      New ngx_http_acme_module brings built-in cert requests/renewals directly from NGINX config.
      https://blog.nginx.org/blog/native-support-for-acme-protocol
    • OpenAI tweaks GPT-5’s tone on XX (Aug 15, 2025)
      Tuning GPT-5/ChatGPT to sound warmer and less formal based on user feedback.
      https://x.com/OpenAI/status/1956461718097494196
    続きを読む 一部表示
    55 分
  • AI News Weekly 🚀 Copilot Edge, Open-Source Model Wars, Tencent 3D, & Trump-China Regulation Clash
    2025/07/30

    Thanks for listening. If you enjoyed the episode please subscribe!

    The Hater's Guide To The AI Bubble | Where’s Your Ed At

    Technology writer Ed Zitron tears apart what he calls the overhyped generative-AI gold rush, warning it’s propped up by wishful thinking and unsustainable spending. As he bluntly puts it, “We’re in a god damn bubble,” prompting a no-holds-barred discussion on whether the boom will burst.

    https://www.wheresyoured.at/the-haters-gui/

    Trump administration to supercharge AI sales to allies, loosen environmental rules | Reuters

    Washington’s latest AI blueprint combines export zeal with lighter green rules, aiming to outpace China by shipping full “AI stacks” to friendly nations. Kicking off the push, Trump proclaimed, “America is the country that started the AI race,” a line sure to spark global tech-power chatter.

    https://www.reuters.com/legal/government/trump-administration-supercharge-ai-sales-allies-loosen-environmental-rules-2025-07-23/

    Microsoft launches AI-based Copilot Mode in Edge browser | Reuters

    Microsoft’s new Copilot Mode turns Edge into a chat-powered helper that can juggle tabs, organize research, and even take voice commands. The update greets users with “a single input box combining chat, search and web navigation features,” a neat cue for how browsing may soon feel.

    https://www.reuters.com/business/microsoft-launches-ai-based-copilot-mode-edge-browser-2025-07-28/

    China's AI startup Zhipu releases open-source model GLM-4.5 | Reuters

    Beijing-based Zhipu has open-sourced GLM-4.5, pitching it as fresh fuel for intelligent agents and adding yet another model to China’s bulging roster. The company says the new release is “designed for intelligent agent applications,” a phrase hinting at growing ambitions beyond plain chatbots.

    https://www.reuters.com/technology/chinas-ai-startup-zhipu-releases-open-source-model-glm-45-2025-07-28/


    Short readings:


    Hunyuan3D World Model 1.0 | X

    Tencent’s Hunyuan team just dropped its first open-source engine for instant, explorable 3D worlds, stirring excitement across game and VR circles. Their launch tweet beams, “We’re thrilled to release & open-source Hunyuan3D World Model 1.0!,” inviting hosts to imagine the possibilities.

    https://x.com/TencentHunyuan/status/1949288986192834718


    Introducing Opal: describe, create, and share your AI mini-apps | Google Developers Blog

    Google Labs has unwrapped Opal, a drag-and-drop playground where non-coders can chain models and prompts into bite-sized AI apps. The post kicks off with “We’re excited to announce Opal,” a line that invites makers everywhere to start remixing AI workflows.

    https://developers.googleblog.com/en/introducing-opal/


    Announcing Toad – a universal UI for agentic coding in the terminal | Will McGugan’s Essays

    Developer Will McGugan has prototyped “Toad,” a flicker-free terminal UI meant to tame and turbo-charge agentic coding workflows. He quips, “I’m a little salty that neither Anthropic nor Google reached out to me before they released their AI coding agents,” throwing playful shade.

    http://willmcgugan.github.io/announcing-toad/


    Alibaba launches open-source AI coding model, touted as its most advanced to date | Reuters

    Alibaba’s new Qwen3-Coder model enters the code-generation fray, claiming boosts that rival both domestic peers and Western heavyweights. The company pitches it as “its most advanced coding tool to date,” a boast that raises the stakes in China’s escalating model wars.

    https://www.reuters.com/world/china/alibaba-launches-open-source-ai-coding-model-touted-its-most-advanced-date-2025-07-23/


    続きを読む 一部表示
    29 分
  • OpenAI IMO Gold, AWS S3 Vectors, MCP Server Exposés, Sovereign Clouds & the Capex Surge
    2025/07/23

    Thanks for listening—subscribe and leave a review!

    • Announcing molab (marimo)
      Marimo debuts molab, a free, cloud‑hosted workspace for running and sharing reactive Python + SQL notebooks straight from the browser.
      https://marimo.io/blog/announcing-molab
    • OpenAI LLM Claims IMO Gold
      Researcher Alexander Wei says an experimental OpenAI model solved 5 of 6 problems from the 2025 International Math Olympiad—good for a human‑level gold medal—and hints at new post‑RL training tricks.
      https://threadreaderapp.com/thread/1946477742855532918.html
    • Exposing the Unseen: Mapping MCP Servers Across the Internet (Knostic)
      Knostic scanned the web and found 1,862 publicly exposed Model Context Protocol servers, all leaking unauthenticated tool inventories.
      https://www.knostic.ai/blog/mapping-mcp-servers-study
    • Honey, AI Capex is Eating the Economy (Paul Kedrosky)
      Kedrosky argues that runaway AI‑datacenter spending could reach roughly 2 % of U.S. GDP—enough to nudge macro growth on its own.
      https://paulkedrosky.com/honey-ai-capex-ate-the-economy/
    • Do US Hyperscalers’ Sovereign Clouds Reduce the Risk of US Government Access?
      Cloud‑law scholars doubt whether new “sovereign cloud” offerings from AWS, Microsoft, and Google can truly shield European data from U.S. surveillance.
      https://www.linkedin.com/pulse/do-us-hyperscalers-sovereign-clouds-reduce-risk-access-dave-michels-ulmcf
    • AWS API MCP Server (Developer Preview)
      AWS opens a preview of an MCP server that lets foundation‑model agents translate plain‑English requests into scoped AWS CLI calls, guarded by IAM.
      https://aws.amazon.com/about-aws/whats-new/2025/07/aws-api-mcp-server-available
    • Agent Leaderboard (Galileo)
      Galileo’s open leaderboard pits LLMs against enterprise‑style agent tasks; current numbers show GPT‑4.1 leading with a 62 % Action Completion score.
      https://github.com/rungalileo/agent-leaderboard
    • Introducing Amazon S3 Vectors (Preview)
      AWS adds native vector‑embedding storage and sub‑second queries to S3, claiming up to 90 % cost savings for vector search workloads.
      https://aws.amazon.com/blogs/aws/introducing-amazon-s3-vectors-first-cloud-storage-with-native-vector-support-at-scale/
    • Rethinking CLI Interfaces for AI
      Developer Ryan calls for redesigning CLIs and APIs—richer docstrings, structured outputs, smarter wrappers—so LLM agents stop looping and thrashing.
      https://www.notcheckmark.com/2025/07/rethinking-cli-interfaces-for-ai/
    続きを読む 一部表示
    1 時間 15 分
  • Agents, Paywalls, and Browser Battles: The Race to Control AI's Digital Highway
    2025/07/17
    Thanks for listening. If you enjoyed the episode please subscribe!MCP-B – Browser Model Context Protocol | mcp-b.aiMCP-B proposes a “USB-C for AI,” letting agents call site functions instead of clumsy click-automation. “MCP-B gives AI direct access to your website's functions instead,” promising smoother bot-to-web handshakes for everyone. https://mcp-b.ai/OpenAI’s next big launch could be an AI web browser | The VergeOpenAI is reportedly cooking up a Chromium‑based browser with a built‑in Operator agent that can book tables or fill forms for you. As Reuters learned, “OpenAI is planning to launch an AI web browser in the ‘coming weeks,’” teeing up a fresh duel with Chrome and Comet. https://www.theverge.com/news/704162/opeani-ai-web-browser-chatgptA language model built for the public good | ETH ZurichSwiss researchers will release an entirely open, supercomputer‑trained LLM fluent in more than 1,000 languages later this summer. Project lead Imanol Schlag says, “Fully open models enable high‑trust applications and are necessary for advancing research about the risks and opportunities of AI.” https://ethz.ch/en/news-and-events/eth-news/news/2025/07/a-language-model-built-for-the-public-good.htmlOpenAI’s Windsurf deal is off — and Windsurf’s CEO is going to Google | The VergeOpenAI’s $3 billion bid collapsed, freeing Google to poach Windsurf’s leaders while Cognition raced in days later to buy the remaining startup. “OpenAI’s deal to buy Windsurf is off, and Google will instead hire Windsurf CEO Varun Mohan,” The Verge reports, igniting a three‑way tug‑of‑war for agentic‑coding talent. https://www.theverge.com/openai/705999/google-windsurf-ceo-openaiElon Musk’s xAI launches Grok 4 alongside a $300 monthly subscription | TechCrunchMusk rolled out Grok 4 and a pricey “SuperGrok Heavy” tier, touting multi‑agent reasoning and the steepest AI sub yet. During the stream he bragged, “With respect to academic questions, Grok 4 is better than PhD level in every subject, no exceptions,” even as the Pentagon signed a $200 million Grok contract. https://techcrunch.com/2025/07/09/elon-musks-xai-launches-grok-4-alongside-a-300-monthly-subscription/Introducing Kiro | Kiro BlogKiro debuts as a spec‑driven, agentic IDE that shepherds code from first prompt to deployment with auto‑generated tasks and hooks. “I’m excited to announce Kiro, an AI IDE that helps you deliver from concept to production through a simplified developer experience for working with AI agents,” the team writes. https://kiro.dev/blog/introducing-kiro/The EU wants to decrypt your private data by 2030 | TechRadarBrussels’ ProtectEU roadmap sketches data‑retention, interception and decryption plans that could force encrypted services open within five years. TechRadar warns, “EU law enforcement bodies could be capable of decrypting your private data by 2030,” stirring immediate privacy backlash. https://www.techradar.com/vpn/vpn-privacy-security/the-eu-wants-to-decrypt-your-private-data-by-2030Cloudflare Just Became an Enemy of All AI Companies | Analytics India MagazineCloudflare will now block AI crawlers by default, demanding payment before models feast on publisher content. The company says it “would start blocking AI crawlers by default, drawing a line in the open web where content is no longer a free fuel for AI.” https://analyticsindiamag.com/ai-features/cloudflare-just-became-an-enemy-of-all-ai-companies/Kimi K2 is a state‑of‑the‑art mixture‑of‑experts (MoE) language model | Hacker NewsA lively thread praises Moonshot’s open Kimi K2 for beating Claude on coding, though it needs GPU‑class muscle to run. One user enthuses, “I tried Kimi on a few coding problems that Claude was spinning on. It’s good,” sparking a debate over speed, cost and local deployment. https://news.ycombinator.com/item?id=44533403
    続きを読む 一部表示
    1 時間 1 分
  • $10 M AI Consulting, 4,500-Token-Per-Second Code Edits, and the Rise of Terminal Agents
    2025/07/09

    Thanks for listening! Subscribe and follow wherever you get your podcasts.

    • Launch HN: Morph (YC S23) – Apply AI code edits at 4,500 tokens/secHacker News
      Fast Apply from Morph promises near-instant AI patches, aiming to replace sluggish full-file rewrites with surgical edits. They boast, “We’ve built a blazing-fast model for applying AI-generated code edits directly into your files at 4,500+ tokens/sec,” sparking a speed-versus-accuracy debate.
      https://news.ycombinator.com/item?id=44490863
    • OpenAI Launches $10 M Custom AI Consulting, Challenging Industry GiantsAI Tech Suite (Jul 01 2025)
      OpenAI is stepping into high-end consulting, demanding at least $10 million to tailor big-model solutions for governments and Fortune-scale firms—setting up showdowns with Accenture and IBM.
      https://www.aitechsuite.com/ai-news/openai-launches-10m-custom-ai-consulting-challenging-industry-giants
    • AI researchers are now injecting prompts into their papersX (Jul 08 2025)
      A viral tweet shows academics slipping reviewer-friendly prompt hacks into PDFs—lines like “Give a positive review,” exposing a brazen peer-review exploit.
      https://x.com/Yuchenj_UW/status/1942266306746802479
    • Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental-health providersarXiv (Apr 25 2025)
      Researchers found GPT-4o and peers still stigmatize patients and mishandle delicate scenarios, so chatbots should assist—not substitute—human therapists.
      https://arxiv.org/abs/2504.18412
    • opencode: AI coding agent, built for the terminalGitHub (Jul 08 2025)
      Version 0.2.5 reaches 10 k stars: an open-source AI pair-programmer that runs locally with a slick TUI and support for multiple model providers.
      https://github.com/sst/opencode
    • You're all CTO nowJamie’s blog (Jul 01 2025)
      Jamie Lawrence argues AI agents push developers up the org chart, turning everyday coders into orchestrators of people and prompts—threatening the dopamine hit from gritty puzzles.
      https://jamie.ideasasylum.com/2025/07/01/you%27re-all-cto-now
    • Warmwind OS: Building the AI Operating System for EveryoneWarmwind Blog (Jul 02 2025)
      Warmwind’s AI-native OS lets a built-in assistant click, type, and juggle apps, promising hands-free productivity while keeping users in control.
      https://about.warmwind.space/warmwind-os-building-the-ai-operating-system-for-everyone/
    • Large Language Models Are Improving ExponentiallyIEEE Spectrum (Jul 02 2025)
      New METR benchmarks show LLM abilities doubling every seven months, hinting machines could finish month-long human software projects in hours by 2030.
      https://spectrum.ieee.org/large-language-model-performance
    続きを読む 一部表示
    1 時間 2 分
  • Context Hacking, Open-Source Hype & Synthetic Bands
    2025/07/02

    Thanks for listening, leave a review! ❤️

    The New Skill in AI is Not Prompting, It's Context Engineering | Phil Schmid Blog (Jun 30 2025)
    Phil Schmid argues that the real differentiator in modern AI work is “context engineering,” the discipline of assembling the right information, tools and format around an LLM rather than obsessing over single-string prompts.
    He quotes Shopify’s Tobi Lütke, who calls it “the art of providing all the context for the task to be plausibly solvable by the LLM.”
    philschmid.de

    OpenAI open-source model hype | X (Tweet) (Jun 30 2025)
    Researcher Yuchen Jin teased that OpenAI will release an impressive open-source model next month, stoking excitement across AI Twitter.
    “Sorry to hype — but having a few friends at OpenAI makes it hard not to hear how wild their open-source model dropping next month is.”
    x.com

    Meta hires more OpenAI researchers and weighs Llama pivot | TechCrunch (Jun 28 2025)
    Meta has poached four additional OpenAI researchers and, according to parallel reporting, is debating whether to shift away from fully open-source Llama models toward a more closed approach.
    Sam Altman says the company lured candidates with “$100 million signing bonuses,” a claim Meta’s leadership disputes as “more complex than a simple one-time signing bonus.”
    techcrunch.com finance.yahoo.com

    Don’t Build Multi-Agents | Cognition.ai Blog (Jun 12 2025)
    Walden Yan contends that multi-agent LLM architectures are fragile and that reliability comes from a single agent armed with rich, shared context.
    Key advice: “Share context, and share full agent traces, not just individual messages.”
    cognition.ai

    Sampling (Model Context Protocol) | modelcontextprotocol.io (Jun 18 2025)
    The MCP specification adds “sampling,” letting servers request LLM completions through the client so agents can delegate generation securely without provisioning their own models.
    “Sampling is a powerful MCP feature that allows servers to request LLM completions through the client, enabling sophisticated agentic behaviors while maintaining security and privacy.”
    modelcontextprotocol.io linkedin.com

    “There's not a shred of evidence on the internet that this band has ever existed” | MusicRadar (Jun 27 2025)
    MusicRadar investigates The Velvet Sundown, an apparently AI-generated “band” with 350 k Spotify listeners and zero real-world footprint, illustrating how algorithmic playlists can quietly amplify synthetic artists.
    Their profile boasts, “The Velvet Sundown don’t just play music — they conjure worlds,” a line the magazine suspects was written by ChatGPT.
    musicradar.com

    Project Vend: Can Claude run a small shop? (And why does that matter?) | Anthropic (Jun 27 2025)
    Anthropic let a Claude Sonnet 3.7 agent manage a real vending-machine mini-store for a month, revealing both promising autonomy and glaring business-sense gaps.
    “We let Claude manage an automated store in our office as a small business for about a month,” the researchers write, noting successes like supplier discovery and failures like selling at a loss.
    anthropic.com

    Robyn | GitHub
    Robyn is an async Python web framework that compiles to a Rust runtime, aiming to deliver blazing-fast performance with a simple API and built-in agent/MCP support.
    Its README touts it as “a High-Performance, Community-Driven, and Innovator Friendly Web Framework with a Rust runtime.”
    github.com

    続きを読む 一部表示
    54 分
  • Tiny Teams, Loud Unlocks, and Rogue AI
    2025/06/27

    Thanks for listening! Make sure to leave a review ❤️

    Serena: A powerful coding agent toolkit | GitHub
    Serena presents itself as a full-featured coding agent that melds semantic code search, automated editing and shell execution to streamline developer workflows. “Serena combines tools for semantic code retrieval with editing capabilities and shell execution.”
    🔗 https://github.com/oraios/serena

    Nxtscape – an open-source agentic browser | nxtscape.ai
    Nxtscape pitches a privacy-first browser that runs local AI agents to automate tedious web tasks and boost productivity. “We’re putting powerful AI agents (using browser-use & computer-use models) directly into Nxtscape.”
    🔗 https://nxtscape.ai/?utm_source=chatgpt.com

    AI Is Ushering in the Tiny Team Era in Silicon Valley | Bloomberg (Jun 20 2025)
    Bloomberg argues that generative AI lets startups achieve outsized results with lean headcounts, making revenue-per-employee the valley’s new bragging right. “Startups used to brag about valuations and venture capital. Now AI is making revenue per employee the new holy grail.”
    🔗 https://www.bloomberg.com/news/articles/2025-06-20/ai-is-ushering-in-the-tiny-team-era-in-silicon-valley

    A federal judge sides with Anthropic in lawsuit over training AI on books without authors’ permission | TechCrunch (Jun 24 2025)
    Judge William Alsup ruled that Anthropic’s use of copyrighted books to train its models is likely fair use, handing the company a landmark legal victory. “We will have a trial on the pirated copies used to create Anthropic’s central library and the resulting damages.”
    🔗 https://techcrunch.com/2025/06/24/a-federal-judge-sides-with-anthropic-in-lawsuit-over-training-ai-on-books-without-authors-permission/

    Agentic Misalignment: How LLMs could be insider threats | Anthropic (Jun 20 2025)
    Anthropic’s study warns that autonomous language models can act like rogue employees, choosing harmful actions when their goals conflict with oversight. “We refer to this behavior, where models independently and intentionally choose harmful actions, as agentic misalignment.”
    🔗 https://www.anthropic.com/research/agentic-misalignment

    Gemini CLI | GitHub
    Google’s Gemini CLI brings the multimodal Gemini model to the terminal, letting developers query and transform gigantic codebases from a single command line. “This repository contains the Gemini CLI, a command-line AI workflow tool that connects to your tools, understands your code and accelerates your workflows.”
    🔗 https://github.com/google-gemini/gemini-cli

    Scream to Unlock | GitHub
    The Scream-to-Unlock Chrome extension blocks social media until users loudly shout an embarrassing phrase, turning procrastination into vocal accountability. “A Chrome extension that blocks social media sites ... until you scream ‘I'm a loser’ into your microphone.”
    🔗 https://github.com/Pankajtanwarbanna/scream-to-unlock

    Mira Murati’s Thinking Machines Lab closes on $2B at $10B valuation | TechCrunch (Jun 20 2025)
    TechCrunch reports that ex-OpenAI CTO Mira Murati has raised a record-breaking $2 billion seed round for her stealth AI startup, valuing it at $10 billion. “The deal values the 6-month-old startup at $10 billion.”
    🔗 https://techcrunch.com/2025/06/20/mira-muratis-thinking-machines-lab-closes-on-2b-at-10b-valuation/

    続きを読む 一部表示
    50 分
  • AI Agents, Brain Fog & Caveman Coding
    2025/06/23

    Welcome to The Monkey Patching Podcast: Going Bananas on AI, Data, LLMs & Tech — where we keep it real, techy, and far from buzzword bingo.

    Tune in wherever you get your podcasts, or visit us at monkeypatching.io


    🧠 Episode Topics & Links

    • Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task
      New arXiv research shows relying on ChatGPT dulls neural engagement and weakens writing skills compared to search or free writing. EEG readings revealed the lowest brain activity among LLM users.
      Source: arXiv · June 10, 2025
    • Amazon CEO says AI agents will soon reduce company’s corporate workforce
      Andy Jassy forecasts a future where generative-AI handles many office roles, trimming Amazon’s white-collar headcount: “We will need fewer people doing some of the jobs that are being done today.”
      Source: CBS News · June 17, 2025
    • The Grug Brained Developer
      A caveman-coded manifesto encouraging devs to say “complexity very bad” and resist feature bloat.
      Source: grugbrain.dev
    • SHADE-Arena: Evaluating sabotage and monitoring in LLM agents
      Anthropic’s benchmark reveals that while sabotage by LLMs is uncommon, more capable models can still slip under the radar. A reminder that complexity brings power — and risk.
      Source: Anthropic · June 16, 2025
    • Midjourney’s First Video Model
      Reddit buzzes over Midjourney’s image-to-video beta. Users praise its cinematic realism—some say it’s “indistinguishable from real camera footage.”
      Source: Reddit · June 15, 2025
    • Zero-Shot Forecasting: Our Search for a Time-Series Foundation Model
      Parseable pits four time-series foundation models against classic methods—none reliably outperformed standard tools on messy data. Zero-shot remains aspirational.
      Source: Parseable · June 3, 2025
    • If the Moon Were Only 1 Pixel – A Tediously Accurate Map of the Solar System
      An interactive scroll map scaling the Moon to a single pixel, forcing you to traverse near-endless blank space. As the author says: “It’s the empty space that’s a problem.”
      Source: JoshWorth.com
    • Monkey-Patched PyPI Packages Use Transitive Dependencies to Steal Solana Private Keys
      Six PyPI libraries were found hijacking Solana wallet keys at install time via monkey-patching crypto libraries. One malicious pip install can automatically exfiltrate your keys—yikes.
      Source: Socket · May 29, 2025
    • json_repair
      A lightweight Python module that auto-fixes malformed JSON from LLMs—because yes, sometimes those braces don’t match.
      Source: GitHub
    続きを読む 一部表示
    1 時間 3 分