AI Explained Official Podcast

著者: Philip - Host of AI Explained YT
  • サマリー

  • Covering the biggest news of the century - the arrival of smarter-than-human AI. From the author of Simple Bench, which reveals the remaining gap between LLM and human reasoning. Hype-free, and the British accent is a freebie bonus.

    © 2024 AI Explained Official Podcast
    続きを読む 一部表示

あらすじ・解説

Covering the biggest news of the century - the arrival of smarter-than-human AI. From the author of Simple Bench, which reveals the remaining gap between LLM and human reasoning. Hype-free, and the British accent is a freebie bonus.

© 2024 AI Explained Official Podcast
エピソード
  • o3 - wow
    2024/12/21

    o3 isn’t one of the biggest developments in AI for 2+ years because it beats a particular benchmark. It is so because it demonstrates a reusable technique through which almost any benchmark could fall, and at short notice. I’ll cover all the highlights, benchmarks broken, and what comes next. Plus, the costs OpenAI didn’t want us to know, Genesis, ARC-AGI 2, Gemini-Thinking, and much more.


    FrontierMath: https://epoch.ai/frontiermath

    https://arxiv.org/pdf/2411.04872

    Chollet Statement:https://arcprize.org/blog/oai-o3-pub-breakthrough

    MLC Paper:

    https://www.scientificamerican.com/article/new-training-method-helps-ai-generalize-like-people-do/?utm_campaign=socialflow&utm_source=twitter&utm_medium=social

    AlphaCode 2: https://storage.googleapis.com/deepmind-media/AlphaCode2/AlphaCode2_Tech_Report.pdf

    Human Performance on ARC-AGI: https://arxiv.org/pdf/2409.01374v1

    Wei Tweet ‘3 months’:https://x.com/_jasonwei/status/1870184982007644614

    Deliberative Alignment Paper: https://openai.com/index/deliberative-alignment/

    Brown Safety Tweet: https://x.com/polynoamial/status/1870196476908834893

    Swe-Bench Verified: https://openai.com/index/introducing-swe-bench-verified/

    Amodei Prediction: https://x.com/OfirPress/status/1858567863788769518

    David Dohan: 16 hours https://x.com/dmdohan/status/1870171404093796638

    OpenAI Personal Writing: https://openai.com/index/learning-to-reason-with-llms/

    https://simple-bench.com/

    John Hallman Tweet: https://x.com/johnohallman/status/1870233375681945725


    00:00 - Introduction

    01:19 - What is o3?

    03:18 - FrontierMath

    05:15 - o4, o5

    06:03 - GPQA

    06:24 - Coding, Codeforces + SWE-verified, AlphaCode 2

    08:13 - 1st Caveat

    09:03 - Compositionality?

    10:16 - SimpleBench?

    13:11 - ARC-AGI, Chollet



    続きを読む 一部表示
    22 分
  • Never Browse Alone? - Gemini 2 Live and ChatGPT Vision
    2024/12/12

    The ‘Gemini 2 Era’ begins … with screen-sharing? But really, it’s a great free tool, for curiosity satisfying rather than bleeding-edge intelligence. I give you the benchmarks, the highlights and of course, the latest from OpenAI Advanced Voice Mode with Vision.

    Plus Deep Research in Gemini Advanced, Simple Bench updates, Santa and what might be for some of you Google’s deflating admission.


    00:00 - Introduction

    00:38 - Live Interaction

    03:43 - Gemini 2.0 Flash Benchmarks

    05:10 - Audio and Image Output

    06:38 - Project Mariner (+ WebVoyager Bench)

    08:49 - But Progress Slowing Down?

    10:43 - OpenAI Announcements + Games



    https://aistudio.google.com/live

    Gemini 2.0 Flash Benchmarks: https://deepmind.google/technologies/gemini/

    Project mariner: https://deepmind.google/technologies/project-mariner/

    WebVoyager: https://x.com/laurentsifre/status/1858918588683296875/photo/1

    Gemini Game play: https://www.youtube.com/watch?v=IKuGNHJBGsc

    Advanced Voice Mode OpenAI: https://www.youtube.com/watch?v=NIQDnWlwYyQ

    https://simple-bench.com/

    Claude Computer Use: https://docs.anthropic.com/en/docs/build-with-claude/computer-use

    Oriol Vinyals Interview: https://www.youtube.com/watch?v=78mEYaztGaw&t=687s



    続きを読む 一部表示
    14 分
  • Sora is Out, But is it a Distraction?
    2024/12/10

    After a 10 month wait, OpenAI have released Sora to paying users. With just a prompt it can generate videos of up to 20 seconds in lower resolutions, and 10 seconds at 1080p if you can fork out $200/month. I’ve tested it and read the system card. The user interface is quite beautiful, even if the videos themselves operate until entirely new rules of physics. But I can’t help wondering if OpenAI want up to focus on releases like this, rather than some quietly broken promises.



    80,000 hours Website, Podcast + Channel:

    https://80000hours.org/

    https://open.spotify.com/show/2WzJwXWBDnn4iZ7odKwDib https://www.youtube.com/@eightythousandhours/videos


    https://openai.com/sora/


    Sora Countries: https://help.openai.com/en/articles/10250692-sora-supported-countries

    Sora Credits: https://help.openai.com/en/articles/10245774-sora-billing-credits-faq

    https://runwayml.com/ and https://pika.art/home


    DeepMind Veo: https://deepmind.google/technologies/veo/


    Sam Altman Ads as Last Resort: https://www.windowscentral.com/software-apps/openai-could-chase-intrusive-ads-as-last-resort


    But OpenAI Considering Ads: https://www.inc.com/ben-sherry/is-openai-getting-into-the-advertising-business-the-company-is-sending-mixed-messages/91033533


    OpenAI Backtracks on Microsoft AGI Clause: https://www.ft.com/content/2c14b89c-f363-4c2a-9dfc-13023b6bce65


    As Microsoft Boast of Labor Savings: https://www.theinformation.com/articles/microsofts-new-sales-pitch-for-ai-spend-less-money-on-humans?rc=sy0ihq


    OpenAI Military Pivot: https://www.technologyreview.com/2024/12/04/1107897/openais-new-defense-contract-completes-its-military-pivot/


    Employees Have Doubts: https://www.washingtonpost.com/technology/2024/12/06/openai-anduril-employee-military-ai/?nid=top_pb_signin&arcId=KZIV7PLRHBCVNPAIAAAVUNRHIM&account_location=ONSITE_HEADER_ARTICLE



    続きを読む 一部表示
    16 分

AI Explained Official Podcastに寄せられたリスナーの声

カスタマーレビュー:以下のタブを選択することで、他のサイトのレビューをご覧になれます。