-
サマリー
あらすじ・解説
Oh boy. o1 pro mode out on the same night as o1 full. I read the 49 page paper, ran my own tests, spent my fuel allowance on Pro Mode and will give you all the highlights. Suffice to say the story is not as simple as it first appears.
Weights and Biases’ Weave: wandb.me/ai_explained
Plus, GPT-4.5? MLE Bench, Simple Update, Image Analysis and much more
o1 System Card: https://cdn.openai.com/o1-system-card-20241205.pdf
Apollo Research: https://www.apolloresearch.ai/research/scheming-reasoning-evaluations
Altman Tweet: https://x.com/AnonCEOMakeItAi/status/1864763052622504344
ChatGPT Pro: https://openai.com/index/introducing-chatgpt-pro/
Tibor Blaho: https://x.com/btibor91/status/1864709670470066605
Simple-bench.com
00:00 - Introduction
00:27 - ChatGPT Pro is $200
01:25 - OpenAI Benchmarks
03:20 - o1 System Card, o1 and o1 Pro Mode vs o1-preview
06:18 - Simple Bench surprising results on sample
08:31 - Weight & Biases
09:05 - Image Analysis Compared
12:51 - More Benchmarks and Safety