• (Voiceover) Building on evaluation quicksand

  • 2024/10/16
  • 再生時間: 17 分
  • ポッドキャスト

(Voiceover) Building on evaluation quicksand

  • サマリー

  • Read the full post here: https://www.interconnects.ai/p/building-on-evaluation-quicksand

    Chapters

    00:00 Building on evaluation quicksand

    01:26 The causes of closed evaluation silos

    06:35 The challenge facing open evaluation tools

    10:47 Frontiers in evaluation

    11:32 New types of synthetic data contamination

    13:57 Building harder evaluations

    Figures

    Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/manual/openai-predictions.webp



    Get full access to Interconnects at www.interconnects.ai/subscribe
    続きを読む 一部表示

あらすじ・解説

Read the full post here: https://www.interconnects.ai/p/building-on-evaluation-quicksand

Chapters

00:00 Building on evaluation quicksand

01:26 The causes of closed evaluation silos

06:35 The challenge facing open evaluation tools

10:47 Frontiers in evaluation

11:32 New types of synthetic data contamination

13:57 Building harder evaluations

Figures

Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/manual/openai-predictions.webp



Get full access to Interconnects at www.interconnects.ai/subscribe

(Voiceover) Building on evaluation quicksandに寄せられたリスナーの声

カスタマーレビュー:以下のタブを選択することで、他のサイトのレビューをご覧になれます。