PodXiv: The latest AI papers, decoded in 20 minutes.

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

PodXiv: The latest AI papers, decoded in 20 minutes.

著者： AI Podcast

無料で聴く

このコンテンツについて

This podcast delivers sharp, daily breakdowns of cutting-edge research in AI. Perfect for researchers, engineers, and AI enthusiasts. Each episode cuts through the jargon to unpack key insights, real-world impact, and what’s next. This podcast is purely for learning purposes. We'll never monetize this podcast. It's run by research volunteers like you! Questions? Write me at: airesearchpodcasts@gmail.comAI Podcast

政治・政府

エピソードもっと見る

Text2Tracks: Music Recommendation via Generative Retrieval

2025/06/11

Natural language prompts are changing how we ask for music recommendations. Users want to say things like, "Recommend some old classics for slow dancing?". But traditional LLMs often just generate song titles, which has drawbacks like needing extra steps to find the actual track and being inefficient.
Introducing Text2Tracks from Spotify! This novel research tackles prompt-based music recommendation using generative retrieval. Instead of generating titles, Text2Tracks is trained to directly output relevant track IDs based on your text prompt.
A critical finding is that how you represent the track IDs makes a huge difference. Using semantic IDs derived from collaborative filtering embeddings proved most effective, significantly outperforming older methods like using artist and track names. This approach boosts effectiveness (48% increase in Hits@10) and efficiency (7.5x fewer decoding steps).
While developing effective ID strategies was a key challenge explored, Text2Tracks ultimately outperforms traditional retrieval methods, making it a powerful new model particularly suited for conversational recommendation scenarios.
Paper link: https://arxiv.org/pdf/2503.24193

続きを読む一部表示

13 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
(LLM Explain-Apple) The Illusion of Thinking

2025/06/09

Generate 200 words description for my podcast. The description should convey the novelty, limitation and applications. Include organization name. Description should be helpful, relevant. Include paper link at the bottom

続きを読む一部表示

13 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
(LLM Scaling Meta) MEGABYTE: Modelling Million-byte Sequences with Transformers

2025/06/08

Explore MEGABYTE from Meta AI, a novel multi-scale transformer architecture designed to tackle the challenge of modelling sequences of over one million bytes. Traditional large transformer decoders scale poorly to such lengths due to the quadratic cost of self-attention and the expense of large feedforward layers per position, limiting their application to long sequences like high-resolution images or books.
MEGABYTE addresses this by segmenting sequences into patches, employing a large global model to process relationships between patches and a smaller local model for prediction within patches. This design leads to significant advantages, including sub-quadratic self-attention cost, the ability to use much larger feedforward layers for the same computational budget, and improved parallelism during generation. Crucially, MEGABYTE enables tokenization-free autoregressive sequence modelling at scale, simplifying processing and offering an alternative to methods that can lose information or require language-specific heuristics.
The architecture demonstrates strong performance across various domains, competing with subword models on long context language modelling, achieving state-of-the-art density estimation on ImageNet, and effectively modelling audio from raw files. While promising, the current experiments are conducted at a scale below the largest state-of-the-art language models, indicating that future work is needed to fully explore scaling MEGABYTE to even larger models and datasets.
Learn how MEGABYTE is advancing the frontier of efficient, large-scale sequence modelling.
[https://proceedings.neurips.cc/paper_files/paper/2023/file/f8f78f8043f35890181a824e53a57134-Paper-Conference.pdf]

続きを読む一部表示

16 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く