Modern machine learning often faces scenarios where models cannot fully utilize the vast amounts of available data, and agents operate in environments so complex that they cannot feasibly visit all possible states. Deciding which data to train on—or how to explore effectively—is crucial to overcoming these challenges. In the first part of this talk, I will discuss the generalization challenges in deep reinforcement learning and demonstrate how effective exploration strategies can improve generalization and the implications this has for the scaling RL algorithms. In the second part, I will show how similar principles can be applied to dynamically select high quality data for language model pretraining and improve performance on a wide range of downstream tasks.