Tahoe Therapeutics, the Arc Institute and Biohub announced a joint commitment to generate a publicly available, perturbation‑heavy single‑cell dataset aimed at training and validating virtual cell models. Partners said Tahoe will contribute more than 120 million single‑cell data points across roughly 225,000 drug–patient interactions; the dataset will expand diversity of perturbations, cell types and patient‑relevant contexts and will be made public after a brief exclusive period. Johnny Yu (Tahoe) framed data as the primary bottleneck for virtual cell technologies; the dataset is intended to accelerate AI models that predict transcriptomic responses to drugs and genetic perturbations, directly targeting a critical infrastructure gap for computational therapeutics.