๐ŸŽฐ Deterministic flash-attn

June 17, 2024

[NOTE]: For additional details, refer to the W&B Report.

Simple tests to confirm the loss is exactly reproducible across independent runs (when launched with the same seed).

Figure 1: Plot of the loss curve for 3 independent runs with deterministic=True
