FLUX-Reason-6M & PRISM-Bench

A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark

1CUHK   2HKU   3BUAA   4Alibaba   5Sensetime
†Equal Contribution   *Corresponding Author
Teaser Image for FLUX-Reason-6M

We introduce FLUX-Reason-6M and PRISM-Bench. FLUX-Reason-6M is a 6-million-scale synthesized dataset designed to incorporate reasoning capabilities into the architecture of T2I generation. PRISM-Bench serves as a comprehensive and discriminative benchmark with 7 independent tracks that closely align with human judgment.

Abstract

The advancement of open-source text-to-image (T2I) models has been hindered by the absence of large-scale, reasoning-focused datasets and comprehensive evaluation benchmarks, resulting in a performance gap compared to leading closed-source systems. To address this challenge, We introduce FLUX-Reason-6M and PRISM-Bench (Precise and Robust Image Synthesis Measurement Benchmark). FLUX-Reason-6M is a massive dataset consisting of 6 million high-quality FLUX-generated images and 20 million bilingual (English and Chinese) descriptions specifically designed to teach complex reasoning. The image are organized according to six key characteristics: Imagination, Entity, Text rendering, Style, Affection, and Composition, and design explicit Generation Chain-of-Thought (GCoT) to provide detailed breakdowns of image generation steps. The whole data curation takes four months of computation on 128 A100 GPUs, providing the community with a resource previously unattainable outside of large industrial labs. PRISM-Bench offers a novel evaluation standard with seven distinct tracks, including a formidable Long Text challenge using GCoT. Through carefully designed prompts, it utilizes advanced vision-language models for nuanced human-aligned assessment of prompt-image alignment and image aesthetics. Our extensive evaluation of 19 leading models on PRISM-Bench reveals critical performance gaps and highlights specific areas requiring improvement. Our dataset, benchmark, and evaluation code are released to catalyze the next wave of reasoning-oriented T2I generation.

Contributions

  • FLUX-Reason-6M: A Landmark Dataset. We release the first 6-million-scale T2I dataset engineered for reasoning, featuring 20 million bilingual captions and pioneering generation chain of thought prompts. This dataset is created using 128 A100 GPUs over a 4-month period, aiming to serve as the foundational dataset for the next generation of T2I models.
  • PRISM-Bench: A New Standard for Evaluation. We establish a comprehensive, seven-track benchmark that uses GPT-4.1 and Qwen2.5-VL-72B for nuanced and robust evaluation, offering the community a reliable tool to measure models’ true capabilities.
  • Actionable Insights from Extensive Benchmarking. Our extensive and rigorous evaluation of leading models reveals the gaps between different models and potential areas for improvement, providing a clear roadmap for future research.
  • Democratizing a Revolution in T2I. We publicly release the entire dataset, benchmark, and evaluation suite to lower the financial and computational barriers to entry, enabling researchers worldwide to build and test more capable generative models.

Official Leaderboard

🚨 To submit your results to the leaderboard, please send to this email.

# Model Source Date Overall Imagination Entity Text rendering Style Affection Composition Long text
AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg
1GPT-Image-1 [High] πŸ₯‡Link2025-09-1086.985.686.386.286.686.490.086.388.268.880.174.592.893.393.190.790.990.896.289.492.883.872.878.3
2Gemini2.5-Flash-Image πŸ₯ˆLink2025-09-1087.183.485.392.484.888.687.081.384.265.274.169.790.590.890.796.088.292.192.588.590.585.976.281.1
3Qwen-Image πŸ₯‰Link2025-09-1081.178.679.980.578.679.679.373.276.354.368.961.684.588.786.691.689.190.493.786.990.383.865.174.5
4SEEDream 3.0Link2025-09-1080.578.779.677.376.476.980.273.877.056.170.263.283.987.485.789.390.389.893.386.389.883.266.775.0
5HiDream-I1-FullLink2025-09-1076.175.675.974.475.675.074.472.473.458.270.464.381.484.883.190.188.889.590.185.487.863.852.057.9
6FLUX.1-Krea-devLink2025-09-1074.375.174.771.573.072.369.567.568.547.561.354.480.883.582.284.090.387.290.985.888.476.264.170.2
7FLUX.1-devLink2025-09-1072.474.973.768.174.071.170.771.271.048.164.556.372.380.576.488.391.189.789.084.686.870.658.564.6
8SD3.5-LargeLink2025-09-1073.973.573.773.371.272.376.771.974.352.065.858.977.184.280.787.185.286.287.084.785.964.351.758.0
9HiDream-I1-DevLink2025-09-1070.370.070.268.269.769.072.067.069.553.464.158.868.778.673.784.283.183.787.679.883.758.147.552.8
10SD3.5-MediumLink2025-09-1070.168.969.569.573.071.372.863.768.333.350.141.777.480.378.984.985.585.289.479.284.363.350.556.9
11SD3-MediumLink2025-09-1065.665.265.461.065.663.364.856.360.632.853.143.074.875.675.278.780.379.585.579.182.361.546.153.8
12Bagel-CoTLink2025-09-1065.465.065.268.474.271.362.460.061.223.240.131.764.470.167.387.180.583.888.577.983.264.052.058.0
13BagelLink2025-09-1066.763.465.169.468.068.759.050.154.630.244.537.467.971.369.681.781.481.690.573.181.868.155.361.7
14FLUX.1-schnellLink2025-09-1067.161.264.263.366.264.861.851.256.546.254.150.268.670.169.475.469.972.785.167.576.369.449.759.6
15PlaygroundLink2025-09-1062.665.664.162.370.666.572.569.170.810.437.323.977.380.979.191.883.887.877.576.577.046.741.043.9
16JanusPro-7BLink2025-09-1064.257.260.770.465.868.167.151.959.515.536.726.171.473.872.679.271.575.483.761.072.462.439.751.1
17SDXLLink2025-09-1058.961.860.455.361.158.272.567.470.013.837.025.472.475.473.978.977.178.075.575.375.444.239.641.9
18SD2.1Link2025-09-1050.745.348.047.941.244.660.946.753.811.230.620.962.758.660.766.758.562.665.753.159.440.128.234.2
19SD1.5Link2025-09-1044.943.544.236.636.136.453.841.147.58.033.120.655.355.355.364.457.561.061.151.056.135.330.432.9
# Model Source Date Overall Imagination Entity Text rendering Style Affection Composition Long text
AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg
1GPT-Image-1 [High] πŸ₯‡Link2025-09-1082.778.780.779.853.366.687.381.084.166.786.876.887.387.887.588.179.884.092.284.988.577.277.577.4
2Gemini2.5-Flash-Image πŸ₯ˆLink2025-09-1085.075.880.484.738.161.486.076.781.372.884.378.589.587.888.694.374.884.591.288.289.776.380.678.4
3SEEDream 3.0 πŸ₯‰Link2025-09-1080.172.376.275.838.056.981.374.277.758.874.066.484.484.184.290.574.682.593.685.189.376.276.476.3
4Qwen-ImageLink2025-09-1080.068.374.175.537.456.579.564.572.057.971.264.586.684.485.589.970.480.193.979.586.776.870.973.8
5FLUX.1-Krea-devLink2025-09-1074.473.774.069.643.156.372.270.771.451.776.163.980.086.683.382.678.780.690.887.188.973.673.473.5
6HiDream-I1-FullLink2025-09-1076.668.672.673.044.058.576.372.874.560.576.468.481.481.581.490.076.683.388.580.384.466.348.657.4
7SD3.5-LargeLink2025-09-1073.467.870.666.743.455.076.872.774.853.673.163.377.378.277.785.673.979.787.880.984.365.852.259.0
8HiDream-I1-DevLink2025-09-1072.367.069.668.845.857.373.568.170.856.775.766.270.277.473.888.274.381.284.778.581.664.049.356.6
9FLUX.1-devLink2025-09-1072.164.968.565.542.954.270.661.966.252.373.062.672.674.273.486.072.979.487.475.881.670.553.862.1
10SD3.5-MediumLink2025-09-1068.665.166.865.134.749.972.570.971.736.664.550.575.580.077.781.873.977.985.481.083.263.550.657.0
11SD3-MediumLink2025-09-1068.064.266.164.337.751.069.463.366.338.563.350.974.679.577.080.575.578.085.679.582.563.450.356.8
12FLUX.1-schnellLink2025-09-1068.361.164.762.835.649.264.856.860.854.368.161.270.371.570.975.465.970.681.775.678.668.754.461.5
13JanusPro-7BLink2025-09-1064.959.462.165.038.851.968.663.566.023.150.336.770.775.272.980.768.074.382.471.176.763.949.056.4
14Bagel-CoTLink2025-09-1067.556.562.068.044.156.067.653.460.529.442.335.869.069.769.387.166.776.986.669.277.964.550.257.3
15BagelLink2025-09-1067.556.662.068.045.056.567.653.460.529.442.335.869.069.769.387.166.776.986.669.277.964.550.257.3
16PlaygroundLink2025-09-1062.252.157.159.039.049.069.456.763.015.331.923.674.674.674.688.866.077.472.261.366.756.035.345.6
17SDXLLink2025-09-1060.154.057.054.534.144.371.165.068.018.637.327.971.772.672.178.766.572.672.267.870.054.134.544.3
18SD2.1Link2025-09-1054.047.750.848.928.438.666.057.661.816.731.424.062.766.564.668.562.165.364.858.361.550.729.840.2
19SD1.5Link2025-09-1048.843.346.040.723.732.261.252.756.911.424.117.856.761.559.166.960.763.857.553.455.447.326.837.0
# Model Source Date Overall Imagination Entity Text rendering Style Affection Composition Long text
AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg
1GPT-Image-1 [High] πŸ₯‡Link2025-09-1087.787.287.588.890.489.685.992.489.283.967.775.893.991.792.891.586.589.092.497.394.977.284.380.8
2SEEDream 3.0 πŸ₯ˆLink2025-09-1081.982.082.077.277.877.577.678.678.179.771.975.887.883.285.588.785.186.987.794.491.174.382.778.5
3Qwen-Image πŸ₯‰Link2025-09-1080.881.381.180.179.679.975.679.777.776.962.969.990.284.387.387.484.986.286.693.490.068.984.276.6
4BagelLink2025-09-1065.565.265.472.864.768.853.962.258.149.229.039.173.968.471.281.473.577.569.089.879.458.168.763.4
5Bagel-CoTLink2025-09-1064.462.463.475.169.372.253.358.856.142.616.329.573.666.670.181.278.079.674.083.678.850.764.357.5
6HiDream-I1-FullLink2025-09-1060.854.957.953.647.350.563.160.862.034.616.325.574.165.569.880.967.374.173.876.175.045.450.848.1
7HiDream-I1-DevLink2025-09-1055.048.351.747.341.144.252.849.050.935.214.524.964.552.458.576.366.571.467.668.368.041.146.443.8
# Model Source Date Overall Imagination Entity Text rendering Style Affection Composition Long text
AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg AlignAesAvg
1GPT-Image-1 [High] πŸ₯‡Link2025-09-1078.077.477.773.037.655.380.482.181.373.189.981.577.192.484.878.077.877.991.985.788.872.476.374.4
2SEEDream 3.0 πŸ₯ˆLink2025-09-1076.273.274.771.436.654.074.873.874.370.788.079.474.188.081.179.071.475.290.383.286.873.071.272.1
3Qwen-Image πŸ₯‰Link2025-09-1075.065.570.371.429.950.774.767.871.364.373.168.775.283.279.277.364.570.989.874.182.072.665.869.2
4Bagel-CoTLink2025-09-1062.057.459.764.436.650.562.653.858.225.251.938.665.476.771.174.065.069.581.371.376.361.446.654.0
5BagelLink2025-09-1061.554.357.964.636.350.562.755.559.118.626.322.566.076.671.374.966.270.681.372.276.862.447.354.9
6HiDream-I1-FullLink2025-09-1055.955.355.651.230.841.060.161.360.720.740.630.764.573.869.265.269.167.272.469.070.757.142.850.0
7HiDream-I1-DevLink2025-09-1052.249.750.948.324.636.552.654.153.418.635.327.059.068.363.765.962.364.166.564.665.654.238.646.4

BibTeX

@article{fang2025flux,
        title={FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark}, 
        author={Fang, Rongyao and Yu, Aldrich and Duan, Chengqi and Huang, Linjiang and Bai, Shuai and Cai, Yuxuan and Wang, Kun and Liu, Si and Liu, Xihui and Li, Hongsheng},
        journal={arXiv preprint arXiv:2509.09680},
        year={2025}
        }