PreciseCam: Precise Camera Control for Text-to-Image Generation

Bernal-Berdun, Edurne; Gadelha, Matheus; Hold-Geoffroy, Yannick; Gutierrez, Diego; Sun, Xin; Masia, Belen; Serrano, Ana
doi:10.1109/CVPR52734.2025.00260
000163836 001__ 163836
000163836 005__ 20251107115329.0
000163836 0247_ $$2doi$$a10.1109/CVPR52734.2025.00260
000163836 0248_ $$2sideral$$a145835
000163836 037__ $$aART-2025-145835
000163836 041__ $$aeng
000163836 100__ $$0(orcid)0000-0002-5275-8652$$aBernal-Berdun, Edurne$$uUniversidad de Zaragoza
000163836 245__ $$aPreciseCam: Precise Camera Control for Text-to-Image Generation
000163836 260__ $$c2025
000163836 5060_ $$aAccess copy available to the general public$$fUnrestricted
000163836 5203_ $$aImages as an artistic medium often rely on specific camera angles and lens distortions to convey ideas or emotions; however, such precise control is missing in current text-to-image models. We propose an efficient and general solution that allows precise control over the camera when generating both photographic and artistic images. Unlike prior methods that rely on predefined shots, we rely solely on four simple extrinsic and intrinsic camera parameters, removing the need for pre-existing geometry, reference 3D objects, and multi-view data. We also present a novel dataset with more than 57,000 images, along with their text prompts and ground-truth camera parameters. Our evaluation shows precise camera control in text-to-image generation, surpassing traditional prompt engineering approaches.
000163836 536__ $$9info:eu-repo/grantAgreement/ES/DGA/T25-24$$9info:eu-repo/grantAgreement/ES/MICIU/PID2022-141766OB-I00
000163836 540__ $$9info:eu-repo/semantics/embargoedAccess$$aAll rights reserved$$uhttp://www.europeana.eu/rights/rr-f/
000163836 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/acceptedVersion
000163836 700__ $$0(orcid)0000-0002-7796-3177$$aSerrano, Ana$$uUniversidad de Zaragoza
000163836 700__ $$0(orcid)0000-0003-0060-7278$$aMasia, Belen$$uUniversidad de Zaragoza
000163836 700__ $$aGadelha, Matheus
000163836 700__ $$aHold-Geoffroy, Yannick
000163836 700__ $$aSun, Xin
000163836 700__ $$0(orcid)0000-0002-7503-7022$$aGutierrez, Diego$$uUniversidad de Zaragoza
000163836 7102_ $$15007$$2570$$aUniversidad de Zaragoza$$bDpto. Informát.Ingenie.Sistms.$$cÁrea Lenguajes y Sistemas Inf.
000163836 773__ $$g2025 (2025), 2724-2733$$pProc.- IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit.$$tProceedings - IEEE Computer Society Conference on Computer Vision and Pattern Recognition$$x1063-6919
000163836 8564_ $$s16450327$$uhttps://zaguan.unizar.es/record/163836/files/texto_completo.pdf$$yPostprint$$zinfo:eu-repo/date/embargoEnd/2026-08-13
000163836 8564_ $$s2562457$$uhttps://zaguan.unizar.es/record/163836/files/texto_completo.jpg?subformat=icon$$xicon$$yPostprint$$zinfo:eu-repo/date/embargoEnd/2026-08-13
000163836 909CO $$ooai:zaguan.unizar.es:163836$$particulos$$pdriver
000163836 951__ $$a2025-11-07-10:25:42
000163836 980__ $$aARTICLE
Universidad de Zaragoza Repository