Resumen: We present a method that computes an interpretable representation of material appearance within a highly compact, disentangled latent space. This representation is learned in a self-supervised fashion using a VAE-based model. We train our model with a carefully designed unlabeled dataset, avoiding possible biases induced by human-generated labels. Our model demonstrates strong disentanglement and interpretability by effectively encoding material appearance and illumination, despite the absence of explicit supervision. To showcase the capabilities of such a representation, we leverage it for two proof-of-concept applications: image-based appearance transfer and editing. Our representation is used to condition a diffusion pipeline that transfers the appearance of one or more images onto a target geometry, and allows the user to further edit the resulting appearance. This approach offers fine-grained control over the generated results: thanks to the well-structured compact latent space, users can intuitively manipulate attributes such as hue or glossiness in image space to achieve the desired final appearance. Idioma: Inglés DOI: 10.2312/sr.20251187 Año: 2025 Publicado en: Eurographics Symposium on Rendering 2025 (2025), [13 pp.] ISSN: 1553-0574 Financiación: info:eu-repo/grantAgreement/EC/H2020/956585/EU/Predictive Rendering In Manufacture and Engineering/PRIME Financiación: info:eu-repo/grantAgreement/ES/MCIU/FPU20-02340 Financiación: info:eu-repo/grantAgreement/ES/MICIU/PID2022-141766OB-I00 Tipo y forma: Artículo (Versión definitiva) Área (Departamento): Área Lenguajes y Sistemas Inf. (Dpto. Informát.Ingenie.Sistms.)