TY - JOUR
T1 - Decoding the third dimension in the metaverse
T2 - A comprehensive method for reconstructing 2D NFT portraits into 3D models
AU - Deng, Erqiang
AU - You, Li
AU - Khan, Fazlullah
AU - Zhu, Guosong
AU - Qin, Zhen
AU - Kumari, Saru
AU - Xiong, Hu
AU - Alturki, Ryan
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/11
Y1 - 2024/11
N2 - In the Metaverse, 3D modeling techniques and autoencoders offer a novel approach for handling 2D portraits of Non-Fungible Tokens (NFTs). These techniques have significant applications in the metaverse, a virtual, shared, and persistently online space that combines the real world, virtual reality, and augmented reality. Within the metaverse, NFTs can represent virtual items and assets, and 3D modeling techniques can be used to create three-dimensional models of these virtual items and assets. In this paper, we propose a novel method of inferring 3D structure and texture from 2D Non-Fungible Token (NFT) portraits using image-decoupled autoencoders. By implementing 3D facial modeling, depth values are associated with each pixel in the canonical view, thereby modeling 3D faces with fine textures and accurate structures from 2D NFT portraits. The input image is decomposed into four elements: depth map, albedo image, light direction, and viewpoint, all of which are used in the 3D reconstruction process. Asymmetry in NFT portraits is also addressed, and a symmetry confidence map is used to record the symmetry prediction probability for each pixel. In the experimental section, datasets including human faces and anime faces are used to better adapt to the diverse styles of NFT images. The Adam optimizer is used for training, and a set of new evaluation metrics, including cosine similarity, PSNR, SSIM, and LPIPS, are used to assess the quality of texture reconstruction. The proposed method achieves state-of-the-art performance in 3D facial reconstruction and performs exceptionally well in 3D facial reconstruction of anime faces compared to other methods.
AB - In the Metaverse, 3D modeling techniques and autoencoders offer a novel approach for handling 2D portraits of Non-Fungible Tokens (NFTs). These techniques have significant applications in the metaverse, a virtual, shared, and persistently online space that combines the real world, virtual reality, and augmented reality. Within the metaverse, NFTs can represent virtual items and assets, and 3D modeling techniques can be used to create three-dimensional models of these virtual items and assets. In this paper, we propose a novel method of inferring 3D structure and texture from 2D Non-Fungible Token (NFT) portraits using image-decoupled autoencoders. By implementing 3D facial modeling, depth values are associated with each pixel in the canonical view, thereby modeling 3D faces with fine textures and accurate structures from 2D NFT portraits. The input image is decomposed into four elements: depth map, albedo image, light direction, and viewpoint, all of which are used in the 3D reconstruction process. Asymmetry in NFT portraits is also addressed, and a symmetry confidence map is used to record the symmetry prediction probability for each pixel. In the experimental section, datasets including human faces and anime faces are used to better adapt to the diverse styles of NFT images. The Adam optimizer is used for training, and a set of new evaluation metrics, including cosine similarity, PSNR, SSIM, and LPIPS, are used to assess the quality of texture reconstruction. The proposed method achieves state-of-the-art performance in 3D facial reconstruction and performs exceptionally well in 3D facial reconstruction of anime faces compared to other methods.
UR - http://www.scopus.com/inward/record.url?scp=85200208938&partnerID=8YFLogxK
U2 - 10.1016/j.asoc.2024.111964
DO - 10.1016/j.asoc.2024.111964
M3 - Article
AN - SCOPUS:85200208938
SN - 1568-4946
VL - 165
JO - Applied Soft Computing
JF - Applied Soft Computing
M1 - 111964
ER -