Preprint · CC-BY
via bioRxiv
Probing the Link Between Vision and Language in Material Perception
Liao, C., Sawayama, M., Xiao, B.
biorxiv · 2024
Abstract
We can visually discriminate and recognize a wide range of materials. Meanwhile, we use language to express our subjective understanding of visual input and communicate relevant information about the materials. Here, we investigate the relationship between visual judgment and language expression in material perception to understand how visual features relate to semantic representations. We use deep generative networks to construct an expandable image space to systematically create materials of well-defined and ambiguous categories. From such a space, we sampled diverse stimuli and compared the representations of materials from two behavioral tasks: visual material similarity judgments and free-form verbal descriptions. Our findings reveal a moderate but significant correlation between vision and language on a categorical level. However, analyzing the representations with an unsupervised alignment method, we discover structural differences that arise at the image-to-image level, especially among materials morphed between known categories. Moreover, visual judgments exhibit more individual differences compared to verbal descriptions. Our results show that while verbal descriptions capture material qualities on the coarse level, they may not fully convey the visual features that characterize the materials optical properties. Analyzing the image representation of materials obtained from various pre-trained data-rich deep neural networks, we find that human visual judgments similarity structures align more closely with those of the text-guided visual-semantic model than purely vision-based models. Our findings suggest that while semantic representations facilitate material categorization, non-semantic visual features also play a significant role in discriminating materials at a finer level. This work illustrates the need to consider the vision-language relationship in building a comprehensive model for material perception. Moreover, we propose a novel framework for quantitatively evaluating the alignment and misalignment between representations from different modalities, leveraging information from human behaviors and computational models.
◌ CITATION ONLY
Full text is not openly licensed for redistribution here. Read it at the source:
Provenance
- Source
- bioRxiv
- DOI
- 10.1101/2024.01.25.577219
- Canonical
- link ↗
- Fetched
- 2026-05-31 MST
Cite this
APA
C., L., M., S., & B., X. (2024). Probing the Link Between Vision and Language in Material Perception. <em>biorxiv</em>. https://doi.org/10.1101/2024.01.25.577219
Vancouver
C. L, M. S, B. X. Probing the Link Between Vision and Language in Material Perception. biorxiv. 2024. doi:10.1101/2024.01.25.577219.
BibTeX
@unpublished{liao2024Probin,
title = {Probing the Link Between Vision and Language in Material Perception},
author = {Liao, C. and Sawayama, M. and Xiao, B.},
journal = {biorxiv},
year = {2024},
doi = {10.1101/2024.01.25.577219},
}
Research neighborhood
References, citing works, and semantically nearest findings. Click a node to open it.
Related findings
biorxiv 2024
Preprint · CC-BY
Modelling variability in dynamic functional brain networks using embeddings
biorxiv 2024
Preprint · CC-BY
Emotional prosody modulates visual mental imagery
biorxiv 2024
Preprint · CC-BY
Explorations of using a convolutional neural network to understand brain activations during movie watching
Stanford University 2025
Open access · US-GOV
Carotenoids in the Skin and Immune Aging: a Pilot Observational Study
biorxiv 2024
Preprint · CC-BY
Decoding kinematic information from beta-band motor rhythms of speech motor cortex: A methodological/analytic approach using concurrent speech movement tracking and magnetoencephalography
China Medical University Hospital 2025
Open access · US-GOV