Voozh

No abstract available

Keywords: artificial intelligence; deep learning; neural networks; neuroscience; perception; psychology.

Figures

Figure 1

A “variational auto-encoder” (VAE) deep network (13 layers) was trained using an unsupervised “generative adversarial network” procedure (VAE/GAN, Goodfellow et al., ; Larsen et al., 2015) on a labeled database of 202,599 celebrity faces (15 epochs). The latent space (1024-dimensional) of the resulting network provides a description of numerous facial features that could approximate face representations in the human brain. (A) A picture of the author as seen (i.e., “encoded”) by the network is rendered (i.e., “decoded”) in the center of the panel. After encoding, the latent space can be sampled with simple linear algebra. For example, adding a “beard vector” (obtained by subtracting the average latent description of 1000 faces having a “no-beard” label from the average latent description of 1000 faces having a “beard” label) before decoding creates a realistic image of the author with a beard. The same operation can be done (clockwise, from right) by adding average vectors reflecting the labels “bald,” “old,” “young,” or “smile.” In short, the network manipulates concepts, which it can extract from and render to pixel-based representations. It is tempting to envision that the 1024 “hidden neurons” forming this latent space could display a pattern of stimulus selectivity comparable to that observed in certain human face-selective regions (Kanwisher et al., ; Tsao et al., ; Freiwald et al., ; Freiwald and Tsao, 2010). (B) Since the network (much like the human brain) was trained solely with upright faces, it inappropriately encodes an upside-down face, partly erasing important facial features (the mouth) and “hallucinating” inexistent features (a faint nose and mouth in the forehead region). This illustrates how human-like perceptual behavior (here, the face inversion effect) can emerge from computational principles. The database used for training this network is accessible from (Liu et al., 2015).

See this image and copyright information in PMC

References

1. Anselmi F., Leibo J. Z., Rosasco L., Mutch J., Tacchetti A., Poggio T. A. (2013). Unsupervised learning of invariant representations in hierarchical architectures. CoRR 1311.4158.
1. Cadieu C. F., Hong H., Yamins D. L., Pinto N., Ardila D., Solomon E. A., et al. (2014). Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Comput. Biol. 10:e1003963. 10.1371/journal.pcbi.1003963 - DOI - PMC - PubMed
1. Cao Y., Chen Y., Khosla D. (2015). Spiking deep convolutional neural networks for energy-efficient object recognition. Int. J. Comp. Vis. 113, 54–66. 10.1007/s11263-014-0788-3 - DOI
1. Champandard A. J. (2016). Semantic style transfer and turning two-bit doodles into fine artworks. CoRR 1603.01768.
1. Cichy R. M., Khosla A., Pantazis D., Oliva A. (2016a). Dynamics of scene representations in the human brain revealed by magnetoencephalography and deep neural networks. Neuroimage. [Epub ahead of print]. 10.1016/j.neuroimage.2016.03.063 - DOI - PMC - PubMed

Publication types

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

URL: https://pubmed.ncbi.nlm.nih.gov/28210237/

⇱ Perception Science in the Age of Deep Neural Networks - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Figures

References

Publication types

LinkOut - more resources

Full Text Sources

Other Literature Sources