Multimode fibers tend to scramble input image due to intermodal dispersion and random mode coupling. While this scrambling effect can be learned, for instance by measuring the transmission matrix of the fiber, slight changes of the geometrical conformation of the fiber modify its response, making the calibration obsolete. In the present paper, the authors use deep-learning using data sets acquired over a wide range of deformations to reconstruct images sent through unknown configurations of multimode fibers. This is possible thanks to the presence of invariant properties that the numerical model learns.
For a given configuration of the fiber geometry, one can fully characterize the response of the fiber by measuring its transmission matrix, which links the input field to the output one, regardless of the amount of disorder. When the fiber is modified, because of bending, vibrations, or temperature changes, the response of the fiber changes, and the transmission matrix acquired before tends to be useless. For weak deformations, different techniques were explored to reconstruct the input information, for instance by estimating the effect of the deformation analytically, or by measuring a set of transmission matrices for various configurations of disorder, and picking the right one afterward. Deep learning was also proposed and demonstrated for weak deformations. The system trained for a given realization of the disorder is still able to reconstruct images through unknown configurations.
However, for stronger modifications of the fiber geometry, these techniques all fail. The idea here is to use a deep learning model that is trained over a wide range of configurations with just a few images for each. The goal is then, for an unknown configuration of disorder, to reconstruct the image only from the measurement of the output intensity speckle pattern of the fiber using the trained network. The principle of the experiment is presented in Fig.1. The data is composed of sets of input images, sent using a spatial light modulator (a digital micro-mirror device), and the corresponding output intensity patterns.
Figure 1. Principle of the experiment. From [S. Resisi et al, arxiv, 2011.05144 (2020)].
The fiber is deformed using a set of 37 actuators that presses on the fiber at different locations in a controllable manner. The space representing the possible configurations of the fiber is very large, it is then not realistic to fully explore it to calibrate all the combinations. The only hope is the existence of invariant properties, that are resilient to disorder, and that could be used to reconstruct the input information. Hopefully, deep learning is famously good at finding invariant properties and exploit them. The perturbations applied totally randomize the output pattern, but hidden information can still be present. The network is trained over 943 different configurations but with only 800 images - taken from the MNIST handwritten digit database - for each configuration. Typical results are shown in Fig.2, with a good reconstruction accuracy even for unknown configurations of the disorder.
Figure 2. Typical reconstruction results. From [S. Resisi et al, arxiv, 2011.05144 (2020)].
To investigate the origin of the generalization capabilities of the approach, the authors investigate the dataset using clustering, which is an unsupervised machine learning approach consisting in trying to find groups of elements in the data set with similar properties. Some results are shown in Fig.3 using the t-SNE algorithm. It shows that the algorithm is able to group digits sent through the same configuration. It indicates that the transmission properties of light are not totally randomized by the deformations, as outputs would otherwise be indistinguishable. This supports the interpretation that the deep learning model learns the invariant properties, common to all deformations, which are then used to infer and reconstruct the input information.
Figure 3. Clustering applied to a data set consisting of different input digits through various fiber configurations. Left and right share the same points, only the color labels are different: one color per realization of disorder (left) or one color per input digit (right). From [S. Resisi et al, arxiv, 2011.05144 (2020)].