Machine learning (ML) techniques are being used as an alternative to numerical methods for solving fluid dynamic partial differential equations (PDE). Whilst numerical methods are higher fidelity, they are computationally expensive. ML methods on the other hand are lower fidelity but provide significant speed-ups. The main argument against using ML in fluid dynamics is their lack of generalizing ability for different boundary conditions, initial conditions, geometry and flow scales. Another downside of ML is that it is a ‘black box’ with no physical interpretability. We aim to address some of these disadvantages by 1) building time-stepping ML models that can predict Rayleigh-Bénard convection (RBC) for three orders of magnitude of flow scales (Rayleigh number Ra = 10e6 – 10e9), and 2) building these ML models such that they encode the flow into a physically-informed, lower-dimensional latent space. The latent space is ‘physically-informed’ in the sense that the latent space vector is partitioned into a part corresponding to convection-dominant (or advection-dominant) regions and a part corresponding to diffusion-dominant regions of the flow. In this way we obtain more physically interpretable latent spaces. The proposed models we use throughout are hybrid, combining various ML tools such as Gaussian mixture model (GMM) clustering, principal component analysis (PCA), convolutional autoencoders (CAE), and multilayer perceptrons (MLP). Our results indicate that the proposed physically interpretable models have improved time-stepping accuracies compared to baseline models, whilst providing computational speed-ups compared to numerical (pseudo-spectral) methods.