Immersive virtual reality (VR) applications require ultra-high data rate and low-latency for smooth operation. Hence in this paper, aiming to improve VR experience in multi-user VR wireless video streaming, a deep-learning aided scheme for maximizing the quality of the delivered video chunks with low-latency is proposed. Therein the correlations in the predicted field of view (FoV) and locations of viewers watching 360$^\circ$ HD VR videos are capitalized on to realize a proactive FoV-centric millimeter wave (mmWave) physical-layer multicast transmission. The problem is cast as a frame quality maximization problem subject to tight latency constraints and network stability. The problem is then decoupled into an HD frame request admission and scheduling subproblems and a matching theory game is formulated to solve the scheduling subproblem by associating requests from clusters of users to mmWave small cell base stations (SBSs) for their unicast/multicast transmission. Furthermore, for realistic modeling and simulation purposes, a real VR head-tracking dataset and a deep recurrent neural network (DRNN) based on gated recurrent units (GRUs) are leveraged. Extensive simulation results show how the content-reuse for clusters of users with highly overlapping FoVs brought in by multicasting reduces the VR frame delay in 12\%. This reduction is further boosted by proactiveness that cuts by half the average delays of both reactive unicast and multicast baselines while preserving HD delivery rates above 98\%. Finally, enforcing tight latency bounds shortens the delay-tail as evinced by 13\% lower delays in the 99th percentile.