This study focuses on the multicast scheduling for M × N input-queued switches. An enhanced first-in-first-out -based round-robin multicast scheduling algorithm is proposed with a function of searching deeper into queues to reduce the head-of-line (HOL) blocking problem and thereby the multicast latency. Fan-out information of each input cell composes a traffic matrix and the scheduler executes a round-robin algorithm on each column independently. Scheduling decisions result in a decision matrix for the scheduler to release multicast cells accordingly. A matrix operation called sync is carried out on the decision matrix to reduce the number of transmission for each cell. To reduce the HOL blocking problem, a complement matrix is constructed based on the traffic matrix and the decision matrix, and a process of searching deeper into the queues is carried out to find cells that can be sent to the idle outputs. Simulation results show that the proposed function of searching deeper into the queues can alleviate the HOL blocking and as a result reduce the multicast latency significantly. Under both balanced and unbalanced multicast traffic, the proposed algorithm is able to maintain a stable throughput.