HIV infection provokes a myriad of pathological effects on the immune system where many markers of CD4+ T cell dysfunction have been identified. However, most studies to date have focused on single/double measurements of immune dysfunction, while the identification of pathological CD4+ T cell clusters that is highly associated to a specific biomarker for HIV disease remain less studied. Here, multi-parametric flow cytometry was used to investigate immune activation, exhaustion, and senescence of diverse maturation phenotypes of CD4+ T cells. The traditional method of manual data analysis was compared to a multidimensional clustering tool, FLOw Clustering with K (FLOCK) in two cohorts of 47 untreated HIV-infected individuals and 21 age and sex matched healthy controls. In order to reduce the subjectivity of FLOCK, we developed an "artificial reference", using 2% of all CD4+ gated T cells from each of the HIV-infected individuals. Principle component analyses demonstrated that using an artificial reference lead to a better separation of the HIV-infected individuals from the healthy controls as compared to using a single HIV-infected subject as a reference or analyzing data manually. Multiple correlation analyses between laboratory parameters and pathological CD4+ clusters revealed that the CD4/CD8 ratio was the preeminent surrogate marker of CD4+ T cells dysfunction using all three methods. Increased frequencies of an early-differentiated CD4+ T cell cluster with high CD38, HLA-DR and PD-1 expression were best correlated (Rho = -0.80, P value = 1.96x10-11) with HIV disease progression as measured by the CD4/CD8 ratio. The novel approach described here can be used to identify cell clusters that distinguish healthy from HIV infected subjects and is biologically relevant for HIV disease progression. These results further emphasize that a simple measurement of the CD4/CD8 ratio is a useful biomarker for assessment of combined CD4+ T cell dysfunction in chronic HIV disease.