This study systematically analyses the questionnaires used by rating agencies, financial institutions and companies to collect ESG data, highlighting significant heterogeneity in the structure, language and content of the tools currently in use.
Through a three-level approach — descriptive analysis, measurement of textual similarity using cosine similarity and cluster analysis — the research shows how differences in data collection formats compromise the comparability and transparency of sustainability assessments. The empirical results, based on a diverse sample of questionnaires, reveal low levels of linguistic similarity, wide variations in the number and type of questions, and structural fragmentation reflected in six distinct clusters.
This evidence is particularly relevant in light of the new Regulation (EU) 2024/3005 on ESG ratings, which aims to strengthen their integrity through more stringent requirements in terms of processes, transparency, and data quality. The study suggests that the effectiveness of the European regulatory framework could be further enhanced by extending harmonisation efforts to the information gathering phase, promoting minimum standards and greater methodological consistency. The analysis thus contributes to the debate on the quality of ESG metrics, providing useful evidence for improving the information base on which sustainability assessments in the European market are founded.