Go Back

D&I mistakes are fueled by dirty data.


Dr. Neil Morelli


If you want to get a real handle on your diversity and inclusion (D&I) efforts, it starts with people analytics. The only way to know if you are meeting your D&I goals – and where you are falling short -- is by tracking relevant human capital data across your company.

When companies have comprehensive people data any number of trends can be tracked, including how many people of color you are hiring per quarter, location, or department; whether salaries are fair across employee groups; and where you may have attrition problems that warrant further investigation. These kinds of data-driven D&I insights are the best way to uncover buried problems in your talent management processes so solutions can be developed.

Unfortunately, most companies do a terrible job of collecting even the most basic information about their workforce, which makes it impossible to conduct meaningful workforce analytics.

In a recent report from PWC, more than half (55%) of companies surveyed got a failing grade for tracking their people data. In the lowest-rated companies, half of the most basic data were missing or inconsistently tracked; and every company had consistency issues, including lack of basic data fields populated for all employee data sets. The authors argue that such inconsistent data management prevents companies from tracking even basic diversity metrics around representation, pay, development, and time to first promotion.

This problem is known in the analytics world as “dirty data.”

Your data is a mess

When a workforce database is full of empty fields, unstructured content, and information captured using inconsistent language, it becomes impossible to compare results or track the progress of individual employee groups against the whole. This makes it much harder to achieve D&I goals.

Dirty data occurs when companies take an ad hoc approach to implementing organizational databases. They add or update new HR systems, acquire a new firm, and assign people across the organization to add information without creating formal rules, codes, and language for data capture. As the company grows, so does the problem, resulting in a morass of human capital information that can’t deliver value because it doesn’t follow any rules.

Dirty data prevents companies from conducting even basic workforce analytics, like how many employees you have in each department, and what they are being paid. It also exacerbates systemic diversity issues, like pay inequity and high rates of attrition among certain populations, because no-one can see where it is happening.

And it happens all the time.

One test conducted by PWC showed the number of unique titles for a manager-level role varied so widely that a single organization could have more than 660 different titles. With this variation alone there is no way to consistently chart management career paths, align salaries with roles, or compare salaries and promotion rates among different populations or departments.

It’s time to clean house

Chances are very high that your human capital data is a mess. And if you want to uncover the D&I problems in your own organization and gain greater transparency over the entire workforce, you have to clean it up.

It won’t be easy, especially in large organizations that have potentially thousands of employee data sets scattered across dozens of systems. But it can be done.

The process begins with cleaning up existing data sets. This process, which we outline in our 6 Steps to Cleaner Data blog, requires going through every single set of data to eliminate duplicate information, update files using consistent language, and fill in every missing field.

Once the existing data is clean, companies need to rethink how they will capture human capital data going forward. That means formalizing job titles and structures, establishing benchmarked salary ranges, and using the same language about skills and attributes in every job description, recruiting post, and performance rating. It also means defining every point in the employee journey where information should be captured or updated.

Ideally, the data collection process begins in the hiring process with the collection of all employee assessment and personality test results, which can be used to vet diversity in hiring goals.

A new view

Cleaning up dirty human capital data can be a long, painful process, but it is necessary if you want to be able to leverage this gold mine of information to build a better, stronger, and more diverse workforce.

Once the data is clean, companies can do all kinds of predictive analytics to figure out what skills they will need in the future, who’s likely to quit, and which managers are thwarting their ability to hire and hone new talent. The potential impact of having clean data is profound -- while the risk of not addressing this issue is equally large.

Don’t wait for your hidden workforce biases to go viral when an unfairly treated employee shares their story on social media. The sooner you clean up your data and set new goals for data collection, the sooner you can identify and fix the simmering issues that are causing good employees to feel excluded from your company’s path to success.

Diversity on your workscreen

Pre-employment Testing Buyers Guide

What to consider when evaluating vendors.

Download Guide

Go Back

2970 Peachtree Road NW, Suite 300, Atlanta, Ga 30305 | Terms of Use | Privacy

© All Rights Reserved. Berke ® is a registered trademark of Berke Group, LLC.