Organizations are increasingly relying on big data analytics to make important decisions. It’s critical that analyses are based on accurate, high-quality data. If a company makes a major decision based on inaccurate data, the results can be costly.
For example, consider a company that manufactures products for babies and young children. The company notices that there have been thousands of complaints on social media about its new crib. These complaints state that the crib is prone to sudden collapse.
The company’s leadership is horrified at the thought of its products potentially causing injuries to babies, and so it prepares to issue an urgent recall notice. Let's say the aforementioned Gretchen then takes a closer look at these thousands of social media complaints. She finds clues that tell her the complaint dataset actually stems from questionable sources.
In other words, the complaints were generated by “robot” social media accounts and are not accurate—no cribs have actually collapsed. Gretchen saves the company from issuing a costly and unnecessary recall notice, and allows them to instead launch a public relations campaign informing consumers of the mix-up.
Veracity in big data can also directly affect individuals—not just companies. For example, consider the software needed to operate an autonomous vehicle. Autonomous vehicles are equipped with high-tech sensors to detect obstacles and prevent potential collisions. However, the sensors can only receive data; the software is needed to interpret whether something is actually a collision threat.
There is a significant difference between an overturned big rig on the road ahead and a plastic bag being blown across the road by the wind. The more accurate the dataset interpreted by the software is, the better the machine will work.