Читать книгу Enterprise AI For Dummies - Zachary Jarvinen - Страница 87

Insufficient or biased data

Оглавление

Then there is the problem of insufficient training data. In 2017, due to reports of possible student visa fraud, the U.K. Home Office used voice-recognition software to flag cases where it appeared that the same voice took an English-proficiency test for multiple students. However, voice-recognition accuracy rates are dependent on having known samples of the voice being reviewed, and the organization doing the review didn’t have independent samples of the English-proficiency test candidates. Based on the results of the review, the government refused, cut short, or canceled the visas of nearly 36,000 people.

In the fall of 2019, the U.S. National Institute of Standards and Technology tested 189 facial recognition algorithms from 99 developers using 18.27 million images of 8.89 million people taken from four sources: domestic mugshots, immigration applications, visa applications, and border crossings.

They tested two common matching tasks for false positives (finding a match where there isn’t one) and false negatives (failing to find a match when there is one):

 One-to-one matching: Match a photo to another photo of the same person in a database. Examples: Unlock your smartphone, board an airplane, check a passport.

 One-to-many searching: Determine whether a photo has a match in a database. Example: Identify suspects in an investigation.

For one-to-one matching, most systems reported false positives for Asian and African American faces, with algorithms developed in Asia doing better at matching Asian faces. Algorithms developed in the U.S. consistently registered a high rate of false positives for Asian, African American, and Native American faces. For one-to-many matching, African American females had the highest rates of false positives.

Essentially, facial recognition works best for people with the same phenotype, or observable characteristics of an individual based on their genetic makeup, as the people who developed the algorithm. Those outside the bias of the model will experience problems with determining their identity for travel and law enforcement purposes, or with being falsely accused of a crime when law enforcement gets a false positive in one-to-many matching.

Enterprise AI For Dummies

Подняться наверх