Is bias and discrimination in AI a problem?

September 2022

Artificial Intelligence - good governance will need to catch up with the technology

The AI landscape

We hear about the deployment and use of AI in many settings. The types and frequency of use are only going to increase. Major uses include:

  • Cybersecurity analysis to identify anomalies in IT structures
  • Automating repetitive maintenance tasks and guiding technical support teams
  • Ad tech to profile and segment audiences for advertising targeting and optimise advertising buying and placement
  • Reviewing job applications to identify the best-qualified candidates in HR
  • Research scientists looking for patterns in health to identify new cures for cancer
  • Predicting equipment failure in manufacturing
  • Detecting fraud in banking by analysing irregular patterns in transactions.
  • TV and movie recommendations for Netflix users
  • Inventory optimisation and demand forecasting in retail & transportation
  • Programming cars to self-drive

Overall, the different forms of AI will serve to improve our lives but from a privacy point of view, there is a danger that the governance around AI projects is lagging behind the evolving technology solutions.  

In that context, tucked away in its three-year plan, published in July, the ICO highlighted that AI driven discrimination might become more of a concern. In particular, the ICO is planning to investigate concerns about the use of algorithms to sift recruitment applications. 

Why recruitment applications?

AI is used widely in the recruitment industry. A Gartner report suggested that all recruitment agencies used it for some of their candidate sifting. The CEO of Ziprecruiter website in US is quoted as saying that three-quarters of submitted CVs are read by algorithms. There is plenty of scope for data misuse, hence the ICO’s interest. 

The Amazon recruitment tool – an example of bias/discrimination

The ICO are justified in their concerns around recruitment AI. Famously, Amazon developed their own tool to sift through applications for developer roles. Their model was based on 10 years of recruitment data for an employee pool that was largely male. As a result, the model discriminated against women and reinforced the gender imbalance by filtering out all female applications.

What is AI?

AI can be defined as: 

“using a non-human system to learn from experience and imitate human intelligent behaviour”

The reality is that most “AI” applications are machine learning. That is, models are trained to calculate outcomes using data collected from past data. Pure AI is technology designed to simulate human behaviour. For simplicity, let’s call machine learning AI.  

Decisions made using AI are either fully automated or with a “human in the loop”. The latter can safeguard individuals against biased outcomes by providing a sense check of outcomes. 

In the context of data protection, it is becoming increasingly important that those impacted by AI decisions should be able to hold someone to account.

You might hear that all the information is in a “black box” and that how the algorithm works cannot be explained. This excuse isn’t good enough – it should be possible to explain how a model has been trained and risk assess that activity. 

How is AI used? 

AI can be used to make decisions:

1.     A prediction – e.g. you will be good at a job

2.     A recommendation – e.g. you will like this news article

3.     A classification – e.g. this email is spam. 

The benefits of AI

AI is generally a force for good:

1.     It can automate a process and save time

2.     It can optimise the efficiency of a process or function (often seen in factory or processing plants)

3.     It can enhance the ability of individuals – often by speeding processes

Where do data protection and AI intersect?

An explanation of AI-assisted decisions is required: 

1.     If there is a process without any human involvement

2.     It produces legal or similarly significant effects on an individual – e.g. not getting a job. 

Individuals should expect an explanation from those accountable for an AI system. Anyone developing AI models using personal data should ensure that appropriate technical and organisational measures are in place to integrate safeguards into processing. 

What data is in scope?

  • Personal data used to train a model
  • Personal data used to test a model
  • On deployment, personal data used or created to make decisions about individuals

If no personal data is included in a model, AI is not in scope for data protection. 

How to approach an AI project?

 Any new AI processing with personal data would normally require a Data Protection Impact Assessment (DPIA). The DPIA is useful because it provides a vehicle for documenting the processing, identifying the privacy risks as well as identifying the measures or controls required to protect individuals. It is also an excellent means of socialising the understanding of AI processing across an organisation. 

Introducing a clear governance framework around any AI projects will increase project visibility and reduce the risks of bias and discrimination. 

Where does bias/discrimination creep in?

Behaviour prohibited under The Equality Act 2010 is any that discriminates, harasses or victimises another person on the basis of any of these “protected characteristics”:

  • Age
  • Disability
  • Gender reassignment
  • Marriage and civil partnership
  • Pregnancy and maternity
  • Race
  • Religion and belief
  • Sex
  • Sexual orientation. 

When using an AI system, your decision-making process needs to ensure and are able to show that this does not result in discrimination. 

Our Top 10 Tips

  1. Ask how the algorithm has been trained – the “black box” excuse isn’t good enough
  2. Review the training inputs to identify possible bias with the use of historic data
  3. Test the outcomes of the model – this really seems so obvious but not done regularly enough
  4. Consider the extent to which the past will predict the future when training a model – recruitment models will have an inherent bias if only based on past successes
  5. Consider how to compensate for bias built into the training – a possible form of positive discrimination
  6. Have a person review the outcomes of the model if it is challenged and give that person authority to challenge
  7. Incorporate your AI projects into your data protection governance structure
  8. Ensure that you’ve done a full DPIA identifying risks and mitigations
  9. Ensure that you’ve documented the processes and decisions to incorporate into your overall accountability framework
  10. Consider how you will address individual rights – can you easily identify where personal data has been used or has it been fully anonymised? 

In summary

AI is complex and fast-changing. Arguably the governance around the use of personal data is having to catch up with the technology. When people believe that these models are mysterious and difficult to understand, a lack of explanation for how they work is not acceptable. 

In the future clearer processes around good governance will have to develop to understand the risks and consider ways of mitigating those risks to ensure that data subjects are not disadvantaged. 

Data Subject Access Request Guide

Being prepared and handing DSARs

Handling Data Subject Access Requests can be complex, costly and time-consuming. How do you make sure you’re on the front foot, with adequate resources, understanding and the technical capability to respond within a tight legal timeframe?

Data subject access request from the data protection consultancy DPN - Data Protection Network

This guide aims to take you through the key steps to consider, such as…

  • Being prepared
  • Retrieving the personal data
  • Balancing complex requests
  • Applying redactions & exemptions
  • How technology can help

Managing Erasure Requests or DSARs via Third-Party Portals

January 2022

Do organisations have to honour them? Well, it depends…

Over the past few years GDPR, the California Consumer Privacy Act (CCPA) and other privacy regulations have led to specialist companies offering to submit Erasure or Data Subject Access Requests (DSARs) on behalf of consumers.

These online portals say they want to help people exercise their privacy rights, while enabling them to make requests to multiple organisations simultaneously.

Companies on the receiving end of such requests often receive them in volume, and not necessarily from consumers they even know. Requests can quote swathes of legislation, some of which may be relevant, some which won’t apply in your jurisdiction.

If you haven’t had any yet, you may soon. Companies like Mine, Privacy Bee, Delete Me, Revoke and Rightly all offer these services.

They don’t all operate in the same way, so be warned the devil is in the detail.

How third-party portals work

Okay, bear with me, as said there are different approaches. They may use one, or a combination of, the following elements:

  • Offer to simply submit requests on the individual’s behalf, then the consumer engages directly with each organisation
  • Offer people the opportunity to upload their details and proof of ID, so the portal can submit requests on their behalf without the consumer needing to validate their ID each time.
  • Provide a bespoke link which organisations are invited to use to verify ID/authority. (Hmmm, we’re told not to click on links to unknown third parties, right?)
  • Allow consumers to select specific named organisations to submit requests too
  • Make suggestions for which organisations the individual might wish to ‘target’
  • Offer to scan the individual’s email in-box to then make suggestions about which organisations are likely to hold their personal data. (Again, really? Would you knowingly let any third-party scan your in-box?).

Is this a good thing? Does it empower the consumer?

On the surface, this all seems fairly positive for consumers, making it simpler and quicker to exercise their privacy rights.

For organisations, these portals could be seen as providing an easier way of dealing with rights requests in one place. Providing perhaps, a more secure way of sharing personal data, for example in responding to a DSAR.

I would, however, urge anyone using these portals to read the small print, and any organisation in receipt of these requests to do their homework.

Why it’s not all straight-forward

The following tale from one DPO may sound familiar…

We tend to find these requests slightly frustrating and time-consuming. First, we have to log all requests for our audit trails. We cannot simply ignore the requests otherwise this can cause regulatory issues, not to mention if they are genuine requests.

More often than not, they are sent in batches and do not contain the information we require to search and make the correct suppression. Where we do have enough information to conduct searches, we often find the personal details do not exist on our database.

Another concern is whether the requests are actually for meant for us. We recently received a number of requests for a competitor, who was clearly named on the requests. When we tried to contact the portal to explain this issue, we did not get a response and were essentially ignored, which leaves us in a predicament – do we continue with the with the request, was it actually for our organisation or not?

So, there’s a problem. Requests might be submitted on behalf of consumers who organisations have never have engaged with. Requests can arrive with insufficient information. We can’t always verify people’s identity, or the portal’s authority to act on their behalf. In these circumstances, do people genuinely want us to fulfil their Erasure or Access request?

What does the ICO say about third-party portals?

The regulator does reference online portals in is Right of Access guidance. It tells us we should consider the following:

  • Can you verify the identity of the individual?
  • Are you satisfied the third-party has authority to act on their behalf?
  • Can you view the request without having to take proactive steps (e.g. paying a fee or signing up to a service)?

The ICO makes it clear it would not expect organisations to be obliged to take proactive steps to discover whether a DSAR has been made. Nor are you obliged to respond if you’re asked to pay a fee or sign up to a service.

The Regulator says it’s the portal’s responsibility to provide evidence of their authority to act on an individual’s behalf. If we have any concerns, we’re told to contact the individual directly.

If we can’t contact the individual, the guidance tells us we should contact the portal and advise them we will not respond to the request until we have the necessary information and authorisation.

This all takes time…

This is all very well, but for some organisations receiving multiple requests this is incredibly time-consuming.  Some organisations are receiving hundreds of these requests in a single hit, as Chris Field from Harte Hanks explains in – You’ve been SAR-bombed.

In addition, we need to do our research and understand how the portal operates, checking whether we believe they’re bone fide or not.

Another DPO, whose company receives around thirty privacy requests from third-party portals a month says; “Often these tools don’t provide anything more than very scanty info, so they all require responses and requests for more info”. This company takes the following approach; “We deal with the individual if it’s a legitimate contact detail, or we don’t engage.”

It really is a question of how much effort is reasonable and proportionate.

We must respect fundamental privacy rights, understand third-party portals may be trying to support this, but balance this with our duty to safeguard against fraud or mistakes.

Are Data Subject Access Requests driving you crazy?

January 2022

Complicated. Costly. Time-consuming...

… And driving me crazy. We’ve all heard the dreaded words, right? I’d like a copy of my personal data.

Which led me to think; is the fundamental privacy right of accessing our personal data becoming part of our increasingly litigious culture? The DSAR is now a staple opening shot for law firms handling grievance claims or employment tribunals, looking for potentially incriminating morsels of information.

Of course, this right must be upheld, but is the process fit for purpose? Employee-related requests, in particular, can entail a massive amount of work and the potential for litigation makes them a risky and complex area.

For some organisations, this is water off a duck’s back; they’ve always had access requests, anticipated volume would increase after GDPR, have teams to handle them, invested in tech solutions, have access to lawyers and so on.

Great stuff, but please spare a thought for others.

Plenty of businesses have lower volumes of DSARs. They’re unable to justify, or afford, extra resources. These guys are struggling under a system that assumes one size fits all.

Then there are businesses who’ve never even had a DSAR. For them, just one request can be an administrative hand grenade.

Of course some businesses are guilty of treating employees badly, but I wish things could be different. It’s about getting the balance right, that most elusive of things when creating regulatory regimes. Are the principles behind the DSAR important? Of course. Can the processes be improved? Definitely!

So be warned – here begins a micro-rant on behalf of the smaller guys. I’m feeling their pain.

What’s that sound? It’s wailing and the gnashing of teeth

It’s clear in our Privacy Pulse Report DSARs are a significant challenge facing data protection professionals. One DPO told us;

“Vexatious requests can be very onerous. Controllers need broader scope for rejection and to refine down the scope, plus criteria for when they can charge… In my view, the ICO should focus on helping controllers to manage complex and vexatious DSARs.”

Some access requests are straightforward, especially routine requests where ‘normal’ procedures apply. However, some requests are made by angry customers or disgruntled ex-employees on a mission… and there’s no pleasing them. A troublesome minority appear to be submitting DSARs because they want to cause inconvenience because they’re angry, but don’t go so far as to fall under the ‘manifestly unfounded’ exemption.

Anyhow, for all those of you out there dealing with this stuff, know that I feel your pain. Without any further ado…

My THREE biggest DSAR bugbears (there are others)

Everything!

We’re entitled to a copy of ALL our personal data (to be clear, this doesn’t mean we’re entitled to full documents just because our name happens to appear on them somewhere).

It’s true organisations are allowed to ask for clarification, and the ICO’s Right of Access Guidance, provides some pointers on how to go about this.

Yet that tiny glimmer of hope is soon dashed – we’re told we shouldn’t seek clarification on a blanket basis. We should only seek it if it’s genuinely required AND we process a large amount of information about the individual.

Furthermore; “you cannot force an individual to narrow the scope of their request, as they are still entitled to ask for ‘all the information you hold’ about them.”

Why?

Let’s take the hypothetical (but realistic) case of an ex-employee who believes they’ve been unfairly dismissed. They worked for the company for 10 years, they submit a DSAR but choose not to play along with clarifying their request. They want everything over a decade of employment.

Do they really need this information? Or are they refusing to clarify on purpose? Is this a fair, proportionate ‘discovery process’? As I’ve said before, large organisations may be better placed absorb this, it’s the not-so-big ones who can really feel the pain. And in my experience, much personal data retrieved after hours of painstaking work isn’t relevant or significant at all.

Emails!

I get conflicted with the requirement to search for personal data within email communications and other messaging systems.

On the one hand we have the ICO’s guidance, which to summarise tells us:

  • personal data contained within emails is in scope (albeit I believe GDPR has been interpreted differently by other countries on this point);
  • you don’t have to provide every single email, just because someone’s name and email address appears on it;
  • context is important and we need to provide emails where the content relates to the individual (redacted as necessary).

If you don’t have a handy tech solution, this means trying to develop reasonable processes for retrieving emails, then eliminating those which won’t (or are highly unlikely) to have personal data within the content. This takes a lot of time.

Why am I conflicted? In running a search of your email systems for a person’s name and email address, you’ll inevitably retrieve a lot of personal data relating to others.

They might have written emails about sensitive or confidential matters, now caught within the retrieval process. Such content may then be reviewed by the people tasked with handling the request.

I suspect this process can negatively impact on wider employee privacy. Yes, we’re able to redact third party details, but by searching the emails in the first place, we’re delving into swathes of lots of people’s personal data.

It seems everyone else’s right to privacy is thrown out in the interests of fulfilling one person’s DSAR.

It also makes me wonder; if I write a comment that might be considered disparaging about someone in an email, do I have any right to this remaining private between me and the person I sent it to? (Even if it wasn’t marked confidential or done via official procedure).

I know many DPOs warn their staff not to write anything down, as it could form part of a DSAR. I know others who believe they’re justified in not disclosing personal data about the requester, if found in other people’s communications. Which approach is right?

Time!

Who decided it was a good idea to say DSARs had to be fulfilled within ‘one calendar month’?

It wasn’t! This phrase led to the ICO having to offer this ‘clarification’;

You should calculate the time limit from the day you receive the request, fee or other requested information (whether it is a working day or not) until the corresponding calendar date in the next month.

If this is not possible because the following month is shorter (and there is no corresponding calendar date), the date for response is the last day of the following month.

If the corresponding date falls on a weekend or a public holiday, you have until the next working day to respond.

This means that the exact number of days you have to comply with a request varies, depending on the month in which an individual makes the request.

For practical purposes, if a consistent number of days is required (e.g. for operational or system purposes), it may be helpful to adopt a 28-day period to ensure compliance is always within a calendar month.

I hope you got that.

Wouldn’t it have been easier to have a set number of days? And perhaps more realistic timescale?

Let’s take the hypothetical (but realistic) case; you receive a DSAR on 2nd December. You can’t justify an extension as it isn’t unduly complex.

Yes, I know you’re with me; bank holidays and staff leave suddenly means the deadline is horribly tight.

I wish there was specific number of days to respond. I wish they excluded national bank holidays and I wish there was a reprieve for religious festivals. I know, I’m dreaming.

DSARs and UK data reform

Is the UK Government going to try and address the challenges in their proposal to reform UK data protection law?

The consultation paper makes the right noises about the burden DSARs place on organisations, especially smaller businesses.

Suggestions include introducing a fee regime, similar to that within the Freedom of Information Act. One idea is a cost ceiling, while the threshold for responding could be amended. None of this is without challenges. There’s also a proposal to re-introduce a nominal fee.

On the latter point, GDPR removed the ability to charge a fee. You may recall prior to 2018 organisations could charge individuals £10 for a copy of their personal data.

Many will disagree, but I think the nominal fee is reasonable. I realise it could be seen a barrier to people on lower incomes exercising a fundamental right. However, my thoughts are organisations wouldn’t be forced to charge. It would be their choice. They would also be able to use their discretion by waiving the fee in certain situations.  It makes people stop and think; ‘do I really want this?’

Whatever transpires, I truly hope some practical changes can be made to support small and medium-sized businesses. Balancing those with individual rights isn’t easy, but that’s why our legislators are paid the big bucks.

And here, dear reader, endeth my rant!