All Episodes

September 24, 2024 11 mins

Spoken (by a human) version of this article.

In a previous article, we discussed algorithmic fairness, and how seemingly neutral data points can become proxies for protected attributes.

In this article, we'll explore a concrete example of a proxy used in insurance and banking algorithms: postcodes.

We've used Australian terminology and data. But the concept will apply to most countries. 

Using Australian Bureau of Statistics (ABS) Census data, it aims to demonstrate how postcodes can serve as hidden proxies for gender, disability status and citizenship.

About this podcast

A podcast for Financial Services leaders, where we discuss fairness and accuracy in the use of data, algorithms, and AI.

Hosted by Yusuf Moolla.
Produced by Risk Insights (riskinsights.com.au).

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Yusuf (00:00):
This article was published in September 2024 and
it's titled Postcodes hiddenproxies for protected
attributes.
The background to this is thatwe've seen postcodes used in a
variety of circumstances and sothis article is meant to Help
explain why postcodes areactually potential proxies for
protected attributes and whythey need to be carefully

(00:22):
considered.
So here we go.
In a previous article, wediscussed algorithmic fairness
and how seemingly neutral datapoints can become proxies for
protected attributes.
In this article, we'll explore aconcrete example of a proxy used
in insurance and bankingalgorithms.

(00:43):
the Universal Postal Union usesthe generic term postcode to
describe this addressing system,and it's used by more than a
hundred countries.
Certain specifics, like howgranular a postcode area is, may
vary.
The term used to denote postcodealso varies slightly.
So there's a list of those inthe article.

(01:04):
in Australia, Malaysia,Netherlands, New Zealand and UK.
Postal code in Canada, SouthAfrica and Singapore.
I can't properly pronounce thisbut something like Postleitzau
in Switzerland and Germany.
Postnummer in Sweden, Denmarkand Norway.
Zip code in the United States.
I think it's error code inIreland.

(01:25):
Code POS in Indonesia.
Code postal in France, althoughthe French would probably say
that a little bit differently.
Codigo postal in Spain, Codigode Enderechamento Postal in
Brazil, Yubin Bangor in Japanand postal index number in
India.
most of this naming conventionis due to language differences

(01:48):
or there might be some subtledifferences in wording.
In this article we'll useAustralian terminology and data
but the concept will apply tomost countries.
Using Australian Bureau ofStatistics, so that's ABS Census
data from 2021, this articleaims to demonstrate how
postcodes can serve as hiddenproxies for gender, disability
status and citizenship.

(02:11):
Gender ratios, not as uniform asyou might think, the overall
gender ratio in Australia isabout 50 50.
it's 50.
7 percent female, 49.
3 percent male.
Pretty much one to one.
But individual postcodes canshow surprising variations.
Now there's a visual based onPower BI in the actual article.

(02:33):
I'm going to try to explain itYou might just have to go to the
article to see it, there's acouple of visuals like that as
we go through.
Considering that visual.
Based on 2021 census data, if weuse 55 percent as the cutoff
point, so that's approximately10 percent above the average,
below the average, depending onhow you want to think about it,

(02:53):
we have 200 postcodes that areeither more than 55 percent male
or more than 55 percent female.
And that's out of the 2640postcodes in Australia.
So that means 200 out of the 2,640, which is around 7.
5 percent of the postcodes inAustralia were not gender

(03:16):
balanced.
And just under 200, 000 of the25 million people in Australia
actually lived in those 200areas.
And the ratios can be quitesignificant.
So they can go all the way from70 percent female to 100 percent
male.

(03:37):
Now, those extremes are usuallyin areas with low populations
where the percentages can beexaggerated But if we look at
the largest Of those, that'spostcode 2010 in New South
Wales, people live in thatpostcode, 11, 11,000 are female,

(03:59):
15,000 are male.
So that's a 42% to 58% ratio.
And so it's not only smallerpostcodes that have this issue
as demonstrated in the 177,000people that live in those 200
areas.
So 180 postcodes have a higherproportion of males, and this is
perhaps due to male dominatedindustries.

(04:22):
There are 20 postcodes that showa higher proportion of females
and this could be influenced bya range of factors, so for
example retirement communitieswith longer female life
expectancy.
Okay so that's gender.
Let's look at disability ratiosacross postcodes.
The census collects data on coreactivity need for assistance.

(04:43):
Which serves as an indicator ofdisability, a proxy for
disability if you like.
I know we're talking aboutproxies of proxies, but just
bear with me for a sec.
Analysis of this data revealssignificant variations across
postcodes.
The average across Australia is6%, but for individual
postcodes, the ratio can rangefrom 0% all the way up to 59%.

(05:08):
While 59% is an outlier.
There are many postcodes in the10 to 15% range, which is again,
significantly above 6%.
So for example, 11% of peoplein.
postcode 4655, Queensland, needcore assistance, versus 4
percent in postcode 6164.
Western Australia, And each ofthose have 000 people in them.

(05:35):
These differences can beattributed to various factors,
including proximity tospecialized healthcare
facilities.
Availability of accessiblehousing and socio economic
factors influencing healthoutcomes.
The last item here we're goingto talk about in this article is
citizenship status.
So we spoke about gender, thendisability.
Let's talk about citizenshipstatus.

(05:57):
Is postcode a predictor ofcitizenship status?
Citizenship status is aprotected attribute.
It's illegal to discriminateagainst someone based on their
citizenship status.
It may also be a proxy for raceand ethnicity, which themselves
are protected attributes.
Australian citizenship statusvaries significantly across

(06:17):
postcodes.
On average, 11 percent of the2021 census respondents
identified as non citizens.
11 percent on average.
More than half of all postcodes,with more than half of the
country's population, had noncitizenship ratios that were
much higher, greater than 16%,or much lower, less than 6%,

(06:43):
than the average.
Remember, the average was 11%,so anything above 16 percent
could be considered much higher,or is, for this postcode.
For the purposes of this articleanyway, anything below 6 could
be considered much lower.
So for example, as reflected inthis other, the third visual in
the article, in the postcodewith the largest population in

(07:05):
Australia, that's Victoria 3029,25 percent of the population
were not Australian citizens.
So in that postcode, the totalpopulation is just under 130,
000, 90, 000 citizens, 32, 000non citizens.
That's quite a significantdifference from the average.

(07:28):
Postcodes with higherproportions of non citizens
might be characterized by thingslike proximity to universities
attracting internationalstudents, areas with seasonal
worker programs, or suburbs thatare popular among expatriate
communities.
In contrast, postcodes withhigher citizen ratios Might
reflect established suburbanareas with multi generational

(07:51):
Australian families or regionswith fewer employment
opportunities for migrants.
What are the implications foralgorithmic fairness of all of
what we've said?
So the variations in disability,gender and citizenship ratios
highlight a critical issue inalgorithm design.
If postcodes are used as inputvariables, as they are, They can

(08:14):
inadvertently introduce relatedto these protected attributes.
For example, a lending algorithmusing postcode data might
unfairly disadvantage applicantsfrom areas with higher
disability ratios.
A hiring algorithm couldperpetuate gender imbalances by
favoring candidates frompostcodes with specific gender

(08:35):
ratios.
An insurance pricing algorithmthat uses postcode data might be
discriminating, illegally, orunfairly.
Against the immigrants.
So how do we mitigate thispostcode bias?
To address these hidden biases,there are a number of things
that we need to do.
First, we need to be aware ofthe potential for postcodes to

(08:58):
act as proxies for protectedattributes.
We conduct thorough analyses toidentify correlations between
postcodes and sensitivevariables.
We consider using alternativegeographic identifiers when
appropriate.
We implement fairnessconstraints that account for
postcode based variations inprotected attributes.

(09:21):
And we regularly audit ouralgorithms for unintended biases
introduced by geographic data.
Those are some of the thingsthat we can do.
It's not a holistic list.
but By recognizing the complexinformation encoded in
postcodes, we can work towardscreating fairer, more equitable
algorithms that serve allmembers of society regardless of

(09:44):
where they live.
We need to look beyond surfacelevel variables in data analysis
and algorithm design.
As we strive for fairness andequity, we must remain vigilant
about hidden proxies that existwithin our data sets, including
the humble postcode.
That's the end of the article.
This is quite a complex subject.

(10:07):
There are various pieces oflegislation, pieces of guidance,
actuarial outputs that explainthese things in a bit more
detail.
As we've said before, there arereasonable circumstances that
can be used.
And then there are othervariables, factors, that may not

(10:30):
be as reasonable and justdirectly relate to protected
attributes with no mitigation.
And what we've outlined in thisarticle simply illustrates that
we need to give these things abit more thought before using
them.
And if we don't need to usethem, we might be better off.
I know there's various argumentsfor and against completely

(10:55):
eliminating variables, At aminimum, we need to understand
the impact that they may have,educate our people across the
supply chain of algorithms aboutthese, and strive to eliminate
the biases that exist in thosealgorithms.

(11:16):
Thanks for listening.
Advertise With Us

Popular Podcasts

On Purpose with Jay Shetty

On Purpose with Jay Shetty

I’m Jay Shetty host of On Purpose the worlds #1 Mental Health podcast and I’m so grateful you found us. I started this podcast 5 years ago to invite you into conversations and workshops that are designed to help make you happier, healthier and more healed. I believe that when you (yes you) feel seen, heard and understood you’re able to deal with relationship struggles, work challenges and life’s ups and downs with more ease and grace. I interview experts, celebrities, thought leaders and athletes so that we can grow our mindset, build better habits and uncover a side of them we’ve never seen before. New episodes every Monday and Friday. Your support means the world to me and I don’t take it for granted — click the follow button and leave a review to help us spread the love with On Purpose. I can’t wait for you to listen to your first or 500th episode!

The Breakfast Club

The Breakfast Club

The World's Most Dangerous Morning Show, The Breakfast Club, With DJ Envy And Charlamagne Tha God!

The Joe Rogan Experience

The Joe Rogan Experience

The official podcast of comedian Joe Rogan.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.