All Episodes

May 31, 2021 36 mins

What's the difference between a commercial data science project and a Data Science project for social benefit? Often so-called Data Science for Good projects involve a throwing together of many people from different backgrounds under a common motivation to have a positive effect.

We talk to a Data Science team that was formed to tackle the unemployment crisis that is coming out of the pandemic and help people to find excellent jobs in different industries for which they have a good skills match.

We interview Erika Gravina, Rajwinder Bhatoe and Dehaja Senanayake about their story helping to create the Job Finder Machine with the Emergent Alliance, DataSparQ, Reed and Google.

Further Information

Some links above may require payment or login. We are not endorsing them or receiving any payment for mentioning them. They are provided as is. Often free versions of papers are available and we would encourage you to investigate.

Interview date: 25 March 2021
Recording date: 13 May 2021

Intro audio Music 4 Video Library (Patreon supporter)

Send us a text

Thanks for joining us in the DataCafé. You can follow us on twitter @DataCafePodcast and feel free to contact us about anything you've heard here or think would be an interesting topic in the future.

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Jason (00:01):
Welcome to the DataCafe.
I'm Jason.

Jeremy (00:04):
And I'm Jeremy. And today we're talking about data
science for good.

Jason (00:23):
Okay, cool. So what is data science for good, Jeremy.

Jeremy (00:26):
So this really comes from, you know, where we find
ourselves today, I think, andit's become quite a movement in
recent years, but especiallyaround the pandemic, of course,
yeah. And we've seen lots ofinitiatives, lots of events spun
up by people trying to use data,trying to get value from data,

(00:50):
and then trying to use datascience, of course, to generate
some level of social beneficialoutcome. Some goes in society,

Jason (01:00):
A real noble cause then where we've got so much data,
and we've got experts out therein a lot of the data science
that we tend to do or talkabout, and the applications
usually have an industrialapplication or commercial
application. But we're talkinghere about a more charitable
cause or a more noble causewhere you're giving back

(01:21):
somehow.

Jeremy (01:22):
Yeah, a moral cause, I think. Absolutely. I think one
where you're not, you're notdoing it necessarily for your
job, although that can be, ofcourse, have a moral dimension
as well. No reason why or, butbut but you for this one, maybe
you're coming together, possiblywith people you haven't met
before, or you know, aren't inyour normal social or work team.

(01:46):
And you're developing a product,using some publicly available
data sources may be to, togenerate something interesting,
and hopefully impactful.

Jason (01:57):
Yeah, impactful actually, is real word, right? It's, it's
got some impact and some use forsomebody out there who maybe
can't afford to buy it, but it'sof impact of use. Yeah, it is
great hate. And there's loads ofcases out there.

Jeremy (02:12):
So one of the ones that I became aware of quite early on
in pandemic last year was acouple of projects that were
spun up by the Royal Society inthe UK, who a scientific
organisation, and they theystarted two projects, the RAMP
project and the DELVE project,which are great acronyms, so hat

(02:33):
tip to them.

Jason (02:34):
They're two different projects.

Jeremy (02:36):
Yes, yeah. related. So what RAMP stands for Rapid
Assistance in Modelling thePandemic and DELVE was Data
Evaluation and Learning forViral Epidemics. So one was more
data focused. The other was moremodelling focus, but I mean,
both, arguably, our data scienceprojects at some some level, and

(02:56):
they brought together totallydisparate sets of academics,
organisations, companies,charities, and, you know, tried
to get some operating insightfrom data that they were able to
get hold off, you know,sometimes from government
sometimes from publiclyavailable sources. Yeah. You
know, I mean, ramp had a missionto try to get out of lockdown

(03:20):
more quickly. Right. What is ourexit strategy?

Jason (03:23):
Yeah. And we can see why that's such a strong impact. And

Jeremy (03:26):
yeah, to two lock downs later.

Jason (03:29):
Yeah. And it's amazing, even when you said it there of
using publicly available data,how much publicly available data
there is, and how manyopportunities there are for any
of us with skills or interestsor curiosity, to take some of
the publicly available data anduse it and go in with that
question. What impact can Imake?

Jeremy (03:49):
I think so you see datasets that come out of
government and sort of open datainitiatives that are supported
by local government, centralgovernment, as well. You've got
news organisations, thensourcing their own data and then
putting on the website, andoften they're in eminently
scrape Abul form, and can thenbe picked up and people can

(04:12):
start using them to spotpatterns and make deductions
around these areas. Exactly. Soyou were involved with a charity
hackathon when you Jason acouple years ago. Do you want us
to think about that?

Jason (04:25):
Well, we had an initiative to bring together the
team in a hackathon before thesocial goods packed and it was
with missing people charity camein and worked with us and you
know, they had cleansedanonymize data and we looked at
how we can apply some nicestatistical methods to segment

(04:45):
certain risk factors for wherethey might put their efforts in
targeting risk groups who youknow may go missing or wrong.
You know, the missing is kind ofthe umbrella term. for any
reason that somebody disappears,yeah, for their own reasons or

(05:06):
because they've been forced togo into hiding or or whatever it
might be.

Jeremy (05:11):
Did you feel when it was a different sort of team working
experience then working on thisproject for for missing people?
Yeah. Over your otherexperiences of doing data
science in a company setting,

Jason (05:23):
There is an element of Yeah, collaboration and
camaraderie that just gets kindof emphasised when you're
working on something thateverybody is inspired a little
bit by, and has some passion ordrive to make sure that it's as
impactful as it can be? Yeah,because you know that there's a

(05:45):
feelgood factor as well as aproduct to deliver that you've
kind of promised you said, doyou want to do this, so you
really put your heart and hardwork into it?

Jeremy (05:57):
Yeah, we're lucky enough today to be talking to some
people who participated in onesuch project for organisation
called the Emergence Alliance,which was set up by Rolls Royce
and Google and an organization'sI think there's 50 organisations
involved in this alliance now.
And they, this was aorganisation that set up to try

(06:20):
and address challengessurrounding the pandemic, not
just in the teeth of it, butalso how on earth you recover
from it. And I think that'swhere they really, really put
their efforts in, in trying todrive use of data to to help
people recover. So we were luckyenough to speak to Erica and
Rajwinder, and Dehaja, who werecontributors to one of these

(06:45):
projects.
I'm joined in the DataCafe todayby three people. I'm very
excited to be talking to ErikaGravina, Rajwinder Bhatoe and
Dehaja Senanayake. Welcome.
Welcome to the DataCafe. Thankyou. Hello, good. Thank you so
much for joining us. So I'mparticularly excited to be

(07:10):
talking to you today about aproject you've just coming to
the end of, and one that i thinki think will chime with the
listeners quite a lot, which isit's a project to do with sort
of social good and around thepandemic. So Erica, could you
just give us an overview of theproject and what you're hoping
to achieve with that?

Erika (07:31):
Sure. So we got on boarded onto the project around
September 2020. And the projectfor us started off as being
about thinking of ways that wecan help people think about the
job market in a way that isn'tas fixed and stationary as maybe
the way that they can thinkabout it now. And what I mean by

(07:52):
that is that due to thepandemic, of course, there has
been a lot of unemployment. Andwhat we were trying to achieve
was to try and think about thethe industries within the job
market and in a way that wasmore flexible, and to then allow
job applicants to move acrossindustries more fluently. And in

(08:13):
order to do so we focused on arepresentation of jobs that
wasn't just to do with kind ofthe structure of a job
description, but to think interms of skill sets. And I
suppose that's kind of theoverarching story of the whole
project.

Jeremy (08:28):
Brilliant. So let's just take each of you in turn. So
Raj, how did you how did you getinvolved in this in the first
place? What was the sort of youryour pathway into the team then
before the project?

Rajwinder (08:40):
So our team came about, we took part in the code
first girls data hackathon backin September, it was a week
long, and we were analysing somedata based on the economic and
environmental impact of thepandemic. And thankfully, we

(09:02):
ended up winning and our prizewas to join this project. But
before that, I had taken a codefirst girls Python course and
when it was recommended, Iapplied and here we are.

Jeremy (09:17):
Brilliant. Dehaja. Was this a team that you got
together before that initialproject was that it was it was
the team put together on day oneto tackle the first challenge?

Dehaja (09:30):
No other team was put together on day one, basically.
So I didn't know Erika or Raj.
Yeah, we didn't know each otherbeforehand. Which it doesn't
feel like we didn't because Ithink we could come up with come
actually quite good friends overthe course of working on that
challenge but also doing likethe read project. Yeah, I think
Erika messaged us and just saidI'll be your team leader for the
challenge. Welcome. Would youlike to share any of your like

(09:54):
hobbies or anything and I thinkwe all found that we all quite
like to cook and eat and things.
So we bonded over that, but, andthen eventually over coding, but
it's been really good.

Jeremy (10:07):
Raj, you talk about coding experience, how much
coding experience did, did youhave Dehaja before you before
you started this? Was this newto you? Or was this something
you've done a little bit ofbefore or a lot of?

Dehaja (10:17):
Yeah, it was really new to me. So like Raj, I've done
one of the code first girlscourses in Python. But that was
my only experience in Pythonbeforehand. So kind of thrown in
a little bit, the deep end, butI've had pretty supportive
people around to help me whenneeded, so that that's been good

Jeremy (10:36):
Erika, what about yourself?

Erika (10:37):
Yeah, it was a real roller coaster for me as well.
It was definitely kind of a newenvironment. And I had been
selected to be team leader. AndI didn't put myself forward for
not thinking that I was actuallygoing to get the position. So I
think it was definitely kind ofan interesting start to the
project, like focusing ongetting the group to get to know

(11:01):
each other, and actually, likeenjoying having many calls and
discussing both the problem andour personal lives. So I think,
you know, looking back, I'mreally happy that we focused on
the social aspect of it as well,because he made the whole
experience a lot more fun. Andin terms of coding experience, I
had some experience throughuniversity, but I never did an

(11:22):
official course of any kind, andnot through CFG. That was my
first experience being involvedwith CFG.

Jeremy (11:29):
And then in this project, you joined a sort of
loose collection of teams asformed part of what's called an
emergent Alliance. And you areasked to tackle this really
super relevant, interestingproblem around jobs, finding
jobs in a pandemic. So I said,Raj, how did you how did you

(11:50):
sort of go about doing this?

Rajwinder (11:52):
I think, right at the start, we had a couple of
meetings with the wider team,kind of thinking about what the
aim was of the project, because,of course, lots of people had
lost their jobs, unfortunately.
And there was a lot of push toget people to kind of use their
skills that apply to jobs thatwere currently in demand. And

(12:16):
that's what they were kind offocusing on from the company's
perspective, that they wantedpeople to be applying to things
that they had the skills ormaybe they hadn't thought of
applying to them. But those werecurrently in demand. So they
would have the opportunity to goto those.

Jeremy (12:39):
So it wasn't jobs that they would might might have done
previously, necessarily. Wecould have been roles that that
match their skill set, butweren't weren't ones they might
have considered before. Is thatright?

Rajwinder (12:51):
Yeah. So I think there was a lot of discussion
about how the current applyingto jobs process doesn't really
take into account your skills,necessarily, you're kind of just
looking for a job title andapplying to those. However, we
kind of want something thatfocuses on people picking out

(13:13):
their skills, their best skills,and kind of searching for those.
And I think that was what kindof came out of those design
thinking workshops we had at thestart.

Jeremy (13:23):
So Dehaja, in tackling the problem, and what sort of
avenues Did you did you go down,you've got this slightly,
slightly, this reallyinteresting approach of not
doing a sort of traditional CVscan, I guess on an individual
applicant, what was the pipelineon this, but for how you how you
thought you might tackle thisinitially.

Dehaja (13:41):
So we first kind of looked down the avenue of having
kind of sets of skills, andthose were related to certain
job title. And then we looked athow we can match skills, that
kind of a user enters into theskills related to each of those

(14:02):
job titles. So you could haveskills that vary across many
industries. But and that's kindof like the whole aim of the
project. So as kind of Raj hasmentioned as to try and see
where maybe the gaps are thatyou haven't thought of. And,
yeah, so in order to do that, wewent down kind of the matching

(14:23):
route. I don't know howtechnical to go, but we kind of
created matrices of the skillsand job titles, and we, we match
those together. And it wasreally interesting for me, I had
I've never worked with naturallanguage programming NLP before.
So that was new and, you know,tokenizing, and stemming and

(14:46):
kind of the differences betweenlemmatization and stamas. And we
have quite a few interestingconversations about the benefits
of that both of those and thenwe ended up looking towards kind
of creating a network or acluster job titles. And yeah,
that's kind of where we got upto but I think, has over to

(15:06):
Erika to talk about the networkof job titles and clustering
because she, she focused onthat.

Erika (15:11):
Yeah, I think everything that the highest said is
essentially kind of thegroundwork that went into it.
And I think that withouteverything that happened in the
beginning, in the first part ofthe MVP, the network approach
wouldn't have come out of it. Ithink there was definitely a lot
of interesting work and thought,like thinking out loud, like

(15:33):
during discussions with the teamabout what parts of the process
were working and what wasn'tworking. And I think a lot of it
came down to the data that wehad available. I think we had an
idea of what data we were gonnaget. And then the more we were
looking into the data weactually had on hand, we
realised that what we initiallyset ourselves out to do wasn't

(15:55):
gonna work, as well as wethought it was. So I think,
yeah, it's definitelyinteresting, you know, for
whomever might be interested,instead of having a corpus of
like a job description, and ajob title, we had very specific,
almost like, you know, singlestring inputs with just a few

(16:15):
words within it, that wererepresentative of the skill
sets, such as the skill strings,I suppose. And those were
connected to job job titles, butit was a very different format
than the way that one couldthink about the kind of job
search and the job data around.
So when we actually ended uplooking into the network
approach, that was a way oftrying to extract as much

(16:35):
information as possible from thedata that we had. So it was a
very, very data specificapproach. And it was very
interesting, it kind of led tothis idea of being able to think
about job titles with respect totheir sets of skills, and use a
set of skills to createconnections between the job
titles, and ease percentagematches across job titles were

(16:59):
created by looking at the setsof skills. So it was a very back
and back and forth approachbetween the two elements, I
suppose. And this is kind ofwhere the network came out.

Jeremy (17:12):
I think I think it's amazing. The idea that you're
not doing something fairlytraditional around this, you're
not doing a sort of search for aparticular title, because
that's, that's exactly whatpeople would have done.
Previously, you're trying toencode the sort of serendipitous
nature of discovering a job thatyou are qualified for, but you

(17:33):
didn't know you were qualifiedfor before you started
searching? So how did that playout? And what's the user journey
sort of look like for for that?

Rajwinder (17:43):
So we got to the approach where a user would
input their skills that theythink they have, but we also
wanted to be able to includeskills that maybe they hadn't
directly mentioned. But wererelated to, to the skill, same

(18:05):
put it because I think peoplecan undersell themselves, and
maybe not be completely specificabout all the skills they have.
So let's say, yeah, let's saythey, they're collaborative. But
they're also good at working ingroups and search. So we wanted

(18:28):
to be able to kind of have asmany skills as possible to
search for us. So I think one ofthe more useful ways to do that
was we've kind of used jobtitles that were exactly the
same, and grouped together theirskills. So you would assume that

(18:50):
if the job title is the same,the kind of skills you'd need
for them were also the same. Butif they were written in, say, a
different manner, or usedifferent wordings, you'd be
able to capture that byaggregating those together, and
think we wanted to be able tomatch as many people to the
correct jobs as we could.

Jeremy (19:12):
I see. So the process was one of learning from the
sort of collection of jobs andtheir associated skills as to
what those likely patterns wouldbe not for one job, but for many
jobs of hopefully a similartype. But you interested
interesting, you mentioned thatthere was a concern over how
people might describe themselveswhen they're giving them.

(19:34):
They're entering the skills. Andthe I understand you were
working on some of the biaselements in there. Is there a
was there a concern that, thatthere might be a bias in even at
the very start of that process?
how someone might describe theirown skill set?

Dehaja (19:50):
Yeah, so you've kind of hit the nail on the head there.
So that's where we ended upseeing whether, I guess,
majority of bias could comefrom. So if I described myself
as, say, self assured, but maybenot confident, those two skills
are quite similar, but maybe onewould be captured more than the

(20:12):
other. And that's so we did abit of investigation into bias
in the data set. Andinterestingly, there was a
significant amount of ice in asense of, we used a list of
masculine words and a list offeminine words. And we, we
explored the data set for theoccurrences of each of those

(20:32):
words. And the occurrences ofthose words was reasonably high.
So it was something that weweren't, we were happy that we
investigated, just to be awareof the levels of bias in the
data set. But also, we've kindof thought about what we can do
to reduce the bias, the effectof the bias even on the user.

(20:54):
And one of the main thingsthat's good about the tool is
that it kind of captures theseadditional skills in the
process, which could reduce thelevel of bias associated with
someone inputting certain skillsthat may not be written in the
job descriptions themselves.

Jeremy (21:11):
So Erika, what would you say then for the team of
yourself, personally, maybe itwas, what would you say was the
biggest challenge in theproject?

Erika (21:18):
I think there were a lot of ideas at all times. Because
there were so many discussionswith wider teams and people
having a lot of input. And itwas really hard, I think, to
start to try and narrow downwhat was feasible and what
wasn't feasible, at least forthe sake of the this first

(21:38):
sprint. And I think that wasdefinitely challenging from my
perspective, because I wasalways used in maybe working a
university or with much smallerteams. So the influx of ideas
was definitely more manageable.
But it was a real challenge. AndI think we did a great job. But
keeping having lots of calls anddiscussions and thinking out
loud with what we thought wouldhave been the best approach

(22:01):
forward in the time, likelimited amount of time that we
had. And that was a real, newtype of difficulty. And I think
we did a really good job withit.

Jeremy (22:12):
That's nice. So I could have finished with the same
question to each of you. And itwas, if you could give advice to
somebody who was maybe in theperson in your shoes six months
ago, wanting to get into codingwanted to get into data science,
particularly, what advice wouldyou give them?

Rajwinder (22:31):
Oh, I think the best way is to kind of get stuck in
you can't wait like stuff likethis, you just got to try
something. There's lots ofresources online. Such as I
know, kaggle is a really gooddata science way to get started.
And just with data science isalways a good start. To start

(22:54):
with asking the question, figureout how to answer that question
where you could get data forthat question, whether there's
stuff already out there. And Ithink, also, I'd like to plug
code base skills and is a greatopportunities for girls to get
involved in learning Python andSQL. And I know me and the

(23:16):
higher tech part, and it's areally good opportunity to do
so.

Jeremy (23:21):
Great. Dehaja what about yourself? What What, what advice
would you give?

Dehaja (23:26):
I think I completely echo what Raj has just said,
definitely, if you're looking toget started code first girls is
a great opportunity to do so. Ithink also, what I would add is
trying to build a little bit ofa network with maybe other
people who are learning becausefor me, it's been really helpful
to have people to ask questionsto but also talk through like

(23:49):
problems. And I think you justlearned so much from talking to
other people as well. So yeah,that'd be my one bit of advice,
for people getting into coding.

Jeremy (23:57):
Brilliant. Erika?

Erika (23:59):
Asides from the brilliant answers that already came
through, I did have one morething to add, I think,
personally, is to not be tooafraid of not knowing. Don't be
afraid to ask questions, anyquestions about anything that
you might be stuck on? And justdon't be afraid to? You know,
think of it out of the box.
Because I think especially fordata science is such a great

(24:21):
tool to try things out in avery, like nice and, you know,
simple way, just keep theproblem simple in your head and
play around with it.

Jeremy (24:33):
That's great advice. I love that. That's been
fantastic. I think I believe thetools gone live. So we will put
a link to the the live jobsearch tool in the show notes
and people can have a play withthat and see, see what you've
had a go. So I think that's beena really nice discussion. I'd
like to thank all of you Dehaja,Erika, Raj, thank you very much

(24:56):
for joining us today in theDataCafe, it's been great. Thank
you for having us. Thank you.

Jason (25:06):
The product that they've built really inspires me,
because they've straightawaycall it a base job application
process in searching for jobs ina way I've never heard of
before, which is really cool,like, really innovative. Like,
what is my skill set? And now Iwill use that to go and find a

(25:28):
job where every, you know,approach, certainly that I've
done is I want to be an insertblank here, right now I go and
read what is by other people'sdefinitions, and try and almost
sandwich what I think my skillset is into what has been
presented in a job description.

Jeremy (25:47):
Yeah, I should declare an interest myself, because I
was involved with this project.
And working with Erika andDehaja and Raj was really
exciting, and was enormous fun.
And I think they've done alovely job with this. And the
fact that they were thinkingabout it in a, you know, I think
Erika said in a flexible way toallow people to, you know, move

(26:11):
between jobs that they mightgenuinely be totally skill for,
totally, they may need, they mayneed no extra training to do or
maybe a tiny bit of extratraining, or education to do you
know, as everyone does onlinecourses these days. So that's
certainly well within people'scapability. And I think I think
thinking of the problem, in adifferent way, gives you so much

(26:36):
more flexibility to really sortof start again, with a tool like
that and to go right, well, ifwe're going to think of it in
those terms, if we're gonna, ifwe're going to start with
skills, and we're going to lookat how jobs define themselves by
those skills, then we can makesome really interesting
assertions and suggestions andgives give people hopefully some

(26:56):
really useful ideas for whatthey might otherwise have not
considered and not thoughtabout.

Jason (27:02):
Exactly. Yeah. And how you lay out those skills is,
it's not so much about the jobtitle, that's always a starting
point. And another point, theysaid, that was really
interesting to me, was thisdifficulty in self evaluating?
So yes, how do I know whatskills I have? Because I'm very

(27:24):
biassed, you know, purely livingmy life in whatever skill set I
have, and what is my norm? meansI'm not necessarily calling out
what is a skill, because it'sjust a normal behaviour to me,
and learned or otherwise, um,what is something that I want to

(27:45):
apply in a job, you know, so avery clear like distinction
would be having confidence inpublic speaking, can be learned
or some people are naturallygood at it, versus having the
ability to code and code in acertain language and coding
certain framework. And the hasto be learned. I don't know many
people who are just, you know,prodigy straightaway at coding.

Jeremy (28:08):
Yeah, they set themselves a genuinely
challenging task here, becausethey first came to realise that
jobs aren't described, even inthe title in the same way. And
skills aren't described byeverybody in the same way. Yeah.
So someone might say it'scommunication skills, someone

(28:28):
might call it presentationskills, you know, so there's
lots there's lots of ways oftalking about these skills and
then, and jobs as well. And thenand then you put you said
absolutely true, then you've gotto somehow marry that up to the
way that an individual applicantmight describe themselves and

(28:50):
the time limit or that biasdidn't the Right, right, and
then you've got that thatdifficulty of your own your own
sort of preconceptions aboutyour own skills, be that
ambitious, or unambitious,really. And I think that, you
know, all of these great,considerable challenges for a
project like that, and there'slots of hurdles at which you
could fall early on, trying totackle that problem and trying

(29:13):
to try to make it you know, allconnect together when you've got
so many disparities there, thereis no common lexicon Yeah, there
is no taxonomy that's officiallyused. I think they've they've
done themselves proud with thiskind of project.

Jason (29:26):
Yeah. And even when you said there, that is such a
challenge in the lexicon, andthen they were applying natural
language processing andprogramming techniques and try
and turn what's so difficult,you know, with the human biases
involved, and the variousdatasets out there into logic
that can be modelled and trainedand applied in certain ways.

(29:50):
Such a challenge. Yeah, really.
Yeah.

Jeremy (29:53):
I mean, we'll probably do another episode on on bias
because it's such a hot topicand one, that one that deserves
really close sort ofinvestigation. But then I mean,
there's just a couple ofinstances in their projects, I
think that they had to tackle.
One was the bias, as we said,from the individual who was
describing their own skills. Andanother would be the bias of the

(30:14):
person writing the jobspecifications, saying, Well,
I'm in my mind's eye, I'm seeingsomeone who has these skills,
when, of course, in their mind'seye, they might be seeing
someone who was maybe a man,maybe in their mid 30s, maybe,
you know, all of that kind ofthing going on, they wouldn't
put that down. But

Jason (30:34):
Yeah exactly put scientist in the title and
they're wearing a white coat.
Yes.

Jeremy (30:39):
So, so I think that there's a lot to get your teeth
into, if you're engaging in thiskind of project I showed you, I
think it shows that AI for goodand data science for good. It
doesn't mean it's astraightforward problem. In
fact, actually quite theopposite here, the data is not
necessarily there, or if it canbe in a very poor state

(31:01):
sometimes. And it can mean thatyou, you know, you don't have
recourse to go back to theperson who gathered it and say,
Well, can you do it better? Orcan you do it differently? Or
can you give us another dataset,which shows us this, because
that just may not exist, youliterally have to work with what
you have, and try and getsomething from it. So you have
to be quite careful and quitesort of disciplined with

(31:22):
yourself. But But I think, youknow, these are really quite
tough data science challenges.

Jason (31:27):
And we can see how it will get more attention as the,
you know, prototype is outthere, and people start using it
and start thinking in this way,you know, that entrepreneurial
shift in mindset of what is itthat we can build a news and
it's built on people'sexperience, you know, they
talked about the pandemicdriving a need for something

(31:48):
like this, because I need tounderstand what my skill set is
to match with what the needs arein whoever I'm looking to hire.
You know, whether they know itor not, they, they may not have
classified it right, for really,really great initiative.

Jeremy (32:06):
And also it does throw together people who would
otherwise never have, haveworked together and never have
met under any othercircumstances. So it is, it does
create these nice serendipitouscross network matches that you
would never, you would neverotherwise have entertained or
made happen.

Jason (32:25):
This is great, because I love hearing about the bringing
together of differentinitiatives like this. So, you
know, I'm wondering if somebodyhears about this for the first
time, how can they get involved,

Jeremy (32:37):
I mean, lots of lots of ways that there's hackathons
going on as part of data sciencedrives all over the world. So
I'd encourage you to, you know,have a look online, and you
know, don't be afraid to have ago, really, you don't have to be
a massively experienced datascientist, in fact, quite the

(32:59):
opposite. You just need to havesome good ideas and some
willingness to contribute andgive, give a give your time.
And, you know, a bit oftechnical skill may be to just
help bring a team of peopletogether. And when I talk to
people about, about this, if ithasn't worked, it's always

(33:20):
because there wasn't, therewasn't quite enough time to do
stuff. And I think, you know,just just just just having a set
of people with a common goal,who are prepared to give it
whether it's a day, or half aday, or a week or wherever it
like, not however long it lastsfor but just be prepared to give
that time. So I mean, if you'rea woman and you're looking to
get into this, and you've nevercoded before, like you know that

(33:41):
they they mentioned it in theinterview, then code first girls
in the UK is fantasticorganisation that helps women
get into coding and they sciencenow I see is very much on their,
their radar. So I whollyencourage people to, to look at
that. But there's loads of bootcamps, and organisations who run

(34:02):
sort of quickstart courses onthis sort of thing. And which
would enable you to get intothis.

Jason (34:06):
Yeah, really trying to reach out and build that network
as part of the advice that theygave, which is really key just
just retired, it seems likenowadays as well, there's a lot
of access online. So you don'teven need to be situated in a
tech hole. You know, like beingin the centre of London, for

(34:28):
example. You can be anywhere andget involved across virtual
events, which is another kind ofone of the side effects of the
pandemic because we've seen thisand increasing accessibility to
things like this.

Jeremy (34:41):
Yeah, yeah, that's absolutely true. I don't think
Erika, Raj and Dehaja everactually met in person? Well,
yeah, I think I think they'veonly ever met over zoom or
whatever. So so

Jason (34:52):
It kind of makes it a little more challenging as well
when you think about it

Jeremy (34:55):
As Dehaja said, you learn so much from talking,
talking with people You may befrom different backgrounds that
you haven't met before they'renot in your network. And then
suddenly you're thrown together.
And I thought that was a reallynice way of sort of
demonstrating the power ofthese, you know, social events
and social hackathons and datascience for good in general. So

(35:18):
that was I think that was areally nice part of their their
story. But the thing that stoodout for me I think, I think it
applies just as much in in datascience in industry, just as it
doesn't apply in in thissetting, which is, Erica, at the
end said her piece of advice wasthat she, you shouldn't be
afraid of not knowing and andyou shouldn't be afraid of, of

(35:40):
asking questions. I think that'sterrific advice for any aspiring
data scientist.

Jason (35:47):
Thanks for joining us today at the DataCafe. You can
like and review this on iTunesor your prepared podcast
provider. Or if you'd like toget in touch, you can email us
Jason at datacafe.uk or Jeremyat datacafe.uk or on Twitter at
datacafepodcast. We'd love tohear your suggestions for future
episodes.
Advertise With Us

Popular Podcasts

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

24/7 News: The Latest

24/7 News: The Latest

The latest news in 4 minutes updated every hour, every day.

Therapy Gecko

Therapy Gecko

An unlicensed lizard psychologist travels the universe talking to strangers about absolutely nothing. TO CALL THE GECKO: follow me on https://www.twitch.tv/lyleforever to get a notification for when I am taking calls. I am usually live Mondays, Wednesdays, and Fridays but lately a lot of other times too. I am a gecko.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.