All Episodes

September 6, 2023 71 mins

In our latest episode of the Digital Discovery Podcast, we're joined by Ed Peel, Co-Founder of Data Cubed. With a global clientele, Ed demystifies the complexities surrounding data governance, protection, and security. Many businesses believe they have a grasp on their data's origins and use. However, revelations often prove otherwise.

Notice: Before making any decisions regarding your ecommerce strategy, we strongly recommend you conduct your own in-depth research and seek expert advice.

Audio Podcast

Show Notes 👉 https://digitaldiscovery.group/podcast/discovery-podcast-s1e1

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
Oh, yeah.

(00:08):
Oh, yeah.
So on the new digital discovery shirts bro.
That looks sick.
You okay?
Yeah, this is three so.
Show me the writing just go all the way down to the bottom.
Yeah, I'm just gonna say move the mic out the way.
Yeah, great.

(00:33):
The, um, the funny thing is the package still says has that Google admin sticker on it for some reason.
That's weird, isn't it? I'm gonna I think I uploaded that because it was the only asset I had at the time when I said it.
When you when you order a merch, Ed, like the packaging has literally a little logo and it says Google admin on it.
Really?
Yeah, it is. I'm gonna take a photo of it. It's funny.

(00:55):
Anyway, Ed, you'll get one of these beautiful mugs. I'll send one out to you. So just I'll tell you what Christian would be well impressed. You know, my business partner.
Okay, so so into merch.
All right, Eddie. Um, so a bit of an introduction, I think is is to.
So tell us all about you yourself, your business, your, your career and where at the moment.

(01:19):
So, uh, so I'm Ed and, um, uh, about two years ago, uh, started a startup called DataCubed with a couple of other people on based here in Auckland in New Zealand.
My other founder is based in Wellington in New Zealand, and then the third founder is based in the UK.

(01:49):
And so here we specifically do the New Zealand, Australia market.
Data company and, uh, but with the rest of the group, we've got that's a cube UK, that's a cube Europe, and then us, that's a cube Asia Pacific, I suppose.
But as I said, specifically, we deal with New Zealand and Australia at the moment.

(02:15):
It's interesting, you know, for people on the podcast, we were talking in the intro about some data things going on at the moment.
And we kind of do a modular thing, Nigel, where I would say, you know, because in data, it's always hard when people go, what do you do?

(02:38):
And so we have like this modular approach, we generally, when we first meet someone, we'll come in and we'll do a data maturity score with them.
And that is along the lines of, you know, what are you looking like on governance?
What are you looking like on data protection?

(02:59):
What are you looking like on data security? You know, those sorts of things.
People find that very useful and we are then able to kind of benchmark them and help them to see where they are, you know, in the scale.
And I have to be honest with you, most companies, you know, are trying to get into this and do better and, you know, get into a data led world.

(03:26):
Most of them are probably low on the scale at the moment.
Even the ones that actually think that they're good. You know, once we've done one of our assessments, I think they're often quite surprised, you know, to find that they didn't score as highly as they thought they would.
Yeah.
And then generally leading from that, we go into like a sort of data discovery module, which again sounds simplistic, doesn't it?

(03:54):
But it's literally like, you know, where's your data? Because we find in companies, small, medium and large, that they think they know where their data is, but they don't necessarily know.
And they think that they know what the key data is and what's driving the company and what's, you know, the key performance indicators.

(04:18):
And again, often seriously, we'll have people think that, you know, oh yeah, that stuff's in the data warehouse. You know, we know all about that.
Yeah, we just pull the data out. That just appears for us.
Mary spreadsheet, you know, she's cobbling together stuff from kind of 16 different branches.
And actually hers is the key piece of information, but you know, they think it's elsewhere.

(04:43):
Yeah. So in terms of, if you don't mind, I'm just going to jump in as we go. Yeah.
So in terms of, because I'm in data, I'm super interested in data.
And I really enjoy working with products like Tableau.
And I find it really easy to visualize the data that, you know, that gets produced by a particular department.

(05:13):
So, you know, in my career heading up e-commerce, you know, it generates a lot of data, you know, customer interactions, business interactions, internal, external, you know, point of style systems.
Where housing system. So there's all these disparate systems that is generating really interesting data.

(05:36):
And I find that, you know, you can put them together. Well, one off visualize it really well, but it's very hard to, as you say, find Mary spreadsheet again when she when she updates it.
You know, she might only updated on a Wednesday, but if she's not in on a Wednesday, it's done on a Friday.
And then all of a sudden, you know, you're running the report on a Wednesday. You don't know she's not there on a Wednesday.

(05:57):
And then next minute, you all your numbers are on similar things in, you know, sort of previous previous lives for me.
What are you finding when you're dealing with these businesses and do you find that when you come in, as you say, their scorecard isn't as good as what they're expecting to get an AI and you come in and it's a, it's a bit of a baby or B plus.

(06:20):
Is there much of a jump between a B, say a B or a B plus to a to an A in your mind, like for a business? Is it generally the size of the business or is it that those that he would like?
No, it's not. What's a D? I mean, is a D like you've been hacked into the data on the dark web?
I don't know. A D is literally, you know, we can't kind of find any evidence that, you know, any of these criteria are being matched.

(06:51):
So, you know, I mean, I could go on and on guys, as you know, but like some examples are, you know, you've got a SQL server.
You've claimed that it's okay. We've patched it. It's got all the latest security patches applied. Yep. But you haven't rebooted it in a year. So there's no point in applying the patches in the first place.

(07:13):
You know, so it's kind of, it's that whole governance state things. If we bring it into, you know, breaches that are constantly going on at the moment. I mean, we read about it.
Yeah, every week there's a new one. Big thing with latitude financial services was that, and you know, I'm angry about this personally.

(07:35):
Yeah. It's that the data should not have been there in the first place. Yeah. So, you know, no data retention policies, you know, no cleaning out of the systems.
And so in actual fact, people like myself who had closed an account 10 years ago, like me, I think it was actually transacted with them 15 years ago.

(07:58):
You know, my personal details, date of birth, address, driver license, etc. So yeah, things, things like that. We also find Nigel and NDA knows this as well, you know, from work we've done in the past that people are very good at saying, you know, I got a security company in and they've secured the network and our firewalls and, you know, kind of

(08:27):
they've done this security assessment, etc. But again, we often find that what's actually left is the data. And it's almost you're educating people to have a reverse stance on it and to go, look, just assume that you're going to be broken into.
Now let's try and make this data pretty much useless to whoever has lifted it, you know, attack vectors shouldn't be allowed to get to data that is in plain text.

(08:58):
Often the lack of data governance itself says, well, there's kind of a lack of a data strategy, because we can help with that and often assist in re architecting something that they already have, we often go in and we, we show people what a modern data platform looks like, and what a customer data platform looks like.

(09:23):
Or in other cases, there's a very, very large company that we work with in Australia, they brought us in, because they had a an excellent cloud platform fit for purpose at the time when it went in.
But technology and concepts had moved along, and it wasn't being either a run as efficiently as it could be anymore. So you know, it was costing a lot of money and be it wasn't as modular and modern as it could be.

(09:54):
And I often I have this conversation with people where I point out to them almost. Here is the amount of money that you are spending fighting your data platform, instead of an amount of money that you are spending to get value out the end of the data platform.
Yeah, that's right. Yeah, making being able to make decisions, sort of informed decisions on the data versus trying to maintain it. Yeah, spending money at the other end super important. Actually, it's funny you say that because it's in real world.

(10:26):
That's exactly what happens, isn't it? I mean, people or businesses put these systems in.
And I've seen it well and I and I've seen it before because we've worked together that they put them in and then they realize, well, the problem they're trying to solve back when that was put in no longer exists. Yeah.
So the tool is no longer required kind of thing and now it's the business has moved on is trying to solve different problems and try and solve. It's trying to solve different, you know, different kind of nuances around the data that it has.

(11:01):
And it just can't do it anymore. Like it just can't string the things together that it needs to because it just doesn't have the structure. Well, as you say, the modularity of these systems is super important.
The landscape has changed, you know, so in our time, you had, you know, a data lake and a data warehouse now with, you know, the kind of concept of a lake house this, you know, there's no distinction really.

(11:32):
And while I either have to build, you know, a data warehouse or set of data warehouses, or I just want to go mining. So I want the data lake. Yep. No, there's modern modular systems that we use these days, you know, things like data bricks and so on, where in actual fact, the lake house concept actually works well for any of those use cases.

(11:57):
Yeah, it's interesting, isn't it. And I think the other thing I noticed while using Tableau myself and doing that discovery on the data that you know the department that was running at the time was was generating was the often visualizing the data in a way that you can make make decisions quickly

(12:18):
is different in, you know, in terms of what you what you think you need to see versus what you actually need to see, and then the gap between what you think you need and what you actually need starts to close and I think that happens because a lot of a lot of people, generally speaking,

(12:40):
are more visual learners than what they are auditory or kinesthetic or you know there's different types of learning profiles right. Yeah, and I think being a visual learner, you know for for us as human beings is it's a lot easier to visualize.
And one thing I also found was it was was hard to make things was easy to make things complicated. Yeah, the 60,000 reports, and it was hard to make them simple. Yeah, and I mean this is a philosophy that goes back I mean to the early days of Apple actually, but, but, you know, making things

(13:14):
easier, the UX is your eyes reports, whatever it is, making them simple, the simpler, the better and if you can make exactly the same decisions on one or two pieces of useful information versus thousands of useful pieces of information, you're able to make those decisions,
a lot quicker.
And be more nimble.

(13:35):
I worked a lot on that concept to know where we, you know, for example to drive marketing decisions, stock and inventory decisions.
You know, it was funny. We had buyers at the time, I remember this clearly. We have thousands of people looking for iPhones, but we didn't have any stock of iPhones.

(14:02):
And I'm like, well, it's costing us $20 to acquire a customer. If we bought an iPhone from somewhere else and sold it at a loss.
How much is that loss, less or equal to $20. Well then, let's try and get some iPhones and sell them at a net cost or a net profit of zero. But in actual fact you've just gained a customer that you wouldn't have before.

(14:25):
Yeah. And just to have simple concepts like that and to be able to visualize the data. Yeah.
And that's a really good example. Yeah.
You know we, we had a similar set we analyze data as well Nigel and we were helping a large insurance company out. And

(14:47):
the without boring you to death with the details that the analysis though basically was that
the lower cost customer to get on board. So like you know in your example, say the $10, $20 per cost customer to get on board was maybe potentially staying much longer with the insurance company.

(15:11):
And, you know, a better profit for them at the end of the day you know a better revenue owner, and they were doing these kind of marketing initiatives you know and offering to do this and that and the other you know for people like me or if you had a car or two
cars or you know whatever. But the bottom line is that on the analysis, they found out that the customers that, in effect, they were giving more to so they were maybe coming in at what kind of hundred dollar per acquisition you know $110 per acquisition

(15:47):
something like that. They would turn back out of the system again about a year later, because they'd, they'd take the deal. They'd come into the system. But then, you know, soon as the deal was over they turn back out and go looking for another deal from someone else.
Yeah. Super interesting. Yeah. I mean when you talk about those types of acquisition costs.

(16:12):
Yeah. Sorry, I just have a question right like me being, I guess, the spanner guy in the back shoveling data to various users.
We love spanner guys on the back.
Working in a steam room. Just because you have access to tableau and power bi does that make you an analyst, because at the moment you know we we do things like, you know, give them access to the platforms, opening up access to database system so I can do this massive

(16:40):
data, then I got to worry about, you know, where the data is going, data encryption at rest in transit, etc. But I mean, is it.
If you give everybody all the full data sets. Can they actually figure it out or is like, how do you sort of say well this is the only data set you should have access to versus, here you go.
You're the analyst do whatever you want. And is everybody turning into analysts now.

(17:04):
No, I mean you're spot on and a right and that that is another fallacy that somehow.
Oh, it's okay look Microsoft have launched this cool power bi and, you know, we get it free and it's awesome you know etc and of course we all know you don't because if you want to do something meaningful with it, you then suddenly have to pay for the power bi

(17:25):
Pro licensing. But look, all of that aside. It is, you're right, there's this modern notion almost to pick up on Nigel's points about, oh look, you know, this is dead easy it's just like an Apple iPhone.
You still need to understand your data, you still need to understand your company. And I think that's, I'd like to think that's what we bring to the table is that, you know, we work with people really to kind of work out with them.

(17:54):
What's meaningful to you, you know, what's meaningful that can change the direction of this company at value, you know, change the way that you work whatever.
Make decisions.
And then we'll work on trying to construct something for you. Now as I said, unlike a report, you can still go drilling, and you can see why, you know, that in Nigel's terms you know why the lovely looking pie graph looks like that, you know, so you can go down behind the

(18:28):
surface and look at the figures and get an understanding of what's you know that looks up what's changed that. Why, why, you know, is that coming up as a big red quarter of the pie chart, you know.
But but no issue, rightly pointed out. You can't just kind of go to somebody. Hey, here's Tableau here's power bi knock yourself out. You know, connect to this data lake and have a merry old time.

(18:58):
Because, yeah, it's almost like the self service analytics concept you know the, the table I tried to push a while and then they realized fairly early, but they tried to push but they were introducing self service analytics I'll just connect to the data lake.
And here's your, you know, here's all the standard reports but hey you might want to create something beautiful for your department. And then they realized oh shit that's creating analytical anarchy I think they called it was their term.

(19:25):
It was interesting because in that same.
In that same in that same seminar that they were doing they also did describe that point of understanding the data and understanding the the the caveats let's just say behind each one of your data points you might bring into a report and knowing that you're analyzing

(19:48):
things in the right way.
I mean, I'll just on that point, and they the, that, which is a really good one about not creating analytical anarchy and creating an environment which gives, I suppose, people in, you know, sitting in front of power bi or tableau whichever whichever tool it is.

(20:12):
You know, they, they quickly become empowered when they see how easy it is to generate really interesting things.
And they can get caught up in this.
I think, you know, they create tend to create.
You know I sort of described them as kind of like, just like shrapnel, you know, from from reports and, and they tried to then make decisions on that and, and these these reports which is essentially just rubbish they just shrapnel from what was the original report or so on.

(20:49):
And then in turn, I said I'm stealing yours, but we all know the old one which is true analysis paralysis. Oh yeah yeah yeah they literally get caught in a loop. Yeah, I mean I countless numbers of a number of times I mean I'm guilty of it myself I'd be working on a report in tableau for a day.
And you all of a sudden, forget what you're actually doing.

(21:10):
You know, it's so funny. You're like, actually, what was I doing, starting off, looking at you know the the delivery in full on time reports from Australia Post for example for all of my ecommerce customers to now all of a sudden I'm looking at my, like, overcharges on, you know, delivery costs for a certain
suburb because somehow it's you know strung the data together and I never had that report before and then. So, you end up creating all this shrapnel but some of it's really, there's nuggets in there you know there's golden nuggets in there there's good good stuff that you can

(21:45):
get out of it and I think I'm having time to do this form of self discovery when you do get access to, you know, a new system if it's been put in by yourself and your company and do you find that you sit there and you tend to try to rain a lot of that in and say, you know,
because I know things like tableau have an end and I've looked at this as tableau online there's different different you know different versions access levels and permissions can become this gigantic task in itself.

(22:15):
Just making sure that the right people have access to the right data.
And I think we can touch on this point later on I think but you know, having third parties coming in and helping you with that data and helping you with that discovery now that you guys have sort of cleaned up the data you've done your ETL processes you've got your business
policies in place you the data is nice and clean. Everybody understands it.

(22:41):
You know, do you, yeah, do you find that, you know, you have to sort of sit down and once you finish the project and you've handed it over.
Do you say right guys these are the, let's call it those sort of like the rules of engagement, you know, like before you start shooting people.
Can you just take a moment to understand, you know, these are the parameters that you work within. How does your company normally handles, you know those situations.

(23:08):
Yeah, look, we, we work through. I mean part of that is working through with them what you know the operating model is, because again back to what and I said you know, it depends, not in small companies obviously but you know you might have data engineers, data

(23:29):
administrators you know like what we would have called database administrators dashboard, you know, visualization people, etc etc.
So part of it is also, we're always honest and upfront with everyone right, you know, one of the reasons we started it was because we kind of got fed up working in places where, you know, less than the truth was being told to us quite frankly.

(23:56):
And so, so no we're, we're very clear about, you know, this.
This is the kind of operating model that you would need to run this. And therefore he is the kind of people that you would need.
And again, one recent one I kept thinking of it now again another Australian one is a, you know, the gap analysis. So basically, we believe that actually you've got people in your team that are capable of these particular roles.

(24:31):
But we think you're a bit short here you know you've got no one who can like do the data modeling for you or something like that.
And so we will highlight things like that. And then, also, help them with, because I'm not selling you everything we don't do everything I said in many respects you could still argue.

(24:53):
The data engineers the data plumbers. And so when it comes to some of these other areas that they want to cover. We will partner up with companies like yourself Nigel, you know, and say that actually that's where the skills gap can be filled.
You know, that this isn't necessarily something where you should be employing people to do it anyway, because some of them are kind of not one off jobs but you know that they're, they're jobs that you do, then effectively you know you probably leave them

(25:30):
for about four or five maybe six months before you come back to do some adjustments or some changes, or whatever you know especially things like data modeling work right.
So no that's that's them where we help them with either we do believe that you've got the skills yourself or that you can, you know, retrain people or whatever, or here are partners that we work with that we think can help you fill that gap.

(26:01):
Yeah, it's funny. You should say that because I've, I've been working and this is probably on the topic of just partnering up or contracting or going out like knowing your weakness and your limitations.
I've been in a few, quite a few organizations and I think it varies wildly like I've been in currently working with some organization now where they're, they're more than happy to outsource they understand we're understaffed.

(26:26):
We're not built for it. Let's not try and take this on ourselves we've got plenty of good resources out there you partner up with a good MSA whoever or MSP can provide you those those contacts and find the good people in the industry.
And then, you know the the star contrast you find people who just clamped up, they've got the same guy that's been working there since 1985.
And they're trying to shoehorn all this modern type work into him. And he's just failing, and they don't, they don't actually recognize it because they don't even know if he's doing a good job or not.

(26:54):
I think as well. One thing I will point out to sort of switch it on its head slightly is that sometimes we are called in to, if you like assist with what I refer to as a hostage situation.

(27:15):
And it's the classic, you know, like, when we were back to that original story you know the guy right in the script himself and you know don't worry leave it to me and I'm the only person who knows how SS RS runs you know report services.
I'm the only person who can write SQL code you know etc etc. We, we often get called in to help both improve the company and, you know, take them in a new direction a new strategic direction you know a new roadmap.

(27:51):
Often, they themselves before even getting in touch with us have realized that they're kind of being held hostage by a couple of individuals or maybe you know if you're smaller, a single individual.
And it's the single point of failure, you know that that again, you're carrying risk as a company, because back to where we originally started this podcast Nigel, you know, some companies are very good at realizing where they carry the risk,

(28:24):
and others aren't and can't actually necessarily see it.
And again you know big companies big brand name companies. We all know the case about banks you know banks run the cobalt systems where you know the team were all 70 years old because, you know, they've not migrated to anything else and nobody can find anyone who

(28:47):
knew the code anymore.
Oh god yeah, don't we know about that in the.
Okay, so one of the things I remember just around around the data piece is understanding the difference between reporting and analytics.
You know, reporting is you know you report on things for the sake of reporting but then you have analytics for the sake of, you know, understanding what's happening, I suppose, at more real time, you know.

(29:23):
Yeah, what's your what's your thoughts around that I mean because I usually dealt with analytics rather than reporting that reporting is a point in time of something that has essentially happened.
And analytics is.
I have a question.
And I would like to know what the answer is. Yeah, yeah.

(29:47):
And do you feel that, you know tools like Tableau power bi are more analytical tools and they are actually about answering questions, then rather than doing reporting, because I found I use Tableau to build analytics.
And to, you know, in, and the online version obviously gives you sort of like a sort of a semi AI tool and we'll get on to AI in a minute I think but you know you could say okay what was the sales result for Friday, you know the 11th of August, and bang

(30:21):
and it gives you that gives you the results but you know you couldn't really say to it, show me a visualization of all of my customers from postcode 2000.
Yeah, it would be able to create that report on the fly, generate, you know what you wanted to see.
I feel my general feeling is, and using Tableau for for many many years.

(30:45):
I feel that it's going to get to a point where you just ask something to either Tableau or whatever system it is that has access to the data.
And all of the caveats and all of the data that has been, you know, in terms of governance, has been presented in a way that it understands this AI or whatever it is understands.

(31:10):
It's just going to be able to show you what you want to see. So, the, the thousands and thousands of hours that goes into generating these useless reports as you said that 60,000 reports which only, you know, 60 were being used.
Those situations won't happen anymore. It'll just be, you know, somebody sitting there going, you know, see if I saying okay show me, you know, all of my net new customers for the last seven days, and where they're from and how much they've spent and bang it just appears.

(31:43):
I just might bring us back into the cyber security realm right so obviously it's fresh topic for me.
In terms of data and we talked about the difference between reporting and analysis right. I'm working now I've stood up a few same solutions now security you know monitoring monitors events and picks up anomalies.

(32:05):
So at the moment, the tools I've seen have a lot of nice stuff out of the box, it knows what to look for if there's an attack or attempt to evade, you know evade being detected detected across your systems, so I don't know if anyone's familiar with
what seem right security information event management or monitoring tool, so gets all your data sources it correlates it all together and then it's able to detect if something funky is going on if it's related to something else on another system.

(32:35):
But what I found is with the tools I've been looking, there is a lot of well defined.
I guess, signals that is looking for. But also there's a lot of things that person like you know someone in my position where I actually have to think about now knowing now what it's looking for ways to get around it.
You know, how would I counter it.

(32:58):
So, for me is a bit of degree of. Yeah, there's cut and dry reports, I could probably do it by something parameters is learned or what we've defined. But honestly there's still for me there's going to be instances where like, I'm working on both sides and a red team and blue team is
going to figure out well you know I know that a scene would be looking for something like this, what's something really mundane that I can push traffic or hide my activity that I seem normally wouldn't be able to detect, because like you know,

(33:31):
because you're extracting data out of a system essentially and storing it somewhere potentially. So you're talking about a report which is a point in time.
And if you're talking about basic operations yeah okay I'm using x amount of disk space a month looks normal. Yeah, anomaly would be used 80 gigs more than what you normally use that's an anomaly it shows up on a report AI can do it.
You know, I mean all the system anomaly detection do it now.

(33:53):
But how do you, how do you know that maybe it's incrementally growing.
Yeah, I mean, there's one extra data source that it doesn't normally get into like I guess the built in tools could probably pick that's anomaly but it's a constant battle for me.
There was a there was a system that I saw a few years ago called Black Lotus, I think, and it had, it had an AI machine learning proprietary thing built into it and you know I'm not a cyber security and I'm not going to start throwing buzzwords around.

(34:25):
And from what I saw from from a from a just a, you know, sort of high level point of view. I thought it looked pretty cool and, and this was in Singapore.
And the one of the things I showed was for payment systems, it's very useful because the payment systems you know generally speaking, there's, you know, transactions are a certain packet size and all this kind of stuff.

(34:51):
And as soon as you have very large transactions going on the packet sizes of the information coming from different systems is different sizes and it starts to segment these things and it was actually incredible how it did it.
But the end result was it's this this this concept as I mentioned before it's hard to, it's easy to make things complicated it's hard to make it simple was just one simple report on a screen that basically showed a threat meter, and that was it.

(35:18):
And the cyber security guys were all sitting around sort of clapping and going wow that's awesome. It was just this one little needle, you know just going up and down showing what's going on in their system.
And obviously they all had propellers kind of spinning, you know kind of stuff going on.
But that was saying that they were using this this this concept for for large banking systems whereas Eddie was saying this cold cobalt systems and so on. Sometimes they don't these cobalt systems they're rock solid that but they don't like talking to anything else they don't like consuming

(35:50):
anything outside of themselves as systems so this is why they're all sitting around cheering and stuff but that's exactly the point that we're raising in that conference and there's like how from a cyber security point of view, how do I see what is normal.
How do I see what is semi abnormal, just to reduce the amount of false positives, they're getting and have to have to investigate because otherwise you have huge teams looking at just red herrings all the time, you know, how do you how do you define an anomaly right

(36:16):
like yeah yeah and the very definition of anomaly you know what is it, and how does it become that when you've got AI doing similar things as humans.
Essentially, yeah, yeah, super super interesting and they yeah good point. This is why I need massive amounts of data.
So, you know, don't forget, whatever.

(36:37):
Anything that we talk about at all, relating to AI.
It always needs huge amounts of data.
So to help with his training. And, you know, the big thing at the moment of course is chat GPT, you know, show me two years worth of data off systems that aren't being hacked, and everything's going okay.

(37:03):
Right now that I've trained you on two years of what the normal world looks like.
So I'm not sure if you see anything out of that pattern. Yeah, but this is the, the very challenge, right, because you're looking at a point in time part backwards in time.
If the business has grown to a point where it no longer recognizes that the new data is naturally different because it's now looking at real time data or data that is closer to real time.

(37:34):
So I don't have false positives on that because it's just now looking at different data sets.
People. Yeah, yeah. So you there's there's all these caveats that the machine learning models and stuff need to understand that the natural progression.
Oh yeah, normalization going on in the data sets and so on to make sure that they don't get you don't get these false positives.

(37:58):
Yeah, yeah, no super interesting.
One, one.
One thing I just wanted to mention just on the larger data volumes one thing I noticed with these different types of systems is, you know, the, the ability for a very basic user you know small business owner for example, who might have normally spun up an Excel spreadsheet

(38:26):
to analyze, you know, thousands of customers potentially they might have a database or whatever.
And the amount of raw compute power that that needs for you to you know to run macros or run, you know run automations or run filters or running, you know all of these things I've seen people in business tried to do everything inside a spreadsheet and Google Sheets

(38:51):
and then realize two years later I should I should have just gone straight to, you know, Google Cloud and used BigQuery or something like that.
One thing, you know, as a small business owner myself now and realized is I can analyze huge amounts of data in tableau.
And all I need is a MacBook Air with an empty processor and there's a little bit of memory and it's smashes millions of rows of data which normally would have crashed if I tried to use Excel.

(39:19):
I mean, do you have, you know, do you have any thoughts around, you know, without the analysis paralysis coming in.
Do you think that this is some good thing or a bad thing or like, you know, what are your thoughts about that.
I look, I, it's not a good or a bad thing. It's all that's education isn't it, it's like, you know, again you're back to what's, what's my kind of risk scenarios you know what's what's the difference between running something on Google Cloud AWS whatever versus

(39:56):
some kind of, you know, nuclear power station of a laptop I've got. And I'll just, you know, I'll share with you the reason why I'm smiling so much in this segment is that, and he knows that I am currently on the podcast with you guys on my M2 MacBook Pro, which is

(40:19):
the one that's been the electricity grid for New Zealand while we're talking.
Yeah, yeah it's it's amazing what small business owners tried to even, you know, just generally like once you get you don't think you just do it you just start doing it once you can.

(40:40):
And you realize you can you never stop to think should you.
That's probably my point. That's exactly it. Yeah, because so many times I've done it and then I've realized and I've put it back into Excel, and I've realized tableau is double counted rows or something or, you know, and it's just it's not.
Unless you are an analyst, you know, and you're looking at the data and you're going back to your business model and you're just you're reconciling things and making sure those numbers are correct.

(41:07):
You can basically make some pretty bad decisions. I mean have you got any stories around businesses making bad decisions on data maybe I mean that that that might be interesting.
Not name names. Yeah, let's not name any names I don't want to get sued or you don't want you to get sued any but yeah.
It is a, you know, a free country you can say what you want but it's all in your opinion of course.

(41:32):
You know then they've handed the data to someone or done this or such and such as had a breach and you know all the all the data has been stolen.
We like the police I think at the moment they've got a big issue. There was a laptop which was sold and it had a spreadsheet of all the police officers names details addresses.
And they mentioned in the news they were saying that they're worried about the police had moved out of their homes that were on that spreadsheet because they're worried about something to do with paramilitary attack.

(42:05):
In Northern Ireland, the Northern Ireland Police Service has managed to send a spreadsheet out with the name of every single serving police officer in the country, and their address, and all their details.

(42:28):
Yeah, yeah, okay, maybe, maybe this is my I saw it on the yeah so I caught the end of the story, because I believe, you know, because I was catching up myself and they said that, you know, one of the dissident army groups or something claimed that they now had that information.

(42:50):
And I literally I was just my mouth was hanging open, because I was going, you know that literally it when you just can't get your head around it. That's like any country going, oops, we just released every single police officer that works for us and where they live.
I mean, it just screams something ain't right. You know, you know, it's just saying it's just like the way to the truth is to ask the right questions it's Socrates you know, and I do remember there was that I ended you're going to correct my facts on this I know you're going to this is this is why

(43:27):
I love podcast.
I can't think of the ideas and you correct me. No, there was this phone. There's a company who created these phones I think was a Turkish company and they had somehow distributed these between the bandidos or one of the one of the bikey clubs.
And they were actually listening that Australian federal police were listening into all of their conversations on these phones, and they didn't realize that this you know the information was there the voices were being recorded and they obviously

(43:59):
were tracking for a while and they cracked down now. This this this data breach I understand as I understood it was what had to do with the Colombian cartels the drug cartels and everything.
And somehow it's just, I feel like this is a bit of retribution. Now like, whenever I hear the police, you know, because they're on our side they're the good guys are trying to keep the drugs off the streets.

(44:21):
You know, being exposed, all of a sudden, and it just screams to me like, I don't know enough of the story. But
basically there was a, there's a device sold by the law enforcement agencies that was made up by product I think is called a normal something like that so basically they created this product, I think it was Android based and they they tatted it and sold it to the

(44:47):
criminal's as a secure messaging app. And so they pretended to sell it they said oh, you know you can do your communications within your organization your organized crime network and we can't, you know, it's uncrackable or whatever.
So after a while they distributed lots of these units everybody and, lo and behold,
it was already hacked. Yeah, it was free hack off the shelf. And I guess that's just a little bit of marketing and spin and then you know people snapping it up.

(45:18):
I believe it was around more than 800 people nabbed or arrested in that sting but there was.
Yeah, that was the one you're probably referring to.
It's not. Yeah, yeah that yeah it was I mean I thought it was brilliant. Be honest I mean what a fantastic way of, you know, sort of getting in there and getting the data that you, you know,
that they needed obviously it has to be, you know, in court there's, you know, you got there's so many rules you have to go by in order for you to present that information in court and has to be beyond reasonable doubt and everything else and so it was clearly the framework

(45:54):
of the data that was siphoning out of these phones was enough to appease a court of law. I think that made him sign in with Google first.
Yeah, yeah, yeah. I was just going to circle back to the bad data decisions. I mean, who could we and we're going to name names here. Who could forget the Robo debt scheme.

(46:21):
Yeah, you know I actually didn't follow that.
I would you go into that. Yeah, we heard about it over here as well. Yeah, you know, how on earth was that thought that that was a great idea.
Yeah, I believe I'm looking at this is Wikipedia. So, we're just quoting what I've seen on Wikipedia at this, at this time date and time. So, they came up with this idea that they'll use data to go and have automated recovery scheme so the systems would ring you

(46:52):
and go hey us money.
And they've contacted 470,000 wrongly issued debts to be repaid in full. And the biggest, the tragedy here is obviously there was a there's a there was a death because somebody had committed suicide because of the crap, and he wasn't even apparently
he wasn't supposed to be owning money on it so I mean exactly that that's what they, you know, these government agencies. I just don't understand what people go through. You know, that's terrible.

(47:26):
Yeah, it shows you, it shows you this whole conundrum about data, because data itself is gold now. Yeah, yeah, it's power it's information in this podcast, you can use it for good things you can use it for bad things.
That's why everybody's kicking off about AI, you know, and AI, because we know that any of these technologies that, you know, we come up with can be used for good and can be used for nefarious activities as well.

(47:59):
But, yeah, I remember I followed that I followed some of the hearings as well. And, I mean, it was just awful. You know things like, well, you know, the memo went to you for approval.
Well, I never saw the memo because I was on holiday for two weeks. You know, but do you just let people loose with data. No, you don't give people the wrong data bad things are going to happen.

(48:29):
Yep, yep.
That's right.
One of the things we do as a company I'll just highlight to you is, we decided this right at the beginning when we started.
We never ever, ever have anyone's data.
We always work with the customers and you know, with their systems and so on. We never ever, ever hold any data.

(48:57):
Now, there are other companies, you know, as we know, like analytics companies and that that actually do say look, effectively give us your data will mine it for you.
We'll find the gold in there. And then, you know, you make some, some good wedge out of it.
And I'm not knocking that in this podcast either that's not what that's about. I'm just simply saying that we chose right at the beginning that we did not ever want to be holding any customers data.

(49:31):
Yeah, yep.
Is that not end day from your point of view is that something like from the vetting process is that, you know, as a, if I'm a, if I'm a business owner and I say, you know, Eddie come come do some work with me and, but, but first of all I want you to have a chat with end day.
Yeah, yeah, we would think you would look at this totally be part of your vendor vetting process with with I think we touched on a couple podcasts ago.

(49:57):
I mean you should just probably do do your due diligence ask the questions like if if I was to engage with Eddie's group and I'd say, you know, we're about to shift the whole bunch of database, shift the database over to be certainly I mean, certainly the IT department
would be asking what's your data policy what's your data handling policy.

(50:18):
What are your procedures in place to protect it where is it being stored. This is the questions that everybody should ask when you're handing stuff over, like I, when I sort of before I started to cyber sort of looking at privacy laws, etc.
And I think when I was going to go and click this is JB high five biggest retailer.

(50:39):
They took a copy of my license I think it's for a click and collect or something like that right. Yeah.
And I went okay.
I don't care too much because I tend not to put myself out there too much but I at that point in time was freshman minus and hey, what are you going to do with it.
And, and credit credit or JB high for at the time, the staff member rattled off exactly what they can do. Yeah, we take a photocopy we hand yours back we hold it here.

(51:04):
Once the transaction is done at the end of the day you got to all get shredded and put in secure bin. They knew the process of the top of their head.
Yeah, either they can ask or they've been trained, and it's very well and I went great. So, we don't, we don't put it anywhere else it just goes straight, we just use the verify and then it gets shredded.
Perfect. That's what I wanted to hear. And so there's a sort of things you want to get from a company that you're going to deal with especially if you're going to push data to them.

(51:28):
You should probably ask that question in general if you're going to have some sort of interconnect agreement and stuff like that you know how do you guys handle stuff on your site.
So absolutely yeah yeah I mean with this day and age with API's and everything into you know, at one minute, as you're saying it you don't know where your data ends up sometimes to do businesses know where the data physically ends up in that example and you could

(51:49):
have, if they had to maybe taken it with with their phone uploaded it to a system somewhere to mark off your thing and then next minute that stored on a database in the cloud somewhere and now it's available to an API query.
Yeah, we somehow gets exposed to the web.
Unfortunately, and now next minute your licenses now out on the web, you know for sale on the web you know it's it's sometimes it is better just to photocopy things and shred them.

(52:18):
So it becomes part of that, you know, risk and reward process doesn't it. Let's talk about future predictions and AI and I'll start with what I think is a future prediction, and we can do some cutting but, and then I'll let you comment on how much bullshit you think
I'm talking.
So my prediction, again, I think I'll go back to the point is, you'll go. Hey chat GPT, tell me about the situation of my business at the moment.

(52:51):
What is what's, you know what's the short term and long term outlook, and then it's just going to spit out based on the data you have.
So at the moment.
I just rewind back a bit. So at the moment chat GPT these language models they're all trained on big data sets as you said you need big, big amounts of data.
And one of the quotes I saw just recently was that the difference between humans and AI is that humans can make decisions on a small amount of data and AI needs a lot of data so if it doesn't have a lot of data, it's kind of useless, where humans have, you know,

(53:28):
artificial intelligence there's there's all these different things that come into play. When humans make decisions about certain things right. So, AI is only always going to be a tool to solve a problem.
So it becomes the humans decision then to work out what problems are not trying to solve. And then the AI can then work out whether or not those problems that the human is trying to solve are the real problems.

(53:52):
My general feeling about tableau and all these BI platforms and to your point before when you mentioned, does the average business owner become all of a sudden an analyst.
I think the role of an analyst, in my view is one of the role where it becomes sort of the gateway between a business wanting to solve the right or ask the right questions to get you know to get them the way to the truth, and that being, it may not come about as the first

(54:27):
time. So it's like chat chat chat jp to you ask it a question then you realize oh no I was actually trying to. It gives you the answer and then you realize, oh shit.
It's a bit like see through PO Sorry, can you rephrase the question you know it's sort of like, it's still very much like you know it's not going to pass the Turing test anytime soon.
You know it's sort of like it, the human becomes no actually what you're trying to ask the data set is this, and that becomes analyst role where I think there will be a big shift between data science and data analysts or analyzing data and then the end user.

(55:04):
And I think AI is going to bridge that gap between the data scientists and the end user, and make the person in the middle so to speak that the analyst, more of a kind of proposition kind of analysts it's like you know how do you describe to work out what you're
trying to describe and then ask the AI, and then it's going to start to evolve from that where you can pretty much ask it anything, and it will start to refine down your question to a point where you start to answer the questions you're trying or problems you're trying to

(55:36):
solve. What are your thoughts about that it what do you think your your sort of future predictions around data and AI.
Well look, first of all, and we know this we've seen it through our lives so far, anybody who does predictions always gets them wrong.
So genuinely I'm always like, I have no idea what is going to happen in the future, because, you know, I love when I see old clips on YouTube now of science programs for the 1970s you know going, or by the way by 1975 you'll all have flying cars and this is what they look like.

(56:17):
So, seriously it's worth putting that caveat in there. Look, my view at the moment is this, that more than what we would generally have termed the job data analyst the role we know it today.
I think the, the role of prompt engineer is going to be big. Right, because it's all about, or certainly in current iterations you know it's all about well, how do you ask it how what way do you build the prompt.

(56:55):
And, and clearly, look the human still has to be there for context because, you know, we've all got stories like this you know you say to it something like, you know, tell me how to get profitable within the next five years.
And it says, Okay, well, if you sack everyone in Queensland, you'll become profitable and in one year.

(57:17):
Like, Elon Musk says that humans become like the house cat.
All your valuable people with who you love and you know are in your business and have made it what it is become the house cat.
Yeah.
So, yeah, we know that, like, you know, you can, you can ask it some things and then it can come back, and it can come back with something that is not a logical you know it makes sense but in the real world you go.

(57:52):
Yeah, but that's not a good thing for me to do. Right. So that's number one. Number two is, you almost did it as a little throw away phrase at the beginning.
Our dear friend that we're talking about is the most important thing data.
So it's like, where's the data. What data is it looking at the answer the question.

(58:14):
This is 2023.
And even in 2023.
I still have no good universal search available to me, where I can go.
Where is that PowerPoint Nigel sent me.
15 minutes go get a coffee. Come back grab a donut on the way. I mean it's highly productive, or did he call it data discovery PowerPoint.

(58:43):
I think he might have called it Nigel, I'm going to look for Nigel the point is you didn't even open it so it doesn't really matter.
You just read the title of it. So that AI, we should use the AI to actually rename documents, so then they can actually there's a business right there.
Renaming documents like Google wants to rename your document the first thing you put on a spreadsheet or a, or a doc but it's not actually what you want to name it.

(59:08):
And this screen, and everyone, I guarantee is not in their heads and going Oh God yeah look I'm there I'm with you. Right, because you're going, is it in a team's channel. Did he send it to me an outlook.
Maybe it's on SharePoint or something. Did he what happened to me.
You know, I think it was on slack I think he slacked it to me. Right. So the point is that even today in like minor bits of data, half the time we go, I've no idea where it is.

(59:41):
And then you open it and it's not even useful. Yeah.
So there goes your morning. Now it's now it's time for lunch.
So now you can say, hey, AI where should I have lunch.

(01:00:04):
According to your urine sample this morning.
Did anybody ever see that the Japanese toilet that would analyze your urine and send it to your doctor.
That's a true story.
Yeah, yeah, but it uses AI to determine whether or not you're healthy or not.

(01:00:25):
Next I'll have an eject a seat with a gillet you know with some sort of shredding machine to just get rid of the humans as they don't need them.
I'll add my add my three points like okay just closing out look number one, I love what it's doing now I love chat GPT barred. It's a man I use.

(01:00:48):
It's really helpful. Yeah, that's the one I use every day. I use Grammarly go, which is the new little event their business model I think because they were using AI, their own.
I want to die.
So it's such a useful tool to make me either sound and tell like it even asked you to a tone that you want.

(01:01:10):
I just I just literally bash a paragraph of random shit in my emails before I send it and I go Grammarly go if I can make it better.
Right.
And I, I think my typing skills have gotten better, but my grammar and my in actual fact my the ability to, you know, it's a bandwidth interface problem you know this neural lace thing that Elon Musk is doing I think is going to be pretty amazing

(01:01:38):
but you type stuff out and then you look up and read what you've actually typed and it's actually not. It hasn't got anything to do with what you were thinking about.
I mean, what is it there's a brain to keyboard user interface problem.
Wonderful vision of what's coming out but then you end up with like a wish.com email.

(01:02:00):
I don't have any times I've typed things on LinkedIn, and then I go back and read them again I thought how the can I sound so stupid.
I mean my head was amazing and now it's just garbage. Yeah, but like I mean for what it's worth and now it's it's such a wonderful tool to help me with my everyday life and yeah, you know, encourage it but that's the thing that a lot of people

(01:02:24):
in tin foil hat brigade a saying and and rightly so and I'm working on a specifically a generative AI policy that we got to review, I got a review is like, okay learning models are great but where does your data go like this is probably more edge room like
if I type in a question like, you know, this is my company and this is the. Well first of all let me preface this is anybody else really polite to the idea start with please or do you just go fuck and just write me up in the code.

(01:02:54):
I started off being so cool it's been like suri I started off being so, thank you Siri for that. Oh, thank you. Excuse me Siri, can I ask you a question, and my kids were doing the same next minute. Hey Siri what's the weather.
You can do better than that Siri I want a sunny day.

(01:03:15):
What I digress is like where does it. Where because I know with copilot I've been looking at specifically it says, Do you want to use code from the wealth of knowledge like the breadth of knowledge, and, or not right.
So that stops you from accidentally pulling in some malicious line of code that you shouldn't have put in your thing. Right.
How does that work.
I don't know the answer but I'm just I'm asking like when you're typing shit is it learning already or how does it what you type go into the pool.

(01:03:43):
Yes, well I would do, and it would use things like latent semantic indexing to know what your next most likely word is that you're going to type.
And then from there we start to predict what you your tone and what you what it thinks essentially is the right next right thing is it a concern machine learning models come in, and that's why chat GPT has the thumbs up and down because it's.

(01:04:05):
They've actually said that GPT four is actually dumber than what it was when it was first released. All of a sudden, I just laughed and I thought that's because there's so many stupid humans using it.
That's right.
I can't see me shoe kaki or like, you know, they see radical physicists sitting there talking and using chat PT chat GPT to answer the, the theory of everything right.

(01:04:31):
It's just dumb people like us using chat GPT exactly brain brain to keyboard into the app.
I think it's a some of it is just unknown.
Right, like, and that's what is worrying people that's what sets people off because, you know, more and more people are understanding that the, the neural networks, you know, even the scientists you know the super

(01:05:00):
technical people themselves don't know, like, how or why it came up with the output.
You know, in the old days you simply tracked back code. Now I was going to say exactly the same spot on it right, like you would you go back and reverse engineer but I don't understand when when the people who you know Sarah Conner famous quote thought it up, right.

(01:05:23):
Yeah, you know, they, at this point, it kind of to the average person, they've going well, this is your idea you mean you don't know what it's doing anymore that that freaks the hell out of people.
Yeah, it freaks me out like what you mean you don't know how it works anymore it's just doing what it was why or how it's, you know, coming out of the other end of the neural network.

(01:05:48):
Obviously, you've heard me before say about retraining I'm working with a lot of AI algorithms at the moment for while he scratches his forehead and like he's got a gigantic headache.
I'm just I'm thinking to myself.
We're trying to do.

(01:06:10):
Image recognition. And again you go ask you about driving. Yeah, it's not. It's a nightmare.
It's really, really hard. Yeah, like a stop sign could look like a giant tomato on top of a pole.
Like, yeah, I mean it's. Yeah, it's amazing how far they've come, but it's it is.

(01:06:32):
It's extremely difficult, like, you like the whole, there's no intuition with the right. There's no equivalent of intuition which is what we have, the intuition is the big thing isn't it.
They're talking about hallucinations like it was just start to make up what it thinks it the thing that it's looking at is, that's actually even more dangerous than saying it doesn't know.

(01:06:55):
Because it thinks. So it's, it's basically it's, it's that concept of you tell a lie. Often enough and you start to believe your own lie.
And it's the same thing with hallucinations if it sees so the next time it sees that, you know, stop sign which is, you know, it says it's a tomato sitting on top of a pole.
Join tomato scene on top of a pole, it's going to believe it next time it sees it.

(01:07:19):
Right. Yeah, the reason why the buttons are there on chat GPT, and you know the comment section is because we're back, what we were talking about we're helping to train it.
If somebody has a hallucination, you know, we're meant to do the thumbs down and then right in the bottom. You know, this isn't true, you know, a tree doesn't have six legs. Yeah, yeah, yeah, yeah, yeah, that's right.

(01:07:49):
You can actually do that versus accept it, because there has to be a measurement there of sick statistical significance as they call it before. At some point it says okay well that's no longer.
The tree with legs or whatever whatever it might be you know so if people aren't doing that enough they have to, they would have to somehow. And I'm sure there's people a lot smarter than the nuts you're doing this but you know the thumbs down and the correction must weigh a lot more in a scorecard

(01:08:21):
than, then, you know, than anything to retrain that model because, you know, you'd have to think one in 200 or one in 20 maybe might actually put a thumbs down to a hallucination or, and actually know what it's actually doing people just take the answer and, and go
oh okay I started a new chat, and then ask the question again, you're absolutely right. Yeah, but again, in these things, we'll we'll take training from anyone. Right, you know, like, my biggest problem at the moment is, I can't find enough images to train one of these

(01:08:55):
models.
And again in this modern world you'd kind of go God that's impossible isn't it but, but does anybody know what quantum entanglement entanglement is. Yeah.
So basically right there they've demonstrated this with photons so basically you get two particles or whatever right that are entangled you entangle them.

(01:09:18):
And you're doing the same way and you take them anywhere in the universe. Yes, not what you change the direction the other one the other one changes as well anywhere in the right. Yeah, that's what does that work.
I don't think anybody knows.
They think they know.
I was like oh god here we go.
Because you're talking about like you know numeral.

(01:09:42):
Five, but I need. Yeah, I think we need some beers for that one and we might we might do that one. But you know, that's how quantum computing works and that's why it has to be almost at absolute zero because anything, any vibrations or any kind of interference
with the entanglement can cause them to become untangled and the the computer gets dropped.

(01:10:08):
I'm just sorry.
Just buy yourself one of those D wave machines.
They're pretty amazing. Apparently.
Let's, let's wrap it up here I think it's, it's been awesome having you on the show it and and then day again thanks very much for joining us.

(01:10:39):
A lot more to talk about actually I think if you've got some time in the future I'd love to have you back on it I think it's been a great podcast.
And thanks for joining us again today. I appreciate it. Anytime.
Thanks very much.
Advertise With Us

Popular Podcasts

Stuff You Should Know
Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

The Herd with Colin Cowherd

The Herd with Colin Cowherd

The Herd with Colin Cowherd is a thought-provoking, opinionated, and topic-driven journey through the top sports stories of the day.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.