Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
I think the biggest trap to potentially fall into is, hey,
it's moving so fast, so much is changing, let's just wait it
out. Completely the wrong approach.
You just got to get started. As tech people within the
shipping industry, how do we explain, how do we make it
accessible to all our users? So that's where we came up with
the idea of Data Supermarket with in mind really the target
(00:23):
of enabling self-service for ourbusiness.
I see so. By giving the analogy of a
supermarket, it was much easier at the beginning to explain to
our business. Welcome to Analyze Asia, the
premier podcast dedicated to dissecting the powers of
business, technology and media in Asia.
I'm Bernard Leung and at the Data Plus AI event in by Data
(00:47):
breaks today in Singapore. And with me today is Nick Ayers,
Vice President of Fuel Engineering, Asia Pacific and
Japan. At Data Breaks, we are going to
talk dive into the evolving landscape of AI in Asia from
infrastructure to intelligence and why data intelligence may be
more important than general intelligence for enterprises.
So Nick is going to come to us to discuss how companies can
(01:09):
build the foundation for responsible and scalable AI in
the region. So, Nick, welcome to the show.
Thank you very much for having me.
Like every other guest on Analyze Asia, we want to hear
origin story. So how do you start your career
and what got you into the area of data and AI?
Yeah. Well, First things first, it's
been a it's been a journey, a 25year journey in analytics.
(01:31):
And to be honest, it's all I've ever done.
Not in a bad way. I absolutely love the space.
I'm super passionate about the space.
I wouldn't be doing anything else.
So about 25 years ago I started working in sub-Saharan Africa,
South Africa for a management information systems company and
actually was developing an earlyversion of what you would call
an intelligence. Platform, yes.
(01:52):
Specifically for financial services customers in
sub-Saharan Africa and we had great success with that product.
I was the developer, I was the engineer from a pre sales, post
sales perspective. I was Technical Support, I was
marketing, I was everything. But it was a wonderful
experience building that product, taking that product to
market and getting really good success and traction in
(02:13):
financial services from there. That product, obviously all of
the contracts and the work that we had done, got transitioned to
a company known as Computer Sciences Corporation, now DXC.
And I established myself in DXC as a data and AI practitioner
and got spotted by a pretty prominent analytics company
called SAS, Statistical AnalysisSoftware.
(02:36):
So I joined SAS based on my experience working in financial
services and actually built out an enterprise architecture
division for SAS in South Africaand eventually grew that to an
international scale, running an international function that
spanned Europe, Middle East, Africa and Asia Pacific and
Japan, and actually relocated tothe UK as a part of that
(02:57):
process. From SAS, I saw the wave of open
source innovation and I got really passionate about that
space. You know, SAS very much closed
source. I was a part of the open source
movement and really saw a lot ofvalue and a lot of future
success in that model. So opted to join a company
called Hortonworks, one of the original open source creators
(03:20):
around data and analytics and took Hortonworks to the merger
with Cloudera. So now today it's known as
Cloudera. And you know, during that merger
phase, obviously sought other opportunities to take my
knowledge and experience to the next level and data bricks,
thankfully, I was looking for someone with my skills and
talents. So joined data breaks in EMEA
(03:42):
now. Now I've been with company for
seven years, three years in EMEAand four years in Asia Pacific
in Japan, based in Singapore. Right.
So reflecting on your journey because it's pretty interesting
in South Africa, EMEA and now Asia Pacific, what are the
lessons that you can share with my audience from navigating, I
guess, very interesting career transformations across, but
(04:04):
still within the same realm of engineering, data and AI?
Yeah, I think #1 it's really being custom obsessed in
everything that you do. And I don't mean to say that
just to pay Idlib service. Like, I genuinely believe you've
got to immerse yourself into thecustomer's world and you've
really got to deeply appreciate the norms around their industry,
(04:26):
their environment, their business context.
And I think that's something that I've learned throughout my
career and it's uniquely different in every single market
and especially in Asia Pacific and Japan, Let's be honest, Asia
Pacific and Japan is, you know, I think by last count, 51
discrete countries, you cannot approach a customer in APJ the
same way. You've really got to meet them
(04:47):
where they are and deeply, deeply understand the industry
and the socio economic state that they're operating in.
So I think having deep appreciation for customers and
being customer obsessed and having great empathy for the
journey they they they're on andthe journey they've been on is
probably one of the most important lessons learned.
I think the other thing is over the years I've learnt to be very
(05:08):
good at challenging customers from an intellectually honest
but humble sort of perspective. I think sometimes vendors can
come across maybe a little bit arrogant and a little bit
stubborn and their opinions. I always try and side of the
side on the side of, you know, be humble, be intellectually
curious and challenge customers to think differently but with
real empathy for the position that they're in.
(05:29):
And I think that's a really goodthing that I've learnt over the
years as well. And that's a great description
of what customer obsession meansAlso specifically, sometimes we
have to say no to customers as well.
So I want to get to the main subject of the day, which is
about data intelligence and AI infrastructure in Asia Pacific
and Japan. So maybe let's start or
baselining by defining some terms for the audience here.
(05:50):
How would you define data intelligence and how is it
different from what people typically think of as artificial
general intelligence? Yeah, great question.
Let's start with the latter first around describing general
intelligence. I think everyone today will be
familiar with these ChatGPT typelike interfaces, right?
Very prevalent, especially in the consumer market.
(06:12):
I think everyone at some point has leveraged these and
experimented with them, Be it tosolve your son or daughter's
maths homework or to plan a holiday for the family, or to
just, you know, get some basic quiz or general knowledge
questions answered. The reason I bring that up is
that's probably the best exampleto bring to life of general
intelligence. So what is it?
(06:34):
It's it's AI that's been applied, models that have been
applied and built upon the corpus, the the public corpus of
knowledge. So everything that's out on the
web, effectively the Internet. So effectively, these models
have been trained on a large corpus, a large footprint of
knowledge in the public domain. Very good for general questions,
(06:55):
general knowledge, Q&A, planningyour holiday, solving general
maths problems. Not very good for solving your
business problems. Because it's not trained on your
business, it doesn't know your products and services, it
doesn't know your policies and procedures, and it doesn't know
your operational norms and taxonomy around how you describe
(07:16):
your business. So that's really what we're
trying to solve. It's not the general problems,
but the domain specific problems, the business specific
problems. To do that, you don't want to
train the AI on the general corpus of knowledge.
You want to bring all of that AIcapability and you want to train
it and attach it and add it to your data because the
(07:39):
competitive differentiator in enterprise is your data.
So bring the model, bring the AIto your data and apply it to
your world and allow it to learnyour business, your business
taxonomy, and about your products, services, and
policies. That's what we call data
intelligence. So why is data intelligence now
more important than, say, general intelligence for
(08:01):
enterprises today, especially inAsia?
Yeah. Well, again, what's going to
uniquely differentiate you with how you engage with a citizen,
if you're a government agency, apatient, if you're in health and
life sciences or consumer, if you're in a consumer orientated
world, it's going to be your products and services.
And, and really what you want todo is you want to leverage the
(08:21):
information that you have aroundyour products, your services,
what you're taking to market andyour customers.
And the best way to do that in adifferentiated way, again, is to
leverage the insights you have internally, not the insights
externally. So really that's the secret
sauce, right? The most for customers that we
work with is their data being AIinfused and AI enabled so they
(08:45):
can uniquely differentiate the products and services that they
take into market and the customers that do that, we're
seeing massive growth and adoption.
So I remember a study that was done with the economies and data
breaks at where they said that probably 15% of companies
actually actually ready for having the correct
infrastructure for AIII. Guess The question for me is
(09:07):
what are the kind of unique challenges that companies in
Asia Pacific and Japan face whenbuilding that infrastructure?
I think people don't appreciate that everybody wants to do AI,
but if they don't organize theirdata, everything is just garbage
in, garbage out. Yeah, you're spot on.
Yeah, absolutely right. We talk about having really good
data and AI foundations. What does that mean?
(09:29):
To your point, these models, andwhen you bring these models and
you bring this AI capability to your data, they're only gonna be
as good as your data. So if the data is garbage,
you're feeding garbage into the model.
You're feeding garbage into these AI systems, these agents,
you will get garbage out. So garbage in, garbage out.
So the most foundational thing you can do to ensure that you're
(09:50):
getting great insights. Great, our outcomes is to make
sure that you collate, store andcurate high quality data and as
much data as possible. So yes, the quality matters, but
you also want to make sure the data is very rich and that
you've got as much data as possible that represents your
customers, your products and services all combined in one
(10:11):
place and in a quality form. So the most important thing
again is good quality data and having lots of data that is a
good quality all together. So you can use that to drive of
insights and bold AI agents, applications and experiences.
The other thing that I think is really important is connecting
this to the why. Because let's be honest,
(10:33):
technology alone is never going to be the secret to success.
Technology becomes incredibly impactful when it's connected to
a higher mission and a higher purpose.
So understanding why this matters to the business, why
this matters to the enterprise, how is it going to transform the
way that we serve our customers?How is it going to transform the
(10:54):
lives in the way that we are serving patients?
Perhaps, again, in healthcare, that really is really important
because when you connect the people in the organization to
the why, then they can really understand, OK, now I know why
we need this data, and now I know why we need to build these
AI agents and models and understand the purpose and the
(11:15):
mission. And I can connect to that and I
can get behind that and I'm induced by that.
Yeah. So I, I think 1 interesting
thing is I know a lot of high growth companies, whether it's
the Grab and the Zendis are all using data breaks and they're
all publicly known references. So one question I do really
think about is how can companiesin the region future proof their
(11:35):
data and AI strategy to stay competitive in the next 5-10
years? I think enterprise AI, I think
we're really at the beginning because most of the current
generative AI solutions tend to be point solutions, but not end
to end. For example, like the way how
Databricks does even combining with AI as well.
Yes, yes. Well, I think #1 you got to
start. Like, I think the biggest trap
(11:58):
to potentially fall into is, hey, it's moving so fast, so
much is changing, let's just wait it out.
Completely the wrong approach. You just got to get started and
the best way to get started is to leverage all of the best
practices and all of the successstories that are already out
there. It's OK to pick up, rinse and
(12:19):
repeat and reuse the best practices that are out there,
but make sure you doing that with a vendor or partner of
choice that has a proven track record of success.
And really what we're doing there as data bricks is we're
bringing together a lot of research, a lot of PhD level
research around the space. And let's be honest, the space
(12:40):
is changing every single day. Like every single day.
There's a new model, there's a new technique, there's a new
product surface area that's coming out.
We're trying to put all of our researchers to work to make sure
that they're picking the best sort of technological
improvements around algorithms, models, techniques, packaging
that up into really, really easyto consume, what we call agent
(13:02):
bricks, and then allowing our customers to consume those and
use those to solve domain specific problems.
So rather than you figuring out which analytical technique to
choose, which model to choose, you focus on the business
problem. We've got that tried proven
track record of success of bringing the best techniques
(13:22):
together and we package it up ina very simple, easy to consume
product surface area, all drivenby natural language.
So literally our customers can interact and start to build AI
agents and AI agent systems justusing natural language.
So the most important thing is get started now.
Don't wait because it's going tomove so fast.
(13:43):
Work with a partner or a vendor that has a proven track record
of success in innovating in thisspace, leading and defining the
best practices and the technology evolution which we
are doing. And that can really help
abstract that complexity for you, so you can focus on the
business problem and not stitching together technology,
so. Can you share a real world
(14:04):
example where data breaks enablea customer, say in the Asia
Pacific, in Japan, do derive saytransformative value from this
data intelligence? Yeah, absolutely.
I've got tons that I can talk about probably.
As much as you can. Maybe I'll start with some of
the customers that we had this week up on stage talking for us
in various forms. You know, one of those customers
(14:26):
is none other than Standard Chartered Bank.
Standard Charted Bank actually took a very unique horizontal
use case around cybersecurity and more specifically what we
call SEAM. So, you know, really looking at
security incidents and figuring out how do they triage those and
how do they get ahead of cyber security threats and power, AI
(14:48):
driven analysis, threat detection and prevention.
So if you think about this worldhistorically, you would have to
go and license a lot of software, what we call SEAM
software, to go and pull together logs from all over your
enterprise landscape. And it's also like that,
multiple threats of information from the story, right?
Exactly. So you're pulling information
(15:10):
from routers, from firewalls, from applications, from, you
know, human footfall in and out of the the building.
And you're trying to connect thedots around, is this activity on
the app in the building through these systems and interfaces?
Is that good behaviour or bad behaviour?
Is it a bad actor? Is it a good actor?
(15:31):
And all of that data, if you think about the data there, it's
all from different vendors. You might have one firewall
vendor, you might have 5 firewall vendors across the
estate that you have. So there's lots of different
data formats, and this data is typically high volume, high
telemetry that's coming in superfast.
And security is a problem space.You can't wait a week, two weeks
(15:52):
to figure out that someone's hacked your system.
Like you need to intervene and react in real time, and you need
to prevent it before it even happens.
So the problem is high volume, high complexity and real time in
nature. And we worked obviously with the
bank to solve that problem. And as a result, what they've
been able to do is they've been able to take all of that
(16:14):
telemetry from all of these different systems into
Databricks. They've been able to build a
world class cybersecurity platform on top of Databricks
and they are driving massive improvements like the TCO
savings were absolutely off the charts.
But not only that, it's the timeto value that has come down so
significantly. So now instead of you know,
(16:34):
waiting weeks, months to kind offigure out a cybersecurity
threat, they're down to days andhours and they will get down to
minutes. So since.
You have one customer case. You have other customers stories
you want to share too. Yeah, 100%.
Another one I can talk to is a customer called Techcom Bank, so
one of the largest privately held bank in Vietnam.
(16:59):
So we have a wonderful stakeholder and champion in
Santosh who was up on stage telling the story.
But I think the difference in this one is it's all about
customer centricity and effectively what Santosh sent
out to do with the bank. And we've been very fortunate to
partner with Santosh and the bank on this journey is to build
a customer brain. So think about how can you
(17:19):
intelligently derive insights about all of your customers and
all the attributes about your customer in real time to really
hyper personalize everything that you're doing for with them
in the moment and to make sure that you enrich their lives and
they have a better experience with the bank.
(17:39):
But ultimately they're getting abetter, a better, better banking
offers and all of that deliveredthrough app experiences, through
digital experiences in real time.
So there Databricks is pulling together obviously all the
information around the customer attributes that they have.
They have the last time I checked over 12,000 attributes
(18:00):
associated with the customer. Wow.
So everything from demographics all the way to various other
attributes. And they use this with
Databricks to derive A1 to 1 personalized offer in real time,
delivered in an experience that the customer knows and loves.
So if they love the app, it's inthe app.
If they prefer e-mail, it's in e-mail.
(18:21):
And not only that, it's delivered in the moment that
matters to them. So they might check e-mail in
the evening, they'll get it in the evening, they might like
their morning run and a coffee, It'll get delivered before they
hit the Starbucks store. So really personalized tailored
1 to one marketing through this customer brain that we've built
with Santosh. I'll keep the third one later,
but we'll get into the subject of responsible AI, which has now
(18:44):
become a priority for enterprises.
How does data breaks ensure, say, AI products are deployed
within safety, transparency and so ethical considerations in
mind. I do bear in mind that just now
for the two use cases, they wereall financial services.
So it comes with a lot of scrutiny and I'm probably will
appreciate that our database must have done a lot of work to
deal with all the regulations and how do you put systems in
(19:07):
control? As well.
Yeah, absolutely. Look, and we have a number of
other use cases in highly regulated industry across the
board, not just financial services, government and other
areas as well. But I think it comes down to two
things. Look, ultimately it is a shared
responsibility model, right? There is a shared responsibility
around the trusted relationship that we establish with our
(19:29):
customers because ultimately thecustomers own the data, right?
So the shared responsibility side is making sure that the
data has the correct classification and is used for
ethical, responsible, and sustainable purposes.
From our side, we work really closely with our customers to do
a couple of things. One, making sure they have the
right organization model to ensure that AI is successful,
(19:52):
sustainable, responsible and ethical.
What does that mean? We help them think through how
do you marry the vision, missionand purpose with real concrete
strategies, a very executable plan that can be measured.
And then how do we drive the literacy up so the entire
organization is aligned behind that?
Why? And everybody can contribute to
(20:14):
that journey and understands theimpact along the way in
measurable terms. So there's a lot that we do
around organizing the company, the leadership and the
organizational entities for success.
The other part then is everything that we do to work
both with government and to workwith industry around
understanding and appreciating and shaping and informing a
(20:37):
position on regulatory and compliance requirements in
market. And this is complex, right?
Again, as I said, APJ, 51 discrete countries, all of these
countries are looking at the regs and compliance, some of
them in slightly different ways.But we work very closely with
government. We work very closely with our
customers to interpret this changing landscape and to figure
out what is best practice, what are the emerging best practices
(21:00):
and to ensure that they're thoughtful in their deployment
of the technology. In addition to that, there's all
the technology aspects, right? Helping our customers make sure
that when they stand up the dataand AI foundations, that the
data is secure by default, that the right authorization and
authentication mechanisms are inplace, that they are looking at
things and leveraging best practices such as red teaming.
(21:24):
Red teaming is an approach that you use on models to check that
the models are responding well and are giving great responses
and can't be hacked or subject to prompt injection or other
techniques. So there's a bunch of stuff that
we do around ensuring that the technology is well architected,
well implemented and tested and validated and is secure and is
(21:47):
robust. And also I think dealing with
things like drifts as well, dataconcept drifts within the models
on that. So do you help your customers to
embed those kind of trust and governance in their AI life
cycle right from the beginning? Yes, yeah.
The foundational to everything in our data intelligence
platform is this concept of governance, security and
(22:08):
control. What's very unique to the way
that Databricks approaches this is we've always been a data and
AI company. So when we think about this, we
don't think about it just from atabular data perspective or an
unstructured data perspective. We think about that entire life
cycle to the agent being built, the model being built, the agent
being served, the multi agent, multi agentic framework being
(22:30):
stood up. Because you have to traverse
that entire life cycle. If you're only focused on the
data, you're missing everything else that gets derived from the
data and ultimately impacts consumers, all citizens
leveraging your products and services.
So data governance is fundamental to the data
intelligence platform. Key capability there is what we
(22:51):
call a catalog. So we've got this wonderfully
rich capability called Unity Catalog.
We've even then got an open source project that we've
created off the back of that. So our customers have choice,
they have interoperability, theyhave a safe exit strategy should
they choose to do so. But defining that catalog,
setting up that catalog, puttingall the data under the manager
(23:12):
of that catalog is a foundational step.
And then once it's in there, everything around how that data
evolves into reports, dashboards, models and models
being served and inference beingdone and agents being deployed
into apps. All of that life cycle
management, all of that lineage,all of that traceability comes
(23:33):
with that platform capability. Do you find it now in enterprise
just to dive a little bit of business trust and governance
thing? Do you find that customers are
now much more concerned about say the AI, the traces of what
the AI is doing get some form ofan audit trail of how the agents
(23:53):
are operating or maybe the AI models?
Do you see that kind of much more deeper dives into the
transparency of the models or interpretability or
explainability of the models then?
Yeah, it's absolutely an area ofconcern.
And I think increasingly as the regulations evolve, I think this
is an area for everyone to stay on top of.
(24:14):
Again, traceability, lineage auditability, really important,
all key platform capabilities that we're solving for.
But I think the other part of that question is, is the problem
the explainability and the interpretability or is the
problem the quality of the output?
And I think more customers rightnow are facing problems around
the quality of the output. How do you actually know when
(24:36):
you build these things that you're getting a good quality
outcome? That's one of the hardest
problems. And actually it goes into a very
complex area that we call evaluation, right?
And so model evaluation, agent evaluation there, it's quite a
complex domain, right? And if you're trying to tackle
this yourself, really hard to kind of get set up for success
there. We're trying to make that super
(24:58):
easy. So effectively, if you go back
to what I referenced earlier around agent bricks, the way
agent bricks works, you define this problem that you're trying
to solve, this domain specific problem, and we will
automatically evaluate the quality of it for you.
So we'll automatically figure out, hey, like this sort of
domain problem, you probably want this sort of quality and
(25:19):
you probably want to look at these sort of quality metrics.
And then with human based feedback in the loop, we then
start to raise the quality bar. So the human feedbacks, if
continues through continuous feedback, continues to improve
the quality of the output. So I think most customers are
actually really worried about that.
They can get a model built quickly, they can get an agent
(25:40):
built quickly, getting it to be of quality super hard.
That that I think is an art by itself.
So I have this. There are three beautiful
questions there. One before we go to the close.
So my first question is, what's the one thing you know about
data intelligence and AI and enterprise AI in Asia that very
few do but should? Well, I won't profess to to know
(26:02):
to be the only one that knows this, but what I think is super
unique and what I love talking about is just the psychology is
different in Asia Pacific and Japan.
I'm not going to cast dispersions any which way, but
when I talk to customers and when I talk to folks in Asia
Pacific and Japan, it's always about the future.
(26:23):
It's not about the past. It's not about bringing back the
past or bringing back things historically.
It's about the futures. And I love that mentality around
the future, Future success, future growth, future
opportunity, securing the futurefor the next generation,
securing a better future for all.
I just love the fact that, you know, the culture and the
(26:44):
psychology in Asia Pacific and Japan is all about the future as
opposed to how do we bring back the past?
That's interesting. That means the mindset is really
changing in the region. So what is the one question you
wish more people would ask you about data breaks or building
responsible and scalable AI systems?
But they don't. I wish more people would ask us,
(27:04):
how do you get started? How do you get started?
Like it sounds super simple, butI think sometimes we get so
bogged down into the technical details and into the weeds of
discussions around this model versus that model, this
evaluation technique versus thatevaluation technique.
And it goes back to what I said earlier, just get started.
Like just ask us how we get started.
(27:25):
And we just want to build with customers.
Like we love building with customers.
So for us, you know, it's ultimately around I, I wish we
could just build more with more customers and I would love that
opportunity to do so. So all customers out there, all
prospects out there, anyone who's interested, just come and
ask us how do you get started? And we'd love to build.
With you, when the customers come today actually first come
(27:46):
to you with a set of problem sets and then you have to try to
figure out like which which partof the platform can help you to
get those quick wins. We all do that as well.
It's a bit of both, right? So there's a couple of ways you
can slice that problem. Some customers come to us with
very opinionated perspectives around industry specific things
they're trying to solve. In that instance, we'll work
(28:06):
back from that specific businesschallenge and we'll try and
figure out, OK, what are the pain points and how is that
inhibiting success ultimately for the business and what sort
of value can we create for the customer by adopting our
technology and then put forward a solution other customers more
are interested in. What are you seeing with regards
(28:27):
to our peers? What are you seeing in the
industry more broadly? What's your perspective on which
use cases are working and which are not for those customers?
Typically what we'll do is more of a blue sky thinking sort of
ideation session and show them the art of the possible.
This is what we're seeing. These the use cases that we
think are valuable, high impact,high feasibility of success if
(28:49):
you have the requisite data in place.
And then we'll showcase that to customers so they can see it,
they can believe it, and then select it as a use case to
tackle. I see.
So, so they actually be able to solve, imagine what is the art
of the possible correct. And then basically you can
actually ship down to those top 2-3 use cases, correct and maybe
(29:09):
able to help them and then startthe building blocks moving.
Correct. Because you never want to boil
the ocean right again. I think some of that why, why,
why are we not getting started? Why isn't that not?
The question is inertia and fearof change and Oh my gosh,
there's going to be like another12 to 18 months.
Whereas if we can really focus on something and really distill
(29:29):
it down to something that is, you know, business critical,
high impact, high feasibility, we can very quickly get to
business value and then build that trust and build that
excitement and energy around tackling the next one.
I just want to get one little part of the getting start.
How about like the education forthe customer?
Huge. Yeah, absolutely huge.
You know, as I said, technology alone is not going to be the
(29:52):
solution. So we spoke a little bit around
obviously that AI organization piece that's right around the
leadership alignment, the aligning, the vision, the
mission. But you need to enable people
to. But you need.
Data breaks and how exactly you use it?
Well. Exactly, exactly.
So part of that is really thinking deeply around the
talent that you have, the talentyou need to acquire and the
(30:13):
talent you need to develop. And it's never one of these
things you're gonna have to do all of those things.
You're gonna have to figure out about talent acquisition
development and making sure thatall the talent in the
organization is super clear on what performance management
looks like, what good looks likewhen you're using data in AI.
So you've got to set some really, really objective
(30:33):
measures around a we're going toenable you, we're going to give
you these tools, we're going to give you this learning and
enablement, but we're also goingto measure you in this way.
And if you're successful with data and AI, this is how you
know you've achieved so. That would be what great looks
like for that. So my traditional closing
question, what does great look like for data breaks in Asia
Pacific and Japan in the contextof enabling data intelligence at
(30:54):
scale? Yeah.
I think for us, ultimately, we want to be the destination of
choice. We want to be the first thing
that customers consider when they're looking to solve any
data and AI business problem. And we really want everyone,
whether they're a customer today, a prospect tomorrow,
academia, we want everyone to consider data breaks as a
(31:16):
destination of choice for building and solving your data
and AI problems. Wow, that's a great 1.
So we have in closing, so 2 veryquick questions.
What any recommendations that have inspired you recently?
Cool. Well, I definitely recommend to
the learning and enablement point, you know, get out there
and get busy with data bricks. We've got this wonderful thing
(31:38):
called Data Bricks Free Edition.So we recently ran a very cool
hackathon. We called it the Smart Business
Intelligence Insights Challenge.And we basically set a bunch of
people free on the platform to go and experiment and build
crazy cool insights on top of data bricks.
And that was wonderful. It was absolutely wonderful.
I must get the link from you there.
(31:59):
Yes, I would like to go and try it out.
And there's some fantastic, again, custom examples of what
they built. So I would love for people to do
that right. Get out there, get into data
bricks free, start building yourown apps, start building your
own solutions as a completely free full cuts of data bricks
and you can get started today. So how can my audience find you
and or stay relevant with what is going on with data bricks in
(32:21):
terms of the fewer engineering and all the work with customers?
Yeah. So always check out the
Databricks blog, the company blog.
There is an RSS feed on our documentation as well.
If you're a bit of a live updatenerd like I am, I follow
religiously every single update to the documentation.
And then please connect and findme on LinkedIn and follow me on
(32:41):
LinkedIn as well. And I'd be happy to support our
customers out there. Yeah.
And of course, everybody know there's just one more thing
here. So we have been talking about
customers. So one of the interesting things
that I got from Nick today is I'm going to interview a
customer. So do you want to talk about the
customer? I think it's Simon Fasser,
General Manager and Global Head of Data Analytics at Hafina.
(33:04):
Yes, so wonderful that we'll be introducing Simon in a bit.
Great example, again, of an organization that has very clear
vision, mission and purpose around how they're going to use
data and AI. And it's just been wonderful
partnering with them. So yeah, you got a great next
speaker coming up. All right.
Thank you very much, Nick, and let's continue the conversation.
Thank you. So as promised with me today,
(33:29):
Simon Facett, General Manager and Head of Global Data and
Analytics at Hafnia. Right, Yes.
So Simon, we promised a customerstory for data breaks today and
Nick was talking very highly about what the initiative that's
going on with your company. Can you first share your role
and destination within Hafnia? Of course.
(33:50):
Hi everyone. So I'm Simon, working in Hafnia
since three years. Previously I was working in the
corporate investment bank, the French bank.
I'm French. I guess you get it, yes?
Yeah. So my role in Hafnia is I'm
leading the data analytics and AI team.
So we have a team of around likethen engineers and we are doing
(34:15):
our best to produce value. Out of all the data that we have
in a highly regulated industry which is shipping, can you
introduce Hafnia as a company business you just received is in
the area of shipping and what does it do in the world shipping
itself? All right.
So Hafnia is one of the leader in the tanker industry.
So we have over 200 vessels. We have mainly 2 activities.
(34:41):
So we are ship owners. So we own vessels and we are
also running an activity of poolmanagement where we will
basically take care of the operations of the vessels from
other ship owners. So these are the two businesses
that Hafna is living. So it operates across globally.
It's a global, global, of course, on the seas and we have
(35:03):
offices in Singapore where the HQ is and we also have another
one in Copenhagen and there's other one in Houston and one in
Dubai. Those are the main parts.
I know you have about say 200 vessels and probably fully
integrated shipping platform. So where does data sit in so
that you enable you to get the skill to be able to know what's
going on across the world? I mean, shipping is a very
(35:25):
important business as we learn from the pandemic that, you
know, the supply chain side. So in terms of IT architecture,
so we worked hard internally after the COVID to be full cloud
enabled, which really help us tobe very agile in what we are
(35:47):
delivering to the business. So most of our data now are
residing in data bricks, yes. So we are fully cloud oriented.
Of course, we using a lot of application for the business,
for the HR, for the operations, for the commercial teams.
Of course we have dozens of applications, but all the data
are getting centralized to data Bricks to help with making a lot
(36:12):
of value out of all the data we're collecting across those
applications that are not alwaystalking together.
But thanks to data bricks developments that were deploying
to our businesses that's we are enabling the application to talk
together and businesses to talk together to break silos.
Yeah. So what was the mental model
behind choosing data breaks in the end?
(36:33):
Is it because it's easy to use or what's very easy to deal with
the use cases that you'll have and also to be able to approach
with the data itself? Yeah.
So historically we were working on SQL Server with a classic
legacy data house. Now the management wanted really
to break silos and enable self-service within the company.
(36:59):
Of course, we have more than 300employees on shore.
We have more than 4000 employeeson the vessels, yes.
So we would like to have everyone to everyone have an
easy access to data how to do it.
It would be basically through enabling self-service.
So what tool on the market two or three years ago, what was the
(37:20):
tool that were available? So we had a few options.
We really accessed data bricks and of course Snowflake plus the
Microsoft solutions. We came up with data bricks
really because of the all different layers and all
different. How can I say that maturity of
(37:44):
our businesses. So we will have three different
kind of users. We'll have the users who are not
interested in to manipulating data, they want to conduct their
daily operations. I see we have those kind of
users who need data to take decisions.
So they need to consume dashboards, they need to consume
any kind of of. So they need the business.
(38:05):
Intelligence very quickly and they want to see they don't
really want to do what kind of alot of the work in order to put
the data into different visualizations dashboards.
Yeah, clearly. And the third one are the what
we call the domain experts. So and this is what made us
choose data bricks is this capacity of having all kind of
users accessing the data. So the domain experts are
(38:27):
basically expert that we've trained within the different
business teams on how to use Python, how to use SQL, how to
themselves bring value to to their colleagues within that
business team in order to be basically the extension of the
data team. So we needed to choose a tool
that were allowing us to do thatand data bricks.
(38:50):
Who has Susan and and and designated as the best
candidate? From that, so, so they also
enabled the team of 10 that is working with you to make sure
that everything is in place so that you can build exactly with
data breaks as well. So what is it that how data
breaks is transforming how your commercial, financial, even
operational teams to access and act on the data with the data
(39:11):
intelligence platform itself? All right, so before focusing on
the in architecture, we had to focus on building a setup with
the different businesses and my team by splitting the activities
but bringing the activities together without segregating or
building silo. So how did we do that?
(39:34):
We basically choose to build 3 different squads within my team
we have the commercial squad, the financial squad and the
technical squad. By building this setup, we had
squad. We have squad that are dedicated
to specific areas of the business and can really focus on
each business value. What we are focusing on now,
(39:56):
it's a project with IAOI. So we work hard with the
business and those domain experts who are embedded within
the business on selecting the project with the highest AOI, so
each squad. We will define a list of
projects that will bring real value to our business and we
(40:16):
will focus on developing those projects.
I see. Whereas for the day-to-day
activity we are counting on our,the, the skills of our domain
experts to develop the BI dashboards and the data team can
focus on the more advanced developments.
And that's how we've enabled andwe, we, we managed to really
focus where the data can bring value.
(40:40):
So how do you make the data accessible, say to non-technical
stakeholders, I guess what's theresponse from, say, business
units when they work with data breaks invisible to them, but
they actually see all the data that they need?
All right, So we'll have different level of access.
We can speak about Gen. AI.
(41:01):
Yeah. So you have the classic data
engineering where data bricks can work as AIT system and will
help application to speak together.
Yes, this is invisible for our businesses.
We have also the classic AIBI where the business will consume
data through the dashboard. So you have also implemented
retrieval of mental generation. That's interesting.
(41:22):
Yes. So it's the the last layer, but
before enabling AI, we had to ensure that our data platform
and our data were pristine. We've tried in the past two
years right after our GPT came out to enable very quick win,
very nice POC, but we've noticedthat without a solid data
foundation, all those POC were just not strong enough in terms
(41:47):
of quality. I see.
So the quality was at the centreof everything where we have the
nice demonstration for the different vendor, for the
different marketing campaign. But in reality, what matter the
most, it's good data. And thanks to Databricks and all
the innovation that that we can put in place on this great
platform, on the Unity Catalog platform, we were able to
(42:11):
control the quality of the data we were providing to our Gen.
AI or to our AI use cases. I see.
And this is the key to enable AI.
Right. So with the foundation model
within the platform itself, so you're able to do questions and
answers on there. Just walk me through like, how
did how did the Gen. AI enable that?
(42:32):
And maybe what kind of questionsnow can it answer?
Yeah, of course. So Gen.
AI for me, we have two differentphases.
The first one it's how to use Gen.
AI to ingest data we could not ingest in the past.
I'm talking about unstructured documents mainly.
That's right. So we have contracts, we can
(42:53):
have SOP, this cell manuals, allthose documents are complex and
of different shape. At the beginning, we've tried to
find the magical solution, the generic solution that will help
us to load all those data seamlessly and have a very nice
output to enable QA and chatbot.But the reality is each kind of
(43:15):
document, each kind of input have to have its own way of
working on the data. So to make sense out of those
data, it's not so simple. It's a lot of work, a lot of
trial and error. And we need the solid platform
that will help us to monitor theperformance of what we are
developing. So this is the first phase, how
to ingest unstructured data to the data platform.
(43:38):
Then the second phase is how to basically consume the data.
Yes, correct. And then we have the data that
is coming from unstructured document.
We have the data coming from ourdatabase data link.
Basically we have our data coming from emails, from
communications, yeah, and data coming from the different
applications that we are building on top of the Delta
(43:58):
Lake platform. So how does it work?
For each scenario? We will choose the right tool on
the Databricks platform. So for instance for the document
we will use vector database fromDatabricks.
For the structured data we will use Geni workspaces.
And now with the enablement of the agency solutions, yes, all
(44:22):
those tools can speak together. And I'm extremely excited about
the future and how this will look like where those tool could
work one by one. But now that we can see
application that can bring thosetool together to make sense out
of all those data, it's going tobe very, very powerful.
So you're one of those rare customers I know who actually
got retrieval of mentor generation to work.
(44:44):
And I think in this case, you achieve it with data breaks.
Do you find that, let's say, forexample, questions and answers
is more of art, like trying to figure out the correct data to
get the correct answer for the Q&A from maybe different
business units then rather than the science itself?
And then you have to do a lot ofdifferent evaluations on the
question and answer. Yes, exactly.
(45:04):
So as I said, you have the beautiful POC that works for
demos and then you have the reallife Where in the real life you
just need one bad answer from the LLM to lose the trust and
confidence from our users and completely lose them.
And this is really what we try to avoid.
That's why we really take our time before releasing any
(45:25):
solution in production. I can give a simple use case
where we use Gen. AI but not as a chatbot.
And I think by embedding Gen. AI technology within the
business process, that's where we will bring a huge value
compared to only using chat bots.
(45:46):
So I can give an example. So we have the Seyer report,
Seyer report, it's inspector that come on our vessels.
Yes, those inspector will give some comments, some observation
about things that could be improved on board the vessels.
Then this document is sent back to our Superintendent on shop
(46:09):
and the Superintendent have to analyse this document and answer
to each one of the comments. Oh, in our company we have
thousands of SOP and for each observation we have a
corresponding SOP where we couldfight the answer.
So the objective of this application is basically for one
observation. Do we have SOP that will that
(46:31):
will be able to answer to the observation or help at least our
Superintendent to come up with avery good answer or to shorten
the time index? Here we are not using a chat bot
but we are using RAG within our workflow in the application to
relieve readers use case that has an impact for the business.
So he actually follows the business process rather than
(46:53):
being a separate chat bot by itself.
Exactly. And this concept of having
unbeaded Geneii technology as a tool in the workflow I think is
key for for process automation or at least to assist our
business chooser with with better performance and and
higher quality. So I've done some pre research
(47:15):
before this. I know there is 2 initiatives, 1
is Marvis and the other one is the data supermarket.
Can you talk a bit about what Marvis and Data Supermarket
supposed to do for half year? Of course.
So we work in, we don't work in a tech industry.
We work in the shipping. Industry.
Yeah, the shipping industry. Is different our users are
focusing on the shipping industry and everything that
(47:38):
goes with that So as tech peoplewithin the shipping industry,
how do we explain how do we makeit accessible to all our users
So that's where we came up with the idea of data supermarket
with in mind really the target of enabling self-service for our
business. I see.
(47:58):
So by giving the analogy of a supermarket, it was much easier
at the beginning to explain to our business, OK, if you want to
get access to our data, you cometo the supermarket, you do your
shopping, you select the right data at the right place, you go
back home with a tool like Poor BI and you can cook yourself and
(48:19):
build your own dashboard. And that's how we've enabled
self-service by building this data supermarket.
And at the beginning we were talking about the supermarket to
help yes users to understand, but the name stayed.
Since then, so so it became there.
What about Marvis then? Yeah.
So on top of this data supermarket, so we spent two
years working hard on building something strong that works very
(48:40):
well built on top of data bricks.
Now we have a very well organised data structure.
As I said earlier in this in thepodcast, it's without this solid
structure general use case. I really not even possible.
So now that we're confident thatthis structure is solid, we were
we wanted to enable Gen. EI use cases again, we wanted to
(49:05):
find a way to personalize or to basically give a personal to
this Gen. EI within the company.
And instead of saying, OK, this is a Gen.
EI tool or this is a Gen. EI process, we call it Maris.
So for our users, when we speak about Gen.
AI, we speak about Marvis. I see.
And it really helped us to enable different use cases and
for the business to project themselves in some processes
(49:28):
saying, oh, but if I ask Marvis this question, then I could get
a nice answer from the Gen. AI system so they don't have to
understand the technicalities behind the scene.
So Marvis, what is it? So it's Gen.
AI tool. It's a GPT kind of tool based on
our company data. It's kind of copilot from the
(49:51):
way you are explaining it to me.It's definitely a copilot
connected to the company's data.So we have different way of
accessing those data. As we spoke.
We have the RAG, we have the structure data.
We have also as we did the the speech today, a partnership with
NEO 4G based on data bricks dataready to new 4 GI see.
(50:14):
So it's, it's, it's, it's a Gen.AI tool connected with our old
data ecosystem and then we can use it for chatbot and also for
internal processes. I think the people do not
appreciate this, but maritime industry, you have a lot of
complex documents, standard operating procedures and you
need to manage these documents effectively.
(50:36):
I suppose Marvis has sort of been a great help in doing that.
Yes, yes, basically as I said, Marvis helped us to integrate
all those documents in our knowledge base and to consume
those documents. And by enabling all those use
cases that were not possible 2-3years ago, we have basically
(50:57):
access to a huge amount of data compared to the only structured
data world. Really.
I think the, the, the Gen. AI use cases is a new a chapter
for us as a data engineering team, where we now need to
really upskill, ensure that the internal team of developer we
(51:21):
have within the company is constantly learning, constantly
keen to know, to dig even more. And that's really with this kind
of tool like data bricks that wecan enable that without so much
complexity, where the complexityis not in using the tool, but
it's more into finding the rightuse case.
(51:43):
Do you foresee a day where anybody in the shipping industry
doesn't really know the technical aspects behind, but
able to because of generative AI, because you're now prompt in
English or in any language, French, etcetera, where you can
actually just prompt and get whatever insights you want from
your system? Exactly.
(52:04):
I think with the Argentic AI system that we see more and
more, we will be able to build systems that are smart enough to
handle any kind of use case. We have to also be careful with
the hype. We have to stay focused on what
works what and we have to maybe take our time, not push too
(52:28):
quickly the the the features that are we are not in control.
But yes, definitely the agentic AI by bringing the all kind of
data from the industry and from the different departments
together will basically be a good change.
So I think one question like what excites you most about
(52:49):
where Hafnia is hitting with AI and data?
All right. So now that we've built our data
supermarket, data supermarket iswell.
And you have Mavis as well and. Mavis as well is known within
the company. It's used quite a lot.
Marvis AI is becoming also a tool that people are using in
(53:13):
that day-to-day activities. The next step is how do we break
silo? How do we bring data to everyone
in the business? How can someone from the
commercial team have access to financial data, have access to
operational data? How do we bring everyone
together using data, using the same data?
(53:35):
So this is our next initiative. Now we are building what we call
DNA port. So it's a platform built on top
of data bricks, on top of MarvisAI, where we will bring all
those data together in one single place.
I see. Instead of having scattered
dashboard around application A, Application B, we'll bring
everything under the same hood and really break the silos and
(54:02):
Marvis AI will be the links among all those processes where
within this application, IMOS data which is a very famous ERP
system in shipping can speak with our financial system.
Instead of having two disconnected system, this DNA
port will be the platform where application can finally.
Talk together So what excites you with your partnership with
(54:26):
Data Breaks and how are they going to support you for the
data with on the mission for Hefner?
Databricks helps us to stay focused on business value and
this is the number one criteria for us.
We don't want to spend time likewhat we used to do in the past
on the operational side of things of IT.
We want really to spend time on business value.
(54:47):
And with the constant innovationthat Databricks brings on the
table, it's really enabled us todevelop very nice applications
like this DNA port build on the very new leg based technology
from data bricks. Where before we had to consume
different system from different SAS services, now with this
(55:07):
unified platform, it makes us extremely good with the velocity
we can have and the development pace we we, we are we are having
so we can bring data, AI and content together under one
single platform. And this is a game changer for
us for Viable City, but also forthe skills within the team where
(55:29):
everybody, all data engineer, software engineer and AI
engineer are talking the same language which is this unified.
Platform. So Simon, Many thanks for coming
on and really tell me about shipping, what Havnia is doing
with data breaks. And also I think this is the
first time I really see something that really works for
retrieval of mental generation as well.
(55:49):
So thank you for coming on. Thank you.
Thank you very much.