2025 Semantic Augmentation Challenge

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:09):
Hello, I'm Karen Quatromoni,
Director of Public Relations forObject Management Group, OMG.
Welcome to our OMG Podcast series. At OMG,
we're known for driving industrystandards and building tech communities.
Today we're going to bespeaking to David Blaszkowsky,
managing director of theFinancial Semantics Collaborative.
He's going to be talking about theSemantic Augmentation Challenge.

(00:33):
To lead, today's podcastwill be Bill Hoffman,
Chairman & CEO of OMG. Bill?.
Hey thanks Karen! And David,thanks for joining us today.
I'd like to know if you could talka little bit about this challenge.
Sure, sure. First,
thanks so much for theinvitation to tell your audience,

(00:55):
OMGs podcast listenersabout this 2025 OMG Semantic
Augmentation Challenge. It's now underway.
Entries are due on July 1st withfinal judging and selections on the
22nd of July. About the challenge,
we really started with a problem.
Literally one walked in the door to usat the financial sector domain task force

(01:19):
at OMG or DTF.
Members of the data team of aleading hedge fund approached us,
the financial sector, DTF, that is witha problem that was costing them time,
effort, and precision.
And one that they expected that othersin financial services and probably beyond
also had the same problem.
They receive and ingest data from somany sources and in so many formats

(01:43):
and of so many almost innumerable kinds.
They receive and ingest data from datasets with facts and columns and rows and
fields. But those facts and rows andfields are disambiguated from what they
actually are and whatthey mean. The context,
really the semantics of that information.
For many CSV files, Jason XML, and others,

(02:05):
no definitions of meaning areprovided and users are left to guess.
Meaning from cryptic column names, printeddirectories, and even data examples.
That kind of approach madesense with the constraints,
the data constraints and spaceconstraints of 1995 perhaps.
But in 2025,
when the exchange data itself oftenoriginated in association with rich

(02:29):
ontologies and taxonomies,
so CSVs and other formats,they're not about to disappear.
So what's needed is a specification,
a standard to make it easy to reconnectthe facts with their meaning to
ambigu what has been disambiguated.
And we're calling that semanticaugmentation. So to the challenge itself,

(02:51):
we're asking participants to recommendand demonstrate how best to include
references such as context, definitions,
and pointers such thatthe receiver of the file,
whether it's a humanor a program or an ai,
will have the informationneeded to understand and use that data successfully in

(03:11):
its processing.In the spirit of a hackathon.
We want participants to propose waysto augment a dataset with metadata that
links its elements like thecolumns to their semantic,
meaning we've got a dataset in thechallenge from the US banking supervisor,
the FDIC, that can be associated withsources. And we want to see great ideas,

(03:33):
whether based on existingmethods using new methods,
innovation or combining whatexists and what should exist and
why did we choose a challenge as away to gather all of this information.
OMGs role as a standards developmentorganization is to facilitate solutions
to these kinds of problems byconvening industry experts.

(03:55):
There are so many methodsthat we use, RFIs,
requests for informationconferences, and we're doing those,
but sometimes a bake-offcontest can really be valuable,
and that's what we've chosento do here to make it an event.
Let's share the problem and fight thetech and the data in tech communities to
propose answers and gather the ideasand even offer a public platform for

(04:18):
the best ideas with aprize for the winner.
Let's get a lot of ideas recognizingthat a spec may draw on many existing
as well as new ideas.And by the way, finally,
we hope that all the submitterswho participate will also choose to stick around
to help us craft a specification.
Speaking of some submitters,who should enter this, David?

(04:40):
Well,
we think that there are an awful lotof people who have an interest in what
we're doing here. Sojust to, as the preface,
we're looking for really folkswho have knowledge of the problem
set data and analytical professionalswho are afflicted by this problem.
Developers looking to solve a problembecause it's fun or it's profitable.

(05:03):
Data governance professionals and dataarchitects who are particularly affected
by this problem. Even students orfolks looking to enter the profession,
looking for a bite-sized problem,which this is to call their own,
come on in and help us solve it. Andof course, we welcome all of those,
as I said a few moments ago,
who want to be part of the challengeand the satisfaction of participating in

(05:24):
the analysis and development of industrystandards and specifications to not
just submit some ideas, butto help us craft these ideas,
all the ideas receivedinto something valuable,
a potential standard thatwill help the industry,
the financial services industry, aswell as all the other industries,
all the other sectors that areconsuming and using data coming from

(05:46):
other players.
Did you mention prize?
I sure did. Bill, yes. A thousand dollars,
maybe even with the prizewill come some fame.
Definitely the satisfaction ofhelping solve a problem, afflicting
submitters friends and colleagues,not just across financial services,

(06:09):
but across every sector and thatprize. But to get to the prize,
you got to enter,
you got to enter July1st is the deadline for
entrance. We're going tohave the public event,
which will include presentationsby a shortlisted set of
finalists on July 22nd, followedby the selection of a winner,

(06:31):
and we we're excited to be ableto put on this show on the 22nd.
Everything will be online and it'll bean opportunity for everyone to learn
about the ways that we might be able tosolve this important problem of semantic
augmentation of data sets.
That's great. Of course, thatwill be open to the public.
Absolutely. This is going to bepublic. Hopefully it'll be fun.

(06:53):
Certainly it'll be a learningopportunity. And of course,
to your point a fewmoments ago for the winner,
there'll be some money andhopefully some fame and excitement.
I think they'll get bragging rights.
There's nothing wrong and there's nothingwrong with bragging rights when you
solve a problem, it's you that solved it,

(07:16):
and we'll be excited to help celebrate
broadcast the bragging through OMG
materials as well. I mean,we want this to be out there.
We want this to be knownand to support the spec.
We're probably going to domore like these as well,
so we'll be excited to see you nowand we'll be excited to see you later.

(07:37):
That's great. Hey, David, thanksfor your time this morning.
We hope we get a lot of entries andit's going to be an awful lot of fun
watching this thing unfold.
Wonderful. Bill, thank you very much,and again, to everyone out there.
July 1st is the deadline.What we didn't say,
what we didn't talk aboutis where to get information.
Information is on the OMG website.
It should be also on the pagewhere you got this broadcast,

(07:58):
but go to the URL and we'll have basic
information about the contest.Also important links, links
to the full problem, to thedataset that we're starting with,
to the requirements for entriesand how to submit an entry.
We're looking forward to seeing youthere. If you've got any questions,

(08:20):
we also have a link to an emailfor sending us those questions.
All questions,
we'll be publicized or will be sharedin a link to the frequently asked
questions, the FAQs, so reach out to uswith your questions. More importantly,
reach out to us with yourideas. Thank you all.
Look forward to seeingyou on the 22nd of July.

(08:40):
Great, thanks David.
Cheers.

All Episodes

Episode Transcript

Popular Podcasts

Stuff You Should Know

SmartLess

My Favorite Murder with Karen Kilgariff and Georgia Hardstark

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}2025 Semantic Augmentation Challenge

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}Stuff You Should Know

SmartLess

My Favorite Murder with Karen Kilgariff and Georgia Hardstark

All Episodes

2025 Semantic Augmentation Challenge

Stuff You Should Know