All Episodes

April 2, 2025 • 38 mins
By some minor miracle (even on April Fools) the R Weekly Highlights podcast has made it to episode 200! We go "virtual" shopping for LLM-powered text analysis and prediction using the mall package, and how recent advancements in the grid and ggplot2 packages empower you to make use of highly-customized gradients. Plus listener feedback!

Episode Links
Supplement Resources
Supporting the show
Music credits powered by OCRemix
Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:02):
Hello, friends. Do not adjust your earbuds. Guess what? It is episode 200
of the Our Weekly Highlights podcast.
And there was much rejoicing.
We somehow made it, folks. Yes. We made it to the big 200.
And if you're new to the show for the for tuning in for the first time, this is the weekly podcast. Well, mostly weekly anyway, that we talk about the latest happenings and resources that are shared on this week's our weekly issue.

(00:32):
My name is Eric Nance, and I still cannot believe we made it this far.
So maybe before I go too much further, let's put a little, little PSA out there.
Yeah. This 200 thing, yeah, after this.
You know what? I've had a change of heart. I think of episode two zero one and and more, we're gonna call this the SAS weekly podcast because they don't get enough love.

(00:56):
April Fools.
If if you didn't catch it, you should have. But it as we're recording this, it is April fools day, which may be a bad omen for things going forward, but we're gonna make it work. But, as always, I'm joined by my awesome cohost on this show who's been with me definitely more than half of those 200 episodes,
Mike Thomas. Mike, how are you doing today?

(01:17):
Doing great, Eric. What a milestone. Very excited.
Yeah. You scared me there for a second, but, it is in The US here at least. I don't know how international the holiday that is, but it is April Fool's Day. So watch out. Beware. Especially, I'm sure for those
like you who have young children who'd love to take advantage of that type of thing. He already did yesterday as a preview for it. I've I discovered these, tablets we use for his, like, online art class

(01:45):
were suddenly taped to my desk,
taped to it for some reason. I'm asking why you do that? He's like, hey. We're fools, Danny. Well, okay. Well, I I'd imagine I'm gonna be in much worse shape after he gets home from school today.
But, but, nonetheless, we're not gonna try to be fools here. We're gonna keep it real as usual with the awesome highlights that we have up for tap today. And this week's issue is curated by John Calder, another one of our awesome contributors

(02:11):
and curators here on our weekly.
And as always, he had tremendous help from our fellow our weekly team members and contributors like all of you around the world with your poll requests and other terrific suggestions along the way.
Just to put a little PSA out there, poor Mike here's,
actual mic wasn't quite working as expected. So we had a little bit different audio quality.

(02:33):
We'll make it work, though. So, again, we're not immune of the shenanigans
on this day for sure.
Well, I guess it's the time capsule. Right? It is episode 200, and typically, every week now, we got something to talk about in the world of large language models. And, yeah, we are leading off episode 200 with another great showcase on some of the tooling that's out there

(02:55):
for an aspect
of the usage of large language models that I personally haven't done as much, but I can see in future projects this will be
extremely helpful if you're doing of a lot of textual based resources.
And this guest post on the Pazit blog was authored by Camilla Levio.
She is a research professional at the Center for International Trade and Security

(03:18):
at the University of Georgia.
And this very practical post leads off with, I think, a use case that many of us will be very much in relation to when we deal with our documents in the enterprise
or maybe from clients or other
projects
where the motivation
for her recent effort
was being able to compile

(03:40):
a list of final reports.
These are in PDF format
from an annual conference called the Conference of the Parties.
This is spanning from nineteen ninety five to two thousand twenty three,
And this is I haven't heard about this until this post here. This is a yearly conference for global leaders

(04:01):
to talk about climate change, some strategies
involved, and, you know, some new
new avenues to address those issues.
So these, conferences
do release the, I guess, the the summaries of these.
And what looks to be a pretty,
dense,
pretty,
comprehensive PDF file for each year.

(04:23):
And so the question is, how can her team
be able to get generate these summaries and insights that are part of the contents of these reports?
Certainly, before the advent of large language models, you might tap into the world of text mining,
textual analysis, sentiment analysis,
where you're gonna be setting up, you know, those, keywords, maybe those patterns of searching.

(04:48):
There are lots of avenues to do that. I know Julia Silgi has been a huge part of the efforts in the art community
to bring text mining packages to the art ecosystem. So I've we've actually covered, I believe, a fair share of those on some of the previous 200 episodes
of this, program here. But with the advent of large language models, you got a new take on making life a little bit easier in these regards.

(05:13):
And in particular, this post is highlighting the use of the mall package,
which is authored by Edgar Ruiz, one of the software engineers at Posit.
And this is a LOM framework in both r and Python. There are separate packages for each to let you run multiple large language model predictions

(05:33):
against a data frame and, in particular, looking at, you know, methods of textual prediction, textual extraction,
and textual summarization.
So there are two steps that are outlined in this post about the use of
the ball package.
The first step, I guess, the precursor of this is that
we have a cleaned version

(05:55):
of the report data that Camilla
has assembled here in a CSV file. But, of course, you might have to leverage other means and perhaps even LOM itself to just get these from the PDF, which you could do with, say, the Elmer package and the like.
But the the data set in in the example here
has three columns here, one of which is the name of the file.

(06:17):
Second is the raw text that's been scraped from this file.
And then another
another column, which is slightly more,
slightly cleaner version of this text, but it's still, again, the raw text
from this PDF just maybe without the formatting junk inside.
So step one of the mall package,

(06:38):
she wanted to generate a new column in this dataset
that gives a summary
of each of these, PDF reports.
And that's where you can leverage after you plug in your model of choice, which in this case
is the,
Ollama
three dot two model here.
You can run a function

(06:58):
called l o m underscore summarize,
where you give it, you know, the the clean textual column that you want to,
to extract here,
and then give it a name of the new column, and then more importantly,
an additional language for the prompt itself that's gonna be feed into the model, where it was a very basic prompt here just saying to summarize

(07:19):
the key points in the report
from this proceedings.
And then she gives a little more extra help on the different categories that she likes summarization on.
Again, one thing I'm learning is that the more verbose you can be of your prompt, the better within reason to give a little bit a little better context.
And then when that function is completed,

(07:40):
she gives a little preview of those,
extracted columns here, and you can see,
looks like pretty straightforward, the three key areas that she asked for highlighting,
such as the decision making process,
mitigation adaption,
you know, emission reduction, and other key points were covered
in these clean summaries. So, again, a llama model does pretty nice job there, it looks

(08:05):
like. Again, this is only three or or maybe five or so reports here, so we're not getting a huge set here.
But this does look promising.
That's step one. Step two
is now with this clean summary that, again, is much easier to digest to read, you know, verbatim.
What about extracting some key parts of this summary? Perhaps some keywords, if you will. Maybe the relay to certain topics.

(08:32):
And she has those here
using the l o m extract function from mall, where you can give it these different
labels,
that you want to basically be extracted from here. And in this case, she's looking for energy transition.
This is where things get a little more flexible than in the typical large, text mining area,

(08:54):
where you might have to look at all these different synonyms or other adjectives that would just that would, say this phrase
and try to grab all that at once.
But in this case, with l o m extract, she's able to leverage just this labels column with a very
fit for purpose, you know, description here, like I said, energy transition.

(09:15):
And the model is gonna be smart enough, hopefully,
to extract
this phrase, but also ones that are, you know, adjacent to this as well.
So after running this function, we have, again, an additional prompt that was supplied here.
You now get a new column here that shows
that phrase or an adjacent or, you know, like phrase of that

(09:37):
in this extract energy trans column. And you see
renewable energy is put in here, energy transition,
solar energy.
Again, that's one nice benefit here. She didn't have to prospectively think of all those that could be possible.
It was just one phrase, one label, and then the model took care of the other patterns that this could be, you know, closely adjacent to.

(10:02):
So
she then did a little visualization
on these different keywords that are identified, and you can see that renewable energy
was the top of the list with the other ones of about, one occurrence each.
And, again, you can, you know, build more custom prompts if you wish,
such as maybe telling the LOM prompt to answer yes or no,

(10:25):
whether this text would mention any challenges in the transition to other forms of energy.
And, again, you can
feed this in with an l o m custom function,
which gives you that more customized prompt to grab this information and sure enough, you get this quote unquote prediction column afterwards
with a no or yes that addresses that particular question.

(10:49):
So what am I seeing here as someone who's doesn't do a lot of day to day of textual mining or textual analysis?
There have definitely been projects at the day job where we have these say study reports or study protocols,
and there's a lot of information there. Some of it's, quote, unquote, structured in tables,
and some of it is not. Some of it's more free form.

(11:11):
This would be a great utility, I think, to extract those different pieces of a study design,
maybe those different variables of interest that maybe have a little difference
in their in their characteristics from one study to another.
This seems really promising here. And to be able to leverage MALL combined with,
large language models with LOMs

(11:33):
seems like a a pretty nice example here to supercharge your textual mining efforts. So really great blog post here by Camilla and makes me wanna go shopping virtually at this mall, so to speak. What about you, Mike? Absolutely.
Yeah. You know, we're getting so many different
LLM related r packages that sometimes it's hard to keep track of them all. And I I really appreciate the blog post here around the mall package that, you know, certainly seems to be tailored specifically

(12:04):
towards,
the open weights models, right, for use cases, perhaps when you don't wanna hit a third party API or you wanna use a local model, with the Ollama package. You know, I was taking a look at the beautiful package down site that we have here, and I think this references
a blog post from our prior week's highlights

(12:25):
In that, we have these tab sets under the getting started section
of the,
sort of home page for the package down site where we have code in both r and Python
for leveraging
mall.
And as soon as you switch from one to the other, all of the tab sets switch from from r to Python, so they're grouped together, which is pretty cool. It's, you know, incredible that we

(12:50):
have these
tools or or common sort of,
interfaces to maybe more back end APIs across whatever language
we care about and not to,
I don't know, get a little too meta.
But I recently
was selected, for posit comp twenty twenty five for anybody that's going to be there.

(13:11):
Whoo. Sort of myself pat on the back to talk about this exact thing, how
we have a really great data science ecosystem
now that's evolving
where the language doesn't necessarily matter, because the syntax looks pretty similar in both cases, and we're just, you know, able to interface to these these common underlying APIs. So I think that's one of the the cases that we have here in the mall package. And if you take a look at the differences between the r syntax

(13:40):
and the Python syntax,
it's it's really, really minimal. So it's pretty cool that, you know, you can sort of bring your own tool and get the same results
at the end of the day. You know, from another higher level perspective, you know, one thing I wonder about in a lot of these use cases is accuracy.
Right?
Versus what you spoke about here, you know, traditional sort of text mining. And I think that that's always probably going to depend on how simple or or complex the use case is. You know, for simpler cases

(14:09):
where you have specific keywords and and patterns that you really know are going to,
you know, be the case 99%
of the time or or more, you're probably gonna benefit from, like, a more traditional text mining approach that's a white box solution. You know exactly what's going on and and what it does miss, you know, why it missed.
But, obviously, when the problem becomes too big for traditional text mining, that's when we can bring in, I think, these LLM based approaches that are probably really the only tool that we have to be able to accurately

(14:40):
do this, but with the trade off that
we don't have as much of a white box to understand
when it missed. So we have to do sort of manual evals a lot of the times, if you will. But, yeah, the the fact that we have these tools at all is fantastic and that, you know, folks
are bringing not only developing tools like mall to allow us to leverage them

(15:03):
as comfortably as possible from our own R environment, but we also have folks like
Camilla who are drafting up fantastic blog posts that walk us through exactly how to do this stuff. So hats off to everybody involved, the mall team and Camilla for developing this this fantastic deep dive walk through.
A great start to the highlights this week. You bet. And I can see, you know, from from my case, you may have this, you know, large set of documents, and it's not like we're eliminating a human in the loop here of, like, reviewing these results. This is a way to get that

(15:37):
that that intermediate
step to make life easier to look for maybe
more targeted summaries and maybe whittling down a list of, say, 150
or 200 some reports down to maybe a list of five or 10 that then become a lot more interesting, a lot more easily,
digestible to review those results. So, again, my my, gears are turning in my head here because there are projects on top of these study documents that we have, but also just research manuscripts out there of given, you know, therapeutic areas like, say, Alzheimer's or other disease states. And we pay a vendor a lot of money to curate this stuff manually, and it sure would be nice

(16:19):
instead of paying all that money to just get, quote, unquote, that study table out or that, you know, high level summary out that we grab that with the LLM and then, of course, you know, vet it through a human kind of in the loop for review. But my goodness, this could this could save a lot of money. So I've got I've got I've got some people to talk to, the the higher ups about this because

(16:43):
we have we have the models now. This is clearly demonstrating this could be a plug and play for the model of interest over it's a llama. Or if you're in the enterprise, maybe you could use, like, Claude or other models on top of this. Seems like right it's a right time to start exploring this further.
Should we pivot to
an hour's worth of hot takes on whether LLMs are going to do peer review

(17:08):
for science?
I'm I'm here for it.
Alright. Let's go to highlight number two.
And rounding out our highlights today, we're gonna take a visit to the good old visualization corner, which has always been a staple on these previous episodes of our weekly highlights.

(17:34):
Because as we've seen throughout the life cycle of this very show,
my goodness, the things you can accomplish with packages like ggplot two
is absolutely amazing to create publication quality
and to be honest more than just publication quality,
but really eye catching
visualizations
that would not look like they came from R.

(17:57):
So in this highlight here, we're gonna talk about a newer feature that landed into R recently
that I think give your plots a little extra pizzazz
especially with the use of colors.
And this post comes to us from James Goldie
whose name may sound familiar. He is a data journalist, but he was also one of the co leads
of that recent quarto close read contest that we talked about a few weeks ago. And he's been a very active member in the visualization and kind of data storytelling,

(18:29):
you know, piece of the landscape here.
And so from his blog, we're gonna talk about how you can make
novel uses
of gradients
within r and, in particular,
the g g plot two package.
So he leads off with the fact that in today's world of g g plot two, we can do a lot of awesome,

(18:49):
you know, features or a lot of awesome visualization
techniques with use of fonts, you know, size of of labels,
and, of course, colors.
But sometimes he would find himself before some of these recent advancements
getting, like, 90% of the way there, but then maybe just give that extra little polish. He had to pour it over to Adobe Illustrator or some other software and just give it that last extra

(19:15):
bit of polish,
especially in the world of colors.
Well, sounds like that might be a thing of the past because
now with g g plot two and some recent advancements in the grids package,
you have a lot more flexibility to make control or use of
gradient
color scales
into your visualizations.

(19:36):
And as I mentioned just now,
this is building upon the shoulders of the grid package,
which actually comes with r itself, but you have to explicitly load it into your session
to start taking advantage of the lower level functionalities.
But Grid has seen some really awesome advancements that we've covered from Paul Merle and others, contributors in this visualization

(19:59):
space.
And with the Grid package that g g bot two is standing on the shoulders of,
you
can leverage
two great functions.
I'll talk about the first one, and I'll turn over the mic for the second one.
But in the case of, say, a bar chart or other types of visualizations,
you might wanna consider a linear type of gradient.

(20:22):
And that is literally using the linear gradient function under the hood to do all this.
So in this first example with a bar plot of the empty car set, it's actually a histogram.
But for the fill of the bars, it's not just a single color. He leverages the linear gradient function going from red to orange.

(20:42):
Now, again, this plot itself doesn't look that great, but you're already gonna start to see the potential here.
With within these, linear gradient calls,
you've got a lot of flexibility for how many colors you feed into this and how you distribute it.
So you can feed in any number of colors. And in this other rainbow like example, he's got about, looks like seven colors here.

(21:06):
And then the the accompanying, you know, data's part of this, call here
is a vector of what you call, like, stopping
stops between zero and one, kinda like giving a threshold
from going from one color to another and another.
So that's all well and good. But you can also change the actual size and position

(21:28):
of these gradients
using coordinate type arguments of, like, x one y one, x two y two to give a little kind of a bounding box, if you will, of where this is gonna be fit in. So you could do this horizontally by default or you could do it vertically.
And on top of that, you also get flexibility to the type of units that's gonna be using in this kind of gradient transition within the the bar itself,

(21:53):
and that is using a different unit. In this case, the SNPC
or square MPC units
that will give you kind of customization on the angle of how this gradient is gonna transition.
As usual, audio podcasts, we're trying our best to that now right here, making clearly see a difference between when he used

(22:14):
this default unit,
being specified of this SNPC
where the transition
from red to orange
is a lot more pronounced, especially earlier on at the bottom of this chart.
So you can see you got a lot more flexibility
here, and all this can be plugged in
to many different aspects of your plot such as with groups

(22:37):
or within the scales themselves. So that if you have a bar chart of, say, positive and negative values,
you can easily
feed in
these different positional arguments. So it may be at the
above the the y equals zero of the axis.
You've got a more orangish red color versus below you might got a bluish color to really distinguish that that threshold going from positive to negative.

(23:03):
Lots of great ideas here on the linear side of it. But
as I've seen, you know, visuals don't always just have those nice little rectangular
boundaries.
We gotta get some circle action here, Mike, and that comes to us with the radial gradients.
Yes. And radial gradients work
pretty similarly.
You're gonna have this radial gradient function,

(23:25):
and it has a few different parameters,
c x one, c y one, c x two, and c y two. Those are the four parameters that really drive, the the center points of the gradient, and then r one and r two establish the the radius of the start and end
radii.
That's how you sort of define this whole entire entire circle gradient within this radial radial gradient function.

(23:50):
Excuse me.
So there are some fantastic examples here in the blog post that really demonstrate this nicely.
And one thing that I'm, you know, really sort of taken aback by is this this group parameter that you had had foreshadowed here, Eric. I guess that's available since our 4.2,

(24:11):
and and it controls whether a gradient applies to individual shapes or to a set of them. And I guess it's true by default,
which in the case of this scatterplot example,
the the
points have different colors
sort of based upon the gradient sort of applies to the whole chart, if you will. So the points have different colors. As opposed to in the second case when the group argument is set to false,

(24:37):
all of the points look exactly the same, but they have this radial gradient
inside
each point. And it creates this sort of three d look,
where, you know, each of these points on the scatter plot looks three-dimensional
itself. It looks like a ball that's sort of coming off the page,
which is absolutely incredible. It makes me wanna throw everything on this this blog post. It makes me wanna throw away every single g g plot I have ever made, because they don't look nearly as nice as

(25:07):
what, James has put together here in terms of his data visualization. This is a quarto blog. You would almost not know it, just because the theming and the CSS that's going on here is is absolutely beautiful. And and as you mentioned,
you know, some of these,
different bar charts as well where we get into this this grouping possibility

(25:30):
really drive pretty stark contrast
depending on whether you set group equals true or group equals false.
You know, another thing that James employs here is the g g force force package.
And I have a awful confession to make today, but I have never used the g g force package. And I need to use the g g force package because I think it's doing

(25:51):
a lot of the the really cool stuff, that's going on here or at least making it easier than just doing it, you know, in raw g g plot. Although, I imagine that's probably possible. Have you
been a user of g g force, Eric?
I have not. And like you, I'm getting the notes to try this out because I'm seeing a lot of novel use cases for it these days.

(26:12):
Me too as well. So the the blog sort of concludes with,
you
know, being able to sort of abstract a lot of things that you may have done in
CSS,
to, you know, our code for these radial gradient specifications
that you can create sort of backgrounds on your plot as opposed to creating gradients within the elements of your plot, really sort of setting kind of these

(26:37):
interesting rainbow backgrounds
on these simple plots that really provide for some cool theme here. I I would say, and this is a lot of stuff that I would have never really thought about or considered
in the type of data visualization
work that that I typically do. And there's a really handy
preview underscore gradient
function,

(26:57):
that allows you to, you get a small
bar, I I believe, in your your plot viewer that shows the gradient before maybe you apply it to, you know, a particular element of your chart. So that's a really, really nifty option. Sort of reminds me of the the the,
theme previewing that we can do in b s lib, if I'm not mistaken.

(27:20):
So a really handy utility function here, but a phenomenal walk through of applying gradients,
in our in g g plot two. I think, you know, James really covers sort of the whole universe of what's possible.
Yeah. And what James is showing here, especially as you alluded to in this last part of the post,
this kind of stacking
mechanism of these different gradients. This is one of those

(27:43):
pain points that he used to have with going to illustrate was to create these more custom
layered approaches to these different gradients for backgrounds and the like. And now with a a great example of code here that he has,
he calls this function stack underscore patterns.
You can see the recursive nature of this to be able to wrap all this into one type of gradient and to be able to, you know, leverage that as much as you like from both the linear side of it as well as the the radial

(28:13):
side of it. And you can see at the end of the post
these really
eye catching backgrounds
on on these different scatter plots
and bar charts. There's a lot of wonderful features here.
And, yeah, like I said, even those that may be more familiar with CSS,
he ties it all together of how that works on the CSS side

(28:34):
as well. Lots of expandable details here. He
he makes great use of Quarto here. This is a fantastic post, and I'm bookmarking it right now before I leave this episode because I wanna up my visualization
game, and gradients looks like a
a wonderful way to do just that. Excellent post here by James. Coming straight back to this post the next time I make a GG plot.

(28:59):
Yeah. And count down with, like, the the twenty or thirty hours that we covered in recent weeks, it seems like. There's so much in this visualization space that I'm just scratching the surface of for sure.
I've been on the kick of interactive plots, so we but, man, at some point, you gotta get to those static plots at the end, whether it's a report
or I hate to say it, even one of those PowerPoint decks. So if I if I have to be in those static confines, I'm gonna I'm gonna bling it up with this for sure.

(29:26):
And there's a lot more you could bling up too by reading the rest of the issue, and it's a jam packed issue that John Calder has put together for us, and we'll take a couple of minutes for additional fines here.
And, well, he's this is someone who we've covered
their journey on reproducible analysis and especially in the case of leveraging

(29:46):
Knicks as part of that journey for, gosh, over a year, if not more.
But, in in my additional find here, Bruno Rodriguez has
turned, you know, what I knew about Nick's and Rick's upside down, so to speak,
because he is announcing
a new package called Rickspress.

(30:07):
This is,
to put it mildly here,
a new take on reproducible analysis
pipelines
powered by Rick's.
That should have a keyword that you may, may latch onto
if you know some other key reproducible analysis analysis pipeline tool kits we're talking about.

(30:27):
But I'm still wrapping my head around what he's accomplishing here at Rick's Press. But in a nutshell,
when you have a pipeline of analysis,
Rick's Press is leveraging Rick's for each step in this process,
which opens up the possibility
that maybe for a given step for analysis or text mining or whatever have you,

(30:48):
because of the power of NICS under the hood, you have the flexibility in these steps to go from, say, R
to Python
or another language that Nick supports.
This is unbelievable.
Now, again, he is very much stressing this is a prototype. Do not use it in production yet. He's still working out the kinks of a few things, such as how we pass objects back and forth

(31:12):
through these steps in the pipeline.
But he is,
the elephant in room, so to speak. He has been heavily inspired by targets,
and this is not replacing targets. Let's let's not kid ourselves.
But what this is showing is the potential
for a multi language
analysis pipeline
in the spirit of targets,

(31:34):
but with a lot of granular control
at the dependencies
at each step,
not just in the overall pipeline.
Immensely,
thought provoking here. I am gonna wrap my head around this, but I admit this is kinda timely because I'm gonna be speaking about the virtues of Nicks and Ricks with my shiny development at the upcoming shiny conference, which is happening in a week and a half. So Rick's Press, is someone that's new, but I'll make sure to plug that in my talk at the end. So, Bruno,

(32:06):
as usual, every time I think I figured it out, you change the question, so to speak, as Roddy Piper would say.
But nonetheless, it was a excellent find here. So, Mike, what did you find?
That's awesome.
I found a recreation
of
FiveThirtyEight's
hack your way to scientific glory,
website that was done by Andrew Heiss, who's always just

(32:30):
pushing out incredible content. And it is a dashboard,
so to speak, but all in observable
JS
OJS.
So what that means is that I I think there's essentially
serverless. Right? No. It's a static site, if you will, but it it is very interactive and really feels
almost like a shiny app, if you will. And it's it's beautifully done. I I really love the theming here. And the idea is that you're a social scientist with a hunch that The US economy is affected by whether Republicans or Democrats are in office.

(33:02):
And you can choose a few different toggles here, your political party,
which sort of politicians you wanna include, how you what measurement do you want to use for economic performance like GDP or inflation or stock prices.
And then
what you're gonna get is is a p value at the end of this,
based upon whether that political party had a negative or positive

(33:24):
impact on the economy. So it's a little bit of p hacking going on here to a cool exercise to be able to just, you know, kind of switch dials until you get a what's called a publishable result, which would be a p value of less than, I think 0.05
or zero point,
zero one
in the case of how I I believe Andrew has put it together here. But it's awesome. It's a a fantastic,

(33:48):
I think, use of of OJS, and I imagine maybe quarto to
to publish it and,
really, really cool work.
This is fun to play with. I'm playing with it right now, and this is one of the advantages of bringing in Observable JS and and a portal doc.
It is just snappy responsive,
and this loaded right away.

(34:10):
Obviously, this is just taking advantage of OJS, so we don't have the web assembly stuff going on here, but you don't need to in this case. It it is fit to the point. And,
obviously, in my industry, I take the issue of p hacking quite seriously. But in this, this could be a fun way to to exercise that. So I'll I can play with this about fear, and I'm gonna lose my job, so to speak.

(34:31):
This this looks fun for sure.
Absolutely. Yeah. I think it's great in situations where you have fairly small data, right, to leverage OJS and,
the quarto here. And probably,
if you wanna continue using a static site in OJS and Cordova, your data gets big, that's where we get into maybe DuckDV

(34:51):
WASM.
I think that's the future, man. That's the future. I can't wait to Got it. Can't wait to play some of that further. Absolutely.
And,
before we close out here, we have put a call out for, you know, any, you know, kudos or other appreciations for our weekend. We did hear back from one of our dedicated listeners,
Maru Lapora, who I've had a great chance of meeting at previous Pazit conferences and other

(35:15):
avenues. He had input a response to one of our,
posts on LinkedIn
about our discussion
on some of the issues of CRAN recently. But first, he he had said he was loving
the use of the continue,
service
as a way to leverage LOMs
in Positron

(35:36):
as a way to have a front end, you know, Visual Studio code or Positron
extension to these services, and I am literally using that now on my open source setup. It is really cool. So great great, great recommendation there, Mauro. And,
his feedback on the issues that we were talking about with CRAN recently, he says,
I hear the pains of developers maintaining packages on CRAN.

(35:59):
Also, I understand the effort the core team puts in into allowing the art community to live at the head, so to speak. This is a rare, costly, and beneficial approach that I came to better understand. Thanks to this section in the book, and he plugs
it, software engineering at Google, which I'll put a link to in the show notes. And that's a pretty fascinating

(36:19):
read if you wanna get into the nooks and crannies of software engineering.
But Mauro, this is a a terrific summary here, a terrific piece of feedback.
In the end, this has always been a multifaceted
issue with the place that CRAN has
in the art community
combined with some recent issues that we've been seeing. But in the end,
the there are things that I would say CRAN is still a step ahead of some of the other languages, which can be a bit more of a free for all in terms of package repositories.

(36:49):
Sometimes not always of of very of varying quality, so to speak. So, again, great take on that. It's never a black or white issue, I feel, with these with these things, but great great piece of feedback, and we enjoyed hearing from you.
And with that, we're always welcoming for more feedback. So from the post episode 200 on, if you wanna get in touch with us, we have a few different ways of doing that, one of which is on the contact page in this episode show notes. Take you a link right there to a little bit of web form for you to fill out there. You can also,

(37:21):
send us a fun little boost along the way if you're on a modern podcast app, like, CurioCaster,
Fountain.
In particular, it makes it easy to get up and running with these. I have linked details on that in the show notes, and you can get in touch with us on these social medias.
I am, blue sky where I'm at rpodcast@bsky.social.
I'm also on Mastodon with @rpodcastatpodcastindex.social,

(37:46):
and I'm on LinkedIn. Search your name, and you'll find me there. And you also like I mentioned earlier, you'll find me as one of the presenters in a in a couple weeks at the upcoming shiny conference I'm super excited about.
And, Mike, where can our listeners find you?
I'll be there at Shiny Conf,
watching you present here. That's super exciting. You can find me on blue sky at mike dash thomas dot b s k y dot social

(38:10):
or on LinkedIn if you search Ketchbrooke Analytics, k e t c h b r o o k, you can see what I'm up to.
Awesome stuff. And, yeah, I remember you gave a a recent plug to some great advancements of using
AI in your shiny app development. That was a a really great great, teaser there. So, hopefully, you're getting a lot of success with that as well.

(38:32):
I really thank you all for joining us, wherever you are listening for this episode 200 of Haruki highlights.
Who knows if we're gonna get to 200 more, but nonetheless, we're gonna have fun along the ride one way or another.
And, hopefully, we'll be back with another episode
of our weekly highlights
next week.
Advertise With Us

Popular Podcasts

On Purpose with Jay Shetty

On Purpose with Jay Shetty

I’m Jay Shetty host of On Purpose the worlds #1 Mental Health podcast and I’m so grateful you found us. I started this podcast 5 years ago to invite you into conversations and workshops that are designed to help make you happier, healthier and more healed. I believe that when you (yes you) feel seen, heard and understood you’re able to deal with relationship struggles, work challenges and life’s ups and downs with more ease and grace. I interview experts, celebrities, thought leaders and athletes so that we can grow our mindset, build better habits and uncover a side of them we’ve never seen before. New episodes every Monday and Friday. Your support means the world to me and I don’t take it for granted — click the follow button and leave a review to help us spread the love with On Purpose. I can’t wait for you to listen to your first or 500th episode!

Stuff You Should Know

Stuff You Should Know

If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.