Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:02):
Hello, friends. We are back with episode 208
of the our weekly highlights podcast, and this is the mostly, I dare say, weekly show where we talk about the excellent resources that are shared in the highlights section and
elsewhere in this week's our weekly issue. My name is Eric Nance. And, yes, I must have jacked myself, a couple weeks ago because I said, oh, yeah. We should be back to a regular weekly schedule. Well, sometimes things out of our control happen too, so it didn't quite happen that way. But we are back at least this week with another episode for you. And joining me as always
(00:34):
in between, putting out various,
you might call fires at the various, projects he has under running,
Mike Thomas.
Mike, hope you can keep a level head here. I know you've been
knee deep in some, fire squashing. Right? Well, haven't we all? Haven't we all? But, none today yet, so knock on wood.
Alright. Well, we'll keep them keep that good positive vibe going.
(00:58):
Alright. Told him about some various legacy things I've gotta fix. So, note of you out there that have maybe, like, IT groups that migrate servers from time to time.
Better check those things after migration. Sometimes things happen.
Hashtag just saying.
Anyways,
we got some great content to talk about today. You don't need to hear about my rants about that. And this week's issue was curated by John Calder, and yet another one of our OG curators on the team. And he had tremendous help from our fro, our rookie team members, and
(01:33):
contributors like you all around with your poll requests and suggestions.
And later on in the show, actually, one of our other curators is quite involved with this, so stay tuned for that.
First off,
as you well know from our previous episodes and also in the general our community,
you are aware that Posit, the company behind our studio's IDE that has been, you know, in production use for many organizations
(01:59):
and data scientists for years,
has been working on a new IDE called Positron,
which is,
at a high level,
a very opinionated
fork of the Visual Studio Code IDE.
We have a lot of nice enhancements that tailor it better to a data science kind of workflow
in R, Python, and frankly, even other languages,
(02:21):
supports.
Things are going really well with it. I have been using it in my day job now at my local system for almost,
gosh, six months, seven months now.
Plays nicely with the Knicks, by the way. You know, I couldn't let that not go in there.
But one of the newer things that we knew we knew was coming,
(02:42):
so to speak, maybe rumors or whatnot, but now it is in preview mode
along with the ID itself, which again, we wanna stress it's still in beta. So it's not like it's been a full production release yet. But one of the preview features that has landed
is the positron assistant.
You probably can guess by the name what this is empowering. Yes. This is empowering
(03:05):
the use of large language models for both code completion
and
agentic coding,
which is in essence giving you that nice interface for you to help, you know, ask questions,
and it will help you with development or answering questions much like a typical chatbot would,
but powered within Positron.
(03:28):
This is early days, but I am eager to say that I've had some use of this recently. I'll share that in a minute. But the first round up that we're talking about today in our highlights is a great blog post by Stephen Turner
who just recently,
changed roles, and now he is an associate professor of data science at the University of Virginia. So congratulations,
(03:50):
Stephen.
He's actually been an earlier pioneer
of using
AI and LOMs
in his codevelopment,
especially with Positron.
So he mentions that he's looked at this in the past in previous blog posts, which are linked in in the aforementioned post here,
where at the time, this wasn't built in. So if we looked at different solutions, the plug in
(04:13):
code completion, for example, as at that time, GitHub Copilot wasn't baked in. So you had tried frameworks like Codeum,
tab nine, and continue, which I've dabbled in bit in the past.
But with this positron assistant, it now comes with batteries included, so to speak, with the use of GitHub Copilot,
(04:33):
which is something you can opt into,
in your GitHub account.
There is a professional version, but I think you can get away of just using their free tier for this. But, you know, may I have to fact check me on that? But it can let you configure that, and then you can use then the copilot in your positron IDE
for your code completion.
(04:55):
It isn't a very radically
new thing per se in the world of IDEs because even Versus codes may able to do this for years. Our studio had support for this earlier on, maybe a couple years ago, year and a half ago.
But he's overall, you know, it does it does work well.
I will admit in my uses or at times there still hallucinates just a little bit of what I wanna do, especially with some shiny
(05:18):
stuff. But it has saved me some time, especially with writing change logs or writing release notes
or writing documentation.
It's actually not too bad, not too shabby.
It is much better than when I tried it a few years ago. So
promising there.
Where I think Steven was really curious about is the agent mode. And for this,
(05:40):
within the positron assistant, you can configure not just the GitHub Copilot tie in, but you can also
configure
the use of anthropic
Claude.
You bring in your anthropic API key or your OAuth. Yeah. I think it's API key for anthropic.
That is the only one supported at the moment for this mode of the bot. So or the the plug in.
(06:04):
Perhaps when it gets to production, it'll get future, plug ins or or I should say account tie ins.
But once you have that, he was curious about what it would be like to use this agent mode
to actually try creating a package.
And when he did to test this out, he just simply
created a skeleton of a package of use this. With the create package, you get your kind of scaffolding lined up,
(06:30):
but a very minimal setup, maybe just your description, your name space, some placeholders.
And then
he asked the agent in this positron assistant
to basically create a punch or, wow,
create a package
with a function,
to re reverse compliment a DNA sequence. Cause Steven has been heavily involved in genetics for years,
(06:55):
documented with our oxygen
and then write tests.
And he has a screen capture of of the agent in action to build this. Every time I see one of these, there's a part of me that's like,
oh, gosh. Does that actually work? But in essence, it wasn't too shabby, he said and you can see in the video,
(07:15):
it basically, you know,
scanned the structure of what he had in a directory,
you know, creates a function, writes a documentation in our oxygen,
and then
wherever he wants to run the test that it helps create as well to check that that's working
and then runs the dev tools check to make sure everything is working,
(07:39):
and no errors were found.
He did have to make one little correction in the license field, but that was mostly due to the defaults of use this.
He wanted to put the MIT license, but that's all, you know, very minor minor,
enhancement there. But overall, he seems to be, you know, pretty impressed at least with this very fit for purpose,
(08:01):
you know, much more streamlined
example.
Time will tell as how this works with a more larger scale
ask.
Maybe you have a more complex thing you're working on.
I do mention for the shiny,
enthusiasts out there, Mike and I included,
positron does have support for the at shiny tag as well that you can put at the beginning of your query
(08:26):
so that they can be shiny aware, so to speak. If you're trying to help enhance an existing app, maybe building a new app, then you're able to take advantage of its, I'm guessing, pretrained,
hook into shiny knowledge and documentation or whatnot. So that's a a little note for the the shiny users out there. So he Steven's gonna look at this, you know, you know, in more
(08:50):
robust situations, but overall,
not too bad.
And one thing to note is that this really didn't cost a whole lot to,
to run through his account.
It didn't cost too much. It was about
9¢
for that trivial example. You know? Yeah. Obviously, if you do that a whole bunch of times, I might add up a little bit, but
(09:15):
certainly way cheaper than some of the other services you might pay for in this space. So
the key though on that, and there's a footnote about this too,
is that
a lot of people are paying, like, the default
account access to these services, or it could be, like, $20 a month, sometimes even more depending on which tier you put in.
(09:37):
He's doing this via the use of,
buying API.
You might say tokens or credits
beforehand. They're kinda like a pool that you can draw from and you pay as you go.
And then when you run out of them, you buy more.
In things like this, I think that's a huge cost savings
compared to using the more traditional accounts. So if you're new to this and you're wondering just which way do you do this, paid approach,
(10:04):
Definitely. If the service you want to use offers this kind of token
bucket approach,
take that and take it now. And that way you will, you will, your, your budget will person, wherever it's you or your organization,
will thank you later, to make sure you don't run through these just with that default account access.
(10:26):
I've been using this a little bit. I'm I did it for some more basic, troubleshooting, and it did help very nicely about 80% of the way there. So as always with these things, I'm pretty new to it, but definitely promising that now
I don't have to flip over to another interface just for that agentic chat. I can now do it within the confines of Positron.
(10:48):
So I think there is improvement to be had, but overall,
not too bad, all things considered. So I'll keep an eye on this space.
Yeah. This is pretty cool, and it's nice to
see the sort of seamless integration that we're getting now with AI assistants in the Positron
IDE.
It's a really nice sort of chat pane on the left side of the IDE that, you know, takes up maybe
(11:13):
20%
of the screen where you can sort of follow along and and type in your prompts and take a look at what, Claude, in this case, is is thinking and doing, and then sort of see it happen in the other 80% of your screen, within the scripts and and the editor and things like that. So that agentic mode was was pretty wild, and I would definitely agree with the notion of trying to leverage
(11:37):
the APIs
here, in terms of Claude. The from a cost perspective, you as,
as mentioned in the blog post by Steven, you really don't have to worry about running up some giant bill because
you can
and this is something that we've done at Catch Brook with some of our enabled,
AI enabled Shiny apps that leverage,
(11:58):
Claude on the back end,
is you can just do $5 at a time, essentially. I think that's probably the minimum that they make you preload.
And
we have, you know, maybe
10 or 20 prompts hitting that that app a day. I know some of our our users
are probably writing,
leveraging that API, some of our developers, I should say, to,
(12:22):
have these AI sort of chat and and edit features within the Versus code IDE.
And
I maybe have to spend $5 every six months. Like, it's
yeah. It's so cost effective.
That you are saying that because I mean, this this will sound crazy.
I spent almost double that when I'm at the cafeteria at the day job for lunch. Like, six months, $5. That's that's amazing, especially when you hear about all the horror stories of people with with AWS
(12:53):
accidentally racking up their bill because some they kept turned on.
This hopefully reassures you that, Mike, you of all people are not are not breaking the bank on this. That's amazing. No. And they give you a notification, I think,
when, I think we have it on auto renew, and they give you a a notification when they're about to charge your credit card another $5 because you're you're coming up to that limit. But it's it's so cost effective right now, especially, you know, I'm all about self hosting.
(13:23):
And I think in terms of privacy,
there's a lot of times when we do that and and have to do that, and it makes sense.
But if you do compare
the cost of the third party like Claude or or OpenAI, I'm not sure how their pricing compares, but I imagine, you know, that they're gonna have to be competitive in terms of their API offering, so it's probably fairly close. But from a a cost perspective, you know, versus self hosting and leveraging these third parties, the the third parties, I think, really have it now at this point in time. When you think about having to stand up sort of the cloud infrastructure, unless you already have your own server, you know, on your server rack like Eric has in his in his basement.
(14:04):
But we're not doing a whole lot of that these days,
leveraging the cloud. But a great walkthrough, I think, of how to
leverage this agentic mode within the Positron IDE that
if you didn't believe it before, you can see it and believe it. Because there's a couple great sort of short one minute, one and a half minute video clips in here that Steven adds that I think really
(14:27):
tell the story, you know, just as well as he articulates it in the blog post. So a great walk through of the combination here of Positron and Claude in this case.
Yeah. And like like I mentioned, I think, you know, right now, it's Copilot and and Claude.
I would, again, not be surprised if we start seeing more offerings in that in that configuration
(14:49):
that you do, including
for those that are living on the self hosted side that might bring their own, you know, llama or service on top of that. I've experienced a few things, like you said, in that,
nice little,
basement server that I have here in three feet away,
try with some various things. One thing to know, and I remember, you know, Joe Chang and others have said this,
(15:12):
Claude still seems to be at least as of now, one of the better
r code, you know, l o m, you know, model
providers. I know there are some things that come in here and there, and there are there are some interesting work that,
Simon Couch at at Pazit has done to help benchmark various models so you want to check out his his blog post,
(15:34):
to get a good take on on those.
So my guess is these two will be kind of the standard for at least a short term but
they're set up for success, I think, to bring in others
as as as this matures.
And, again, for the shiny fans out there, just put the at shiny
before your the start of your real query, and it'll it'll basically call that shiny assistant
(15:58):
right off the bat provide you have the shiny extension also
in your positron extension list. So really,
really exciting to see and
I I think
over time we'll start to see at least from deposit company standpoint,
the fruits of the labor of going to this newer direction for an IV to give them more
(16:19):
opportunities to hook into novel technologies
to
make the the data science development experience much easier down the road.
(16:40):
Well, Mike, in our last highlight here, it's a reminder that just a couple months away oh, it's coming up fast.
We're headed to Posikov 2025.
You and I have the good fortune of being
speakers at the event. I'm also gonna be helping out with the our pharma summit there. Lots of cool things happening.
And as usual, these conferences, there is a wealth
(17:04):
of different topics that are gonna be talked about
in the various presentations,
the lightning talks, the full talks, and the workshops.
It's a lot to take in. Definitely,
wherever you're even on the fence are going or not, you'll probably want to look through the program and then you might see man that's a lot to digest.
(17:24):
Well, it wouldn't be a a theme. This episode has a definitely a theme to it because
it's one thing to pursue the agenda via the traditional web page that they expose.
But what if you had a friendly interface to kinda ask questions and then it would give you the information about the conference right at your leisure? Well, that's where our next highlight comes in
(17:46):
because
our very own curator at our weekly, one of our curators, Sam Palmer,
a few weeks ago
had released this quietly,
a shiny app that's an l o m powered app to explore the POSITCOMF twenty twenty five agenda.
And, yeah, it definitely got some attention so much that in this last highlight, Isabel Velasquez from Posit
(18:10):
has literally wrote a blog post on their official
blog here about
this new l o m powered
agent, you know, agenda pilot app.
So you can get to the link to the app. We'll put it in the show notes, of course, as well as, the link to the post.
But this is really nice. It's a very straightforward interface.
(18:32):
You ask it a question.
In fact, in the blog post, Isabelle asked what are all the sessions related to positron and
sure enough within, you know, less than a second, it lists out all the different sessions there. If you wanna learn more about positron like we were just talking about a few minutes ago, Lots of already I'm seeing two or three great talks in this screen capture,
(18:54):
four even. Wow. There's a lot going on there.
So name your topic.
You can find it in this,
powered
assistant here that Sam's put together.
If he had asked me if this is possible a year ago, I would have said, you're crazy.
But, no. We're not crazy. We're living the the good life, I guess, here because all of this,
(19:16):
every bit of this is powered
by open source in terms of the architecture and then leveraging
one of these aforementioned services under the hood to power the l m itself. So what does Sam have to do to accomplish this? First,
he used the h t t r two and chromote packages
to scrape the agenda details from the conference portal. So again, there's a bit of homework you have to do
(19:42):
to figure out, okay, what are my CSS selectors? How do I select this particular table information?
That's scraping
one zero one at that point, but it is definitely achievable as long as these agendas,
this is a note to future conference organizers,
put them in a web format. Makes it easier for things like this to exist.
(20:02):
So that was the cleaning exercise. He got all that data, then the little cleaning up with the tidy verse and certainly the per package is involved as well.
But then how does the LM
be able to ingest this? Because obviously, the pre generated, you know, training models that say open AI or anthropic claw to use definitely don't have this information.
(20:25):
This is definitely a novel development now,
but typically speaking in this,
workflow,
you would somehow feed this data
into a retrieval augmented generation or rag approach.
Now there is a new package and the kind of AI ecosystem in our called Ragner
(20:47):
that helps you create
this more,
readily available knowledge storage,
so that the LOM can use this and query it for the information
based on what the user asks. This is relatively new, but I remember, you know, my early days of exploring this, this seemed like a bit of a gap
(21:07):
other than some real, you know, custom solutions. I've heard people like James Wade over at Dow talk about a couple years ago to kind of engineer this all yourself.
Ragnar is giving you a nice interface to do this all from the confines of r.
So you got that knowledge store. So how do we let users use this? And that's where obviously shiny comes in because you can connect this with Elmer.
(21:33):
Elmer is really becoming that
hugely powerful engine
to power a lot of this from the r side. And I dare say the more I use it, the more I'm really enjoying the experience. And while I have the the the audio mic here,
an audio thanks to to Hadley Wickham himself
(21:53):
who saw an issue I've filed on the Elmer GitHub about AWS bedrock support.
He turned it around in less than twenty four hours. So thank you, Hadley. I won't try to be spoiled about that in future PRs, but boy oh boy, that really makes my life easier at the day job. So Elmer
combined with shiny chat
is how this and shiny interface is powered here. Shiny chat is a great package that gives you that, you know, very familiar looking query prompt and then streaming the text results back in near real time as if you're on one of the other external services.
(22:30):
So that would be great if you're just looking this locally. How do you share with people? So in this case, Sam published it on Posit Connect Cloud, a newer
offering from Posit
that has
the, you know, the keys required for the authentication to open
AI baked right in so the end user doesn't have to worry about this.
This is a part where I'm maybe say saying this for Sam's benefit.
(22:54):
Obviously, this app probably is gonna take off a bit, especially in a month or so from now.
Hopefully, it doesn't break your bank, but we'll see. I guess later on, hopefully, you're taking the token approach like we were advocating
a few minutes ago. But nonetheless,
obviously, you can deploy this in many different ways by Pawsit Connect Cloud. There certainly is a convenient way to make all that happen.
(23:18):
So again, look at where we are now, folks.
A AI powered conference explorer
where in the past, we would have, like, maybe say a shiny up to have, like, a dashboard of table
of these, you know, conference talks and you can do filtering and whatnot. Yeah. Serves purpose at the time, but we know this is the way a lot of people are going to interact with this stuff. So she wanna find that one topic
(23:42):
or that one domain. They wanna find it quickly. So I could definitely see people using this literally at the conference itself when they start to decide what talks they wanna go to next. So I'm I'm curious to see if it'll hold up. But
overall,
really impressed and credit to Sam for the awesome job on this and Isabelle for spreading the word.
(24:03):
Does this mean that filterable
DT tables and reactable tables are a thing of the past, Eric?
Ask me another day, Mike. I'm not sure if I wanna talk about that now.
I think the answer is no. I agree with you. But this chatbot is awesome.
And
I gave it a shot, and I asked if there were any
(24:26):
talks about building data science teams that use both R and Python.
And,
my talk is
multilingual
data science, essentially.
So it's not like, you know, I'm not using all of the keywords perfectly in there. I wanted to see if it could figure it out. And my talk was the first one that came back. So I was very impressed with this. The response time was very fast, which made me immediately dive into the GitHub repository,
(24:52):
which has been linked, I think, in the blog post as well as if you're on the app itself. There's a nice little hyperlink button on the left side of the app. And trying to figure out exactly how this goes on
under the hood. Looks like there's a few different OpenAI models that take place, maybe one for,
creating the vector database. It looks like there's some evals that Sam is doing, which is really cool to see how well it's performing.
(25:19):
And then obviously, the the call itself,
to
be able to retrieve the
answers to the user's prompt essentially.
So,
really interesting. There's also a requirements dot txt
file in here with a bunch of Python packages that says it was generated,
by RS Connect Python.
I couldn't find any Python code. So I'm trying to figure out if that's legacy or if this is truly a multi lingual project in and of itself. There's DuckDV going on on the back end, so I think it's a really cool
(25:53):
repository
as well. Sorry to get, you know, in the weeds here, but I think I think it's a great repository to be able to use in in future projects and take a look at sort of how all the different pieces
came together between, you know, creating that vector database,
having system prompts,
working with Shiny chat
(26:13):
and
getting sort of everything to work together, including evals, which I feel like a lot of folks just sort of forget to
care or
have some sort of approach to evaluate the accuracy of your LLM model that you're creating. So awesome job. This is definitely a repository that I'm going to
come back to quite a bit as well as
(26:34):
hit this, especially as the conference comes closer, hit this app pretty hard, probably to help build my agenda, which I think is gonna be really cool. I imagine, as you mentioned, Eric, that there may be a lot of folks if this gains in popularity that will be hitting this pretty hard especially as it comes towards, September here.
And my hope is that that posit would be willing to split the bill here,
(26:57):
but we'll see.
My guess is or if I had to guess, there's gotta be some kind of support here for this because this is gonna make their life a lot easier too to boot. Right? I mean, good grief. They didn't have to build this themselves. The community came through once again with one of these. It's it's
it's amazing to see. And, yes, I I was,
while you were talking there, see if they could pick up my talk because I just asked, can you tell me if there are any presentations about shiny bookmarkable state? And thank goodness mine didn't get picked up there for such a niche kind of topic. I found it very quickly. So I'm excited to see
(27:30):
see what else I can and find in this in this exploration. But,
yes, I am gonna geek out on the GitHub repo that Sam has put together because
this is I'm sure you can attest to this, Mike. This is somewhere we are being asked time and time again.
How do we leverage tech like this
in in apps that focus maybe solely on it or maybe there's a part of the app that focuses on having this tie in. So I think
(27:56):
I'm gonna I'm gonna definitely check this out a bit because in my
expirations with tying Elmer and AWS Bedrock, I just had a major breakthrough. And, again, thanks to Hadley for pushing that to the finish line. I think I'm gonna be starting to play with this a bit.
But, hopefully, my, job security of doing, you know, fun interfaces with with filterable tables is still intact. I think it will be like you said, but, you know, we gotta give it the times too sometimes.
(28:23):
I hope so as well. Well, Eric, I just asked it, are there any podcast hosts who are presenting at the conference? Just to see what would happen.
It says,
yes, Eric Nance
is a a podcast host presenting at Positconf twenty twenty five. Voice behind our podcast,
and you co host the Our Weekly Highlights podcast.
I was not mentioned, which just further goes to prove the correct
(28:47):
the correct notion that you are the star of the show.
Oh,
we don't wanna end on that. No. Did we? But, no, that's that's intriguing, nonetheless.
So How funny is that?
That is that that that that's a cute one. Yeah.
You know what's next? Somehow, I've heard this on other podcasts. There will be
(29:07):
some service that somehow takes my voice and Kabuki butchers it to do some random things. Like, it's happening, folks. It's just a matter of time. So please listen around there. Don't take that as a challenge. Somebody else is probably gonna do it.
We're already there. We're already there. So yep. Well, on that, you know, rather, entertaining
(29:27):
note, I think we
I don't have much for follow after that. So we're gonna close-up shop here pretty quick here. We got, some aforementioned,
things to put out our respective day jobs here. Getting back to the the weeds so to speak. But, you know, really fun issue to talk about today.
And, you know, much like what Sam demonstrated
here, he from the community has helped the industry greatly.
(29:50):
You all in the community can help our weekly greatly.
You know how to do that. You can find the upcoming draft of this issue and if you found that great new resource, that new blog post, that new app, that great new package that's doing mind blowing things,
you can send us a poll request with the template already provided for you to put that into the upcoming issue draft, and our curator for the week will be glad to review that and and most likely merge that in on the spot. So definitely value your contributions there. And we also value hearing from you and the community
(30:23):
for all the various things that, you ever get wrong or things you like hearing about the show. We'd love to hear from you.
We have a contact form right in the episode show notes. Definitely take advantage of that. If you're just in your podcast, boy, you wanna quickly send some stuff to us. If you're on a modern podcast app, you can send us some fun little boost along the way as well. Details are in the show notes. And And if you wanna get in touch with us on social well, social media that is, you can find me on Blue Sky with at rpodcast@bsky.social.
(30:54):
You can also find me on Mastodon with @rpodcastatpodcastindex.social.
And last but not least, you can find me on LinkedIn
after you sift through all the AI generated fluff sometimes on there. You can find my me. Just search my name and you'll find me there causing all sorts of chaos.
Mike, where can the listeners find you?
(31:14):
You can find me on blue sky at mike dash thomas dot b s k y dot social. You can find me on LinkedIn.
If you search Ketchbrooke Analytics, k e t c h b r o o k, you could see what I'm up to lately. And although I am a business owner and a CEO, you will you will not find me at any Coldplay concerts.
(31:36):
Yeah. I I'm I'm I'm glad that wasn't your face that popped up on that when I saw that feed pop up. Yeah. Data science adjacent, though.
Airflow. That was. Yes. That was. Yes. And, there were some ramifications
as they might say. I that that that made the wave. So
Mike and I are not that,
that kind of person, so we are safe in our respect to homes here. But, we hope you are safe in your respect to home wherever you listen to this episode. We promise we won't cause you any additional chaos. That will conclude this episode of ROV highlights.
(32:07):
Thank you so much for listening from wherever you are, and we hopefully will be back with another episode of our weekly highlights
next week.