STARTUPS PULSE - Individual Developer Effectiveness Measurement Pitfalls - Innovation Pulse: Daily News - AI, Startups, Cleantech, Auto + Learning Extras

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
Welcome to Innovation Pulse, your quick no-nonsense update covering the latest in startups and

(00:10):
entrepreneurship news. Anthropic, an AI company founded by former open AI researchers, has
secured a $2.5 billion credit line, while Affinity is transforming expense management
software for small businesses with innovative tech. After this, we will dive deep into measuring
software developer productivity and its impact on the tech industry.

(00:34):
Anthropic, an AI company founded by former open AI researchers, recently secured a $2.5
billion five-year revolving credit line to bolster its position in the rapidly evolving AI market.
Known for its Claude Chatbot launched in March to 2023, Anthropic has swiftly scaled its operations,

(00:57):
achieving an annualized revenue of $2 billion in the first quarter, doubling from the previous period.
This financial boost aids Anthropic's expansion and strengthens its balance sheet amidst the
intense AI competition. With a recent valuation of $61.5 billion, the company is seeing a surge

(01:19):
in high-value customers, with those spending over $100,000 annually increasing eight-fold
from the previous year. Major financial institutions like Morgan Stanley, Barclays and JP Morgan
are backing this credit facility, showcasing confidence in Anthropic's business model and mission.

(01:40):
As the AI sector races towards a projected $1 trillion revenue within a decade, Anthropic's
strategic financial moves position it uniquely in the competitive landscape, enabling it to keep
pace with industry giants like Google, Amazon and open AI in the quest to develop advanced AI agents.

(02:04):
Affinity, founded by Aaron By and Sahil Fadness, is revolutionizing expense management software
for small businesses, offering features traditionally enjoyed by tech startups.
After raising an $11 million seed round, they quickly secured a $17 million series A, led by
SignalFire. Affinity targets small and mid-sized businesses like pharmacies and auto dealerships,

(02:31):
providing them with customizable expense management credit cards and software.
Unlike traditional options from banks like American Express, Affinity delivers what
by-calls V3 FinTech, which not only improves UX and data access but also offers analytics and
financial advice crucial for businesses lacking finance teams. The series A funding

(02:57):
will enable Affinity to introduce features like banking, bill pay and cash flow analytics and
integrate with enterprise and point of sale apps. Current offerings include customized cashback
rewards, native QuickBooks files and short-term loans against invoices. Instead of broad targeting,

(03:18):
Affinity focuses on niche verticals with complex cash flow, achieving rapid growth to $1,800
customers and $20 million in monthly transaction volume. The platform is on track to handle $1
billion in transactions by year-end, with revenue growth around 10-fold in a year.

(03:39):
And now, pivot our discussion towards the main entrepreneurship topic.
All right everybody, welcome to another episode of Innovation Pulse. I'm your host,
Dana, and I'm joined as always by my brilliant co-host, Yakov Lasker. Today we're diving into

(04:00):
a topic that's been causing quite a stir in engineering departments everywhere.
Yakov, why don't you set the stage for us?
Thanks, Dana. Today we're tackling the thorny question of measuring developer productivity.
Companies are obsessed with metrics these days, but when it comes to software development,
can you really capture a developer's effectiveness with numbers?

(04:23):
Is counting lines of code like measuring an aircraft's progress by weight?
Oh, I love that analogy. Was that Bill Gates who said that?
Spot on. Gates made that comparison to highlight how misleading pure code volume can be,
and he's not alone in being skeptical. Martin Fowler, that guru from ThoughtWorks,
argues we fundamentally have no way of reasonably measuring software developer productivity.

(04:48):
Yet companies keep trying. I was talking to a CTO last week who's tracking everything from
commit counts to story points per developer. I couldn't help but wonder if that approach
is helping or actually hurting their team. That's exactly the tension. Let's break down some of
these common metrics. First up is the classic lines of code or LOC. Simple to count gives you a rough

(05:12):
indicator of system size, but completely misses the point of good code, right?
I remember a senior developer telling me her proudest achievement was deleting 2,000 lines of
code while keeping all the functionality under a LOC metric that would look like negative productivity.
Exactly. As Fowler points out, a good developer might achieve in 10,000 lines what another does in

(05:36):
100,000. The shorter version isn't less productive. It's just more elegant and probably easier to
maintain. What about commit counts and pull requests? I see those on dashboards all the time.
Ah, the green squares on GitHub. The problem is they're easily gamed. One engineering leader

(05:57):
noted that after they started tracking commit counts, developers suddenly started splitting
changes into extra commits. One change, five commits, tada, productivity quintupled, on paper
anyway. I've totally seen that happen. So what about story points? That's supposed to be more
meaningful, right? Story points are interesting because they were designed for team estimation,

(06:20):
not individual performance. Scrum experts are actually quite emphatic about this. One quote
from scrum.org stands out. If you start measuring individual performance by story points, it's
game over. This will crush the spirit of teamwork. That makes sense. If I measured on my personal

(06:42):
point quota, why would I stop to help a colleague? I'd be shooting myself in the foot metrics wise.
Exactly. This leads to a broader cultural problem with individual metrics. They can create a
environment where knowledge sharing takes a backseat. I've seen teams where developers hoard
tasks or avoid helping others because they're worried about their own numbers. That sounds toxic.

(07:05):
What about the psychological impact? Do we know how developers feel about being measured this way?
There's a telling anecdote about that. When one CTO suggested tracking commit counts,
a senior engineer called it garbage and warned that no developer would want to be measured like
this. Engineering is creative work. Imagine if we tracked an artist's productivity by counting

(07:30):
brush strokes. Congratulations, Picasso. You've hit your KPI of 500 brush strokes today. Keep it up
and you might earn that bonus. Exactly. And this brings us to Good Heart's Law, which warns that
when a measure becomes a target, it ceases to be a good measure. People will change their behavior
to hit the metrics, often in ways that aren't actually helpful. Like splitting commits or inflating

(07:54):
story point estimates? Precisely. Or rushing through bug fixes without addressing root causes,
just to close more tickets. One engineering blog gave the example that focusing too much on bug
count could drive teams to favor quick and dirty fixes, creating technical debt that will cause
bigger problems down the road. So are all metrics bad news? Should we just throw our dashboards out

(08:19):
the window? Not necessarily. The consensus from industry leaders seems to be that metrics can be
valuable if used wisely. McKinsey's 20x23 report claims that, yes, you can measure developer
productivity, but not with one magic number. You need a balanced approach. What does a balanced

(08:40):
approach look like? There's this framework called space that I find really helpful. It stands for
satisfaction, performance, activity, communication, collaboration, and efficiency. The idea is to look
at multiple dimensions rather than fixating on just activity metrics like commits or PRs. I like that

(09:02):
holistic view. So satisfaction would be like, are developers happy and engaged? And performance would
be actual outcomes like features shipped or customer impact? You've got it. And then communication
metrics might track how well developers collaborate. Things like code review participation.
By looking at all these dimensions together, you get a much more realistic picture than any

(09:27):
single metric could provide. You know what I've been wondering throughout this conversation,
Yakov? We're talking about how challenging it is to measure individual developer productivity,
but I've worked with sales and marketing teams where individual metrics seem to work just fine.
Why is software development so different? That's a fantastic question, Donna. Software

(09:47):
development is fundamentally different from departments like sales, support, and marketing,
where individual leading indicators tend to be much more useful and effective.
Tell me more about that. I mean, sales has quotas, marketing has lead generation targets,
customer support has tickets resolved, all individual metrics that seem to work reasonably well.

(10:09):
Right. In those departments, there's often a clearer line between individual effort and outcome.
A salesperson closes a deal that's countable revenue directly tied to their work.
A support rep resolves a ticket. That's one customer problem solved. These are relatively
atomic, discrete tasks where one person's contribution is easy to isolate.

(10:34):
And I'm guessing software development isn't like that?
Not at all. Software development is inherently collaborative and interconnected.
The code I write today might be building on someone else's foundation from six months ago,
or my contribution might be reviewing and improving another developer's work rather than
writing new code myself. So in sales, if I make a hundred calls and close 10 deals,

(10:59):
that's clearly my work. But in development, if a feature ships, it's usually the result of multiple
people's contributions. Exactly. And the value of those contributions isn't always immediately
apparent. A developer might spend weeks refactoring code that doesn't add any visible features,
but prevents future bugs and makes the next 10 features easier to implement.

(11:23):
How do you measure that impact? I see what you're saying. In marketing,
if I run a campaign that generates 500 leads, that's a clear metric. But in development,
sometimes the most valuable work isn't easily quantifiable.
Precisely. And there's another fundamental difference. In sales or support, the work tends to be more

(11:43):
homogeneous. One sale is comparable to another. One support ticket follows similar patterns.
But in development, every problem can be unique. The time it takes to build feature A versus feature
B might differ by 10X, not because of developer skill, but because of inherent complexity.
So applying the same metric across different types of development work is like comparing

(12:07):
apples to quantum physics? That's not far off. And here's another angle. In sales or marketing,
the time horizon is relatively short. You make a call, you close a deal, you run a campaign,
you count the leads. But in software, the true value of work might not be apparent for months or

(12:27):
years. Speaking of different contexts, I'm curious about how company size affects these metrics.
Do large enterprises and small startups approach developer metrics differently?
Absolutely. In large enterprises, leaders often seek metrics to gain visibility across many teams
and projects. They tend to focus on process efficiency and consistency using standardized

(12:51):
frameworks. Like those Dora metrics, deployment frequency, lead time and reliability? Exactly.
Large organizations can use those to benchmark teams against industry standards. They also
typically track quality metrics like defect rates and product value metrics like features delivered
to ensure productivity translates into business impact. But I imagine implementing metrics in a

(13:16):
5,000 person engineering org is quite different from a 20 person startup. Completely different.
Big companies often have dedicated teams measuring developer experience with sophisticated dashboards
to find systemic bottlenecks like slow build times or excessive meetings. They need that structure to
manage complexity. What about the risks? I would think larger organizations might be more tempted

(13:41):
to misuse these metrics. That's a real concern. With hundreds of developers, there's definitely
temptation to rank people by numbers. That's why forward thinking enterprises use metrics to
identify teams that need help rather than singling out low performing individuals. Privacy and trust
become even more crucial at scale. And for smaller teams and startups? They typically prioritize

(14:05):
speed of delivery and customer feedback above all. Their metrics tend to be more outcome focused,
user adoption, feature completion or revenue impact per engineer rather than internal process metrics.
I guess when you have a 10 person team, everyone's contributions are pretty visible day to day.
Exactly. Startups rely more on direct observation and conversations than dashboards.

(14:31):
They might track something simple like lead time per feature or error rates. But these are typically
discussed in all hands meetings rather than formal reports. It seems like small teams can be more
flexible and qualitative in their approach. Absolutely. They should avoid over bureaucratizing
metrics. Rigid KPIs introduced too early can reduce the flexibility and creativity that small

(14:56):
teams thrive on. As small teams pivot quickly, their metric frameworks should stay nimble,
focus on a few key indicators rather than drowning in data. Speaking of complexity in measuring
developer productivity, there's a massive elephant in the room we haven't addressed yet. AI assisted
coding tools like GitHub, Co-Pilot and Claude are completely changing the game. How does that fit

(15:20):
into our metrics conversation? Oh wow, that's such a timely point, Donna. AI coding assistants are
turning traditional developer metrics on their head. Just when companies thought they had figured
out how to measure productivity along comes AI to complicate everything. Right? Imagine trying to
measure a developer's output when they can generate 200 lines of code with a well-crafted prompt.

(15:43):
Lines of code as a metric just got even more meaningless. Absolutely. And the variability
is huge. Two developers might solve the same problem, but one uses AI effectively and produces a
solution in 20 minutes, while the other takes three hours coding everything manually. Both
solutions work equally well, so who was more productive? It's a fascinating question. In

(16:08):
some ways developer work might be shifting toward prompt engineering and solution architecture
rather than manual coding. I think you've hit on something important there. The metric that matters
might be becoming time to solution rather than volume of code written. The best developers might
actually write very little code themselves in the future. Instead, they'll excel at directing AI to

(16:33):
generate well-structured, maintainable code. So in a way, AI might be making individual metrics
even less useful since the actual coding is being partially outsourced to the AI. In some ways,
yes. But it also raises new questions about what skills we should be measuring. Maybe instead of
tracking code volume, we should be looking at a developer's ability to effectively leverage these

(16:57):
new tools. Their AI utilization effectiveness, if you will. I can already see the next wave of
dashboards. AI prompt efficiency index and co-pilot utilization rate. Please don't give them ideas.
It sounds like the rise of AI in coding makes us need to focus even more on outcomes rather than
output. Did the problem get solved? Is the solution maintainable? Does it meet requirements?

(17:21):
Those questions matter more than how the code came to be. You've nailed it. It's pushing us
toward what we should have been measuring all along, the impact and value of the solution,
not the mechanical process of producing it. Maybe AI is forcing a healthy recalibration of our
metrics. So what advice would you give to engineering leaders navigating this new AI-assisted

(17:44):
development landscape? I'd say embrace the change rather than fight it. Except that traditional
individual metrics like lines of code are becoming even less relevant. Focus on measuring outcomes
and impacts, working software, customer value, problem resolution, and create a culture where
using AI effectively is celebrated rather than hidden. And from a metrics perspective? Consider

(18:10):
developing new frameworks that account for this hybrid human AI coding environment. Maybe what
we should measure is a developer's ability to break down problems effectively, to know when to
code manually versus when to leverage AI, and to evaluate and refine AI-generated solutions.
These meta-skills might become the new productivity differentiators. Fascinating,

(18:33):
so the metrics themselves need to evolve alongside the technology. Exactly, and organizations that
cling to outdated metrics will not only get misleading data, they might actively discourage
the effective use of these powerful new tools. This conversation reminds me of the principle
that what you measure is what you get. If we measure individual output alone, we get individualism.

(18:56):
If we measure team outcomes and collaboration, we get teamwork. That's really the heart of it.
No metric is neutral. Each one sends a signal about what the organization values. So choose
metrics that align with your true goals and culture. If you value knowledge sharing and
quality, don't just measure raw output. And remember that context matters. A developer tackling a

(19:21):
complex critical task might close fewer tickets or write less code. Yet their contribution could be
more valuable than someone churning through trivial changes. Absolutely, as Alan Hullab put it,
focusing too much on raw productivity numbers destroys the careers of highly productive people
who happen to be working on very hard problems. So what should tech leaders take away from all this?

(19:44):
I'd say there are three key takeaways. First, be wary of any single metric claiming to capture
developer productivity, especially individual metrics like lines of code or commit counts.
Second, focus more on team outcomes and business impact than raw activity. And third,

(20:06):
use metrics to drive improvement conversations, not as a cudgel for performance reviews.
And I'd add, listen to your developers. If they're telling you a metric feels unfair or is causing
unhealthy behaviors, take that seriously. The goal should be creating an environment where great work
can happen, not optimizing for dashboard numbers. Couldn't have said it better myself. If your

(20:30):
metrics are creating anxiety and competition rather than clarity and improvement, something's wrong
with the metrics, not the developers. Well, this has been an illuminating dive into the world of
developer metrics. If you're a tech leader listening to this, I hope we've given you some food for
thought about how to approach measurement in your organization. And if you're a developer being

(20:51):
measured by some of these metrics, maybe share this episode with your manager. Suddenly, of course.
Oh, I just happened to be listening to this fascinating podcast.
That explains why counting my GitHub commits might not be the best way to evaluate my work.
That's all for today's episode of Innovation Pulse. Thanks for tuning in. And remember,

(21:13):
not everything that counts can be counted and not everything that can be counted counts.
Until next time, keep innovating.
We've explored how Anthropik and Affinity are making waves in the tech industry with their
innovative solutions and impressive growth and delved into the complexities of measuring

(21:35):
software developer productivity, advocating for balanced and outcome focused metrics.
Don't forget to like, subscribe, and share this episode with your friends and colleagues
so they can also stay updated on the latest news and gain powerful insights. Stay tuned for more updates.

All Episodes

STARTUPS PULSE - Individual Developer Effectiveness Measurement Pitfalls

Episode Transcript

Popular Podcasts

Amy Robach & T.J. Holmes present: Aubrey O’Day, Covering the Diddy Trial

Dateline NBC

The Breakfast Club

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}STARTUPS PULSE - Individual Developer Effectiveness Measurement Pitfalls