Goalzz.me Update: Building an AI Productivity Coach — What I’ve Learned About Agents

Back in January 2025, I wrote about Goalzz.me as a simple productivity statistics app. It pulled my Microsoft To Do tasks into a dashboard so I could actually see what was going on during my weekly reviews. That was the whole pitch — a better view of your tasks.

A lot has changed since then.

Goalzz has evolved into something I didn’t fully anticipate: a personal AI coaching system. What started as a dashboard now has an AI coach named Iris that can read my goals, manage my tasks, check my calendar, remember things about me, and reach out proactively throughout the day. I’ve been building it nights and weekends, and along the way I’ve learned a lot about what AI agents actually are — not the hype, but the reality of making one useful.

What Goalzz Can Do Today

The core idea hasn’t changed — I still use Microsoft To Do as my task manager and the Full Focus Planner methodology as my framework. But Goalzz now sits on top of all of it as an active coaching layer.

Iris, the AI Coach

Iris is built on Claude and has access to over 50 tools. That means when I tell Iris “I need to finish the quarterly report by Friday,” she doesn’t just say “great idea!” — she can actually create the task, set the due date, link it to my Q2 goal, and check my calendar to see if I have time blocks available.

The tools span everything: task management, goal tracking, calendar analysis, habit monitoring, journaling, even a simple CRM for keeping notes on people. Iris can also create memories — she’ll silently note things like “Ben prefers to do deep work before noon” or “Ben’s wife’s name is…” and use those in future conversations.

Proactive Outreach

This is where things got interesting. Iris doesn’t just wait for me to open the app. She reaches out:

  • Morning briefings — A daily email or text with my top priorities, yesterday’s wins, and what’s on my calendar
  • Midday nudges — Only sent if I have overdue or high-importance items sitting untouched
  • Goal stall alerts — If a goal has had zero activity for 5+ days, Iris flags it
  • Evening reflections — An optional end-of-day prompt asking how things went
  • Streak celebrations — When I hit a habit milestone

These go out via email, SMS (through Twilio), or Telegram — whatever channel I’ve configured. There’s a daily cap so it doesn’t become annoying, and quiet hours so I’m not getting pinged at 11 PM.

Weekly Reviews

The weekly review was always the centerpiece of my productivity system, and it’s become the most developed feature in Goalzz. Iris runs a structured 5-phase review:

  1. She checks my follow-through on last week’s Big 3 priorities
  2. She asks me to reflect first — how did the week feel? — before showing any data
  3. Then she pulls everything: tasks completed, goals progress, habits, calendar — and connects my feelings to what the data shows
  4. We classify the week together (proactive, reactive, maintenance, rough) and identify one thing to change
  5. I pick my Big 3 for next week, and Iris challenges or refines them

It’s the closest thing I’ve found to having an accountability partner who actually knows what’s on my plate.

Wearable Integration

I recently added Oura Ring support. Goalzz pulls in sleep scores, readiness, HRV, resting heart rate, activity data — all of it. Iris can factor this into coaching. If my readiness score is tanked, she might suggest a lighter day focused on shallow work rather than pushing me to tackle the hardest thing on my list.

What I’ve Learned About Agents

Building Iris has given me a very different perspective on AI agents than what I see in most headlines.

Tools are the real product. The language model is impressive, but what makes Iris useful is the 50+ tools she can call. Without them, she’s just a chatbot giving generic advice. With them, she can look at my actual tasks, my actual calendar, my actual goals — and give specific, grounded coaching. Every time I add a new tool, the coaching gets meaningfully better.

Context is everything. The system prompt that Iris works from isn’t static — it’s assembled dynamically with my vision statement, current goals, recent tasks, habit streaks, and behavioral patterns. When Iris says “you tend to stall on goals in the third week of the quarter,” that’s not a guess. That’s from actual pattern data the system has been tracking.

Proactive beats reactive. The scheduled outreach was an experiment, and it’s become the feature I value most. The morning briefing reframes my day before I even open my task list. The midday nudge catches things I’ve been avoiding. The goal stall alerts are uncomfortable but effective. An agent that only responds when you ask is useful. An agent that knows when to reach out is a coach.

Memory makes it personal. Iris stores memories about my preferences, patterns, and personal context. This is what separates a generic AI assistant from something that actually feels like it knows you. After a few weeks, Iris stops asking setup questions and starts making connections — “Last time you had a week like this, you said cutting one goal helped you focus.”

The agent loop is humbling. Claude processes the tools in a loop — Iris might call 5-10 tools in a single response to gather context before answering. Getting this reliable took a lot of iteration. Error handling, rate limiting, debouncing sync operations so the AI doesn’t trigger a storm of Microsoft API calls — these are the unglamorous parts that make it actually work.

Cost tracking matters. Every AI call has a cost — input tokens, output tokens, tool calls. I track all of it per-user. This forced me to think carefully about when Iris should use the full tool suite (weekly reviews, chat) versus when a lighter text-only response is fine (flash coaching, return greetings). Not every interaction needs 50 tools.

What’s Next

I’m working toward making Iris more autonomous — not in a scary way, but in a practical one. Right now she can suggest and create, but she always acts within a conversation. The next phase is letting her do things like decompose a big goal into tasks on her own, or notice when two goals are competing for the same time and flag the conflict before I feel it.

I’m also thinking about progress visibility. The data is all there — goal completion rates, task velocity, habit consistency — but I haven’t built the charts and dashboards to make it visual yet. That’s coming.

If you want to try Goalzz, it’s in beta at goalzz.me. It integrates with Microsoft To Do, and you can connect your Oura Ring if you have one. I’d love feedback — especially from anyone who does structured weekly reviews.

The biggest thing I’ve learned building this: the gap between “AI chatbot” and “AI agent” isn’t the model. It’s the tools, the context, the memory, and the judgment about when to act versus when to wait. That’s the hard part. And it’s the part that makes it worth building.

Leave a Reply

Your email address will not be published. Required fields are marked *