User Experience in the Age of AI

Artificial intelligence is no longer a futuristic concept—it’s here, reshaping industries and changing the way we work, communicate, and solve problems. But while AI’s potential is undeniable, its success hinges on how effectively it meets real user needs.

User Experience in the Age of AI
/imagine From Chatbots to Calendars: User Experience in the Age of AI --ar 16:9

Designing user experiences with large language models is challenging because they are fundamentally a tool that can sometimes be wrong. This fuzziness makes it very important to:

  • Break Complex Tasks into Smaller Ones
  • Build for Failure and Handle Edge Cases

In this article, I'll explore some of the challenges I experienced while designing a chatbot and scheduling assistant, and share some of the prompts and approaches that have worked for me.

"AI is as revolutionary as personal computers, mobile phones and the internet" - Bill Gates (The Age of AI Has Begun)

Many companies are rushing to insert artificial intelligence into existing products. Some of these implementations are genuinely transformative. Tasks that once required a user interface may no longer need one, and entirely new workflows are emerging. Some users and designers are wary of the speed at which these features are getting added.

Microsoft & Google adding generative AI into office apps is a classic pattern of incumbents making the new thing a feature. But the new thing generally also enables completely new ways to solve the problem. ‘Easier spreadsheets’ is less important than ‘why is that a spreadsheet?’” — Benedict Evans (on X)

Both Microsoft and Google have published principles on how and when to use AI effectively and they are great reads. At a high level, they both emphasize the importance of user-centric design, transparency and clear communication.

Discoverability vs. Chat

We are still in very early days of AI usefulness and adoption, but the dominant interface that has emerged is an old one - chat. The success of ChatGPT has led multiple other companies to copy and paste that interface into their own products and not everyone loves it:

If companies are determined to shove chatbots into their products with no consideration as to whether their users want, need, or will benefit from them, then there's no way to "design better." - Internanal-Box47 (reddit)

Chat as an interface is something that I love (but I also prefer using the terminal and feel nostalgic for my old MS-DOS prompts). Unfortunately, it breaks the two basic characteristics that good user experience share - discoverability and understanding.

Two of the most important characteristics of good design are discoverability and understanding. It is easy to design devices that work well when everything goes as planned. The hard and necessary part of design is to make things work well even when things do not go as planned. - Donald Norman (How to Design Everyday Things)

Chat users are greeted with a blank canvas and no indication of what is possible. In traditional UX, we solve the problem of discoverability with intuitive buttons or graphical indicators, but chat has none of those. If we are to make chat useful, it needs to be very smart. Especially since as dialogue becomes more convincing and human-like, user expectations tend to become more unrealistic. When it feels like you are just chatting with a human and not a really advanced autocomplete one starts to expect that it understands what you mean.

My Experience Building Scheduled Messages

I have been working on a smart personal assistant that is built on top of these powerful new large language models, and recently had to solve a particularly challenging design - I wanted to allow users to request a message be delivered later. Some of the use cases I had in mind were:

  • "Remind me to call my wife and ask about dinner in 2 hours"
  • "Remind me to check my TODO list the night before Black Friday"
  • "Every morning at 9am send me a digest of information about bitcoin."
  • "Send me a new vibrant, dynamic wallpaper for my phone every Sunday at 7pm"
  • "Every Monday morning at 8am, Send me a message to ask how last week went, and remind me to set measurable goals for the next week. Use the SMART framework to keep me honest."

Defining the Problem

There are a few problems that make this difficult. First, we need to extract a date and time along with (potentially) a frequency from the messages. Second, we need instructions for what to do later. These are problems of natural language processing, and are solved very naturally with the current popular language models. The second issue we face is the bane of every developer who has spent any amount of time with dates - timezones. Users speak locally - as if the system they are talking to is in the same timezone as them and expect it to understand.

This problem seems tractable and interesting, so I started breaking it down into manageable steps. Here's the rough flow I came up with when planning this feature:

The green path is what engineers and designers call the "happy path." If we know the users timezone and are able to extract a date, time, frequency, and message instructions from the input, we are all clear and can let the user know that things have succeeded.

Designing the Tools

In order to add this functionality to my assistant, I knew I was going to need to add some tools (more details at that link if you haven't used these before). I began building by adding two tools: schedule_message and set_user_timezone. Based on the incoming messages, the model decides if it needs to call those and with what parameters. The timezone tool was defined like this:

{
  "name": "set_user_timezone",
  "description": "Sets a users timezone. User input required for location.",
  "parameters": {
    "type": "object",
    "properties": {
      "location": {
        "type": "string",
        "description": "The users current location."
      }
    },
    "required": [
      "location"
    ]
  }
}

The second tool, called schedule_message has a more intricate implementation, but the definition is similarly simple. Giving the AI examples of user requests with the parameters helped make it consistent. Here's the definition:

{
  "name": "schedule_message",
  "description": "Schedules a future message. Include the full original request in request parameter so we can extract a datetime for sending. For example: { request: 'remind me to work out each day at 9am', instructions: 'Time to work out!' } or { request: 'send me a scheduled summary of current bitcoin news every morning at 9am', instructions: 'Search the web and summarize what is new with bitcoin'}",
  "parameters": {
    "type": "object",
    "properties": {
      "request": {
        "type": "string",
        "description": "The original user request"
      },
      "instructions": {
        "type": "string",
        "description": "The message to send"
      }
    }
  }
}

When schedule_message is called, the first thing we do is check if a user has a timezone set up. We'll need this for the next steps, so failing early is the best bet. If they don't, we return this to the AI:

{
   success: false,
   reason: 'User does not have timezone set.'
}

The flow usually looks something like this

  1. Assistant tries to schedule message
  2. Receives a message that timezone is needed
  3. Assistant asks user where they are
  4. Assistant sets timezone AND schedules the messages

Since assistants can call multiple tools before responding, it doesn't need to be asked again to schedule something, once it knows that it has set the timezone, it tries scheduling again. If it succeeds that time, we return

{ success: true }

and the assistant lets the user know that their message is all set up. This is the flow of information and success states.

Extracting Frequency and DateTime

Of course, we still need to solve the problem of extracting the actual date and time that a user wants to receive a future message. To accomplish this, I separated those tasks into simple completions inside the schedule_message tool. Since language models don't really know the current date, we'll need to tell them. My system instructions for this completion look like this

const systemMessage = `Given a user request, extract the date and time.
Today’s date and time are '${new Date().toLocaleString('en-US', { timeZone })}'.
Format the response as an ISO8601 date time.
If the date is recurring, return the first instance.
If you can't extract a date, return 'fail'`

We tell the model what the date and time is currently for the user. This way both user and AI are operating with the same set of rules. This instruction works well with the gpt-4o model, and not quite as well with simpler ones. Since the response is important, I use the more expensive model. I use a similar prompt for extracting the recurrence and simply tell the model to return one of ['daily', 'weekly', 'monthly', 'yearly'] or non-recurring.

Solving Timezone Challenges

Determining a timezone may be easy if your application lives in a browser (and your users don't use VPNs or hide their location) since you can read the it directly. I didn't have that luxury since my application operates via text messages. I also wanted users to be able to change their timezone if they are travelling as I am, so I simply have the assistant ask for a user location and extract the TZ with the following:

const systemMessage = `Given a user request, return an IANA timezone name.
If you cannot fulfill the request return 'fail'`;

Again, this works well with gpt-4o and is a very small context request, so I recommend springing for the more expensive model here since making a mistake is a bad user experience.

Of course, with timezones, it is always a little harder than you initially think. The server where my app runs also exists in a timezone of it's own, and daylight savings time may or may not come into play so we need to be careful to add adjustments to the extracted date times before putting them into a database.

Extending The Functionality

Users will likely want to view or cancel scheduled messages. To enable this, I added two additional tools: get_scheduled_messages and cancel_message. They can now get back a list of messages and cancel them conversationally. The assistant is able to infer an internal identifier from the list and a message like "Cancel the daily reminder of standup".

Delivering Efficiently

Once we have saved the messages with delivery times and recurrence (or non-recurrence) schedules in the database, we also need a process that can check for messages that need delivering, and activate parts of the system that will do that delivering when required. That process looks (from a high level) like these three steps:

  1. Collect all the messages that need delivering from the database
  2. Insert a system message into the correct thread for each user with the instructions
  3. Run the thread and deliver the message to the user.

I include the current localDateTime in the instructions in case the scheduled message needs to know that information. This is useful when users schedule things like daily digests where the assistant might need to search the web for up to date content.

{
   role: 'assistant',
   content: `The current date and time is ${localDateTime}'. ${message.instructions}`
}

Designing AI That Truly Delivers

This design has been a challenge but I was guided along the way by starting with how users wanted to interact with the system. This feature was requested by more than one user, and I think it enables some really great use cases. My takeaways when designing for a chat interface:

  • Language models can handle error states and self correct - Returning an error from a function call with instructions on how to fix it (ie - "User timezone is not set") was a very effective way to make the system self-sufficient
  • Leverage knowledge embedded in language models - ChatGPT understands what an IANA timezone is. It knows the ISO standards for formatting dates. You can use this to your advantage
  • Don't assume language models are correct - It's a good idea to instruct your agent to "explicitly verify" details with users before proceeding and double check that inputs are valid in your own code

Thanks for reading! I hope you found something useful in here. It was a blast building this. I have included sources and interesting links below.

💡
If you enjoyed this article, consider subscribing, or go check out Marv.
The Age of AI has begun
Bill Gates explains why AI is as revolutionary as personal computers, mobile phones, and the Internet, and he gives three principles for how to think about it.
Microsoft HAX Toolkit
Collaborative tools to help you create more effective and responsible human-AI experiences
AI Principles – Google AI - Google AI
A guiding framework for our responsible development and use of AI, alongside transparency and accountability in our AI development process.
Why Conversational Interfaces are taking us back to the Dark Ages of Usability
The very first time I interacted with a computer was through this screen:
The Design of Everyday Things - Wikipedia