> ## Documentation Index
> Fetch the complete documentation index at: https://docs.eigenai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Google Workspace

> Everyday Google Workspace tasks — managing emails, calendars, sheets, and contacts in single-turn and multi-turn formats.

This demo contains tool-use tasks for Google Workspace APIs (Gmail, Calendar, Drive, Sheets, Slides, Contacts), generated in both single-turn and multi-turn formats. Tasks reflect everyday personal and professional use — managing emails, scheduling, updating spreadsheets, and coordinating with contacts — across diverse personas like teachers, office managers, and project coordinators.

## Overview

| Property      | Single-Turn                                                         | Multi-Turn                                                                   |
| ------------- | ------------------------------------------------------------------- | ---------------------------------------------------------------------------- |
| **Samples**   | 50                                                                  | 50                                                                           |
| **Task type** | One-shot tool calls                                                 | Conversational tool-use sequences                                            |
| **Scenarios** | Check calendar events, send emails, look up contacts, update sheets | Multi-step workflows combining email, calendar, sheets, slides, and contacts |

## Environment

The environment is a JSON database (`db.json`, \~2.4 MB) representing the full Google Workspace state — 24 tables covering 100 user accounts with their emails, calendar events, drive files, contacts, and more. This database serves as the backend that the agent queries and modifies via tool calls.

```
google-workspace/
├── db.json                               # Simulated Google Workspace database
├── google-workspace_single-turn/         # 50 single-turn tasks
│   └── reference_payloads/               # Per-task environment snapshots
└── google-workspace_multi-turn/          # 50 multi-turn tasks
    └── reference_payloads/               # Per-task environment snapshots
```

<Accordion title="Database structure — 24 tables, 100 accounts, 5,000+ entities">
  | Table                     | Entries |   | Table                 | Entries |
  | ------------------------- | ------- | - | --------------------- | ------- |
  | `google_account`          | 100     |   | `drive_file`          | 901     |
  | `calendar`                | 200     |   | `drive_permission`    | 367     |
  | `calendar_event`          | 276     |   | `gmail_message`       | 279     |
  | `calendar_event_attendee` | 374     |   | `gmail_thread`        | 202     |
  | `contact`                 | 203     |   | `gmail_draft`         | 242     |
  | `contact_email`           | 281     |   | `gmail_label`         | 229     |
  | `contact_phone`           | 285     |   | `sheets_spreadsheet`  | 243     |
  | `contact_address`         | 284     |   | `sheets_tab`          | 342     |
  | `contact_organization`    | 282     |   | `slides_presentation` | 213     |
  | `contact_url`             | 275     |   | `slides_slide`        | 304     |
  | `docs_document`           | 284     |   | `slides_object`       | 373     |
  | `drive_file_parent`       | 200     |   | `gmail_message_label` | 230     |

  **Account entry:**

  ```json theme={null}
  {
    "account_id": "acct_00000001",
    "display_name": "Alex Taylor",
    "email": "alex.taylor@example.com"
  }
  ```

  **Calendar event entry:**

  ```json theme={null}
  {
    "event_id": "evt_00006008",
    "account_id": "acct_00000008",
    "calendar_id": "cal_00004050",
    "title": "Household admin hour",
    "description": "Catch up on paperwork, bills, and household administrative tasks.",
    "location": "Home",
    "start_time": "2024-10-09T19:00:00",
    "end_time": "2024-10-09T20:00:00",
    "timezone": "America/New_York",
    "visibility": "private"
  }
  ```

  **Gmail message entry:**

  ```json theme={null}
  {
    "message_id": "msg_00001202",
    "account_id": "acct_00000005",
    "thread_id": "thr_00001025",
    "from_email": "taylor.nguyen@example.org",
    "to": "noreply@example.com",
    "subject": "Re: Reminder: confirm your appointment time",
    "snippet": "Hi Noreply, Are you free sometime this week to connect?...",
    "internal_date": "2025-02-11T17:05:44"
  }
  ```

  **Contact entry:**

  ```json theme={null}
  {
    "resource_name": "people/c_00003001",
    "account_id": "acct_00000009",
    "display_name": "Sam Rivera",
    "given_name": "Sam",
    "family_name": "Rivera",
    "biography": "Sam is a Software Engineer and a valued professional contact.",
    "birth_month": 6,
    "birth_day": 14,
    "birth_year": 1992
  }
  ```
</Accordion>

## Data

Each format (single-turn and multi-turn) contains:

| Directory             | Description                                                          |
| --------------------- | -------------------------------------------------------------------- |
| `datapoints/`         | Generated tasks/dialogs with tool calls                              |
| `intent/`             | Task intent definitions — goals, user profiles, and expected actions |
| `evaluators/`         | Python grading scripts for automated evaluation                      |
| `reference_payloads/` | Per-task environment state snapshots (initial + final)               |

## Single-Turn Sample

<Accordion title="User scenario">
  ```json theme={null}
  {
    "goal": "Verify the details of the 'Interview prep' calendar event, log related notes into the 'Pet Care Tracker' sheet, then look up Devon Price and email them a short heads-up summarizing what was logged.",
    "profile": "User is Family Hub (account_id: acct_00000091, email: family.hub@example.com). Relevant entities: Calendar event 'Interview prep' (event_id: evt_00006217, calendar_id: cal_00004220). Spreadsheet 'Pet Care Tracker' (spreadsheet_id: sh_00007014). Contact: Devon Price (resource_name: people/c_00003061).",
    "scenario_time": "2025-05-15T18:30:00+01:00",
    "persona": "veterinary office manager with moderate Google Workspace experience",
    "motivations": [
      "Make sure the interview prep details are correct before writing any notes tied to it",
      "Keep Devon Price in the loop on what was added to the pet tracking log"
    ],
    "constraints": [
      "Confirm the 'Interview prep' event details (time and location) before updating the Pet Care Tracker",
      "Look up Devon Price in Contacts to retrieve the right email destination before sending the message"
    ]
  }
  ```
</Accordion>

<Accordion title="Full dialog (13 messages, 7 tool calls)">
  **User:**

  > Hi — I'm logged in as Family Hub (acct\_00000091), [family.hub@example.com](mailto:family.hub@example.com). Can you first double-check the details for my calendar event called "Interview prep" (just confirm the time and the location for me), and then add a quick note about it into my spreadsheet "Pet Care Tracker" (something like: confirmed Interview prep details and that I'll be out during that window). After you log it, can you look up Devon Price in my contacts and email them a short heads-up summarizing what you added to the sheet?

  **Assistant:**

  **Step 1** — Search calendar, spreadsheet, and contacts in parallel:

  *The user wants me to: 1) Get details for 'Interview prep' calendar event, 2) Find the 'Pet Care Tracker' spreadsheet, 3) Look up Devon Price in contacts. I can do all three in parallel since they're independent.*

  ```json theme={null}
  domain_calendar_get_event_by_query({"account_id": "acct_00000091", "query": "Interview prep"})
  domain_sheets_list_spreadsheets({"account_id": "acct_00000091", "query": "Pet Care Tracker"})
  domain_contacts_search_contacts({"account_id": "acct_00000091", "query": "Devon Price"})
  ```

  **Tool results:**

  ```json theme={null}
  {"event": {"title": "Interview prep", "start_time": "2025-05-15T20:15:00",
    "end_time": "2025-05-15T22:15:00", "location": "Meeting room", "timezone": "Europe/London"}}

  {"spreadsheets": [{"spreadsheet_id": "sh_00007014", "title": "Pet Care Tracker"}]}

  {"contacts": [{"display_name": "Devon Price", "resource_name": "people/c_00003061"}]}
  ```

  **Step 2** — Fetch spreadsheet and contact details in parallel:

  *I have the event details: Interview prep is on 2025-05-15 from 20:15 to 22:15 (Europe/London), location is 'Meeting room'. I need to get the spreadsheet details to know which sheet to append to, and I also need Devon Price's contact details (email) to send them an email.*

  ```json theme={null}
  domain_sheets_get_spreadsheet({"account_id": "acct_00000091", "spreadsheet_id": "sh_00007014"})
  domain_contacts_get_contact({"account_id": "acct_00000091", "resource_name": "people/c_00003061"})
  ```

  **Tool results:**

  ```json theme={null}
  // Spreadsheet response — full tab data with existing rows (truncated)
  {"title": "Pet Care Tracker", "tabs": [{"title": "Care - Summary",
    "data": [["Area", "Status", "Last Updated", "Owner", "Next Action"],
      ["Operations", "On Track", "2025-02-10", "Ops Lead", "Monthly review"],
      ["Finance", "Attention", "2025-02-09", "Finance", "Budget revision"], ...]}]}

  // Contact response
  {"display_name": "Devon Price", "resource_name": "people/c_00003061",
    "emails": [{"email": "devon.price@example.com", "email_type": "work"}],
    "phones": [{"phone_number": "+1-555-0145-3061", "phone_type": "home"}]}
  ```

  **Step 3** — Append note to spreadsheet:

  *I have all the info I need. Now I need to: 1) Append a note to the Pet Care Tracker spreadsheet, 2) Email Devon Price at [devon.price@example.com](mailto:devon.price@example.com) with a summary. I'll do both — append to the spreadsheet first.*

  ```json theme={null}
  domain_sheets_append_values({"account_id": "acct_00000091", "spreadsheet_id": "sh_00007014",
    "range_notation": "Care - Summary!A1",
    "values": [["Note", "Confirmed", "2025-05-15", "Family Hub",
      "Confirmed Interview prep event details and will be out during that window"]]})
  ```

  **Tool result:**

  ```json theme={null}
  {"appended_rows": 1, "starting_row": 8}
  ```

  **Step 4** — Send email to Devon Price:

  *The note has been appended to the spreadsheet. Now I need to email Devon Price at [devon.price@example.com](mailto:devon.price@example.com) with a short heads-up summarizing what was added to the sheet.*

  ```json theme={null}
  domain_gmail_send_message({"account_id": "acct_00000091", "to": "devon.price@example.com",
    "subject": "Quick heads-up: Interview prep event logged",
    "body": "Hi Devon, Just a quick note to let you know I've confirmed the details..."})
  ```

  **Tool result:**

  ```json theme={null}
  {"message_id": "msg_00000344", "thread_id": "thr_00000128"}
  ```

  **Step 5** — Final answer:

  *All tasks have been completed successfully. I should now summarize everything for the user in a friendly, natural response.*

  > All done! The Interview prep event is on May 15, 2025, from 8:15 PM to 10:15 PM (Europe/London), and it's located in the Meeting room. I also added a note to your "Pet Care Tracker" spreadsheet (Care - Summary tab) confirming those details and that you'll be out during that window, and I emailed Devon Price ([devon.price@example.com](mailto:devon.price@example.com)) a quick heads-up summarizing what I logged.
</Accordion>

<Accordion title="Evaluator excerpt (~1,000 lines Python)">
  Each sample ships with an auto-generated evaluator that scores submissions on two axes:

  **Part A — Environment state comparison:** deep-compares the final MCP server state (all 24 database tables) against the reference payload, field by field, using pattern-based classification to handle non-deterministic fields (timestamps, tokens) and semantic matching via LLM for free-text values.

  **Part B — LLM trajectory judge:** prompts an LLM to holistically assess whether the agent achieved the user's goal. The criteria are task-specific:

  ```python theme={null}
  ## EVALUATION CRITERIA
  # 1. Goal Achievement: Did the agent accomplish what the user asked for?
  #    - Check the calendar event "Interview prep" details (time and location)
  #    - Add a note to the "Pet Care Tracker" spreadsheet about the event
  #    - Look up Devon Price in contacts
  #    - Send Devon Price an email summarizing what was added to the sheet
  #
  # 2. Completeness: Were all required steps performed?
  #    - Calendar event lookup
  #    - Spreadsheet search/access
  #    - Contact search for Devon Price
  #    - Spreadsheet append with note
  #    - Email sent to Devon Price
  #
  # 3. Correctness: Were the right tools called with appropriate arguments?
  #    - Correct account_id used (acct_00000091)
  #    - Correct spreadsheet identified (Pet Care Tracker)
  #    - Correct contact found (Devon Price)
  #    - Email sent to Devon Price's email address
  #
  # 4. Final State Validity: Does the final state reflect the expected changes?
  # 5. Alternative Validity: Are there alternative valid approaches?
  ```

  The combined verdict:

  ```python theme={null}
  def evaluate_submission(reference_payload, eval_payload, llm_client) -> dict:
      """Returns config_match, config_reports, llm_judgment, overall_pass, debug_info."""
      # ...
      # Part A: environment state comparison
      config_match, config_reports = compare_configs(ref_snapshots, eval_snapshots, llm_client)
      # Part B: LLM trajectory judge
      llm_judgment = trajectory_llm_judge(reference_payload, eval_payload, llm_client)
      # Overall: both must pass
      result["overall_pass"] = config_match and llm_judgment.get("passed", False)
  ```
</Accordion>

## Multi-Turn Sample

<Accordion title="User scenario">
  ```json theme={null}
  {
    "goal": "Check whether anyone replied to my 'Household checklist for this week' email thread, update my 'Presentation Draft' Slides with a quick text change, confirm a couple numbers from my 'Holiday Budget Planner' Sheet, share that spreadsheet with Parker Chen, then find a free time next week for a follow-up call and draft an email proposing it.",
    "profile": "Ivy Bryant (ivy.bryant@example.net) uses account acct_00000087. Her Drive includes a Google Slides file named 'Presentation Draft' (pres_00005272) and a Google Sheet named 'Holiday Budget Planner' (sh_00007036). Her contacts include Parker Chen and Kendall Young, and her Gmail has a thread with subject 'Household checklist for this week' (thr_00001028).",
    "scenario_time": "2025-02-24T10:00:00-05:00",
    "persona": "hospitality revenue manager, comfortable with Google Workspace basics on mobile",
    "motivations": [
      "Wants to send a clean stakeholder review package without missing any replies or outdated numbers",
      "Needs to line up a quick follow-up call with a collaborator to keep momentum"
    ],
    "constraints": [
      "I'm on my phone, so keep outputs short and scannable",
      "Before you share anything externally, confirm the recipient email address with me",
      "When drafting the email, show me the draft text so I can tweak it before it's finalized"
    ]
  }
  ```
</Accordion>

<Accordion title="Full dialog (35 messages, 6 user turns, 13 tool calls)">
  **Turn 1 — User:**

  > Hey — can you check my Gmail thread "Household checklist for this week" and tell me if anyone replied?

  **Assistant:**

  **Step 1** — Clarify account:

  *The user wants to check a Gmail thread titled 'Household checklist for this week'. I need their account\_id first. I don't have it, so I should ask for their email address to look up their account.*

  > Sure! What's your email address so I can look up your account?

  **User:**

  > Yep — it's [ivy.bryant@example.net](mailto:ivy.bryant@example.net).

  **Assistant:**

  **Step 1** — Look up account:

  *I need to look up the user's account by email first, then search for the Gmail thread.*

  ```json theme={null}
  domain_account_get_account_by_email({"email": "ivy.bryant@example.net"})
  ```

  **Tool result:**

  ```json theme={null}
  {"account_id": "acct_00000087", "display_name": "Ivy Bryant", "email": "ivy.bryant@example.net"}
  ```

  **Step 2** — List threads:

  *I have the account\_id. Now I need to search for the Gmail thread 'Household checklist for this week' to check if anyone replied.*

  ```json theme={null}
  domain_gmail_list_gmail_threads({"account_id": "acct_00000087", "max_results": 20})
  ```

  **Tool result:**

  ```json theme={null}
  {"threads": [{"thread_id": "thr_00001028"}, {"thread_id": "thr_00001078"}], "count": 2}
  ```

  **Step 3** — Search messages in the matched thread:

  *I have the thread IDs but no titles. I need to search Gmail messages with that query to find the right thread and check for replies.*

  ```json theme={null}
  domain_gmail_list_gmail_messages({"account_id": "acct_00000087",
    "query": "Household checklist for this week", "max_results": 10})
  ```

  **Tool result:**

  ```json theme={null}
  {"messages": [{"from_email": "pat.rivera@example.com", "internal_date": "2025-02-23T17:12:33",
    "subject": "Re: Household checklist for this week",
    "snippet": "Hi Chores, Thanks for reaching out. I have reviewed the details and everything looks good..."},
    {"from_email": "ivy.bryant@example.net", "internal_date": "2025-02-22T09:45:00",
    "subject": "Household checklist for this week", ...}]}
  ```

  **Step 4** — Final answer:

  *I have the thread messages. The original email was sent by Ivy. There's a reply from [pat.rivera@example.com](mailto:pat.rivera@example.com). So yes, someone replied — Pat Rivera replied to the thread.*

  > Yes, someone replied! Pat Rivera ([pat.rivera@example.com](mailto:pat.rivera@example.com)) responded to your "Household checklist for this week" thread on Feb 23rd. They said they reviewed the details and everything looks good on their end, and asked you to let them know if there's anything else needed. No reply from Alex Taylor yet though.

  ***

  **Turn 2 — User:**

  > Got it, thanks. Next — can you open my Slides file "Presentation Draft" and change one line? I just need the title to say "Weekly Revenue Update" instead of whatever it is now.

  **Assistant:**

  **Step 1** — List presentations:

  *User wants to update the title in their 'Presentation Draft' slides file. I need to find the presentation first.*

  ```json theme={null}
  domain_slides_list_presentations({"account_id": "acct_00000087", "query": "Presentation Draft"})
  ```

  **Tool result:**

  ```json theme={null}
  {"presentations": [{"presentation_id": "pres_00005272", "title": "Presentation Draft"}], "count": 1}
  ```

  **Step 2** — Get presentation details:

  *I found the presentation. Now I need to get its details to find the title text box object ID so I can update it.*

  ```json theme={null}
  domain_slides_get_presentation({"account_id": "acct_00000087", "presentation_id": "pres_00005272"})
  ```

  **Tool result:**

  ```json theme={null}
  {"presentation_id": "pres_00005272", "title": "Presentation Draft",
    "slides": [{"slide_id": "sld_00009301", "insert_index": 0, "objects": []}]}
  ```

  **Step 3** — Create text box with new title:

  *The slide has no existing objects — no title text box to update. I'll create a text box with 'Weekly Revenue Update' on the first slide.*

  ```json theme={null}
  domain_slides_create_text_box({"account_id": "acct_00000087", "presentation_id": "pres_00005272",
    "slide_id": "sld_00009301", "text": "Weekly Revenue Update", "x": 50.0, "y": 50.0})
  ```

  **Tool result:**

  ```json theme={null}
  {"object_id": "obj_00000001"}
  ```

  **Step 4** — Final answer:

  *The slide had no existing objects, so I created a new text box with 'Weekly Revenue Update'. The task is done.*

  > Done! The first slide in "Presentation Draft" didn't have an existing title text box, so I created one with "Weekly Revenue Update" on it. The presentation is all set!

  ***

  **Turn 3 — User:**

  > Perfect. Next can you open my Sheet "Holiday Budget Planner" and tell me what the total budget and total spent numbers are? And after that I need to share the spreadsheet with Parker Chen — can you pull up Parker's email first and confirm it with me before you send anything?

  **Assistant:**

  **Step 1** — Search spreadsheet and contacts in parallel:

  *User wants two things: 1) Get values from 'Holiday Budget Planner' spreadsheet, and 2) Look up Parker Chen's contact info. These are independent so I can batch them.*

  ```json theme={null}
  domain_sheets_list_spreadsheets({"account_id": "acct_00000087", "query": "Holiday Budget Planner"})
  domain_contacts_search_contacts({"account_id": "acct_00000087", "query": "Parker Chen"})
  ```

  **Tool results:**

  ```json theme={null}
  {"spreadsheets": [{"spreadsheet_id": "sh_00007036", "title": "Holiday Budget Planner"}], "count": 1}

  {"contacts": [{"display_name": "Parker Chen", "resource_name": "people/c_00003187"}], "count": 1}
  ```

  **Step 2** — Fetch full details in parallel:

  *I found the spreadsheet ID. Now I need to get the values from the spreadsheet to find total budget and total spent. I also need Parker Chen's full contact details to find the email. Let me do both in parallel.*

  ```json theme={null}
  domain_sheets_get_spreadsheet({"account_id": "acct_00000087", "spreadsheet_id": "sh_00007036"})
  domain_contacts_get_contact({"account_id": "acct_00000087", "resource_name": "people/c_00003187"})
  ```

  **Tool results:**

  ```json theme={null}
  // Spreadsheet data (truncated)
  {"title": "Holiday Budget Planner", "tabs": [{"title": "Budget - Plan",
    "data": [["Category", "Budgeted", "Actual", "Variance", "Status"],
      ["Personnel", "12000", "11800", "-200", "Under budget"],
      ["Materials", "3500", "3650", "150", "Slightly over"],
      ["Software", "2000", "1950", "-50", "Under budget"],
      ["Travel", "1500", "1600", "100", "Slightly over"], ...]}]}

  // Contact with two emails
  {"display_name": "Parker Chen", "resource_name": "people/c_00003187",
    "emails": [{"email": "parker.chen@example.com", "email_type": "home"},
      {"email": "parker.chen@example.org", "email_type": "work"}]}
  ```

  **Step 3** — Final answer:

  *Total budgeted: \$21,300. Total spent: \$20,845. Parker has two emails. I should present the budget numbers and ask which email to use for sharing.*

  > Here's what I found in the Holiday Budget Planner:
  >
  > Total Budget: \$21,300
  > Total Spent: \$20,845
  > Variance: -\$455 under budget overall — looking pretty good!
  >
  > Now for Parker Chen, I found two email addresses on file:
  >
  > 1. [parker.chen@example.com](mailto:parker.chen@example.com) (home)
  > 2. [parker.chen@example.org](mailto:parker.chen@example.org) (work)
  >
  > Which one should I use to share the spreadsheet? And what access level — viewer, commenter, or editor?

  ***

  **Turn 4 — User:**

  > Use [parker.chen@example.org](mailto:parker.chen@example.org) (work) please — can you share it as viewer access? Also after you do that, can you check my calendar for any 30-min free slots next week for a follow-up call and then draft an email to Parker proposing a couple options (show me the draft before sending).

  **Assistant:**

  **Step 1** — Share file and check calendar in parallel:

  *I can do the share and find free time in parallel since they're independent. I'll do those first, then draft the email based on the free slots found.*

  ```json theme={null}
  domain_drive_share_file({"account_id": "acct_00000087", "file_id": "sh_00007036",
    "role": "reader", "permission_type": "user", "email_address": "parker.chen@example.org"})
  domain_calendar_find_free_time({"account_id": "acct_00000087",
    "time_min": "2025-03-03T09:00", "time_max": "2025-03-07T17:00", "duration_minutes": 30})
  ```

  **Tool results:**

  ```json theme={null}
  {"permission_id": "perm_00000011", "file_id": "sh_00007036"}

  {"free_slots": [{"start": "2025-03-03T09:00", "end": "2025-03-07T17:00"}],
    "count": 1, "busy_periods": 0}
  ```

  **Step 2** — Final answer:

  *The spreadsheet has been shared. The calendar is wide open next week — no busy periods at all. I'll pick two reasonable 30-min slots and draft an email to Parker proposing those times. I need to show the draft to the user before sending.*

  > Done — the Holiday Budget Planner is now shared with [parker.chen@example.org](mailto:parker.chen@example.org) as a viewer!
  >
  > Your calendar looks wide open next week, so I picked two convenient 30-minute slots:
  >
  > Option 1: Tuesday, March 4 at 10:00–10:30 AM
  > Option 2: Thursday, March 6 at 2:00–2:30 PM
  >
  > Here's the draft email for your review:
  >
  > To: [parker.chen@example.org](mailto:parker.chen@example.org)
  > Subject: Follow-Up Call — Holiday Budget Planner
  >
  > Hi Parker,
  > I wanted to follow up on the Holiday Budget Planner I just shared with you. Could we hop on a quick 30-minute call to review the numbers? Here are two options...

  ***

  **Turn 5 — User:**

  > Looks good — can you tweak the subject to "Follow-up call next week" and change the first line to "I just shared the Holiday Budget Planner with you — could we do a quick 30-min call to review?" Everything else can stay. Then go ahead and send it.

  **Assistant:**

  **Step 1** — Send the email:

  *User approved the draft with two tweaks: subject changed to 'Follow-up call next week' and first line changed. I'll send the email now with the updated content.*

  ```json theme={null}
  domain_gmail_send_message({"account_id": "acct_00000087", "to": "parker.chen@example.org",
    "subject": "Follow-up call next week",
    "body": "Hi Parker,\n\nI just shared the Holiday Budget Planner with you — could we do a quick 30-min call to review?..."})
  ```

  **Tool result:**

  ```json theme={null}
  {"message_id": "msg_00000033", "thread_id": "thr_00000008"}
  ```

  **Step 2** — Final answer:

  *The email was sent successfully. I should confirm this to the user and let them know everything is done.*

  > Email sent! Parker will receive it at [parker.chen@example.org](mailto:parker.chen@example.org) with the updated subject and opening line. All done — the spreadsheet is shared, the calendar slots are identified, and the email is on its way!
</Accordion>

<Accordion title="Evaluator excerpt (~950 lines Python)">
  Same two-axis evaluation as single-turn, but with multi-turn-specific criteria:

  **Part A — Environment state comparison:** identical deep-compare of final MCP server state against reference.

  **Part B — LLM trajectory judge** with task-specific criteria:

  ```python theme={null}
  ## EVALUATION CRITERIA
  # 1. Goal achievement: Did the agent accomplish everything the user asked
  #    for across all turns?
  # 2. Completeness: Were all required steps taken?
  #    - Account lookup, Gmail search, Slides update, Sheets read,
  #      contact lookup, file share, calendar check, email send
  # 3. Final state validity: Does the final state match the reference for
  #    key mutations (new drive_permission, new gmail_message, slides
  #    text box created)?
  # 4. Correctness of key actions: Were the correct IDs, emails, roles,
  #    and parameters used?
  # 5. Alternative validity: Is the evaluation trajectory an equally valid
  #    alternative approach?
  ```

  ```python theme={null}
  def evaluate_submission(reference_payload, eval_payload, llm_client) -> dict:
      """Returns config_match, config_reports, llm_judgment, overall_pass, debug_info."""
      # ...
      # Part A: environment state comparison
      config_match, config_reports = compare_configs(ref_snapshots, eval_snapshots, llm_client)
      # Part B: LLM trajectory judge
      llm_judgment = trajectory_llm_judge(reference_payload, eval_payload, llm_client)
      # Overall: both must pass
      result["overall_pass"] = config_match and llm_judgment.get("passed", False)
  ```
</Accordion>

## Download

```bash theme={null}
# Download all Google Workspace data
hf download jindidi/eigendata-demo-data --repo-type dataset --include "google_workspace/*"

# Download only single-turn tasks
hf download jindidi/eigendata-demo-data --repo-type dataset --include "google_workspace/google-workspace_single-turn/*"
```

<Card title="Browse on Hugging Face" icon="face-smile" href="https://huggingface.co/datasets/jindidi/eigendata-demo-data/tree/main/google_workspace">
  View Google Workspace files
</Card>
