Tiller-sheets-export - Unofficial tool to fetch Tiller data from Google Sheets as Arrow, DuckDB, or Parquet

Reminder to users considering executing code-based workflows:

  1. Malicious code can exfiltrate your data or worse
  2. Tiller cannot screen or protect you when installing third-party software — its presence in this community is not a security endorsement
  3. Thoroughly review all code before you install and execute

Overview

Unofficial tool to fetch Tiller financial data from Google Sheets as Arrow, DuckDB, or Parquet.

I have a few different projects where I needed to fetch Tiller sheet data in a more structured way in Python, for things like analysis, auditing, or running checks while importing data from an institution Tiller does not support. Instead of repeating the fetching and parsing logic every time, I split it into a small library and CLI tool, and I’m publishing it in case it’s useful to others.

This is more on the advanced side and is mostly useful to Software Engineers and Data Analysts.

What it does

The CLI tool:

  • After authentication, you pass a Google Sheets URL and it exports the Transactions and Categories sheets as Parquet files. The output can be queried by anything that reads Parquet, for example Pandas, DuckDB, Polars, BI tools and more.

The Python Library:

  • After authentication, you pass a Google Sheets URL and get the Transactions and Categories sheets as PyArrow tables or DuckDB relations. From there you can save to disk, convert to DataFrames, or run SQL, anything supported by PyArrow or DuckDB.

Both:

  • Basic validation runs automatically and logs warnings for things like type mismatches, missing categories, and empty cells in critical fields.

Installation

Authentication

The CLI and library use Google Application Default Credentials (ADC) to access Google Sheets. This makes it able to run locally with gcloud, or in environments like Google Colab. For using gcloud you’ll need an GCP account/project.

Important note about access

If you authenticate as your user with gcloud auth application-default login and grant the Sheets scope, your local credentials can be used to read any Google Sheet that your Google account has access to. This tool only reads the sheet URL you provide, but the credential itself is not limited to a single spreadsheet.

If you want to limit access to just one spreadsheet, you can:

  • Create a Google Cloud service account

  • Share only the target spreadsheet with that service account

  • Authenticate as that service account either by:

    • Using a service account key file and setting GOOGLE_APPLICATION_CREDENTIALS env var, or

    • Using gcloud to impersonate the service account

As always, read the code before using it. That said this is a pure Python tool and does not send your data anywhere.

  1. Login to Google Cloud
    1. gcloud auth login
  2. Setup credentials with Sheets scope
    1. gcloud auth application-default login --scopes=https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/spreadsheets.readonly
      
      
  3. Set your project (replace <PROJECT_ID> with your actual GCP project ID)
    1. gcloud auth application-default set-quota-project <PROJECT_ID>
    2. gcloud services enable ``sheets.googleapis.com`` --project=<PROJECT_ID>

CLI Linux/macOS permanent install with uvx.sh (installs uv + tiller-sheets-export):

curl -LsSf uvx.sh/tiller-sheets-export/install.sh | sh
tiller-sheets-export "https://docs.google.com/spreadsheets/d/YOUR_ID/edit"

Or if you have UV ( uv ) already installed:

uvx tiller-sheets-export "https://docs.google.com/spreadsheets/d/YOUR_ID/edit"

Library installation

uv add tiller-sheets-export

In Google Colab

from google.colab import auth
auth.authenticate_user()

!pip install tiller-sheets-export

Usage

CLI:

tiller-sheets-export "https://docs.google.com/spreadsheets/d/YOUR_ID/edit"

This generates

  • data/processed/transactions.parquet
  • data/processed/categories.parquet.

For Library use:

from tiller_sheets_export import TillerData

data = TillerData.fetch(spreadsheet_url="https://docs.google.com/spreadsheets/d/YOUR_ID/edit")

# To Arrow
data.transactions.to_arrow()
data.categories.to_arrow()

# To pandas
data.transactions.to_arrow().to_pandas()

# To duckdb (con is optional)
con = duckdb.connect()
transactions = data.transactions.to_duckdb(con=con)
categories = data.categories.to_duckdb(con=con)

con.sql("""
    SELECT t.date, t.description, t.amount, c.group, c.type
    FROM transactions t
    LEFT JOIN categories c ON t.category = c.category
    ORDER BY t.date DESC
""").show()

In Google Colab:

from google.colab import auth
auth.authenticate_user()

!pip install tiller-sheets-export

from tiller_sheets_export import TillerData

data = TillerData.fetch(spreadsheet_url="https://docs.google.com/spreadsheets/d/YOUR_ID/edit")
data.transactions.to_arrow().to_pandas()

Permissions

Fully open sourced, MIT Licensed.

FAQ

  • Automatic validation logs warnings for type mismatches, missing categories, and empty critical fields. Invalid values are coerced to NULL.

Thanks for this well-documented share, @eoneoy9Pei0p.
It looks like a cool unlock for all sorts of new data pipelines.

To all: please review and carefully consider code-based workflows.

2 Likes