I have some thoughts on the process. I spent some time looking under the hood yesterday at how the initial load categorized my transactions, and I noticed a fascinating pattern regarding how metadata is being parsed.
Only 15 of my new direct fills were handled by a mix of direct match or AI Suggest (which spot-checked at about 1-for-2 on accuracy). However, looking closely at the backend columns, the downstream aggregator actually passed down highly accurate Category Hints that unfortunately got overridden or flattened by the standard pipeline.
Specific Examples:
AES Student Loan: The aggregator cleanly passed down a Category Hint of Loans, but the default matching engine flattened it to a generic Transfer. (I manually updated this to Education).
CVS:The aggregator passed down a highly accurate hint of Healthcare/Medical, but the system still required a manual correction to map it to my specific Medical/Dental bucket. (The systemâs AI Suggest chose Shopping).
Verizon: The aggregator explicitly passed down Cable/Satellite/Telecomâwhich is incredibly granularâbut the system default-mapped it to the much broader Utilities and Bills.
Rocket & Figure (Mortgages): Rocket was left entirely blank instead of using a loans category. Figure was passed as a Transfer by the aggregator, which was actually incorrect in this specific case.
The Product Opportunity
As a data consumer, it seems like the most accurate, context-aware piece of data in the entire stream is actually that baseline Category Hint from the aggregator. If the Tiller pipeline could leverage or prioritize that hint field to dynamically map or suggest categories, it would instantly solve a lot of the initial-load messiness and text-matching limitations before the user ever has to intervene.
(Note: AI Suggest was initially turned on by default when I loaded my Google Sheet for the first time, which contributed to two categorizations).
My Ideal Categorization Hierarchy
To fix this initial-load messiness and account for the occasional aggregator error (like Figure), the data pipeline should follow this specific priority logic:
1. AutoCat Rules (Should always take ultimate precedent)
2. Aggregator Category Hints (The priority default if no AutoCat rule exists)
3. Nulls / âUncategorizedâ (If the aggregator provides no hint)
4. Manual Spreadsheet Intervention (Final cleanup by the user after viewing in a database or manually on their spreadsheet by sorting the categorization column)
Utilizing the aggregatorâs category hints as a core pillar of the logic before hitting manual cleanup would significantly streamline the onboarding experience.
Since I am unfamiliar with actual description match logic this suggested way although not perfect would help in initial loads. -David
Edit:To say maybe a toggle switch for aggregated hints?