Stripping prefixes from transactions

Kind of an out-of-left field question, but I’ll give it a try: is there any way to automate stripping out prefixes from transactions? I find it annoying that every transaction done through Square or Toast gets a prefix (sq* or tst*) when it downloads to the transaction sheet from my credit card. I don’t know much of anything about how to program Google sheets, but would there be some way to automate removing those prefixes? It’s not really important to me what credit card processor was used. No, this isn’t a huge problem, but I figured I’d ask if I’m missing an easy way to do this.

1 Like

Individually they could be renamed using AutoCat. I don’t believe there is a way to just strip the ‘*sq’ off and keep other unique descriptors, but you could rename everything containing ‘*sq’ to “Square” or something like that.

2 Likes

Yeah i’ve ran into a similar situation before, as jpfieber said, aside from running a separate macros to split it out:
use auto-cat with RegEx to find the actual store name after the sq and rename that way, or
if you want to name those items even further is to remove the "ignore items if already auto-cat"ed toggle that way on first pass it removes the sq and then it names it to what you want

I haven’t tried the second method as i don’t trust myself to no have unintentional conflicts. So most stores that use square that I visit i go often enough that I just create a separate rule.

1 Like

I’m trying to understand this suggestion better.

Are you referring to this AutoCat setting?
image

And what would a two-pass auto-cat look like? It seems like it would all have to happen in one pass. How do you remove the SQ* on the first pass without renaming it the same pass?

If Full Description is:
SQ* SWEET DOZEN BAKERY

Category Description Contains Description
Food SWEET DOZEN SWEET DOZEN

vs.

Category Description Contains Description
Food SQ* ???
Food ??? ???

Yes, it’s been a while, so i forgot if it can do it all in one sitting. but with the “All transactions” you’ll first search for " `SQ*" in Description Contains then in Description Regex something like “(?<=.{4}).+” which will skip the first four chars
and then in another line in Description Contains “SWEET DOZEN BAKERY” and in Description “Sweet Dozen”
I think for this example though you could go strait to the second autocat line, ie must be a reason I ended up not splitting it.

Like this?

Category Description Contains Description Regex Description
Food SQ* (?<=.{4}).+
Food SWEET DOZEN BAKERY Sweet Dozen

Okay, yeah, that’s what I was thinking, only the second line is needed.

Would be nice to be able to do something like this search/replace, though, what @dmetiller was asking about (strip out the prefix). And then one line covers all merchants. This intends to also strip out the business type suffix (i.e. BAKERY), assuming type is always one word.

Category Description Contains Description Regex Description
Food ^SQ\*\s+(.*)\s+\S+\s*$ $1
3 Likes

Thanks for this feedback. We do already have the Full Description column that gets edited down to what you see in the Description column. I wonder if there’s some more tweaking our dev team can do to improve those descriptions more. You could add a feature request so this gets on their radar and we can see how many other Tiller users agree.

1 Like

Hi @dmetiller - I like this post and I learned something that I should’ve realized long ago in that AutoCat for Google Sheets has an advanced rule builder allowing updates to many columns. I was oblivious and I watched the nice walk-through vid that @heather created here.

It’s a good suggestion that you posted as a Feature Request - it doesn’t seem possible so far because the way it looks as the others said is that the Rule builder filters using Regex which is like a REGEXMATCH function but it seems that there is no corresponding REGEXREPLACE built into that automation. While we await a possible enhancement, I’ve got a workaround if you want to try it out.

In my Transactions sheet, I added a column at the far right and entered the following formula (it assumes the Description is in Column C):

=ARRAYFORMULA(IF(ROW(C1:C)=1,
"Prefix Investigator" &CHAR(10) & ARRAYFORMULA(SUM(IF(REGEXMATCH(C1:C, "^[A-Za-z ]{3,5}\* ?"), 1, 0))),
IF(REGEXMATCH(C1:C, "^[A-Za-z ]{3,5}\* ?"), REGEXREPLACE(C1:C, "^[A-Za-z ]{3,5}\* ?", ),)))

This one is called the Prefix Investigator because it looks for all of these types of transactions that come from payment processors and it strips the prefix. This particular Regex narrows it down to those that are between 3 and 5 characters. One interesting finding was that while the spaces and asterisk are consistent within an individual payment processor, it is not the case amongst different ones (ex. TST* - no space before asterisk, vs SQ *).

This Regex may cast a bit too wide of a net, so you can create an additional column or replace the first one if you like to focus just on the prefixes you care about. For example the three main ones for me were Square, Toast, and Intuit. For that I pasted the following formula in Row 1:

=ARRAYFORMULA(IF(ROW(C1:C)=1,
"Prefix Stripper" &CHAR(10) & 
IFNA(QUERY(C1:C,"select count(Col1) where LOWER(Col1) starts with 'tst* ' or LOWER(Col1) starts with 'sq *' or LOWER(Col1) starts with 'in *' label count(Col1) ''",1),0),
PROPER(IFNA(IFS(ISNUMBER(SEARCH("tst~* ",LOWER(C1:C),1)),RIGHT(C1:C,LEN(C1:C)-5),
ISNUMBER(SEARCH("in ~*",LOWER(C1:C),1)),RIGHT(C1:C,LEN(C1:C)-4),
ISNUMBER(SEARCH("sq ~*",LOWER(C1:C),1)),RIGHT(C1:C,LEN(C1:C)-4)),""))))

This one I call the Prefix Stripper and it defines more exactly the three that I encounter most frequently.

After that you can place a temporary filter on your data and filter out the blank rows. Or you can create a reusable filter view. The counter in the header row can be used to verify that the strings entered into the SEARCH function are correct and that you indeed have all of the rows being captured.

Finally there’s just a manual step to copy/paste the values over to the corresponding row in the Description column. Once you do that, the value disappears in the Prefix Stripper column and the counter reduces until you hit zero.

I will caution that if you load up a lot of ARRAYFORMULAs and run them against a large dataset, you could start to see some slower performance from the sheet. You could reduce any performance hit by changing all the range references C1:C which cover the entire column to C1:C100 as an example. After all, once you run it the first time you will not really need to keep looking back. I am running it initially against 6000+ rows without any noticeable issue.

Unfortunately I don’t have any screenshots because I don’t have a non-personal dataset with these prefixes.

I hope it’s helpful to you or someone else that may want to try it out. Feel free to let me know if any questions!

5 Likes

Wow. Thanks for this, @KyleT. Still a bit of a newbie to Google Sheets formulas, so I may have to build up some courage to try this, but I very much appreciate the post and the work that went into this. Very cool.

2 Likes

@dmetiller If you want a clean and tidy looking sheet – I’d use the Description override feature in the advanced autocat features. This will clean up anything that you’re creating rules for. You’ll have to manually do any that you’re not creating a rule for.

Another idea would be to create a new column called something like “Description for Display”, and write a formula that would use the right trim function and the length of the description string to trim strings if they contain one of the specified prefixes, but I don’t think this will go very far in having a neat outcome.

For some of my sheets I don’t bother, but the ones that have lower transaction volume and are intended to share with others, I do go to the trouble of adding a description override when creating the autocat rule. Often the description I’m using to filter by is the same description I want to override with, so it’s not much extra work to copy that over into the additional column.

2 Likes

This is very helpful. Thanks. As I get more comfortable with making my own modifications and tweeks to the sheets, I will keep this one in mind.

1 Like

Try my AI AutoCat script – does this for you plus a lot more.

3 Likes

I saw this in that thread and thought it was immensely cool. Working with scripts still makes me a little nervous, so I just need to building up the courage to give it a try. Thanks!

1 Like

I have on my list to try to package this as an Add On so it’s easier to install, but Google doesn’t make that super easy unfortunately. Just have to learn all the steps to do it and I will update that thread.

Best would be for Tiller to integrate something like this into their extension!

1 Like

Yep. At least one of Tiller’s competitors is proudly promoting its “AI-powered spending categorization,” which I suspect just means that they are running something very similar to this under the hood.

It’s not very good, I’ve tried it. :wink:

What I think makes mine work pretty well is that it’s based on your category list and how you have previously categorized (and described) things so it learns from you.

3 Likes

@dmetiller If your unsure of scripting or complex changes you can try a real easy feature that is built into Google Sheets called Find and Replace. Its located under Edit option in toolbar and you can Find SQ * and replace with nothing to delete. Instead of replacing all you can just find and replace one at a time until your comfortable. Just a thought on how to easily accomplish the mission.

3 Likes

This is a great discussion and we do expect to continue improving the quality of the data and/or the customization capabilities.

1 Like