Assorted geeky things, reality tv, and bragging about my kids

Tags - Categories : All | Family | Scrubs | Shakespeare | Blogging | News | Technology | TV
Links

I love awk. It's one of the first tools I go to when I need to do something like figure out the average value of a column of numbers:

cat file | awk '{ total+=$1;count++;}END{print total/count;}'

Or the distribution of values across two columns:

awk '{ table[$1]+=$2;}END{for (i in table) print i,table[i];}'

But variable fields bug me. That is, when you setup something like '.' or ' ' as your field delimiter, but then one row has this: A.B.C.D and one has A.B..C.D for whatever reason. You can no longer say "field #3" and be sure that you are getting what you wanted.

That's where the NF constant comes in, for Number of Fields. If you figure that somewhere in the middle you're going to lose count, you can start counting from the opposite end of the record.

In the case I'm working on I have a field "author" which may or may not contain the following: email, delimiter, location. Where location is often but not always "city, state". And then a pipe delimiter, and then an index field. I need the email (field 1) and the index (field NF) and I don't care what's in the middle. Very handy indeed.

More awk (and other Unix utility) tricks at my other, more geeky blog, Duane's Brain.

Wow, didn't see that one coming.

Read more...

Is strategy the right word?

Read more...

Ok, this is different.

Read more...

Apparently I'm not the only one with this problem.

Read more...