Programmers, please take five minutes to provide some data for an experiment

April 19, 2012 Programming, Unix 30 comments

Whenever people talk about ack, there’s always a discussion of whether ack is faster than grep, and how much faster, and people provide data points that show “I searched this tree with find+grep in 8.3 seconds, and it took ack 11.5 seconds”. Thing is, that doesn’t take into account the amount of time it takes to type the command.

How much faster is it to type an ack command line vs. a find+xargs line? I wanted to time myself.

Inspired by this tweet by @climagic, I wanted to find out for myself. I used time read to see how long it would take me to type three different command lines.

The three command lines are:
A: ack --perl foo
B: find . -name '*.php' | xargs grep foo
C: find . -name '*.pl' -o -name '*.pm' | xargs grep foo

So I tried it out using time read. Note that it’s not actually executing the command, but measuring how long it takes to hit Enter.

$ time read
find . -name '*.pl' -o -name '*.pm' | xargs grep foo

real    0m8.648s
user    0m0.000s
sys     0m0.000s

For me, my timings came out to average about 1.4s for A, 6.1s for B and 8.6s for C. That was with practice. I also found that it is nearly impossible for me to type the punctuation-heavy B and C lines without making typos and having to correct them.

So I ask of you, dear readers, would you please try this little experiment yourself, and then post your results in the comments? Just give me numbers for A, B and C and then also include the name of your favorite Beatle so I know you actually read this. Also, if you have any insights as to why you think your results came out the way they did, please let me know.

At this point I’m just collecting data. It’s imperfect, but I’m OK with that.

  • Yes, I’m sure there’s another way I could do this timing. It might even be “better”, for some values of “better”.
  • Yes, I know that I’m asking people to report their own data and there may be observational bias.
  • Yes, I know I’m excluding Windows users from my sample.
  • Yes, I know it’s possible to create shell aliases for long command lines.
  • Yes, I know that the find command lines should be using find -print0 and xargs -0.
  • Yes, I know that some shells have globbing like **/*.{pl,pm}.

Note: I’ve heard from a zsh user that time doesn’t work for this because it’s a shell function, but /usr/bin/time does work.

Thanks for your help! I’ll report on results in a future post.

The world’s two worst variable names

April 18, 2012 Programming 71 comments , ,

As programmers, assigning names makes up a big part of our jobs. Phil Karlton said “There are only two hard things in Computer Science: cache invalidation and naming things.” It’s a hard problem, and it’s something we deal with every time we write a line of code. Whether it’s a variable or a table or a column in that table or a file on the filesystem, or what we call our projects and products, naming is a big deal.

Bad variable naming is everywhere. Maybe you’ll find variables that are too short to be adequately descriptive. The programmer might as well have been working in TRS-80 BASIC, where only the first two characters of variable names were significant, and we had to keep a handwritten lookup chart of names in a spiral notebook next to the keyboard.

Sometimes you’ll find variables where all vowels have been removed as a shortening technique, instead of simple truncation, so you have $cstmr instead of $cust. I sure hope you don’t have to distinguish the customers from costumers! Worse, $cstmr is harder to type because of the lack of vowels, and is no longer pronounceable in conversation.

There are also intentionally bad variable names, where the writer was more interested in being funny than clear. I’ve seen $crap as a loop variable, and a colleague tells of overhauling old code with a function called THE_LONE_RANGER_RIDES_AGAIN(). That’s not the type of bad variable name I mean.

While I’m well aware that variable naming conventions can often turn into a religious war, I’m entirely confident when I declare The World’s Worst Variable Name is $data.

Of course it’s data! That’s what variables contain! That’s all they ever contain. It’s like if you were packing up your belongings in moving boxes, and on the side you labeled the box “matter.”

Variable names should say what type of data they hold. Asking the question “what kind” is an easy way to enhance your variable naming. I once saw $data used when reading a record from a database table. The code was something like:

$data = read_record();
print "ID = ", $data["CUSTOMER_ID"];

Asking the question “what kind of $data?” turns up immediate ideas for renaming. $record would be a good start. $customer_record would be better still.

Vague names are the worst, but right behind them are naming related objects with nearly identical names that do not distinguish them. Therefore the World’s Second Worst Variable Name is: $data2.

More generally, any variable that relies on a numeral to distinguish it from a similar
variable needs to be refactored, immediately. Usually, you’ll see it like this:

$total = $price * $qty;
$total2 = $total - $discount;
$total2 += $total2 * $taxrate;

$total3 = $purchase_order_value + $available_credit;
if ( $total2 < $total3 ) {
    print "You can't afford this order.";
}

You can see this as an archaeological dig through the code. At one point, the code only figured out the total cost of the order, $total. If that’s all the code does, then $total is a fine name. Unfortunately, someone came along later, added code for handling discounts and tax rate, and took the lazy way out by putting it in $total2. Finally, someone added some checking against the total that the user can pay and named it $total3.

The real killer in this chunk of code is that if statement:

if ( $total2 < $total3 )

You can’t read that without going back to figure out how it was calculated. You have to look back up above to keep track of what’s what.

If you’re faced with naming something $total2, change the existing name to something more specific. Spend the five minutes to name the variables appropriately. This level of refactoring is one of the easiest, cheapest and safest forms of refactoring you can have, especially if the naming is confined to a single subroutine.

Let’s do a simple search-and-replace on the coding horror above:

$order_total = $price * $qty;
$payable_total = $order_total - $discount;
$payable_total += $payable_total * $taxrate;

$available_funds = $purchase_order_value + $available_credit;
if ( $payable_total < $available_funds ) {
    print "You can't afford this order.";
}

The only thing that changed was the variable names, and already it's much easier to read. Now there’s no ambiguity as to what each of the _total variables means. And look what we found: The comparison in the if statement was reversed. Effective naming makes it obvious.

There is one exception to the rule that all variables ending with numerals are bad. If the entity itself is named with a number, then keep that as part of the name. It's fine to use $sha1 for variable that holds a SHA-1 hash. It helps no one to rename it to $sha_one.

After I wrote the first version of this article, I created policies for Perl::Critic to check for these two naming problems. My add-on module Perl::Critic::Bangs includes two policies to check for these problems: ProhibitVagueNames and ProhibitNumberedNames.

What other naming sins drive you crazy? Have you created automated ways to detect them?

Undecided if something should go on your resume? Add more detail for guidance.

April 11, 2012 Resumes 2 comments , , , ,

Convential Wisdom has it that resumes have to be written in the most clipped, stilted business-speak possible.  It’s not true.  Thinking that way is a disservice to our resumes and our job prospects.

A poster on Reddit asked how proficient he should be in German before listing it on his resume. You can see where he’s coming from. He’s wondering if he can add a “Languages spoken: German” bullet point to his resume, and that’s good. The problem is that the clipped business-speak mentality has him thinking that that’s all he can say.

You can and should add detail to your resume. The more detail you add, the less chance there is for misinterpretation, and it helps you think more about your skills and how you can sell them to the reader.

I suggest that instead of putting an overly terse “Languages spoken: German”, you add a sentence giving details. This might be, for example:

  • I am fluent in written and spoken German, and have been for the past 20 years.
  • I have conversational fluency with spoken German.
  • I know some German words I picked up from my Grandma.

If in the process of writing the details of your skill you find that it sounds silly, then you’ve answered your question as to whether it should be on your resume.  To be clear, that last bullet item isn’t worth putting on your resume.

This process works with any item you want to put on a resume.  As you add detail, does it still sound like it’s worth putting on there?  If not, leave it off.  If it is, work with that detail to grab the reader’s attention.

Programmers struggle with this all the time.  “How much Ruby do I have to know before I can put it on my resume?”  Add detail to answer your own question.  If you’re not going to be comfortable asking the question “How have you used Ruby?” in the interview, then don’t put it on a resume.

Finally, always remember why you have a resume: A resume exists to get the reader to call you in for an interview.  If something isn’t going to make the reader say “We need to get her in here ASAP”, then leave it off.

The nameless “they” and the Facebook & job interview trend that isn’t a trend

April 6, 2012 Interviews 19 comments , , ,

“I’m never eating there again,” he told me. “You know what they do?”

I was standing around at a party twenty years ago, and conversation got around to what our first jobs were. I said that my first job was at the McDonald’s, and someone in the circle looked stricken. “You couldn’t pay me to eat there. You know what they do there?” he asked. “I knew a guy who worked at McDonald’s, and he saw this other guy drop a hamburger patty on the floor by mistake, and he picked it up and put it on a burger and they served it. I’m never eating there again.”

The guy at the party had invoked the nameless “they,” as if McDonald’s tells its kitchen workers it’s OK to observe the five-second rule. Maybe he meant “they” to mean there was a secret cabal of grill workers who create Big Macs with special seasonings from the floor. He took the actions of one worker at one time to be an indicator of a trend. He nursed his horror and made sure everyone else knew about it.

But what if this tale of the dirty burger got on the news? Maybe the story would spread like wildfire across the country, with outraged citizens letting everyone else know about this horror. Maybe pundits would come out with columns excoriating the stupid practice of picking up hamburgers dropped on the floor, and why it’s bad for business. Maybe opportunistic politicians could beat their chests and call for a Justice Department inquest into this alarming trend.

Absurd, right? But that’s exactly what this non-trend of “job interviewers demand your Facebook password” is.

Over the past week, blogs and message boards and, of course, Facebook have been burning up with outrage at this non-trend. People commiserate and shake their heads grimly, imagining being stuck between the rock of having an employer snoop in our Facebook accounts and the hard place not having a job. People turn on Internet Tough Guy mode and imagine their defiance at the scenario, or give their theories as to the legalities of the practice. Business pundits weigh in on why it’s a bad idea.

The original AP news story that sparked this hullabaloo named one candidate, Justin Bassett, citing one interview at one unnamed company. That’s it. Still, it’s been rerun over and over and over. Every article has a similarly declarative headline like “Job seekers get asked in interviews to provide Facebook logins.” That’s as absurd as saying “McDonald’s serves burgers off the floor” because of the story the guy at the party told.

The news media have added non-facts, with one headline calling it a “growing trend”.
The follow-on news stories didn’t help. News media and bloggers snowballed it without doing further research. Even NPR, smarting from Mike Daisey’s fabrications, ran the story saying that “some companies” are asking for Facebook passwords. “Some companies” has as much to back it up as “they,” but it doesn’t sound so bad.

Senators have called on the DOJ and EEOC to launch investigations. (Also disturbing to me is Schumer’s assertion that in the job-seeking process, “all the power is on one side of the fence,” which only helps reinforce that incorrect idea.)

Is it plausible that this practice is widespread, and getting moreso? Sure, it’s plausible. Our privacy erodes every day, and millions of us do it through Facebook willingly. The story has the feel of truthiness. Doesn’t it just seem like the thing that Big Business would do to us? We already piss in cups to prove that we’re drug-free so that we can come in and shuffle paper.

To be sure, there are cited cases in that AP story of employers requiring access to candidates’ Facebook accounts. As Matthew Kauffman points out in his excellent probing of this story, those cases are of law enforcement and corrections departments, where greater scrutiny of candidates is common and expected. “In many of those cases, of course, applicants are also subjected to a full-on psychological evaluation,” Kauffman points out.

Kauffman’s aritcle isn’t alone in being sensible. An article on CNN.com says “The reason you haven’t come across any job interviewers asking for your Facebook password is that the practice is pretty rare.”

But how did this non-story get to this point? You got suckered in and the media ran with it.

When you heard this story, did you even question it? Or did you just forward it and post it as if it was an important life-saving story about there are these gang initiations and how “they” will kill anyone who flashes their lights?

It’s 2012, and we are the media. When we fan the flames of non-issues like this, we become the media that we should seek to leave behind.

Finally, in my job as blogger about employment and job interviews, I would be remiss in not addressing how to deal with a request for your Facebook credentials. I’ve read plenty of comments in threads suggesting walking out of the interview, or lying to the interviewer and saying you don’t have a Facebook account.

Walking out may feel good, as righteous indignation so often does, but it doesn’t help your situation. You give up any chance you had of getting the job. Lying is easily disproven, and. worst of all, requires you to lie.

The best answer is to calmly and respectfully say “I believe it’s best for business to keep business and personal life separate. That’s why I keep my private life private.” You may not get the job, but at least you’ll have been turned down while keeping a strong sense of ethics about you… which is more than you can say for companies that would ask to snoop in your private life.

Eight items to leave off your resume

March 30, 2012 Resumes 9 comments , , , , , ,

Here’s a quick list of things that should never appear on your resume. Unfortunately, I see them all the time.

A photo
unless you’re applying for a position as a model or actor.
A list of references
You’ll be asked for them at the right point in the process. If you want the company to be impressed by who you know or who you’ve worked with, then put that in the cover letter.
“References available upon request”
This is assumed. The reader will not think “This guy has no references available, so toss his resume.”
An objective
Objectives are summaries of what you want to get from the company. It doesn’t make sense to start selling yourself by telling the reader what you hope to get out of him. Replace your objective with a 3-4 bullet summary of the rest of the resume. (See more posts about objectives)
Salary information
Disclosing your salary history weakens your position when negotiating a salary. It’s also irrelevant on your resume.
An unprofessional email address
Email accounts are free from Gmail, so there’s no reason to use your “cubs_fan_1969@yourisp.com” account for professional correspondence.
Meaningless self-assessments like “I’m a hard worker” or “I work well on a team.”
Everyone says those things, so they have no meaning. Instead, the bullets for each position on your resume should give examples and evidence of these assertions. (See more posts about self-assessments)
Hobbies that don’t relate to the job
Everyone likes to read and listen to music and spend time with their families. Exception is if the hobby somehow ties to the job or company. If you play guitar and you’re applying to be an accountant for Guitar Center’s corporate office, then mention that you play, even though your job won’t involve guitar-playing directly.

What else do you see on resumes that should never be there?

Stand during phone interviews

March 7, 2012 Interviews 1 comment

Best advice I’ve ever heard about how to handle phone interviews is to stand during the call.

Standing will keep you more alert and focused on the interview. In phone interviews, it’s easy to forget that there’s someone else there even though you can’t see them.

Don’t walk around or pace, either. Keep focused on the task of selling yourself to the interviewer and listening and learning about the company.

How long should it take for an interviewer to get back to me?

February 24, 2012 Job hunting 1 comment

Every few days in the /r/jobs subreddit, someone will ask “It’s been N days since my interview, and I haven’t heard back. When can I follow up?  How long does it usually take?”

Two big lessons here:  1) there is no such thing as “usual” in the job process, and 2) the time to ask about timeframes is before you leave the interview.

Here’s an excerpt from chapter 8 of my book, Land The Tech Job You Love:

[After specifically stating you want the job, ]ask about follow-up. Ask about what the next steps in the process are and when you can expect them to happen. It can be very simple.

You: So, what are our next steps? What timeframe are we looking at?

Interviewer: Well, we’ve got a another week of interviews, and then we look at them as a group, so probably the next two weeks you should hear from us.

You: That sounds fine. If I don’t hear back by the 18th, may I call you? Is this number on your card best?

This part is purely for your benefit, so you may omit it if you don’t really care about waiting. However, if you’re like most people, after a while you’ll wonder “Have they forgotten me? Are they just taking a long time?” There’s no such thing as a “usual” amount of time it takes to hear back, so it’s up to you to ask before you leave. This is also a good time to ask for a business card, if you haven’t already been offered one, to make sure you have all the contact information you need.

Be sure to get a specific day, rather than “a couple of days.” As I posted last week, “a couple of days” may mean very different things to you than to the interviewer. Leaving it at “couple of days” is too vague, and leaves you wondering “How many days did he mean?”

Have you ever been asked “What is your biggest weakness?”

February 20, 2012 Interviews 18 comments ,

It’s become a bit of a joke by now, being asked in a job interview “What is your biggest weakness?” Numerous books and blog posts talk about how to answer the question, turning a negative into a positive, without sounding glib. I discuss it in the “Tough Questions” chapter of my book. It’s been parodied in this movie:

It’s a pretty bad question to ask. Presumably it’s asked to find out how self-aware the candidate is of where they have room for improvement, but there are better ways to find that out. For example, I’ve asked it directly in interviews, “Where do you see room for improvement in your skillset, and what are you doing to make that happen?”

Watching the “biggest weakness” movie above, I realized that I don’t think I’ve ever actually been asked the question in a real job interview. I know that if an interviewer did ask me, my opinion of him would drop considerably. I would wonder if he just got it out of a stock list of questions to ask.

I know what my answer would be, if I was ever asked this live: “I don’t know JavaScript as well as I should. I know enough to do basic form validation and graphic mouseovers, but as far as applications being written with tools like jQuery, I just haven’t gotten into that, and I should because that’s clearly where much of the web is headed.”

What about you? Have you, personally, ever been asked the question in a job interview? How long ago was it? What year? How did you answer, and how did the interviewer take your answer? How was the rest of the interview?

Or is “What is your biggest weakness” almost a sort of urban legend of interview questions, the one that you hear about other people getting asked, but never yourself?

Today’s PostgreSQL indexing gotcha

February 16, 2012 Programming No comments

At work, I have a big 14M-row production table with a bunch of indexes on it.  One of the indexes was bloated, so I built a new version of the index, and dropped the old bloated index.  Got back a gig of space on the filesystem.  Excellent.

Now, from what I understand, that should be all I have to do.  Postgres doesn’t need an ANALYZE on the table to use the new index.  All the column stats for the table in pg_stats are still there, so the query planner can use the index, and it should all Just Work.

Except that all of a sudden slow queries started showing up in the server log, and we were doing sequential scans. The planner wasn’t using the newly built index.

So I did an ANALYZE on the table, and suddenly the planner started using the index. Why was this?

This goes against what I knew. On this page, Robert Treat, Pg guru, says:

When adding indexes, it is not necessary to re-analyze the table so that postgres will “know” about the index; simply creating the index is enough for postgres to know about it.

So why didn’t it work for me? Turns out it was because the index I rebuilt was a functional index.

Apparently, Pg doesn’t know about the functional index unless there’s an ANALYZE to make the planner know about it. I’m guessing that somewhere there’s a pg_stats equivalent that has functional index histograms in it, too.

If you have further insight on this, please let me know in the comments.

Clarify user expectations to the minute to eliminate frustration and extra work

February 9, 2012 Career, People, Work life 1 comment

Vague timeframes like “ASAP” or “in a few days” are a sure way to get sorrow into your work day.  You’ll likely spend too much effort getting something earlier than the customer wanted, or later than he expected, leaving him frustrated.

Consider this simple request: “Can you get me the number of widgets we sold in 2011 ASAP?”  What exactly does “ASAP” mean?  Always ask for clarification.  “When exactly do you need this?  In ten minutes?”  You might get an answer of “Within half an hour. Jim has a conference call to London at 11:00.”  Or you might get an answer like “Oh, no, by the end of Tuesday is fine.”

This is also the same approach to take when someone asks “How long will it take for you to do X?”  She doesn’t really want to know how long it will take, but rather if you can do it in the timeframe she wants.  Therefore, don’t answer the question, but instead find out what the user actually wants by asking “When do you want it by?”

Make sure you always get time requirements down to the minute, not the day.  For instance, if a user says “Can you email me those numbers by Wednesday?” when exactly does that mean?  You might take that to mean “some time on Wednesday”, but she might mean “Wednesday at 8am because that’s when I come in and will want to incorporate them into a report.”  When 8am comes and no numbers are in her mailbox, you look like a chump.  If it’s the other way around and you get her numbers sooner than necessary, you’ve prioritized her work higher than other tasks that need to get done first.

There are all sorts of vague terms to clarify.  “End of business” usually means “I really want it the following morning”, and “by lunchtime” probably means “when I get back from lunch”.  In all cases, clarify to eliminate misunderstanding.

Finally, close the conversation by reiterating your commitment and include your understanding of the time frame.  “I’ll email you an Excel document with those numbers by 4pm tomorrow.”  This makes everything clear and gives one more chance for potential misunderstandings to be made explicit.