On the importance of being a terminology pedant

January 15, 2018
Picking a suitable and consistent terminology makes your software easy to read and maintain. Here is an explanation of why it is so important with a few examples.

Introduction

A common task faced by all development teams, at the start of every project, is to pick a terminology for the project. Should you call the entities doctors, or should you call them medical professionals? Should we say, doctors have speciality? Or should we call it department? Are they equivalent, and we can pick either, or will there be a cost for choosing one over the other? These are the questions that the development team has to get answers to while the requirements are being fleshed out. 

Of course, the domain experts, who are most familiar with the market and customers, typically already have an evolved terminology. But developers have to go one step further and resolve any ambiguities and fill the gaps. Sort of develop a glossary of terms. They also have to keep it updated as the project evolves.

Picking a good terminology makes the code simple to read. It also makes it easier to have conversations about the code within the development team, as well as with other stakeholders outside the team.

If the project follows a consistent terminology, someone reading the code can make safe assumptions about what a class does or what a variable contains without reading every line of code.

An example

Let us say we are designing a quiz system. The database will have a large question bank of multiple-choice questions. A user will be presented with a set of 5 random questions out of the question bank. The user can answer the quiz within a fixed time. After they answer, they get evaluated and receive a score. We can then later rank the high scorers or award a prize or something like that.

Once we get started on the implementation, one of the first things we have to do is create a DB table:
 
# A simple DB table to store quiz questions
create_table "questions" do |t|
  # columns for storing questions text, options and
  # correct answer go here..
end

So, the first decision we face is what to name this database table. If we call it questions, then a single record, i.e. a model, will be called Question. We will write code like the following:

@question = Question.first

Next, we have to decide what to call the question text? We could call it question_text. We will then write something like this:

@question.question_text

That might look too verbose, though. Could we name is qtext?

@question.qtext

But then we will have to explain to someone unfamiliar with the code what qtext really means. How about if we rename the table to something else, say quizzes, so that we can use the term question for the text of the question instead of question_text? These are some of the issues that should be tackled by the software engineering team upfront on any project. Using common name like quiz and question can also be problematic. Since people will come with a pre-conceived idea of what question really represent. A line of code like:

@quiz.question

might have someone puzzling whether it refer to the question text of a single question, or is @quiz a collection of questions and @quiz.question is one of the questions within the collection?

Ultimately, whatever name you pick, it is important to be consistent and use it across the codebase, database tables, in emails and while talking to team members. A notable exception is, when the context is clear like in a loop, you can get away with naming something really short like using q for question object:

@questions.sort_by do |q|
  q.last_used_at
end

Philosophy

Philosophically speaking, There are 3 parts to a name: sign, referent and sense. Sign is what we call something. Referent is the object which the sign refers to. Sense is the implied meaning or context around that name, i.e. the chosen sign.

@q = Question.active.order("RANDOM()").first

Here @q is the sign. Question object, i.e. the row fetched from the database, is the referent. The fact this selected question @q is currently active, and it was randomly picked, is the sense. We could have chosen a name @random_active_question instead of @q to be more precise. But, as long as the context of the surrounding code make it clear, we can get away with being terse, and picking a simpler name like @q.

Takeaways

👉 The selected name should describe the purpose rather than implementation. Since purpose will be closer to the sense than implementation.

👉 Names should reflect the terminology used by actual users and domain experts as far as possible. For example, use Doctor instead of Medical Professional if know you are going to be dealing with doctors only. But using medical professional may be better if you know some of the users are going to nurses, students and technicians as well.

👉 Name should be used consistently - Database table name, class name and local variable names should all use the same terminology. Sometimes a shorter version can be used in loops. However, aliases should be avoided. For example, an article, post and a content can look pretty much interchangeable initially but could evolve to be different in future.

👉 Well established languages and libraries also make this mistake. Ruby added a method called yield_self in 2.5. But it was then aliased as then in 2.6 to improve readability. So, if you realize, you have not been as careful in picking the names in your project, take your time and straighten it out.
/ / /