How to Name Things Properly

Perhaps it is not news to everyone that there are two hard things in Computer Science.

I previously explained why I felt programming is not boring because there is some art in doing it. I will not spend any time here arguing whether naming things is an art or not.

Instead, I will just go straight into how to name things properly. All the examples below uses the Julia language even though the same principles can be applied to other languages.

Why?

To me, it is extremely important to write code that is readable and easy to understand. If I write it properly, my colleagues will be able to read it, support it, and enhance it going forward. If I write it badly, I may even curse at myself in the future.

I will not cover naming conventions such as

CamelCase

snake_case

, as they are fairly well documented and understood. Rather, I will focus more about the semantics.

As our journey begins...

What are the things that we need to give names to? In Julia, the main things are modules, data types, functions, constants and variables. The very first decision is to choose between using a verb, a noun, or an adjective.

This is actually not too difficult. One reasonable approach is to adopt the following convention:

Naming modules

A good module name typically encompasses a group of concepts and functionalities provided by the module. It needs to be broad enough to cover all functionalities but specific enough to distinguish itself from other modules.

For example, I have a

CircularList

module that provides an implementation of a circular linked list. I do not name it as

LinkedList

, nor would I name it as

List

, for those are too broad of a concept.

Sometimes you want to think bigger and name the module a little broader than what is currently implemented. I think it is appropriate as long as there is a plan to develop the package further, and it does not cause any confusion for any user of the package. But, it should not miss the mark.

Additional guidance can be found at the Julia package naming guidelines documentation.

Naming data types

Data types are used to represent either abstract concepts (abstract types) or concrete data structures (concrete types).

That's why using nouns would be appropriate. In the case that data types represent behavior or traits, it is appropriate to use adjective such as

Iterable

and

Pretty

A general rule of thumb is to come up with names that are not too long. I would say any name that is composed of more than three words is code smell.

So,

ProductCatalog

is perfectly fine and even

ElectronicProductCatalog

is acceptable but

ElectronicDepartmentProductCatalog

would be too long.

Why is it bad? It is because long names are harder to read and spelling mistake are hard to find, causing headaches during debugging.

Naming functions

For functions, when should I use verbs? And when should I use nouns?

There is no hard-and-fast rule. Based upon my experience, I have developed an intuition to make these kinds of judgment. Let me illustrate my thought process with an example.

How a noun looks better than a verb...

Let's say I am working on an e-commerce site and I have a

Product

data type. I want to define a function that determines the price of the product based upon the current market condition. I can think of two names -

current_price

and

calculate_price

Now, consider the code below:

for product in products
    if current_price(product) in price_range
        add(search_result, product)
    end
end

Does it look better or worse if I choose

calculate_price

instead? I bet it would look worse!

The reason why

current_price

looks better is that it's easier to read. I can read the code as "if current price of the product is in the price range..." But, I cannot read the code as "if calculate price of the product is in the price range...". Get it?

When I use

current_price

, I'm emphasizing the thing that is being returned from the function rather than how it does its job.

At this point, one may argue that the readability issue can be resolved with a temporary variable. Let's assign the result of the function call to a

price

variable:

for product in products
    price = calculate_price(product)
    if price in price_range
        add(search_result, product)
    end
end

Sure, that could work but obviously it limits the way you write code, so I can't say it's optimal.

How a verb looks better than a noun...

Next, let's consider the case that I need a new function that find the best price of the product using several different pricing algorithms. So, the pricing function needs to take an algorithm argument.

Let's write the function and its corresponding doc string above:

"Return the current price of the product calculated using the specified algorithm."
current_price(product::Product, algorithm::Algorithm)

At this point, it starts to feel just a little more awkward to read. Let's try to use a verb:

"Calculate the price of the product using the specified algorithm."
calculate_price(product::Product, algorithm::Algorithm)

Using a verb makes it more natural when you need to emphasize the action being taken to get the job done. In this example, the use of calculate goes together with the argument algorithm quite well.

So, my point is, what do you want to emphasize when calling the function. Is it the thing that it returns, or the process that it goes through?

You should be able to make a better decision when you can answer the above question.

Naming Constants/Variables

I believe the best code should be readable. More often than not, I have heard from other programming experts about using meaningful words for all variables in the code. Religiously. I can agree to that most of the time, until the verbosity kicks in and kills readability.

I can think of two examples where it makes sense to use terse names.

Doing Math

The first case is code that performs heavy-lifting mathematical calculations. It is usually more natural to use math symbols than English words. Julia has full support of Unicode and this is an area where it shines. Here's an example from my BoxCoxTrans.jl package:

function transform(𝐱, λ; α = 0, scaled = false, kwargs...) 
    if α != 0
        𝐱 .+= α
    end
    any(𝐱 .<= 0) && throw(DomainError("Data must be positive and ideally greater than 1.  You may specify α argument(shift). "))
    if scaled
        gm = geomean(𝐱)
        @. λ ≈ 0 ? gm * log(𝐱) : (𝐱 ^ λ - 1) / (λ * gm ^ (λ - 1))
    else
        @. λ ≈ 0 ? log(𝐱) : (𝐱 ^ λ - 1) / λ
    end
end

At current state, the code looks fairly clean and maybe even resembles the formula found in a math research paper. By contrast, it would look very ugly if I have to replace all symbols with names like

alpha

lambda

, and

time_series

Indexing arrays

The second case relates to the argument about naming indexing variables such as

index

. I generally oppose this idea. For simple code, it does not matter. For something more complex, it goes against you and hurts readability.

Consider this code that I found from the LinearAlgebra package:

## Swap rows i and j and columns i and j in X
function rcswap!(i::Integer, j::Integer, X::StridedMatrix{<:Number})
    for k = 1:size(X,1)
        X[k,i], X[k,j] = X[k,j], X[k,i]
    end
    for k = 1:size(X,2)
        X[i,k], X[j,k] = X[j,k], X[i,k]
    end
end

Let's just do an exercise and see how it looks if we replace those short variable names with the more meaningful and longer version:

function rcswap!(index1::Integer, index2::Integer, matrix::StridedMatrix{<:Number})
    for position = 1:size(matrix,1)
        matrix[position,index1], matrix[position,index2] = 
            matrix[position,index2], matrix[position,index1]
    end
    for position = 1:size(matrix,2)
        matrix[index1,position], matrix[index2,position] = 
            matrix[index2,position], matrix[index1,position]
    end
end

If the purpose of using longer variable names is to improve readability, then this is doing more harm than good.

By contrast, using single-letter variable names is not too bad. In fact, it is quite common that single-letter variable names are used for indexing arrays in any programming language tutorials (e.g. python, java, javascript, and rust).

In addition, whenever I read code, my brain already translates them to read like index. Really, there is no difference between

index

. They all look the same to me.

Wrapping up...

Is naming things actually difficult? Definitely, maybe.

You may think that I always name things properly? Wrong!

I am definitely not perfect. Having written a book and a couple of blog posts, I have come to realize that writing code is no difference from other types of writing in general. You must keep revising (refactoring) to make it right.

I do believe that practice makes perfect. I would encourage you to ask someone else to read your code and give you feedback.

P.S. For more tips in writing good code in Julia, consider picking up my book Hands-on Design Patterns and Best Practices with Julia.