Golang Sudoku Solver for Hard Sudoku Puzzles

Motivation

After completing this excellent hands-on course on golang, I wanted to get my hands dirty by solving something non-trivial. I also wanted to satiate the inner child artist in me by building something for myself for fun.

What started as a fun project soon became an obsession! I ended up building a mini sudoku-solver that uses goroutines and recursion to efficiently solve any valid sudoku fast.

The project is hosted here for folks who want to head to code directly.

Results

Here's the result of a solver run executed on arguably the world’s hardest sudoku ever.

Solution Approach

The solver employs two different sub-approaches to solve a sudoku.

The solver first maps the set of eligible numbers (that can be filled) for each cell of sudoku.
It also fills the cells which have just a single eligible number. This is a map and reduce step performed on each cell.

If each iteration of a sudoku scan keeps reducing the number of unfilled cells, the solver keeps repeating map and reduce until the sudoku is solved.

Of course, the above technique solves only baby sudoku puzzles.

When the number of unfilled cells stops reducing, the solver switches to a trial and error approach.

This is where concurrency and goroutines come into play.

This approach works as follows:

The solver identifies the cell with the least number of eligible numbers.
The solver then fires a goroutine as a recursive call to itself for each of these eligible numbers, after creating a copy of sudoku with the eligible number filled at this cell.
In the next iteration, the solver picks up the next cell with least number of eligible numbers, and fires the next set of goroutines... and so on and so forth.
This keeps repeating until either the sudoku is solved or a total of ten million (10, 000,000) iterations are exhausted.

This approach results in a mega decision tree with each guess on a cell as a branch node. There is only one path which correctly solves the sudoku. When the solver hits this path, it returns the solved sudoku.

Each thread of the solver works on a distinct copy of a sudoku to avoid data race.

Key Learning

Always use a channel (chan) or a struct with a chan to communicate with a go routine.

Use wait groups to block execution until all the goroutines are completed.

The below is a common pattern for firing multiple goroutines (as anonymous functions) and waiting for them to complete execution.

func Solve(sudokuIn Sudoku) (Sudoku, bool, int, error) {

	// Other parts of the code come here
	
	chanSudokuSolve := make(chan Channel)
	wg := new(sync.WaitGroup)

	// fire a goroutine for each eligible nunber
	for _, eligNum := range cellToEvaluate.EligibleNumbers.GetList() {

		wg.Add(1)

		// Call the solve function recursively, but as a go routine thread so that it executes asynchronously
		go func(sudokuIn Sudoku, rowID int, colID int, fillVal int, wg *sync.WaitGroup, c *chan Channel) {
			defer wg.Done()

			_SudokuOut := sudokuIn.Copy()

			_SudokuOut.Fill(rowID, colID, fillVal)

			sudokuInter, _solved, _, _err := Solve(_SudokuOut)
			*c <- Channel{Intermediate: sudokuInter, Solved: _solved, Err: _err}
		}(SudokuOut, cellToEvaluate.RowID, cellToEvaluate.ColID, eligNum, wg, &chanSudokuSolve)
	}

	// wait for the threads to be done & close channel once all threads are done
	go func(wg *sync.WaitGroup, c chan Channel) {
		wg.Wait()
		close(c)
	}(wg, chanSudokuSolve)

	// collect the results and look for the right guess
	for r := range chanSudokuSolve {
		
	}
}

Note that we do not have special wrapper function for Solve that deals with channels as a separate function parameter etc. All that complexity is handled in the anonymous block.

This way of invoking goroutines keeps the function signature and code clean.

Always run the code with race detector flag on to ensure that there is no data race.

golang has a powerful race detector to detect data race conditions (i.e. scenarios when more than one goroutine in concurrently accessing the same memory address).

I have modelled the sudoku as a slice of int slices. In the initial stages of coding, I was incorrectly copying the sudoku structure, which led to data race conditions and the solver was behaving super weird.

_sudokuOut := make(sudoku, len(sudokuIn))
copy(_sudokuOut, sudokuIn)

Once I detected the issue with race detector, I wrote a deep copy function for copying each element of a multi-dimensional slice, as follows:

// Copy is a deep copy solution for Sudoku structure which is array of array of int
func (s Sudoku) Copy() Sudoku {
	mySudoku := make(Sudoku, 0)
	done := make(chan struct{})

	go func() {
		for _, _Row := range s {
			myRow := make(Row, 0)
			for _, _col := range _Row {
				myRow = append(myRow, _col)
			}
			mySudoku = append(mySudoku, myRow)
		}
		done <- struct{}{}
	}()
	<-done
	return mySudoku
}

Of course, there are situations when a global variable must be accessed by multiple goroutines concurrently.

A counter is a common example of this situation.

In such situations, you can one of the following approaches:

Use sync/atomic package and implement atomic counters. See here.
For more complex management of state, use mutex. See here.

I have used a simple atomic counter to count the number of iterations across all goroutines, as follows:

var globalCounter = new(int32)

// Solve takes an unsolved sudoku and solves it.
func Solve(sudokuIn Sudoku) (Sudoku, bool, int, error) {

	atomic.AddInt32(globalCounter, 1)
}

Use receiver functions on custom data types to model operations on the custom types.

Here's an example of a receiver function that "receives" a Sudoku and validates whether it is solved or not.

func (s Sudoku) Solved() bool {
	myColumns := make(map[int]Row)
	myBoundedBoxes := make(map[int]Row)

	// traverse the Sudoku once to collect Rows, columns and bounded boxes
	for rowID, Row := range s {

		if !(Row.complete() && Row.nonRepeating()) {
			return false
		}

		for colID, col := range Row {

			// collect column values belonging to the same column id in a separate Row
			myColumns[colID] = append(myColumns[colID], col)

			// collect column values belonging to the same bounded box into a separate Row
			bbID := _getBoundedBoxIndex(rowID, colID)
			myBoundedBoxes[bbID] = append(myBoundedBoxes[bbID], col)
		}
	}

	if len(myColumns) > 0 {
		for _, Row := range myColumns {
			if !(Row.complete() && Row.nonRepeating()) {
				return false
			}
		}
	}

	if len(myBoundedBoxes) > 0 {
		for _, Row := range myBoundedBoxes {
			if !(Row.complete() && Row.nonRepeating()) {
				return false
			}
		}
	}

	return true
}

At the risk of sounding pithy, receiver functions make golang "Classy". There are a number of excellent resources that explain receiver functions in depth. I especially liked this and this.

Conclusion

I was able to understand a number of intricate topics on golang through this tiny yet practical project.

As a compiled, concurrent, garbage-collected, statically typed language, golang allows you to write efficient code fast and in the process have fun too!

Thanks for reading!