pills all over there

GO Programming Pills

GO Programming Pills

#1: Forcing to use field names when instantiating structs

In GO, when instantiating a struct type you can use:

named fields: you name the fields you are assigning a value to, or
positional fields: you let the position of the instantiation values define which field they correspond to

For example:

type position struct {
	x int
	y int
}

func main() {
	p1 := position{y: 5, x: 10} // named instantiation
	p2 := position{10, 5}       // positional instantiation

	fmt.Printf("%+v == %+v", p1, p2)
}

The output of executing the above code is:

{x:10 y:5} == {x:10 y:5}

Each initialization approach has its pros and cons and is up to you to decide which one to use when instantiating a struct type.

In cases like the above example, using positional initialization can lead to subtle bugs because the type of the two fields is the same, thus we can mix their values inadvertently when doing the instantiation. As the developer of the type you might avoid instantiation errors by forcing users to use named fields… but how?!

Well there is a little hack: add a field with a blank (_) identifier at the beginning of the struct

type position struct {
	_ struct{}  // just to force named fields in instantiation
	x int
	y int
}

Now, if you try to run the code you will get the following compilation error for the expression position{10, 5}:

cannot use 10 (untyped int constant) as struct{} value in struct literal
too few values in struct literal

That does the trick and users will need (or prefer) to use named fields when instantiating your struct.

Anyway, do not forget that it is a hack and that the compile error might be a little cryptic for some people. Personally I prefer to use constructors even if they are not idiomatic in GO.

#2: Are your struct tags OK?

In GO, struct tags are used to annotate struct fields with meta-information. Usually the meta-information helps data transformation libraries to marshall/unmarshall structs to/from some data representation format like JSON, XML, YAML, …

For example, the following struct is annotated with meta-information for JSON and XML transformations:

type User struct {
	ID   string `json:"id" xml:"identifier"`
	Name string `json:"name" xml:"name"`
}

Struct tags are one of the few ugliness of GO. Tags are strings interpreted at runtime by using reflection. Consequence: the compiler is unable to catch errors in struct tags

One very frequent error is JSON marshaling tags including inline option. For example:

type UserContainer struct {
	apiv1.Container `json:",inline" protobuf:"bytes,1,opt,name=container"`
	// ...
}

(Argo)

type ResourceMeta struct {
	TypeMeta `json:",inline" yaml:",inline"`
	// ...
}

(Kustomize)

The problem: inline option is not recognized by the standard JSON library, thus using it means nothing!

Another common problem in struct tags are typos. Can you spot the problem in the code below?

type LogicalProcessor struct {
        LpIndex     uint32 `json:"LogicalProcessorCount,omitempty"`
        NodeNumber  uint8  `json:"NodeNumber,omitempty"`
        PackageId   uint32 `json:"PackageId,omitemty"`
        CoreId      uint32 `json:"CoreId,omitempty"`
        RootVpIndex int32  `json:"RootVpIndex,omitempty"`
}

Did you see it? The omitem[p]ty at the third field?

And to make things worst, using spaces in tags is tricky:

`json:"metadata,omitempty"`  // OK
`json: "metadata,omitempty"` // Not OK
`json:"metadata, omitempty"` // Not OK

(here you can see these problems in action)

As said before, the GO compiler will not help us with that.

We can use static code analysis tools (like Revive) to detect these problems.

#3: Multiple multiples

GO loves multiples. It has multiple return values, multiple variable assignment, multiple variable declaration.

The less uncommon feature is multiple variable declaration:

var a, b, c int

declares (and initializes to the int zero value) three variables. Nothing extraordinary.

Multiple variable assignment allows assigning multiple variables in a single assignment statement

var a, b, c int
a, b, c = 1, 2, 3

of course the right hand of the assignment can be any expression and variable types do not need to be the same

var a int
var b string
a, b = 1+25, fmt.Sprintf("the value of a is %d", a)

(can you guess the value of b after executing the assignment?)

When you combine multiple variable declaration and multiple assignment you can write things like

var a, b, c = 1, 2, 3  	// declares three int variables 
		// with initial values 1, 2, and 3 respectively

var n, s = 4, "text"	// declares an int variable and a string variable
		// with initial values 4 and "text"

You can also use the := operator to declare and assign at the same time

a, b, c := 1, 2, 3	
n, s := 4, "text"

The third, and most interesting, multiple of GO is multiple return values. In GO functions can return multiple values.

func randomCoords() (int, int) { ... }

x, y := randomCoords()

We can then write something like

func newPoint(x, y int) point { return point{x, y} }

func randomCoords() (int, int) { ... }

func main() {
	x, y := randomCoords()
	p := newPoint(x, y)
	// ...
}

Further, GO lets us compose newPoint and randomCoords and replace the sequence:

	x, y := randomCoords()
	p := newPoint(x, y)

	p := newPoint(randomCoords()) // look mum... no temporary vars!

Yes, GO verifies randomCoords return values match in number and type with newPoint arguments and it allows the function composition.

Ugliness?: function composition works only if all the expected parameters are provided by a single function call used as argument. For example:

func new3DPoint(x, y, z int) point { ... }
func randomCoords() (int, int) { ... }
p3d := new3DPoint(0, randomCoords()) 	// does not compile
p3d := new3DPoint(randomCoords(),0) 	// does not compile

Ugliness: multiple value expressions can not be used when instantiating structs. For example:

type point struct {
	x int
	y int
}
func randomCoords() (int, int) { ... }

p := point{randomCoords()}	// does not compile

Above examples are somewhat artificial because the concept of a function returning more than one result does not match the mathematical concept of a function. (functions returning more than one value might be the symptom of a design problem) In GO multiple return values are mainly used to handle errors.

f, err := os.ReadFile("/tmp/dat") // returns a file content and an error

A function returning a value and an error is a clearly a GO idiom:

v, err := functionReturningAValueAndAnError(...) 

Well… half of the idiom because its full version is

v, err := functionReturningAValueAndAnError() 
if err != nil { 
	// handle the error
}

The verbosity of the error handling idiom is one of the negative remarks people makes about GO.

Personally, I do not find the error handling idiom verbose. My main concern about function returning and extra error value is how it hinders function composition. Let’s modify a little bit our example of points and random coords. Let’s say the function randomCoords can also return an error

func randomCoords() (int, int, error) { ... }

Now our composition newPoint(randomCoords()) does not compile any more: newPoint expects two parameters but randomCoords returns three values.

We will see in the next pill how to solve this problem.

There are some language constructions, other than functions, that evaluate to more than one value. For example accessing an entry of a map returns two values:

m := map[int]string{1: "one", 2: "two", 3: "three"}
v, exists := m[2]
fmt.Printf("%q %v\n", v, exists) // "two" true
v, exists = m[5]
fmt.Printf("%q %v\n", v, exists) // "" false

Ugliness: the second value is somewhat “optional”

v = m[2]
fmt.Printf("%q\n", v)    	// "two"
fmt.Printf("%q\n", m[2]) 	// "two"
fmt.Printf("%d %q\n", m[2])	// %!d(string=two) %!q(MISSING)
v = m[5]
fmt.Printf("%q\n", v)    	// ""

Other GO construction returning more than one value is for-range

m := map[int]string{1: "one", 2: "two", 3: "three"}
for key, value := range m {
	println(key, value)
						// 1 one
						// 2 two
						// 3 three
}

Ugliness: again, the second value is “optional”

for key := range m {
	println(key)
				// 1
				// 2
				// 3
}

#4: Adapter functions (with a touch of generics)

In the previous pill we saw how to compose functions like in

func newPoint(x, y int) point { return point{x, y} }
func randomCoords() (int, int) { ... }
...
newPoint(randomCoords())

We also saw how multiple return values are used mainly to return errors like in

f, err := os.Open("/tmp/dat")

And how these error values prevent easily composing function calls. For example:

_, err:= io.Copy(
			os.Create("/tmp/dst.txt"),
			os.Open("/tmp/src.txt")
		)

will not compile. To make it work we could add some temp variables to write something like

source, _ := os.Open("/tmp/src.txt")
destination, _ := os.Create("/tmp/dst.txt")
io.Copy(destination,source)

But ignoring errors (the _ in the left-hand-side of the assignment) is not nice. So we end with something like:

source, err := os.Open("/tmp/src.txt")
if err != nil { ... }
destination, err := os.Create("/tmp/dst.txt")
if err != nil { ... }
_, err = io.Copy(destination,source)
if err != nil { ... }

That is why people says error handling is verbose in GO. In the previous pill I’ve said I’m comfortable with the GO’s approach with error handling. Of course, others are not. The subject was at the center of debates in the GO community, and some proposals to change error handling were made but rejected by the GO team.

Returning to our initial problem. We want to call

io.Copy(os.Create("/tmp/dst.txt"),os.Open("/tmp/src.txt"))

but it does not work because the expected parameters of io.Copy do not match with those we are providing and we get a nice compiler error:

multiple-value os.Create("/tmp/dst.txt") (value of type (*os.File, error)) in single-value context
multiple-value os.Open("/tmp/src.txt") (value of type (*os.File, error)) in single-value context

in other words: you are providing two values where the function expects only one

If we make disappear those error return values then expected and provided parameters will match. But how? We will introduce an adapter function that will… eh… adapt the results of os.Create and os.Open to match those expected by io.Copy. Because all we need to do is remove the error return value, our adapter function will be named noErr and its definition is:

func noErr(f *os.File, err error) *os.File {
	return f
}

The adapter takes two arguments, an *os.File and an error to only return the first.

Now we can do:

io.Copy(
	noErr(os.Create("/tmp/dst.txt")), 
	noErr(os.Open("/tmp/src.txt"))
)

Nice! :happy:…

eh…

Nice? Not nice at all. We are dropping the error and that, as said before, is ugly, very ugly.

We could handle the error in the adapter!

func try(f *os.File, err error) *os.File {
	if err != nil {
		panic(err)
	}

	return f
}

(I’ve renamed the adapter, I think try is now a better name)

Now we can write:

io.Copy(
	try(os.Create("/tmp/dst.txt")), 
	try(os.Open("/tmp/src.txt"))
)

And errors are no more dropped. If when creating or opening a file we get an error, the program will panic. (panic-ing might not be so cool, I will talk about that in a future pill)

In fact what we will write is:

_, err := io.Copy(
		try(os.Create("/tmp/dst.txt")), 
		try(os.Open("/tmp/src.txt"))
	)
if err != nil { ... }

Could we use the try function again and write:

try(io.Copy(
	try(os.Create("/tmp/dst.txt")), 
	try(os.Open("/tmp/src.txt"))
))

No because io.Copy returns values (int64, error) not compatible with those expected by try (*os.File, error) Wait a minute! Are we saying our try adapter only adapts functions returning *os.File and error? Yes, that is the signature of our adapter. So, if we want to adapt other types we need to declare new adapters? Yes. For example the try version for io.Copy could be:

func try2(v int64, err error) *os.File {
	if err != nil {
		panic(err)
	}

	return v
}

Beyond needing to find a new name for the new try function I’m sure you agree with me on the fact that defining a new version of try for each combination of a type and an error does not scale well.

There is still hope!

Re-reading me… “defining a new version of _try for each combination of a type and an error“… “_combination of a type and an error“… “type and an error”

We can use generics!

With generics we can define a, well, generic function that can be parametrized with the type we need to adapt.

func try[T any](v T, err error) T {
	if err != nil {
		panic(err)
	}

	return v
}

Now try accepts a value of some type T (we do not know yet) and an error. Thus the following calls are now valid:

try(
	io.Copy(
		try(os.Create("/tmp/dst.txt")), 
		try(os.Open("/tmp/src.txt"))
	)
)

And others like

jsn := try(json.Marshal(myStruct)) // Marshal returns ([]byte, error)
t := try(http.ParseTime(text)) // ParseTime returns (t time.Time, err error)
r := try(httpClient.Get(url)) // Get returns (resp *Response, err error)

I do not use this hack and I do not recommend using it. I used error handling as an example, adapter functions have other, more morally acceptable uses.

#5: Method = function+receiver

In GO, methods are functions attached to structs.

// Point models a 2D point
type Point struct {
	x, y int
}

// Methods
// X yields the x of the point
func (p Point) X() int { return p.x }
// Y yields the y of the point
func (p Point) Y() int { return p.y }

In the above example, the expressions (p Point) in the methods declaration is called the receiver. Here p is the equivalent of this or self in other languages. So we can write

func (this Point) Y() int { return this.y }

but it is not idiomatic in GO.

Once you have a struct instance you can call its methods

myPoint := Point{1, 2}
x := myPoint.X() // 1
y := myPoint.Y() // 2

The syntax

myPoint.X()
// structInstance.MethodName(parameters)

Is just syntax sugar , we could also write

Point.X(myPoint)
// structName.MethodName(methodInstance, parameters)

but is more verbose, thus we prefer the sugared version.

In the body of the method the receiver is a copy of the original instance (by value) or a pointer to it (by reference). You decide which one when declaring the receiver of the method. For example, in the following Shift method, the receiver p is a pointer (*) to the struct instance.

// Shift shifts the point
func (p *Point) Shift(x, y int) { p.x, p.y = p.x+x, p.y+y }

myPoint := Point{1, 2}
fmt.Printf("%+v\n", myPoint) // {x:1 y:2}
myPoint.Shift(5, 5)
fmt.Printf("%+v\n", myPoint) // {x:6 y:7}

If we opt to use a by value receiver for Shift we get

// Shift shifts the point
func (p Point) Shift(x, y int) { p.x, p.y = p.x+x, p.y+y }

myPoint := Point{1, 2}
fmt.Printf("%+v\n", myPoint) // {x:1 y:2}
myPoint.Shift(5, 5)
fmt.Printf("%+v\n", myPoint) // {x:1 y:1}  DID NOT CHANGE

Notice the state of myPoint remains the same after executing Shift. That is because the method modifies x and y of a copy of the original instance and the copy is not reachable out of the scope of the method.

So, be careful when defining receivers of your methods. If you want to modify the state of the struct instance then use a by reference receiver, otherwise, if you just want to access the state of the struct instance, you can use a by value receiver.

Because using by value receivers makes a copy of the struct instance you might be tempted to use them to implement struct instance copying…

func (p *Point) Shift(x, y int) { p.x, p.y = p.x+x, p.y+y }
// Copy yields a copy of the point
func (p Point) Copy() Point     { return p }

myPoint := Point{1, 2}
itsCopy := myPoint.Copy()
fmt.Printf("%+v\n", myPoint) // {x:1 y:2}
fmt.Printf("%+v\n", itsCopy) // {x:1 y:2}
myPoint.Shift(5, 5)
fmt.Printf("%+v\n", myPoint) // {x:6 y:7}
fmt.Printf("%+v\n", itsCopy) // {x:1 y:2}

Cool! A very cheap and simple implementation of struct instances copying. Yes but you need to take into account one thing: the instance copy is a shallow copy, i.e. it’s a copy of just the struct instance, not of their full (deep) contents. For example, if we add a new field next to the struct Point and a method to set it: LinkTo

type Point struct {
	x, y int
	next *Point
}

func (p *Point) LinkTo(other *Point) {
	p.next = other
}

a by value method receiver will be a shallow copy of the original struct, then the value of the field next, a memory address, will be the same of that of the original.

	pointA := Point{0, 0, nil}
	pointB := Point{10, 10, nil}
	pointA.LinkTo(&pointB)
	aCopy := pointA.Copy()
	fmt.Printf("%+v\n", pointA) // {x:0 y:0 next:0xc00000c030}
	fmt.Printf("%+v\n", aCopy)	// {x:0 y:0 next:0xc00000c030}

notice pointA.next and aCopy.next point to the same address (0xc00000c030) That means if you modify the structure referenced by pointA.next you are also modifying that of aCopy.next

//  ...
pointA.next.x, pointA.next.y = -1, -1
fmt.Printf("%+v\n", pointA.next)	// &{x:-1 y:-1 next:<nil>}
fmt.Printf("%+v\n", aCopy.next)		// &{x:-1 y:-1 next:<nil>}

The receiver does not need to be referred in the method. For example you can write:

func (_ Point) foo() { ... }

Notice the receiver is named _ to indicate it’s not referred in method’s body. This is just a GO convention.

Why defining a method which does not make reference (use) the struct instance? Well, for example, in some situations we might need to write utility functions and to avoid name collisions with other functions in the same package we can attach them as methods of the concerned struct. For example, in the same package you can have

func (_ Point) foo() { ... }
func (_ Point3D) foo() { ... }
func foo() { ... }

without risking name collisions. That is, you can use struct types as something like namespaces…

type myNamespace struct {}
var MyNamespace myNamespace

func (_ myNamespace) foo() { ... }
func (_ myNamespace) bar() { ... }

I’ve said “you can”, not “you should”. Use packages to organize your code.

#6: GO concurrency? Warn: dataraces ahead

GO makes concurrent programming very easy.

Goroutines and channels let you create (maintain and understand)concurrent applications without the cognitive load imposed by other programming languages like C++ and Java.

The concurrency model of GO is rooted on that described by Tony Hoare in his famous Communicating Sequential Processes paper.

So concurrent programming in GO is easy, that’s good news. The bad one: it is equally easy to do it wrong.

You know, GO has its proverbs, and the first one is:

Don’t communicate by sharing memory, share memory by communicating

Well, it is not by accident this is the first of 19 proverbs. In concurrent applications, shared memory is the root of all evil.

Enter dataraces.

A datarace in GO is… well when two or more goroutines access the same memory location concurrently.

Let’s see a (very artificial) example:

a := 1
go func() {
	for {
		a++
		print("+")
	}
}()

for a = 1; a < 100; a++ {
	print(a, " ") // it never prints something >= 100... never?
}

In the above code we have 2 goroutines, the main one and the one we start in the second line with the keyword go. Both goroutines share (by writing and reading from it) the memory location represented by a, then stranger things will happen. For example, even if the last print is in a loop where we might think a will never be greater or equals to 100 we can get the following output:

3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ++++++++++29 31 32 33 34 35 ++37
40 41 42 43 44 45 46 47 48 49 50 +++++++++59 61 62 63 64 65 66 67 68 69 
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 
96 97 98 +++++++++++++++113 

113, we printed 113!

Then if for example, we change print("+") into print("") we will more frequently get outputs like:

1 3 4 5 6 7 8 9

We were not supposed to go until 99?

You can run the code yourself here.

As you can see, this kind of code is very difficult to understand and it usually produces hard to find bugs.

A real life example from the Prometheus code base:

 1|func (db *DB) loadWAL(r *wal.Reader, multiRef map[chunks.HeadSeriesRef]chunks.HeadSeriesRef) (err error) {
 2|	// ...
 3|	go func() {
 4|		defer close(decoded)
 5|		for r.Next() {
 6|			rec := r.Record()
 7|			switch dec.Type(rec) {
 8|			case record.Series:
 9|				series := seriesPool.Get().([]record.RefSeries)[:0]
			series, err = dec.Series(rec, series)
			if err != nil {
				// ...
			}
			decoded <- series
		case record.Samples:
			samples := samplesPool.Get().([]record.RefSample)[:0]
			samples, err = dec.Samples(rec, samples)
			if err != nil {
				// ...
			}
			decoded <- samples
		case record.Tombstones, record.Exemplars:
			// ... 
		default:
			// ...
		}
	}
}()

// ...
	
select {
case err := <-errCh:
	return err
default:
	// ...
	return nil
}
}

Take some minutes to analyze the code.

Did you spot the problem?

It is a very subtle bug due to a datarace. In line 3 a new goroutine is created and it executes the code going from line 3 to line 28. In line 11, thus in the goroutine, there is a condition on the variable err. The variable is supposed to be set in the previous line… but… err was declared as a return value in the function’s signature (line 1) therefore the scope of err is the whole function; then any modification of err, even after the creation of the goroutine, will affect the goroutine’s execution.

These, unwanted, interactions between goroutines may lead to funky bugs. For example, because err is a named return value, the return statements at lines 32 and 35 imply an assignment to err of the returned expression (e.g. return nil is compiled to err = nil; return err) Then, say line 10 (in the goroutine) is executed and assigns some error to err, just after that, line 35 (in the main goroutine) is executed thus now err is nil and then when the condition at line 11 is evaluated, err does not contain the error returned by the function call at line 10

(Did you notice the same problem is also present at lines 17 and 18?)

Free variables captured by-reference in closures are one of the main source of dataraces in GO (there are others we will talk about in a future pill)

So, what can we do to minimize dataraces in our code? A couple of things:

Avoid capturing free variables in goroutines by explicitly passing them as arguments:

a := 1
// prefer
go func(a int) {  ... a ... }(a)
// over
go func() { ... a ... } // captured by-reference

Execute your tests with the -race flag:
```
go test -race
```
At integration and pre-prod stages, build your code with -race flag:
```
go build -race
```
Dataraces manifest themselves randomly, so you may need to execute several times your tests or your code to find them… if you can, do something like:
```
untilfails go test -race
```
or run you application until it fails with
```
untilfails myapp
```
The bash code of untilfails could be something like:
```
while "$@"; do :; done
```
Use a static analysis tool (aka linter) to spot potential dataraces. revive is a good candidate.

#7: Do it! But later…

GO has a very useful instruction: defer With it you can delay the execution of a function until the control flow leaves the surrounding function. For example:

func main() {
	defer println(" World")
	
	print("Hello")
}

will print Hello World

You may wonder what defer can be used for in real world programming. In fact defer has shown itself as an elegant and simple mean for executing cleaning-up actions. Imagine you need to open a file, work on it, and then close it. You can write something like:

f, err := os.Open("filen.ame")
if err != nil {
    // handle err
}

// work on the file

f.Close()

This works fine only under the assumption that // work on the file code segment is unable to take the control flow out of the function (i.e. there is no return until f.Close()) Even if that might be the case, nothing prevents future editions of the code to introduce modifications that will break the assumption.

defer to the rescue. In GO it is idiomatic to write code like:

f, err := os.Open("filen.ame")
if err != nil {
    // handle err
}
defer f.Close()

// work on the file

Just after the file is successfully opened, we schedule its closing to happen at the exit of the function. If you know Java or Python, you can see defer as a kind of finally block but more simple and powerful.

Some things you need to know when using defer:

(of course) You can have multiple defer in the same function:
```
defer println(" World")
defer print("Hello")
```
prints the expected Hello World. Notice deferred functions are executed in the opposite order of their apparition in the source code: deferred functions are stacked and then executed in the order they appear at the top of the stack.
Arguments of the deferred function are evaluated when the defer is evaluated (v.s. evaluated at function execution time).
```
a := 1
defer println("1st deferred print:", a)
a = 2
defer println("2nd deferred print:", a)
a = 3
println("a = ", a)
```
will print:
```
a =  3
2nd deferred print: 2
1st deferred print: 1
```
as you can see, the argument a at each deferred println it’s evaluated at the moment the defer statement is executed. This is very important to understand, otherwise it can lead to errors. For example, if we try to print function’s execution time with something like:
```
func foo() {
  start := time.Now()
  defer println(time.Since(start))
  // ... do something ...
}
```
we will always get an execution time of 0. Why? Simply because the arguments of the deferred function, time.Since(start), are evaluated when the defer statement is executed. Therefore the time elapse between the assignment to start and the evaluation of time.Since(start) is always near to 0. How to fix that? Easy:
```
func foo() {
  start := time.Now()
  defer func() { println(time.Since(start)) }()
}
```
we put the print statement inside a parameter-less lambda, and we defer the execution of that lambda. The value of start is captured inside the lambda and time.Since will be executed at the exit of the foo function.
You can modify the return values of a function in the deferred code.
```
func randomInt() (r int) {
  defer func() { r = -1 }()

  return rand.Intn(1000)
}
```
The above function seems to return a random integer between 0 and 1000 but it always returns -1. That is so a hack that I think I’m not doing the right thing by showing you how to do it.