Preface#

If you have never written a piece of code before, or if you’ve only had an LLM write code for you (and you have been mildly successful at it) then this first bit will be for you.

If you are reading this and you’re not interested in scientific programming, i.e. writing code for computational fluid dynamics, weather simulation, electronic structure theory, orbital mechanics, etc. then you might find this a bit boring? But maybe you’ll find something fun.

Getting started with programming#

Programming is just how we solve a problem in a reliable, reproducible, portable, and hopefully fast way.

Computers are not smart, they are very good at doing what you tell them to do but only at that. They will do exactly what you tell them, even if it is wrong. As of now, a computer itself won’t understand natural language, as in you cannot write a text file that says “Solve the 2D heat equation and plot the answer”. As of right now you could ask an LLM to do this for you and, depending on the language, it might do a very good job at it.

However, even with the advent of LLMs and “vibe coding” there is still a necessity to think about how a program is to be designed and how will it be sustained across time.

Therefore, in order to be a programmer you have to master the idea of domain decomposition. This is simply taking a complex task and distilling it to its core components which make it up. Think of it as a recipe for cookies or a nice risotto. The goal is to bake cookies, but to bake them you need to look for the ingredients, grab the ingredients, grab the measuring devices, measure the ingredients, put the measured ingredients into a container, turn on the oven, get the cookie sheet, etc. you get the gist of it.

Solving a scientific problem is similar to writing a recipe for a new cookie flavor - there is a chance that no one has written this recipe before; maybe you are tasked with making the recipe better. Additionally, you might be working to add this recipe into a larger cookbook which you do not own. You are the baker/chef in this case and you need to make sure that the recipe you write is findable, accessible, reusable, and simple to use. You would like for other people to be able to build upon your recipe, maybe take it and make it into a cookie cake. Without a good way to use your recipe, they would need to rewrite or completely repurpose your recipe making the following baker loose a gigantic amount of time.

Software is very similar. I like to classify programs based on two distinguishing traits:

  • Are they a big black box that takes an input and produces a result based on that input? Your code is “monolithic”
  • Is your code a standalone software that is used by other software to produce a result? Your code is a “library”

In an ideal case, a big monolithic software can be comprised of many individual libraries which each do a certain task. For example, following our cooking/baking analogy: in a professional kitchen (or at home if you force your flatmates, significant other, etc. to work) you have stations - you have someone cutting and prepping the food, you have a dishwasher, you have a saucier, etc. you have individual entities who are independent of each other that are given a specific task. Whereas in a home kitchen there is only you acting as a prep person, the chef, the saucier, the dish washer, and of course, the guest.

Each one of these chefs is a library which was designed to do a certain task. In the kitchen these chefs connect to each other via verbal communication; in the software world you connect via an interface. Which is basically the “way” through which two pieces of code communicate to each other.

The idea behind a library and a program that is comprised by a set of libraries is the concept of modularity. This means that the different pieces of code in your program (your chefs) don’t need to know about the others unless they are exchanging specific information or tasks. For example, the dishwasher does not need to know how to make a Hollandaise but will interact with the saucier when they give the dishwasher the dirty plates from making said Hollandaise.

Why is this good? Your dishwasher is specialized, they are the best in the business and because they are an individual, anyone can poach him and have him do the exact same task equally as good in another kitchen. The analogy here is not as good, since you’d loose the chef. In the software world you just make a copy of this library and use it in your code. You don’t need to deprive the other program from their dishwasher.

A great example of a modular software is the LLVM set of libraries, which are used everywhere in the planet to develop new compilers, and a multitude of tools.

In the end, all it takes is to distill down your program into a set of simple “things” that need to happen in order for your problem to be solved. Think “how do I bake this cake?”

Designing your program#

Before you even lay down a single line of code (which we haven’t gone into yet) I want to highlight the importance of design and thinking for the future. In academia it is a bit difficult to do this, since you are incentivized to do things as fast as possible (but good) in order to achieve your main goal: getting the grant, publishing the paper, graduating, etc. This complicates scientific software development a bit.

But a little bit of thinking can save you (and future collaborators) a lot of woes regarding the code you and anyone else writes. Why?

Have you ever tried to read the old recipe book your family members left behind? The paper could have been degraded by huimdity, the hand writing is difficult to read, there are no measures for how much to add of things, etc. This is mostly because the recipe was originally written for the person that was cooking it every now and then. When they wrote it down, they didn’t think “I’d better write down exactly how many carrots I need”, because in their mind they always knew how many carrots they needed per person in the recipe. Maybe the single comment in the recipe will be “this feeds 4 people”. The “interface” between this recipe and you is unideal. How would you like to see a recipe? My main wish list is:

  • I don’t want a lengthy introduction of why this was your mum’s favourite recipe
  • I want quantities and how much this recipe makes
  • I want to know what are the caveats of it; what if my oven does not work, can I use the stove?
  • I want to know quickly how long it will take

These same ideas apply to how an interface between two programs has to be written. It has to be easy to read, interpret, use and quickly know the limitations of it. Can I bake cookies with this recipe? No. Ok, let’s move on.

So before we go into some programming essentials, think of it this way: you are writing code that has the potential to be used by anyone on the planet. Make it such that it is easy to use.