Programming is something everyone using a computer really should be able to do and in my opinion it's a small crime it's been made so difficult on modern devices. It's also much easier than you're often told it is, I learned when I was about 12 years old. Degrees are more for dealing with HR and getting dragged through ideas you think are unimportant, hopefully this document can be a substitute for the latter. It's based on years of teaching children and adults to program from scratch, helping junior engineers contribute to projects I've been working on, and things I've learned in the course for my own career.
Note on books
If you need a book that isn't available for free online and can't afford it or borrow it I'd recommend going to libgen (the domain changes so you'll have to google it) rather than do without. This is federal crime in the US and it's better to have a hard copy IMO but you can wait until you have a good job for that if you need to. Chances are though I'll have the book your looking for or something similar (if I haven't already lent it out to someone else.) The O'Riely "animal books" are very pricey but they're worth every penny. It's rare one of them is a dud and they're usually the best thing you'll find for learning a new language, tool, or framework.
The Basics
In my experience beginner programmers usually get completely blocked from learning because of a couple of things that seem trivial to people who have been programming for a while. These are:
- Setting up a suitable environment
- Programming language syntax
- Problem decomposition and debugging
Some very simple stuff to get you started
I've found that scheme, because of its extreme minimalism, tends to make the first two easy to overcome. The very popular book "Structure and Interpretation of Computer Programs" can help begin with the third one. You don't have to read the whole book, if you make it through chapter 1 or maybe 2 that should be enough to get you going. Scheme (and most Lisps) are very rarely used for anything other than teaching but they're exceptionally good for that in my opinion just because it lets you remove all the distractions more complex languages create.
So to get started:
- Get an environment: Either Biwa scheme Which you can use without installing anything (de focus the text input to execute your code) or MIT/GNU Scheme If you want something local. MIT/GNU also includes a somewhat bizarre Emacs Like editor if you want to get started down that rabbit hole.
- Get the book: which is available for free in HTML on MIT's website Or you can borrow my copy. I tend to loan it out a lot but I should have a copy somewhere around my house.
- Now just go through chapter one and maybe two. The goal here is to get you used to thinking about and manipulating procedures and data structures. Once you have internalized that you can move on to a more useful language like Python or C. The book covers at least at a high level almost every concept you'd see in a normal CS degree so it's not a bad idea to at least skim the rest of it but don't feel bad moving on to something else when you're ready. Also don't get too hung up on scheme, it's an unusual language.
All of that isn't to say scheme is useless even for back end programming. My two favorite programming forums: Textboard and Hacker News are both written in a scheme dialect. The latter is written in PG's personal scheme-like language: arc which is specifically intended for web development.
Debugging tips
When you inevitably get stuck in scheme or any other language remember these rules
- You're in control of the behavior of your program. If you want to see a value for a variable at some part Don't be afraid to just cram random print/display calls into it. This is the fastest and most useful form of debugging called "printf debugging."
- Sometimes when working with new algorithms you can get hung up on the logic. It can be really helpful to just talk through what you're thinking to yourself (or an imaginary person.) This is called "rubber duck" debugging because some people like to talk to a rubber duck when they do it.
- For very difficult bugs in large programs you often need to resort to something called a "binary search." To perform a binary search:
- Find a determining question that splits your problem space roughly in half
- In the remaining half repeat the search until your problem is obvious by inspection
This is something beginners often struggle with and the ability to do it is something most people watch for in interviews. It's one of the most useful ways to debug something
- Don't be afraid to ask for help. You can use my phone number or email address (swiley@swiley.net). They both go to the same place and you're not the only person asking for help learning this stuff. This is another thing people at all skill levels (including me) tend to avoid longer than they should.
- Also don't underestimate the utility of modern large language models. GPT davinci used in the "gptchat" service is very good at answering simple questions and putting together some "MVP/tutorial" code to help you get started with new libraries/languages/projects. If you want models you can self-host personally I keep a copy of Codegen Loaded on my large machine and can share a copy of my server code for this. It's not nearly as good but can often generate some of the more annoying boilerplate for you. GPT-neox can be self hosted but the low parameter models are kind of crap and the high parameter models still can't do ZSL and need upwards of 80GB of ram just to run forward.
Some slightly more useful stuff
If you're already comfortable with recursion and Turing complete languages you might try Python. Many people erroneously think of this as a beginners language but I would argue it has some of the most complex and confusing syntax so I usually tell people to try scheme first just to internalize the basics of programming. Once you have a good intuition for this though you can write complex software very quickly in it since the language runtime does so much for you. The general culture that's developed around the language is very healthy and people almost every use it for all kinds of things. It's a good language to know and one of the best for experimenting with new ideas.
- "Python cookbook" is a very commonly recommended intro to python book. I have a copy but a friend in Lynchburg is currently borrowing it. When she gives it back I can lend it to you if you want.
- The official tutorial looks almost as good as I remember the book being. Definitely give that a look.
- The Getting started page looks like it will walk you through installing python which can be complicated on Windows so you might want to have a look at that too. If you're on a Debian based Linux OS just run
apt install python3
and on Alpine or iSH you can run apk add python3
Another language that's nice when you're starting out is C. It doesn't have many built-in data structures like python but the simplicity can help focus you on the underlying algorithms. It also runs much faster and on more hardware than Python. C used to be one of the more common video game development languages and it's one of the first I learned. If you remember the Wii game I wrote around when I first came to Evergreen and installed on Gabe's Wii that was written in C.
- If you're on Windows grab Msys2 to get the GNU tools you need for producing binaries. If you're on Linux just run
apt install gcc
- I learned from the book "C in easy steps." I no longer have a copy since I lent it out and it was never returned. It's kind of shallow but extremely well organized and great for beginners. I paid $7 for it at Barns and Noble back in 2006 or so.
- The Canonical book is "The C programming language" which served as the spec until ANSI formalized it. This ones more dense but more complete and also introduces you to some of the culture around the language which influences almost all of the system software available today.
You may also want to have a look at either C# or Java. I did a lot of C# when I was freelancing but don't have a good tutorial to give you. Let me know if you want more info on this.
SQL
No matter which language you pick you'll usually want a database for anything non-trivial. I'm extremely partial to SQLite. Not only is it fast, trivially easy to set up, trivially easy to work with, portable, and popular, many languages have built-in support for it (including python.) This tutorial looks decent. Most people find the SQL basics pretty easy to pick up, let me know if you need more resources for that though.
You'll want to install git This not only gives you a version control and collaboration tool but also provides you with a Unix shell (bash) which is good to get acquainted with.
General Software Engineering Resources
The most common problem decomposition technique used in corporate software is called "Object Oriented" decomposition. When you're designing software this way you try to break down your problem into data structures called "objects" that function by "sending messages" to each other. These objects have a list of internal "fields" that may or may not be visible to each other and "methods" which are functions that operate on object instances via a "receiver." That's a very poor explanation, other people have much better ones.
You'll also want to be able to work with databases.
Here is a short list of what I would consider to be the most important books on software engineering
- Martin Fowler's "UML Distilled"
- The Practice of Programming This is a great introduction to problem decomposition
- You also need to know SQL and normalization. I really don't have a good resource for this off the top of my head. The book the gave us at the university was crap. Martin Fowler has some good stuff on it if you check his blog and I'm sure there are some good OCW lectures if you search on YouTube. Also the Wikipedia Article isn't bad. The Relational Algebra article is really rough so I'll try and find something better but this is already getting pretty long.
Extra Academic Stuff
Don't be scared by the math, CS people tend to be crap at it and it's all pretty easy.
- If there's one book in this you should read it's The data structures and algorithms book. This stuff comes up in interviews all the time and going through it will make you a way better programmer.
- Linear Algebra is just good to know. Pick any applied book on it. I went through the whole MIT OCW lectures in addition to taking an advanced class on it. It's extremely useful and at the very least the online lectures are great sleep aids! 3blue1brown also has a fantastic playlist that focuses on the intuition. If you do any kind of ML, data modeling, graphics, robotics, controls etc you'll want this. I used it all the time in the robotics club.
- The Art of Computer Programming (I think I have the complete set, so far, set if you want to borrow part of it.) It's intended to essentially be an encyclopedia of abstract CS topics.
Some specifics
Web backend
You can write web backend code in just about any language. Most of the backend code I've written for my personal projects has been in the Unix shell language "bash" or in C but both of these are somewhat unusual. Some people like to use Python either via the "Flask" framework or "WSGI."
Most languages besides go will need a web server to forward requests to your application, here are some useful resources:
- One of the most popular web servers is "Apache HTTP." Here is their Getting started guide. On Debian you should be able to
apt install Apache
and get a moderately sane default config. On Alpine or iSH run apk add apache2
- Alternatively you can use the server I use thttpd. I prefer it because there's much less to configure but you'll probably need to run it from Linux. On Debian its again just
apt install thttpd
- CGI programming using shell (you can use any language though such as Python)
- Web programming using GO (another common "back end" language, I wrote my final project in this):
A warning about contemporary server side JavaScript (node or npm) and PHP
Abstraction is one of the most powerful tools a programmer has and there's a common tenancy to abuse it. For whatever reason this seems to happen most often among JavaScript and PHP programmers. The results are catastrophic. The large dependency graphs expose projects to "supply chain" attacks where malicious behavior is added to one of the many libraries they're consuming. This is a particular case of a more general problem they face where their dependencies change behavior out from underneath them.
Personally I have a strict rule of not working in PHP (you can see this on my "hire me" page on my main website) and generally I try to avoid large JavaScript projects. Don't let my cynicism turn you off to it if you're feeling brave just be prepared for what you're signing up for. I know people who have been blindsided by this.
Systems Programming
Everything has to sit on top of a machine at some point. The people who deal with the problems this creates are called "systems programmers." It might sound niche but some of the most lucrative contracts I got while freelancing came from this space. I would think of it as two separate but related things: "Embedded systems" where a computer is physically part of some application (such as a controller in an industrial automation) and more traditional systems (such as end user workstations or servers.) As computers get cheaper these two become more and more similar. Modern embedded system firmware is now often generated using the same tools used to generate OS images for servers. Bellow are some resources for understanding these:
- Arduino Is an increasingly common and very easy to use C++ IDE for programming cheap microcontroller boards for embedded systems. If you want a cheap board I can give you one, I have piles of hardware and have built half a dozen small robots with them over the years. It's also a fun way to get into C++ if you need some motivation.
- Plan 9 is an extremely Bizarre operating system but the code is very easy to read and the abstractions are usually very well chosen. While Nemo struggles with English his book is how I was introduced to operating system design and it is IMO one of the better introductions.
- You can build a complete Linux based Operating system with just Busybox and The Linux kernel itself. Special purpose Linux images were an extremely common and easy request I got while freelancing. It seems to be common and lucrative enough that many (most?) of the busybox contributors are themselves freelancing embedded systems consultants. The busybox source code is also very easy to read, I'd recommend it if you want a good understanding of the Linux userspace.
- Buildroot will generate Linux images entirely from end to end with just the make tool. It can be useful if you need more utilities than busybox provides or need to build more complex services such as Xorg and EUDev.
Deployment Automation
Now that everyone is moving to hosted ephemeral VMs for services the new hot thing is something called "deployment automation." The idea here is that you can have all of the underlying infrastructure for your application along with installation instructions written out in code. This way if something goes wrong you can just run a program and spin up a new instance, you can even spin up a private instance for testing new features, and you can also easily discuss and approve changes using your normal code review process. You can even set up automatons to automatically rebuild your entire testing environment as your team develops new features. This lets you catch bugs early.
- Terraform lets you describe the resources you want and will mutate the world to make it match.
- Ansible is a more procedural way to do something similar. It tends to constrain you less without giving up the idempotency.
- Puppet is actual garbage but lots of people like it and it can be good to have on your resume.
Notes about Computer Hardware, Programming Environments and "soft skills"
Text Editors
Your text editor will become familiar enough that it almost becomes an extension of your body. Because of this most programmers tend to have extremely strong opinions about their editor. I'd recommend trying a few and keeping an open mind and then getting very good at a couple. Keep in mind that the people you work with often force you to use their favorite editor so you may be forced to learn a new one.
In the past tools for auto complete, syntax highlighting, and symbol lookup used to be coupled with particular editors but that era has almost completely ended. Most editors speak common protocols/formats for these via language servers and tags files making it easy to move between editors without having to give up language support.
Here is a list of common text editors:
- vim is the editor I use mostly because dad installed it on my machine when I was young. It has a large array of motion commands, a powerful plugin and scripting system and decent syntax highlighting. It has both a graphical version and VTE version so you can easily use it on remote machines. It's more or less the "de facto" Unix text editor.
- Emacs Is another common programming editor that shares some ideas with Vim. It was written around the same time but is built around a special dialect of Lisp.
- Nano Is a minimal clone of Emacs that's really not much more capable than Notepad.exe on Windows. Linus Torvalds used an early version of this to write Linux.
- Visual studio code (not to be confused with Visual Studio) is a web based editor that can be run in a special version of Chromium. It's essentially Emacs but with JavaScript and a slightly more intuitive UI.
Hardware and Environments
Via the Church Turing thesis we know any computer can simulate any other computer provided it has enough time and ram. Modern computers are usually even more compatible than this would suggest so in practice computer hardware doesn't really matter. I'm personally using a refurbished "thinkpad 11e" right now that I think I paid maybe $80 for (including shipping.)
Nonetheless there are some things that can make it easier and I'll list bellow what I've found to be most helpful
- If you insist on doing everything on an iOS device you can use iSH This is a rewrite of the Linux kernel API and an X86 emulator for the Alpine userspace that gives you access to most Linux tools on iOS. It can be very slow though.
- Run Linux. Windows is the result of a pretty nasty consumer oriented "software market" vision Bill Gates had and the result is something that makes controlling your machine and developing software extraordinarily difficult.
- And a second less abstract reason to run Linux: Window absolutely abuses the hardware. For the same task on a descent Linux based environment you might need twice the ram and many times the CPU and way way way more I/O bandwidth.
- Avoid "non-free" software in general. By "non-free" I mean software that is shared without source code. For more information see FSF. This includes Apple software, especially now that the visionaries there seem to have been driven out by accountants.
- Try to avoid mouse driven interfaces. They're slow, difficult to communicate and reason about, and they tend to cause "RMI" much faster than keyboard driven user interfaces. Personally I use an editor called "VIM" for all my text editing/programming. The learning curve for it is extremely steep but it's more than worth it.
-
- Use version control Especially distributed version control such as git or mercurial. Almost everyone these days uses git and you'll almost certainly be expected to use it or something similar.
- Get familiar with a Unix shell (bash, ash, fish etc.) It's everywhere and your life will be much easier if you can automate things
- Similar to the last, learn regex. They're just super useful in all kinds of weird places.
- I like having a separate physical space (preferably a separate machine as well) for when I'm trying to be productive as opposed to goofing off. Try not to cross contaminate the too different behaviors as managing your emotional state is really the biggest challenge when you're doing knowledge work like this.
- Don't be afraid to create and destroy VMs. Especially if you're on Windows. Virtual boxis a great tool for that on Windows (since Qemu doesn't work well there) and the UI is super intuitive.
- It can be super useful to have an always on Linux computer with a public IP address. If you have a good ISP and know how to configure your home network you can use a spare computer for this (vpn.swiley.net points to my house for example) but it can also be nice to just have someone else set one up (the machine your reading this on: mail.swiley.net, is one such machine.) I personally use Linenode for this, my dad likes Digital Ocean. Sometimes you can find cheaper places. Stay away from AWS and Oracle unless you want to wake up one day with no money for some confusing reason.
- TLDP has probably the best bash tutorial available. You'll likely run into bash at some point and should know it.
Some Linux Distros
Here are some Linux distros you can run in Virtual Box or probably install directly on your machine. If you need a thumb drive with an image on it I can probably burn you one or you can use Rufus. I've never used it but I've heard good things from people who have. If you do this make sure you back up Everything on your computer.
There isn't just one "Linux" OS because Linux on its own isn't an OS and so these are really large collaborations between many projects. On most Linux distros both the application and system software is really thought of as a single unit distributed and packaged together by some "maintainers."
- Mint is a very popular "just works" distro. I think the people maintaining it are a bit sloppy so I don't use it but it's probably not bad if you're just starting out.
- Alpine This is what I use on most devices. It's a bit goofy but I like the ideas the people behind it have
- Debian Is what a lot of popular distros are based on. Beginners don't like it because they don't include non-free WiFi firmware for ideological reasons.
- Devuan Is a fork of Debian that was caused by some political issues (mostly brought about by IBM/Microsoft.) I'm currently moving a lot of my machines to it but it can be a bit harry for new people.
Soft skills
- You should know the git feature branch workflow. This is the most common method of intra-organizational source code change management. The other common DVCS workflow is the "patch email" workflow which can be good to know but isn't something junior engineers use often unless they're contributing to some open source projects.
- It's a good idea to be familiar with "scrum/kanban" style collaboration. You can read the documentation included in my kanban implementation and I'm sure there are much better explanations. The main idea is you write "tickets" when people decide work needs to get done. During planning/grooming the team might assign "points" to these which is a relative prime number guess at the difficulty. Every two weeks people get assigned tickets putting them in a "TODO" state. Every morning during a status update meeting people talk about what they're doing with the tickets in the "doing" state. When the code for a ticket is tested and merged it's moved to the "done" state. That's pretty much all there is to it, some people over think it.
- You should know how to write flow charts, ERDs and UML class diagrams. All of these are very useful for communicating with other engineers, managers, and clients. When I was freelancing having the project modeled in UML was a requirement. Either the client had to bring me their own documentation or I'd bill them hourly to do it for them.
Quick list of shortcuts
The things that made the biggest changes for me were:
- trying to make at least one meaningful contribution to my side projects every day
- Using Linux as my home OS and doing as much as possible in bash
- Learning database normalization and object oriented decomposition
- The data structures and algorithms course I took
- Test driven development
- Having a separate work space that I don't mentally contaminate with goofing off
Don't hesitate to ask Questions!