AVR is a family of 8-bit RISC microcontrollers developed by Atmel and is well known for being used on many Arduinos. AVR is also the name of the instruction set and architecture used by these microcontrollers. The goal is to write a program in AVR assembly, assemble it into a .hex file, and upload it using AVRDUDE to an Arduino Uno with an atmega328p microcontroller. I hope that this post will provide a quick introduction to the AVR architecture and assembly language, as well as some of what goes on in an assembler.
AVR Assembly Programming
To start out, I researched how to program an Arduino using only AVR assembly and it turns out it isn’t too hard.
Atmel provides an assembler (documentation, GitHub)
that converts assembly into .hex files, and the .hex files can be uploaded to the
Arduino using a utility called AVRDUDE.
“AVRDUDE is a utility to download/upload/manipulate the ROM and EEPROM contents of AVR
microcontrollers using the in-system programming technique (ISP).”
The AVR assembler and AVRDUDE are generally available as the avra
and avrdude
packages on linux.
The first step when writing assembly is to find an include file for the microcontroller you want to program.
One for the atmega328p is available here.
This gives names to the various registers that control I/O, timers, and other features.
It isn’t required, but the alternative is reading through the microcontroller datasheet.
Say we want to turn on the Arduino’s LED.
On the Arduino Uno, an LED is connected to pin 13, which is connected to the PB5 pin on the microcontroller.
First, we need to configure the pin to be an output.
This is done by writing a 1 to the 5th bit of the DDRB
and PortB
registers.
Then to keep the program counter from going crazy, we just jump to the same place over and over again. The resulting program is as follows:
.include "./m328Pdef.inc"
ldi r16,0b00100000
out DDRB,r16 ; Set data direction to out
out PortB,r16 ; Set pins high or low
Start:
rjmp Start
To make the LED blink requires some more code. It can be done by counting up to a certain number, turning on the LED, resetting the counter, counting up again, and then turning off the LED. This doesn’t require accessing RAM so each instruction takes one clock cycle (not that it matters for a counter like this). The microprocessor operates at around 16Mhz, so if the loop takes 7 cycles it needs to be able to count to around 2,000,000 to toggle every second.
To make this counter we will use several instructions that rely on the status register. This register stores 8 flags:
Bit | Description |
---|---|
0 | C Carry – also the borrow flag |
1 | Z Zero – 1 when the result is zero. |
2 | N Negative – MSB of result |
3 | V Overflow – two’s complement overflow |
4 | S sign – A XOR B. For signed tests |
5 | H Half-carry – Internal carry, for BCD |
6 | T Bit copy – used in bit store/load instructions |
7 | I Interrupt – Set if interrupts are enabled |
The carry flag is set when the result of an add or sub instructions when they carry or borrow on the last bit. The adc and sbc instructions add the two numbers plus the carry bit. Comparisons are done with the cp instruction, which modifies all of the bits except T and I. There are many branch and skip instructions that use these bits.
We need a 22 bit number to count to 2 million so it will have to be stored over 3 of the 8 bit registers. When the upper 8 bits reach a certain value, it resets all three counter registers and starts over.
.include "./m328Pdef.inc"
.equ ledMask = 0b00100000
.def zero = r1 ; r1 should always be zero
.def numLow = r16
.def numMid = r17
.def numUpper = r18
.def one = r19
.def temp = r20
ldi one, 1 ; initialize our "one" register
ldi temp, 0 ; make sure zero register is 0
mov zero, temp
ldi temp, ledMask
out DDRB, temp ; Set led pin data direction to out
out PortB, temp ; Set led pin to high
Increment:
add numLow, one ; increment lower by one
adc numMid, zero ; add zero + carry bit to mid
adc numUpper, zero ; same for upper
cpi numUpper, 0b00011111 ; half of full register
breq TurnOn ; branch if numUpper is Same or Higher than value in previous instruction
rjmp Increment ; else increment again
TurnOn:
ldi temp, ledMask
out PortB, temp ; Set pin to high
clr numLow
clr numMid
clr numUpper
Increment2:
add numLow, one ; increment lower by one
adc numMid, zero ; add zero + carry bit to mid
adc numUpper, zero ; same for upper
cpi numUpper, 0b00011111 ; half of full register
breq TurnOff ; branch if numUpper is Same or Higher than value in previous instruction
rjmp Increment2 ; else increment again
TurnOff:
mov temp, zero
out PortB, temp ; Set pin to low
clr numLow
clr numMid
clr numUpper
rjmp Increment ; start incrementing again
This felt like quite a bit of assembly to do something so simple, so I was curious about what the classic Arduino blink program compiles to.
It turns out it compiles to an assembly program with 385 instructions. A lot of this is overhead due to using function calls to set pin outputs and do delays. The delay function might also set up a timer or counter of some sort. There are quite a few places where it stores and reads registers to and from RAM, which take two clock cycles instead of the usual one. I don’t know why it’s so long, but it shows why writing assembly can have big performance and size improvements. Of course in this case none of that matters because these delays are supposed to be slow.
After the assembly is written or generated, it is assembled into a file in the Intel Hex format that defines what memory in the microcontroller should be set to. It starts with a byte representing how long the data sections is, then a two-byte address for where the data should start in memory. Next is a record type that says if this defines data or memory addresses or the end of file. For my assembler, I only need to define data with instructions. Next is a data field which is each instruction in little endian. Finally, is a checksum that is calculated by adding up all the instructions, taking the twos complement of that, and taking only the last byte.
Custom Assembler
Atmel provides a nice assembler, but it doesn’t support multiline comments. I decided I had to have multiline comments and that I don’t care about most of the instructions, so I made my own AVR assembler. I did this using python as the programming language, regex for parsing, and json to define instruction formats. This assembler is available here on GitHub.
The python program is split into several steps. I’ll explain them here without really going many of the coding specifics. First, the json file with instruction formats is parsed using the json library. Then the assembly file is read in one line at a time and run through the preprocessor step. If a file is included, it treats it as if it were part of the original file. The preprocess step also saves label values, strips out comments, trims whitespace, and converts everything to lowercase. This is the line that strips out single line comments:
line = re.sub(r";.*|\/\/.*|\/\*.*\*\/", "", line)
This is regular expression that matches “;” or “//” and anything after that or “/* */” and anything in between and replaces them with an empty string. AVR assembly is case-insensitive so everything converted to lower case. The processed instructions are stored as instances of my Instruction class in an array. To give an example of this process, the simple ledOn program:
.include "./m328Pdef.inc"
ldi r16,0b00100000
out DDRB,r16 ; Set data direction to out
out PortB,r16 ; Set pins high or low
Start:
rjmp Start
is converted to
program[] = |
---|
ldi r16,0b00100000 |
out ddrb,r16 |
out portb,r16 |
rjmp Start |
labels[] = |
|
---|---|
Key | Value |
ddrb |
0x04 |
portb |
0x05 |
start |
4 |
tons of stuff from the .inc file |
Next the program loops through all the lines in program and generates their bytecodes.
This takes several steps for each instruction.
First the operation (ldi
, out
, add
, etc) is parsed and looked up in a json file.
This defines which format to use and can define bits to go in certain fields.
For example, the ldi
instruction is defined like this:
"ldi": {
"format": "register-immediate",
"opcode": "0b1110"
}
Next, it uses the “format” field to look up the instruction format in the json file. This is separate because many instructions can use the same format:
"register-immediate": {
"regex": ".*\\s(?P\\w+),\\s*(?P\\w+)",
"fields": {
"opcode": {
"bits": [15,14,13,12]
},
"Rd": {
"bits": [7,6,5,4]
},
"K": {
"bits": [11,10,9,8,3,2,1,0]
}
}
}
This format definition provides regex with named match groups to parse the rest of the instruction.
It also defines what the fields in the bytecode are and which bits they fill.
AVR assembly is a lot more annoying than MIPS when it comes to instruction formats and fields.
MIPS has 4 instruction types/formats I think, where AVR has about 6 main formats and a lot of formats unique to one or two instructions.
Most instructions are 16 bits long, but several are 32 bits.
Many instructions also have split fields.
For example, the ldi (load immediate) instruction has the format ldi Rd,K
where K
is an 8 bit number and Rd
is a 4 bit register address.
These are stored in the bytecode in this format:
1110 | KKKK | dddd | KKKK |
Another weird instruction that took me a while to figure it is clr (clear register). It puts the 5 bit register address in the bytecode twice. The documentation showed this format as
0010 | 01dd | dddd | dddd |
So I assumed it was the 5 bit address followed by the 5 bit address again, but it actually meant that the bits are split up like this:
0010 | 01dd | dddd | dddd |
Using the format information and the instruction ldi r16,0b00100000
, a fieldValues
object is created that looks like this:
fieldValues = {
opcode: "0b1110"
Rd: "r16"
K: "0b00100000"
}
These fields are then run through an evalFields
function that tries to convert them into integers.
Binary, hex and decimal strings are converted directly to integers, but if that fails, it checks if they are labels or general purpose registers.
If they are labels, it replaces the string with the label value and repeats the process because the value of a label could be another label or string.
If it is a general purpose register (r0-r31), the address bits can vary depending on the instruction.
Instructions can have register addresses stored as 3, 4, or 5 bit numbers, which can access registers r16-23, r16-31, and r0-31 respectively.
So if the field was 4 bits long, r16 would be converted to 0, but if it was 5 bits long, it would be converted to 16.
In the case of our ldi
instruction, the fields evaluate to this:
fieldValues = {
opcode: 14
Rd: 16
K: 32
}
Now that we know the instruction format and the values that will go in each field, we can generate the bytecode.
The intel hex format expects the instruction byte order to be little endian so I swap the two halves of the instruction.
This gives: 0b0000000011100010
or 0x00E2
.
All the instructions are combined ( 0x00E204B905B9FFCF
), the number of bytes is calculated ( 0x08
), the checksum is calculated using the technique mentioned before ( 0xCD
), and it is all combined into a line in the hex file.
I also add the end of file code ( 0x00000001FF
) and write everything to a file:
:0800000000E204B905B9FFCFCD
:00000001FF
This is then uploaded to the Arduino using this command: avrdude -p m328p -c arduino -P /dev/ttyUSB0 -b 115200 -U flash:w:ledOn.hex
And now we have successfully reinvented the wheel. This is definitely a lot of work for something as simple as making an Arduino blink. The final project is available on GitHub at https://github.com/CalebJ2/avr-assembler. It supports around 40 instructions, multi-line comments, and several preprocessor directives. Although it does have a lot of limitations compared to Atmel’s assembler. Some assembly languages/assemblers like MIPS are very straight forward and only support a few preprocessor features like macros and .include while AVR has a list of preprocessor directives and supports things like math operations.
In conclusion, this project was interesting because I learned a lot about AVR assembly and the assembly process and hopefully if anyone reads this they did too. I’m sure there are better ways of writing an assembler so take that part with a grain of salt. It was interesting seeing how AVR works and I was happy to have made an assembler that supports basic directives like .include and enough instructions to be useful.
References
- https://github.com/CalebJ2/avr-assembler
- https://www.microchip.com/webdoc/avrassembler/
- https://en.wikipedia.org/wiki/Atmel_AVR_instruction_set
- https://en.wikipedia.org/wiki/Intel_HEX
- http://www.nongnu.org/avrdude/
- https://www.sparkfun.com/datasheets/Components/SMD/ATMega328.pdf
- https://github.com/lpodkalicki/blog/blob/master/avr/asm/include/m328Pdef.inc
- https://learn.sparkfun.com/tutorials/how-to-install-ch340-drivers/all
- https://www.arduino.cc/en/uploads/Main/Arduino_Uno_Rev3-schematic.pdf
- https://regex101.com
- https://pythonhosted.org/bitstring/
- https://www.instructables.com/id/Command-Line-AVR-Tutorials/
- https://www.codeproject.com/Articles/712610/AVR-Assembler
- http://nuft.github.io/avr/2015/08/02/avr-hex-programming.html. This is an excellent post on converting assembly to hex.