Chapter 3 Instructions : Language of the Computer¶
回忆起了被bhh支配的恐惧
Abstract
- Introduction
- Operations of the Computer Hardware
- Operands of the Computer Hardware
- Representing Instructions in the Computer
- Logical Operation
- Instructions for Making Decisions
- Supporting Procedures in Computer Hardware
- Communicating with People
- MIPS Addressing for 32 Bit Immediates and Addresses
Introduction¶
Instruction Characteristics
- Language of the machine
- Instructions (Statement)
- Instruction Set (Syntax)
指令集基本的结构:Operation 操作; Operand 操作数
- 不同指令集,指令的编码可以不同。如 000 表示加法,这也叫指令的 Encoding.
- 操作数位宽可以不同,可以是立即数/寄存器/内存。
冯诺依曼架构:指令由数字方式被存储,可以进行读写。
Operations of the Computer Hardware¶
Design Principle
- 简单源于规整
- 更少则更快
- 加快常用指令
- 优秀的设计需要适当的折中
Operands of the Computer Hardware¶
- Arithmetic instructions operands must be registers or immediate
- 32 registers in RISC-V
- 64 bits for each register in RISC-V
- Design Principle 2
- Smaller is faster(越少越快,寄存器个数一般不超过32个)
-
RISC-v register operand
- Size is 64 bits, which named doubleword
Name Register Name Usage Preserved or call? x0 0 The constant value 0常量0 n.a. x1(ra) 1 Return address(link register)返回地址 yes x2(sp) 2 Stack pointer栈指针 yes x3(gp) 3 Global pointer全局指针(可以任意指向栈上存储的变量) yes x4(tp) 4 Thread pointer yes x5-x7 5-7 Temporaries no x8-x9 8-9 Saved yes x10-x17 10-17 Arguments/results no x18-x27 18-27 Saved yes x28-x31 28-31 Temporaries now
为什么寄存器 x0
一直为 0
Make the common fast. 因为经常有 0 参与计算,将其存在一个寄存器中,便于计算。
Memory Operands¶
- Advantage
- Could save much more data
- Save complex data structures
- Arrays and structures
-
Data transfer instructions
- Load: from memory to register; load doubleword ( ld )
- Store: from register to memory; store doubleword( sd )
-
Memory is byte addressed.
- RISC-V is Little Endian
- RISC-V dose not require words to be aligned in memory (节省空间)
- words align: 一个字是 4 字节,我们要求字的起始地址一定要是 4 的倍数。
Big/Little endian
- Big endian:
- 数据的高字节存放在低地址
- 数据的低字节存放在高地址
- PowerPC
- Little endian:
- 数据的高字节存放在高地址
- 数据的低字节存放在低地址
- RISC-V
Little vs Big Endian
Memory Alignment
第一个是对齐的,第二个是不对齐的。
Memory Operand Example
(默认数组是双字的, h inx21
, base address of A in x22
) 翻译为 RISC-V 代码得到
地址是以 byte 为单位,所以要偏移 \(8\times 8=64\) bytes.load
和 store
是唯二可以访问存储器的指令。
how to represent g = h + A[i]
Registers vs. Memory¶
- Registers are faster to access than memory
- Operating on memory data requires loads and stores
- Compiler must use registers for variables as much as possible
编译器尽量使用寄存器存变量。只有在寄存器不够用时,才会把不太用的值放回内存。
Constant or Immediate Operands(立即数)¶
Immediate: Other method for adding constant
- Avoids the load instruction
- Offer versions of the instruction
e.g.addi x22, x22, 4
Summary
- 为什么内存是 \(2^{61}\) 个 doublewords?
可以表示的地址有这么多,因为我们以 64 位寄存器为基址,可以表示的双字就是 \(2^{64}/2^3=2^{61}\) (这里 \(2^3\) 表示 8 个字节,即双字). 即我们的load
指令可以访问的范围有这么大。
Non-Leaf Procedures¶
- Procedures that call other procedures
- For nested call, caller needs to save on the stack:
- Its return address
- Any arguments and temporaries needed after the call
- Restore from the stack after the call
Signed and Unsigned Number¶
本部分已在chapter 2中记录,此处不再赘述
Representing Instructions in the Computer¶
- All information in computer consists of binary bits.
- Instructions are encoded in binary
called machine code (机器码) - Mapping registers into numbers
0 for registerx0
, 31 for registerx31
. e.t.c. - RISC-V instructions
32 位指令编码。所有指令都是规则化的,即一部分是 opcode, 一部分是 operands 等等。
Translating assembly into machine instruction
Summary of RISC-V architecture
R-format¶
- opcode: operaion code操作码
- rd: destination register number目的操作数寄存器,用于存储操作结果
- funct3: 3-bit function code(additional opcode)额外的操作码字段 例如,我们加法减法可以做成一个 opcode, 然后利用 funct 进行选择。
- rs1/rs2: the first/second source register number
-
funct7: 7-bit function code(additional opcode)额外的操作码字段
-
All instructions in RISC-V have the same length
- Conflict:same length←--→single instruction
I-format¶
- Immediate arithmetic and load instructions
e.g.addi
,ld
- rs1: source or base address register number
- immediate: constant operand, or offset added to base
address
将 rs2, funct7 合并了,得到 12 位立即数
S-format¶
- rs1: base address register number
- rs2: source opearand register number
- immediate:
- Split so that rs1 and rs2 fields always in the same place.
Example
Stored Program Computer
- Instructions represented in binary, like data.
- Instructions and data stored in memory.
- Programs can operate on programs. e.g. compiplers, linkers.
- Binary compatibility allows compiled programs to work on different computers
summary of R-, I-, S-type instruction format¶
这里可以很清楚的看到三种不同的指令尽量把共性的部分都对齐了
Logical Operations¶
逻辑操作可以用于提取字内的字段
Operation | C | Java | RISC-V |
---|---|---|---|
Shift left | << | << | slli |
Shift right | >> | >>> | srli |
Bit-by-by AND | & | & | and, andi |
Bit-by-by OR | | | | | or, ori |
Bit-by-by XOR | ^ | ^ | xor, xori |
Bit-by-by NOT | ~ | ~ | - |
bitwise NOT 要用异或实现(与全 F 异或)
Shift¶
- I 型指令
- 为什么还有
funct6
移位不需要这么多立即数,只要六位 (\(2^6=64\)) 即可。 - 左移 i 位相当于乘 \(2^i\), 右移 i 位相当于除 \(2^i\).
AND¶
OR¶
XOR¶
Instructions for making decisions(用于决策的指令)¶
Branch instructions¶
beq reg1, reg2, Label
相等则跳转bne reg1, reg2, Label
不相等则跳转
Example
store 的立即数是作为数据的地址, beq 的立即数是作为运算的地址(加到 PC 上)因此二者的指令类型不同。
跳转的范围有限制,因为立即数只有 12 位。(PC 相对寻址,以当前程序位置为基准前后跳)
Loop
slt instruction¶
set on if less than.
slt x2, x3, x4 # x2=1 if x3 < x4
R 型指令
Example
More Conditional Operations¶
blt rs1, rs2, L1
若rs1<rs2
则跳到 L1bge rs1, rs2, L1
若rs1>=rs2
则跳到 L1
slt(Set on less than)instruction¶
- set on less than -slt
- If the first reg. is less than second reg. then sets third reg to 1
slt x2, x3, x4
:x2=1
ifx3 < x4
- Example: Compiling a less than test
Signed vs. Unsigned¶
默认是有符号数进行比较
- Signed comparison:
blt
,bge
- Unsigned comparison:
bltu
,bgeu
Case/Switch¶
Compiling a switch using jump address table
\(x_6\) 是跳转表的基地址,\(x_7\leftarrow x_6+8*k\)
jalr x1, 0(x7)
把下一条指令的地址 PC+4放入
x1寄存器,随后跳向
[x7] + 0的地方。
这里我们
jalr x0, ...因为我们不能改变
x0` 寄存器,所以这里仅用作占位,并没有实际存储返回地址。
I 型指令
A basic block is a sequence of instructions with
- No embedded branches (except at end)
- No branch targets (except at beginning)
Supporting Procedures in Computer Hardware¶
Procedure/function --- be used to structure programs
为了完成特定任务。易于理解,可以复用。
调用函数的步骤
- Place Parameters in a place where the procedure can access them (in registers
x10~x17
)
传参 - Transfer control to the procedure
控制权给子程序 - Acquire the storage resources needed for the procedure
- Perform the desired task
- Place the result value in a place where the calling program can access it
- Return control to the point of origin (address in
x1
)
Procedure Call Instructions¶
- Procedure call: jump and link
jal x1, ProcedureLabel
- Address of following instruction put in
x1
- Jumps to target address
- Address of following instruction put in
- Procedure return: jump and link register
jalr x0, 0(x1)
- Like jal, but jumps to
0 + address in x1
- Use
x0
as rd (x0
cannot be changed) - Can also be used for computed jump
- Like jal, but jumps to
不能用 jal
跳回来,跳进函数的地址的是固定的, Label 一定。但是跳回来的地址不一定,要用 x1
存储才能跳回。
Using More Registers¶
- Registers for procedure calling
x10~x17
(a0~a7
): eight argument registers to pass parameters or return values
用来传参的x1
: one return address register to return to origin point
- Stack:Ideal data structure for spilling registers
- Push, pop
- Stack pointer (
sp
):x2
指向最栈顶,即最后一个有效数据所在的位置
- Stack grow from higher address to lower address
- Push:
sp = sp - 8
- Pop:
sp = sp + 8
- Push:
Compiling a leaf procedure
Name | Register Name | Usage | Preserved or call? |
---|---|---|---|
x0(zero) | 0 | The constant value 0 | n.a. |
x1(ra) | 1 | Return address(link register) | yes |
x2(sp) | 2 | Stack pointer | yes |
x3(gp) | 3 | Global pointer | yes |
x4(tp) | 4 | Thread pointer | yes |
x5-x7(t0-t2) | 5-7 | Temporaries | no |
x8(s0/fp) | 8 | Saved/frame pointer | yes |
x9(s1) | 9 | Saved | yes |
x10-x17(a0-a7) | 10-17 | Arguments/results | no |
x18-x27(s2-s11) | 18-27 | Saved | yes |
x28-x31(t3-t6) | 28-31 | Temporaries | no |
PC | - | Auipc(Add Upper Immediate to PC) | yes |
t0~t7
临时寄存器,不需要在函数中保存 编译器会优先调用s0~s11
saved registers
标有 Preserved 表明我们需要在函数开始时保存该寄存器的值,并在离开函数前恢复寄存器的值。
- 一些编译器使用
fp
和x8
指向过程帧的第一个字 - 我们可以通过维护稳定的
sp
来减少对fp
的使用,仅在进入和退出才调整栈。
Nested Procedure(嵌套过程)¶
- 不调用其他过程的过程称为叶子过程(leaf)
Example
Preserved or not
寄存器一般靠堆栈保存, sp 靠加减保存。
Local Data on the Stack
Communicating with People¶
- Byte-encoded character sets
e.g. ASCII, Latin-1 - Unicode: 32-bit character set
e.g. UTF-8, UTF-16
编码中有不同长度的数据,因此也需要有不同长度的 load 和 store.
-
Load byte/halfword/word: Sign extend to 64 bits in rd
我们的寄存器是 64 位的,因此需要扩充。lb rd, offset(rs1)
lh rd, offset(rs1)
lw rd, offset(rs1)
ld rd, offset(rs1)
Example
同样是取 A[4] 的值,不同的数据类型 offset 不同。
char
为 4,short int
为 8,int
为 16. -
Load byte/halfword/word unsigned: 0 extend to 64 bits in rd
lbu rd, offset(rs1)
lhu rd, offset(rs1)
lwu rd, offset(rs1)
- Store byte/halfword/word: Store rightmost 8/16/32 bits
sb rs2, offset(rs1)
sh rs2, offset(rs1)
sw rs2, offset(rs1)
存储就不需要考虑扩充问题,我们不做运算,只是把对应部分放到对应位置。
offset 可以是 3. 因为 RISC-V 是可以不对齐的。(实际上 sh offset 一般是 2 的倍数, sw 是 4 的倍数)
Compiling a string copy procedure
i 不应该分配给 s3, 分配给一个临时寄存器,就可以不用堆栈保存 s3 了。
对于一个 leaf procedure(不再调用其他 procedure) 编译器要尽可能用完所有的临时寄存器,再去用其他的寄存器。
为什么强调 leaf procedure? - 因为对于非 leaf 的函数,可能临时变量会被调用后的函数改变,这样就不能继续用了。
RISC-V Addressing for 32-Bit Immediate and Addresses¶
Wide Bit Immediate addressing¶
如何将一个寄存器初始化为一个任意的立即数。
lui reg, imm
可以把 20 位的常数放到寄存器中。(U-type)
注意这里,我们会把立即数放入寄存器的 [31:12] 位,低位会填充为 0.
Loading a 32-bit constant
我们最终想放入寄存器的值是 32 位常数 0x003D0
. 先利用 lui
将高 20 位 976 放入寄存器中,随后利用加法指令加上 低 12 位,即 2304.
跳转指令的复杂马德老师认为无法解释,在第三章先记忆了化简的
Branch Addressing(分支寻址)¶
SB-type
- PC-relative addressing
\(Target\ address = PC + Branch\ offset = PC + immediate \times 2\)
这里低位会默认补 0. 这样可以把地址范围扩大一倍。
Jump Addressing(跳转寻址)¶
UJ-type(只有 jal
指令)
20-bit immediate for larger range, 低位默认补 0, 故实际表示立即数为 [20:0] 共 21 位。
Example
RISC-V 直接用 PC 算 offset, 而非 PC+4.
bne
等跳转的范围是4k,jal
是\(2^20\),但实际上,这仍然不够
- All RISC-V instructions are 4 bytes long
- PC-relative addressing refers to the number of halfwords
While branch target is far away
* Inserts an unconditional jump to target
Invert the condition so that the branch decides whether to skip the jump.
Branching far away
RISC-V Addressing Summary¶
寻址方式是指令集的核心区别。
- 立即数寻址
addi x5, x6, 4
- 寄存器寻址
add x5, x6, x7
- 基址寻址
ld x5,100(x6)
- PC相对寻址
beq x5,x6,L1
Example
RISC-V Disassembly¶
把机器码翻译为汇编指令。
- opcode
先看 opcode, 确定是哪类指令,随后就可以进行具体划分了。
Example
Synchronization in RISC-V¶
- Two processors sharing an area of memory
- P1 writes, then P2 reads
- Data race if P1 and P2 don’t synchronize
- Result depends of order of accesses
- Hardware support required
- Atomic read/write memory operation
- No other access to the location allowed between the read and write
- Could be a single instruction
- e.g. atomic swap* of register ↔ memory
- Or an atomic pair of instructions
Load reserved: lr.d rd,(rs1)
把地址 rs1 的值放到寄存器 rd 中,同时
Store conditional: sc.d rd,(rs1),rs2
把寄存器 rs2 的值放入地址 rs1.
如果成功那么 rd 里面是 0. 如果上条指令 load 后,这个地方的值被改变了,那么就失败了,返回 0.
atomic swap
lock
地址 x20 放的是锁,如果锁为 0, 说明我们现在可以存入数据,则我们获得锁随后存入,并释放锁。否则需要等锁释放了才能存。
Translating and starting a program¶
Object file¶
- 对象文件头(其它部分的大小和位置)
- 文本段
- 静态数据段和动态变量
- 当程序加载到存储器中时,重定位信息指令和数据字的绝对地址
- 符号表
- 调试信息
Producing an Object Module¶
Provides information for building a complete program from the pieces(Header).
- Text segment: translated instructions
- Static data segment: data allocated for the life of the program
- Relocation info: for contents that depend on absolute location of loaded program
- Symbol table: global definitions and external refs
- Debug info: for associating with source cod
Link¶
Object modules(including library routine) \(\rightarrow\) executable program
- Place code and data modules symbolically in memory
- Determine the addresses of data and instruction labels
- Patch both the internal and external references (Address of invoke)
Loading a Program¶
Load from image file on disk into memory
- Read header to determine segment sizes
- Create virtual address space
- Copy text and initialized data into memory
Or set page table entries so they can be faulted in - Set up arguments on stack
- Initialize registers (including sp, fp, gp)
- Jump to startup routine
Copies arguments to x10, … and calls main pWhen main returns, do exit syscall
Dynamic Linking¶
Only link/load library procedure when it is called.
静态链接已经编入文件了,动态链接是在运行时链接,可以用到最新的代码
- Requires procedure code to be relocatable
- Avoids image bloat caused by static linking of all (transitively) referenced libraries
- Automatically picks up new library versions
Arrays versus Pointers¶
指针是可以改变的,但是数组首地址不能改变,因此翻译成汇编的结果也有所不同。
Clearing an Array
Real Stuff: MIPS Instructions¶
MIPS: commercial predecessor to RISC-V
- Similar basic set of instructions
- 32-bit instructions
- 32 general purpose registers, register 0 is always 0
- 32 floating-point registers
- Memory accessed only by load/store instructions
- Consistent use of addressing modes for all data sizes
Different conditional branches
- For <, <=, >, >=
- RISC-V:
blt, bge, bltu, bgeu
- MIPS:
slt, sltu
(set less than, result is 0 or 1) - Then use
beq
, bne to complete the branch
Real Stuff: The Intel x86 ISA 发展历史 了解即可
Other RISC-V Instructions¶
- Base integer instructions (RV64I)
- Those previously described, plus
auipc rd, immed
// rd = (imm<<12) + pc- follow by
jalr
(adds 12-bit immed) for long jump slt, sltu, slti, sltui
: set less than (like MIPS)addw, subw, addiw
: 32-bit add/subsllw, srlw, srlw, slliw, srliw, sraiw
: 32-bit shift
- 32-bit variant: RV32I
registers are 32-bits wide, 32-bit operations
RV64I 是最基本的。编译器需要知道当前执行是基于多少位的处理器。
Instruction Set Extensions
- M: integer multiply, divide, remainder
- A: atomic memory operations
- F: single-precision floating point
- D: double-precision floating point
- C: compressed instructions
16 位的指令,用于低成本产品(嵌入式)
Fallacies (谬误)
- Powerful instruction \(\Rightarrow\) higher performance
- Use assembly code for high performance
- Backward compatibility \(\Rightarrow\) instruction set doesn’t change
Pifalls (陷阱)
- Sequential words are not at sequential addresses (应该 +4)
- Keeping a pointer to an automatic variable after procedure returns