x86 64 Assembly Language Programming with Ubuntu Ed Jorgensen - Download the entire ebook instantly and explore every detail
x86 64 Assembly Language Programming with Ubuntu Ed Jorgensen - Download the entire ebook instantly and explore every detail
com
https://ebookmeta.com/product/x86-64-assembly-language-
programming-with-ubuntu-ed-jorgensen/
OR CLICK HERE
DOWLOAD EBOOK
https://ebookmeta.com/product/modern-x86-assembly-language-
programming-covers-x86-64-bit-avx-avx2-and-avx-512-2nd-edition-daniel-
kusswurm/
ebookmeta.com
https://ebookmeta.com/product/assembly-language-
for-x86-processors-6th-edition-kip-irvine/
ebookmeta.com
https://ebookmeta.com/product/axonal-transport-alessio-vagnoni/
ebookmeta.com
Pick Your Poison 1st Edition Emma Nichole
https://ebookmeta.com/product/pick-your-poison-1st-edition-emma-
nichole/
ebookmeta.com
https://ebookmeta.com/product/introduction-to-electrochemical-science-
and-engineering-2nd-edition-lvov-serguei-n/
ebookmeta.com
https://ebookmeta.com/product/cambridge-international-as-a-level-
chemistry-study-and-revision-guide-third-edition-david-bevan/
ebookmeta.com
https://ebookmeta.com/product/street-culture-50-years-of-subculture-
style-1st-edition-gavin-baddeley/
ebookmeta.com
Evangelicals on the Canterbury Trail Why Evangelicals Are
Attracted to the Liturgical Church Revised Edition Robert
Webber Lester Ruth
https://ebookmeta.com/product/evangelicals-on-the-canterbury-trail-
why-evangelicals-are-attracted-to-the-liturgical-church-revised-
edition-robert-webber-lester-ruth/
ebookmeta.com
x86-64
Assembly Language
Programming
with
Ubuntu
Ed Jorgensen, Ph.D.
Version 1.1.40
January 2020
Cover image:
Top view of an Intel central processing unit Core i7 Skylake type core,
model 6700K, released in June 2015.
Source: Eric Gaba, https://commons.wikimedia.org/wiki/File:
Intel_CPU_Core_i7_6700K_Skylake_top.jpg
Cover background:
By Benjamint444 (Own work)
Source: http://commons.wikimedia.org/wiki/File%3ASwirly_belt444.jpg
Table of Contents
1.0 Introduction...........................................................................................................1
1.1 Prerequisites........................................................................................................1
1.2 What is Assembly Language...............................................................................2
1.3 Why Learn Assembly Language.........................................................................2
1.3.1 Gain a Better Understanding of Architecture Issues...................................3
1.3.2 Understanding the Tool Chain.....................................................................3
1.3.3 Improve Algorithm Development Skills.....................................................3
1.3.4 Improve Understanding of Functions/Procedures.......................................3
1.3.5 Gain an Understanding of I/O Buffering.....................................................4
1.3.6 Understand Compiler Scope........................................................................4
1.3.7 Introduction Multi-processing Concepts.....................................................4
1.3.8 Introduction Interrupt Processing Concepts................................................4
1.4 Additional References.........................................................................................4
1.4.1 Ubuntu References......................................................................................5
1.4.2 BASH Command Line References..............................................................5
1.4.3 Architecture References..............................................................................5
1.4.4 Tool Chain References................................................................................5
1.4.4.1 YASM References...............................................................................6
1.4.4.2 DDD Debugger References.................................................................6
2.0 Architecture Overview.........................................................................................7
2.1 Architecture Overview........................................................................................7
2.2 Data Storage Sizes...............................................................................................8
2.3 Central Processing Unit.......................................................................................9
2.3.1 CPU Registers...........................................................................................10
2.3.1.1 General Purpose Registers (GPRs)....................................................10
2.3.1.2 Stack Pointer Register (RSP).............................................................12
2.3.1.3 Base Pointer Register (RBP)..............................................................12
2.3.1.4 Instruction Pointer Register (RIP).....................................................12
2.3.1.5 Flag Register (rFlags)........................................................................12
2.3.1.6 XMM Registers..................................................................................13
2.3.2 Cache Memory..........................................................................................14
2.4 Main Memory....................................................................................................15
2.5 Memory Layout.................................................................................................17
Page iii
Table of Contents
Page iv
Table of Contents
Page v
Table of Contents
Page vi
Table of Contents
Page vii
Table of Contents
Page viii
Table of Contents
Page ix
Table of Contents
Page x
Table of Contents
Page xi
Table of Contents
Page xii
Table of Contents
Illustration Index
Illustration 1: Computer Architecture................................................................................7
Illustration 2: CPU Block Diagram..................................................................................15
Illustration 3: Little-Endian Data Layout.........................................................................16
Illustration 4: General Memory Layout...........................................................................17
Illustration 5: Memory Hierarchy....................................................................................18
Illustration 6: Overview: Assemble, Link, Load.............................................................42
Illustration 7: Little-Endian, Multiple Variable Data Layout..........................................44
Illustration 8: Linking Multiple Files...............................................................................49
Illustration 9: Initial Debugger Screen.............................................................................56
Illustration 10: Debugger Screen with Breakpoint Set....................................................58
Illustration 11: Debugger Screen with Green Arrow.......................................................59
Illustration 12: DDD Command Bar................................................................................60
Illustration 13: Register Window.....................................................................................61
Illustration 14: MOV Instruction Overview....................................................................71
Illustration 15: Integer Multiplication Overview.............................................................88
Illustration 16: Integer Division Overview......................................................................96
Illustration 17: Logical Operations................................................................................102
Illustration 18: Logical Shift Overview.........................................................................104
Illustration 19: Logical Shift Operations.......................................................................104
Illustration 20: Arithmetic Left Shift.............................................................................106
Illustration 21: Arithmetic Right Shift...........................................................................106
Illustration 22: Process Memory Layout........................................................................144
Illustration 23: Process Memory Layout Example........................................................145
Illustration 24: Stack Frame Layout..............................................................................175
Illustration 25: Stack Frame Layout with Red Zone......................................................176
Illustration 26: Stack Call Frame Example....................................................................230
Illustration 27: Stack Call Frame Corruption.................................................................235
Illustration 28: Argument Vector Layout......................................................................242
Illustration 29: Privilege Levels.....................................................................................291
Illustration 30: Interrupt Processing Overview..............................................................294
Page xiii
Table of Contents
Page xiv
If you give someone a program, you will Chapter
frustrate them for a day; if you teach them 1
to program, you will frustrate them for a
lifetime.
1.0 Introduction
The purpose of this text is to provide a reference for University level assembly language
and systems programming courses. Specifically, this text addresses the x86-64 1
instruction set for the popular x86-64 class of processors using the Ubuntu 64-bit
Operating System (OS). While the provided code and various examples should work
under any Linux-based 64-bit OS, they have only been tested under Ubuntu 14.04 LTS
(64-bit).
The x86-64 is a Complex Instruction Set Computing (CISC2) CPU design. This refers
to the internal processor design philosophy. CISC processors typically include a wide
variety of instructions (sometimes overlapping), varying instructions sizes, and a wide
range of addressing modes. The term was retroactively coined in contrast to Reduced
Instruction Set Computer (RISC3).
1.1 Prerequisites
It must be noted that the text is not geared toward learning how to program. It is
assumed that the reader has already become proficient in a high-level programming
language. Specifically, the text is generally geared toward a compiled, C-based high-
level language such as C, C++, or Java. Many of the explanations and examples assume
the reader is already familiar with programming concepts such as declarations,
arithmetic operations, control structures, iteration, function calls, functions, indirection
(i.e., pointers), and variable scoping issues.
Additionally, the reader should be comfortable using a Linux-based operating system
including using the command line. If the reader is new to Linux, the Additional
References section has links to some useful documentation.
Page 1
Chapter 1.0 ◄ Introduction
Page 2
1.3.1 Gain a Better Understanding of Architecture Issues
Learning and spending some time working at the assembly language level provides a
richer understanding of the underlying computer architecture. This includes the basic
instruction set, processor registers, memory addressing, hardware interfacing, and Input/
Output. Since ultimately all programs execute at this level, knowing the capabilities of
assembly language provides useful insights into what is possible, what is easy, and what
might be more difficult or slower.
Page 3
Chapter 1.0 ◄ Introduction
Page 4
1.4.1 Ubuntu References
There is significant documentation available for the Ubuntu OS. The principal user
guide is as follows:
◦ Ubuntu Community Wiki
◦ Getting Started with Ubuntu 16.04
In addition, there are many other sites dedicated to providing help using Ubuntu (or
other Linux-based OS's).
Page 5
Chapter 1.0 ◄ Introduction
Page 6
Warning, keyboard not found. Press enter Chapter
to continue. 2
Primary Storage
CPU Random Access
Memory (RAM)
BUS
(Interconnection)
Page 7
Chapter 2.0 ◄ Architecture Overview
Page 8
For example, C/C++ declarations are mapped as follows:
C/C++ Declaration Storage Size (bits) Size (bytes)
char Byte 8-bits 1 byte
short Word 16-bits 2 bytes
int Double-word 32-bits 4 bytes
unsigned int Double-word 32-bits 4 bytes
long5 Quadword 64-bits 8 bytes
long long Quadword 64-bits 8 bytes
char * Quadword 64-bits 8 bytes
int * Quadword 64-bits 8 bytes
float Double-word 32-bits 4 bytes
double Quadword 64-bits 8 bytes
The asterisk indicates an address variable. For example, int * means the address of
an integer. Other high-level languages typically have similar mappings.
5 Note, the 'long' type declaration is compiler dependent. Type shown is for gcc and g++ compilers.
6 For more information, refer to: http://en.wikipedia.org/wiki/Central_processing_unit
7 For more information, refer to: http://en.wikipedia.org/wiki/Die_(integrated_circuit)
8 For more information, refer to: http://en.wikipedia.org/wiki/Arithmetic_logic_unit
9 For more information, refer to: http://en.wikipedia.org/wiki/Processor_register
10 For more information, refer to: http://en.wikipedia.org/wiki/Cache_(computing)
Page 9
Chapter 2.0 ◄ Architecture Overview
Additionally, some of the GPR registers are used for dedicated purposes as described in
the later sections.
Page 10
When using data element sizes less than 64-bits (i.e., 32-bit, 16-bit, or 8-bit), the lower
portion of the register can be accessed by using a different register name as shown in the
table.
For example, when accessing the lower portions of the 64-bit rax register, the layout is
as follows:
← eax →
← ax →
rax = ah al
As shown in the diagram, the first four registers, rax, rbx, rcx, and rdx also allow the
bits 8-15 to be accessed with the ah, bh, ch, and dh register names. With the exception
of ah, these are provided for legacy support and will not be used in this text.
The ability to access portions of the register means that, if the quadword rax register is
set to 50,000,000,00010 (fifty billion), the rax register would contain the following value
in hex.
rax = 0000 000B A43B 7400
If a subsequent operation sets the word ax register to 50,00010 (fifty thousand, which is
C35016), the rax register would contain the following value in hex.
rax = 0000 000B A43B C350
In this case, when the lower 16-bit ax portion of the 64-bit rax register is set, the upper
48-bits are unaffected. Note the change in AX (from 740016 to C35016).
If a subsequent operation sets the byte sized al register to 5010 (fifty, which is 3216), the
rax register would contain the following value in hex.
rax = 0000 000B A43B C332
When the lower 8-bit al portion of the 64-bit rax register is set, the upper 56-bits are
unaffected. Note the change in AL (from 5016 to 3216).
For 32-bit register operations, the upper 32-bits is cleared (set to zero). Generally, this
is not an issue since operations on 32-bit registers do not use the upper 32-bits of the
register. For unsigned values, this can be useful to convert from 32-bits to 64-bits.
However, this will not work for signed conversions from 32-bit to 64-bit values.
Specifically, it will potentially provide incorrect results for negative values. Refer to
Chapter 3, Data Representation for additional information regarding the representation
of signed values.
Page 11
Chapter 2.0 ◄ Architecture Overview
Page 12
Sign SF 7 Used to indicate if the result of the
previous operation resulted in a 1 in the
most significant bit (indicating negative in
the context of signed data).
Direction DF 10 Used to specify the direction (increment or
decrement) for some string operations.
Overflow OF 11 Used to indicate if the previous operation
resulted in an overflow.
There are a number of additional bits not specified in this text. More information can be
obtained from the additional references noted in Chapter 1, Introduction.
Page 13
Chapter 2.0 ◄ Architecture Overview
xmm13
xmm14
xmm15
Note, some of the more recent X86-64 processors support 256-bit XMM registers. This
will not be an issue for the programs in this text.
Additionally, the XMM registers are used to support the Streaming SIMD Extensions
(SSE). The SSE instructions are out of the scope of this text. More information can be
obtained from the Intel references (as noted in Chapter 1, Introduction).
Page 14
A block diagram of a typical CPU chip configuration is as follows:
CPU Chip
Core 0 Core 1
L1 Cache L1 Cache
L2 Cache
BUS
Current chip designs typically include an L1 cache per core and a shared L2 cache.
Many of the newer CPU chips will have an additional L3 cache.
As can be noted from the diagram, all memory accesses travel through each level of
cache. As such, there is a potential for multiple, duplicate copies of the value (CPU
register, L1 cache, L2 cache, and main memory). This complication is managed by the
CPU and is not something the programmer can change. Understanding the cache and
associated performance gain is useful in understanding how a computer works.
Page 15
Chapter 2.0 ◄ Architecture Overview
For a double-word (32-bits), the MSB and LSB are allocated as shown below.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
MSB LSB
var1 → 40 01001008
? 01001007
Based on the little-endian architecture, the LSB is stored in the lowest memory address
and the MSB is stored in the highest memory location.
Page 16
2.5 Memory Layout
The general memory layout for a program is as shown:
The reserved section is not available to user programs. The text (or code) section is
where the machine language11 (i.e., the 1's and 0's that represent the code) is stored. The
data section is where the initialized data is stored. This includes declared variables that
have been provided an initial value at assemble-time. The uninitialized data section,
typically called BSS section, is where declared variables that have not been provided an
initial value are stored. If accessed before being set, the value will not be meaningful.
The heap is where dynamically allocated data will be stored (if requested). The stack
starts in high memory and grows downward.
Later sections will provide additional detail for the text and data sections.
Page 17
Chapter 2.0 ◄ Architecture Overview
Drives (SSD's) are larger, slower, and less expensive. The overall goal is to balance
performance with cost.
An overview of the memory hierarchy is as follows:
CPU
Registers
Cache
Primary Storage
Main Memory (RAM)
Secondary Storage
(disk drives, SSD's, etc.)
Tertiary Storage
(remote storage, optical, backups, etc.)
Where the top of the triangle represents the fastest, smallest, and most expensive
memory. As we move down levels, the memory becomes slower, larger, and less
expensive. The goal is to use an effective balance between the small, fast, expensive
memory and the large, slower, and cheaper memory.
Page 18
Some typical performance and size characteristics are as follows:
Based on this table, a primary storage access at 100 nanoseconds (100 ´ 10-9) is 30,000
times faster than a secondary storage access, at 3 milliseconds (3 ´ 10-3).
The typical speeds improve over time (and these are already out of date). The key point
is the relative difference between each memory unit is significant. This difference
between the memory units applies even as newer, faster SSDs are being utilized.
2.7 Exercises
Below are some questions based on this chapter.
Page 19
Chapter 2.0 ◄ Architecture Overview
Page 20
There are 10 types of people in the world; Chapter
those that understand binary and those that 3
don't.
Page 21
Chapter 3.0 ◄ Data Representation
As you may recall from C, C++, or Java, an integer declaration (e.g., int <variable>) is
a single double-word which can be used to represent values between -231
(−2,147,483,648) and +231 - 1 (+2,147,483,647).
The following table shows the ranges associated with typical sizes:
In order to determine if a value can be represented, you will need to know the size of the
storage element (byte, word, double-word, quadword, etc.) being used and if the values
are signed or unsigned.
• For representing unsigned values within the range of a given storage size,
standard binary is used.
• For representing signed values within the range, two's complement is used.
Specifically, the two's complement encoding process applies to the values in the
negative range. For values within the positive range, standard binary is used.
For example, the unsigned byte range can be represented using a number line as follows:
0 255
For example, the signed byte range can also be represented using a number line as
follows:
-128 0 +127
The same concept applies to halfwords and words which have larger ranges.
Page 22
Since unsigned values have a different, positive only, range than signed values, there is
overlap between the values. This can be very confusing when examining variables in
memory (with the debugger).
For example, when the unsigned and signed values are within the overlapping positive
range (0 to +127):
• A signed byte representation of 1210 is 0x0C16
• An unsigned byte representation of -1210 is also 0x0C16
When the unsigned and signed values are outside the overlapping range:
• A signed byte representation of -1510 is 0xF116
• An unsigned byte representation of 24110 is also 0xF116
This overlap can cause confusion unless the data types are clearly and correctly defined.
Note, all bits for the given size, byte in this example, must be specified.
Page 23
Chapter 3.0 ◄ Data Representation
Note, all bits for the given size, words in these examples, must be specified.
248 = F8 -8 = F8
The final result of 0xF8 may be interpreted as 248 for unsigned representation and -8 for
a signed representation. Additionally, 0xF816 is the º (degree symbol) in the ASCII
table.
As such, it is very important to have a clear definition of the sizes (byte, halfword, word,
etc.) and types (signed, unsigned) of data for the operations being performed.
Page 24
3.3.1 IEEE 32-bit Representation
The IEEE 754 32-bit floating-point standard is defined as follows:
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Where s is the sign (0 => positive and 1 => negative). More formally, this can be
written as;
N = (−1) S × 1. F × 2E−127
When representing floating-point values, the first step is to convert floating-point value
into binary. The following table provides a brief reminder of how binary handles
fractional components:
For example, 100.1012 would be 4.62510. For repeating decimals, calculating the binary
value can be time consuming. However, there is a limit since computers have finite
storage sizes (32-bits in this example).
The next step is to show the value in normalized scientific notation in binary. This
means that the number should have a single, non-zero leading digit to the left of the
decimal point. For example, 8.12510 is 1000.0012 (or 1000.0012 x 20) and in binary
normalized scientific notation that would be written as 1.000001 x 2 3 (since the decimal
point was moved three places to the left). Of course, if the number was 0.125 10 the
binary would be 0.0012 (or 0.0012 x 20) and the normalized scientific notation would be
1.0 x 2-3 (since the decimal point was moved three places to the right). The numbers
after the leading 1, not including the leading 1, are stored left-justified in the fraction
portion of the double-word.
The next step is to calculate the biased exponent, which is the exponent from the
normalized scientific notation plus the bias. The bias for the IEEE 754 32-bit floating-
point standard is 12710. The result should be converted to a byte (8-bits) and stored in
the biased exponent portion of the word.
Note, converting from the IEEE 754 32-bit floating-point representation to the decimal
value is done in reverse, however leading 1 must be added back (as it is not stored in the
word). Additionally, the bias is subtracted (instead of added).
Page 25
Chapter 3.0 ◄ Data Representation
Example 1: -7.75
• determine sign -7.75 => 1 (since negative)
• convert to binary -7.75 = -0111.112
• normalized scientific notation = 1.1111 x 22
• compute biased exponent 210 + 12710 = 12910
◦ and convert to binary = 100000012
• write components in binary:
sign exponent mantissa
1 10000001 11110000000000000000000
• convert to hex (split into groups of 4)
11000000111110000000000000000000
1100 0000 1111 1000 0000 0000 0000 0000
C 0 F 8 0 0 0 0
• final result: C0F8 000016
Example 2: -0.125
• determine sign -0.125 => 1 (since negative)
• convert to binary -0.125 = -0.0012
• normalized scientific notation = 1.0 x 2-3
• compute biased exponent -310 + 12710 = 12410
◦ and convert to binary = 011111002
• write components in binary:
sign exponent mantissa
1 01111100 00000000000000000000000
• convert to hex (split into groups of 4)
10111110000000000000000000000000
1011 1110 0000 0000 0000 0000 0000 0000
B E 0 0 0 0 0 0
• final result: BE00 000016
Page 26
Other documents randomly have
different content
The Project Gutenberg eBook of The Young
Book Agent; or, Frank Hardy's Road to Success
This ebook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this ebook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.
Title: The Young Book Agent; or, Frank Hardy's Road to Success
Illustrator: A. B. Shute
Language: English
BY
HORATIO ALGER, JR.
AUTHOR OF “LOST AT SEA,” “NELSON THE NEWSBOY,” “OUT
FOR BUSINESS,” “YOUNG CAPTAIN JACK,” “RAGGED
DICK SERIES,” “TATTERED TOM
SERIES,” ETC.
NEW YORK
STITT PUBLISHING COMPANY
1905
Copyright, 1905
BY
Many years ago the author of the present volume resolved to write
a long series of books describing various phases of village and city
life, taking up in their turn the struggles of the bootblacks, the
newsboys, the young peddlers, the street musicians—the lives, in
fact, of all those who, though young in years, have to face the bitter
necessity of earning their own living.
In the present story are described the ups and downs of a boy
book agent, who is forced, through the misfortunes of his father, to
help provide for the family to which he belongs. He knows nothing of
selling books, when he starts, but he acquires a valuable experience
rapidly, and in the end gains a modest success which is well
deserved.
It is the custom of many persons in ordinary life to sneer at a
book agent and show him scant courtesy, forgetting that the agent’s
business is a perfectly legitimate one and that he is therefore
entitled to due respect so long as he does that which is proper and
gentlemanly. A kind word costs nothing, and it often cheers up a
heart which would otherwise be all but hopelessly depressed.
After reading this volume it may be thought by some that the
hero, Frank Hardy, is above his class in tact, intelligence, and
perseverance. This, however, is not true. A book agent, or, in fact, an
agent of any kind, must possess all of these qualities in a marked
degree, otherwise he will undoubtedly make a failure of the
undertaking. As in every other calling, to win success one must first
deserve it.
CONTENTS
CHAPTER PAGE
I. Frank at Home 1
II. Down at the Wreck 9
III. Disagreeable News 17
IV. The Hunt for a Missing Man 25
V. Frank at the Store 34
VI. The Rival Merchants 42
VII. A Fourth of July Celebration 50
VIII. Frank Looks for Work 58
IX. Frank Meets a Book Agent 67
X. Frank Goes to New York 76
XI. Frank as an Agent 86
XII. A Bright Beginning 96
XIII. Frank on the Road 108
XIV. A Boy Runaway 118
XV. Caught in a Storm 127
XVI. An Important Sale 136
XVII. A Curious Happening 145
XVIII. The Would-be Actor 153
XIX. Giving an Autograph 162
XX. Frank’s Remarkable Find 171
XXI. Gabe Flecker Shows His Hand 180
XXII. The Rival Book Agent 189
XXIII. News from Home 197
XXIV. Lost in a Coal Mine 205
XXV. Frank Meets Flecker Again 214
XXVI. An Escape 224
XXVII. At Home Once More 232
XXVIII. Frank Starts for the South 242
XXIX. A Scene on the Train 249
XXX. Frank Meets His Brother Mark 257
XXXI. A Clever Capture—Conclusion 264
THE YOUNG BOOK AGENT
CHAPTER I
FRANK AT HOME