Let us understand some of the print function concepts in this article, which are foundational to understand print-related vulnerabilities such as format string vulnerabilities.
Format functions
In a later section, we are going to discuss how format strings are used with format functions. But following is a list of commonly used format functions.
fprint – Writes the printf to a file printf – Output a formatted string sprintf – Prints into a string snprintf – Prints into a string checking the length vfprintf – Prints the a va_arg structure to a file vprintf – Prints the va_arg structure to stdout vsprintf – Prints the va_arg to a string vsnprintf – Prints the va_arg to a string checking the length
Understanding printf
To better understand Format String vulnerabilities, let us first understand how print family of functions work by taking printf function in C language as an example. Let us begin by considering the following C program as an example. test1.c The above C Program contains one printf function with multiple format specifiers namely %d, %f, %p. int a = 100; float b = 2.3; int *c; c = &a; printf(“%d, %f, %p n”, a,b,c); } When the printf function is executed with a format specifier, it prints data as specified by the format specifier. Let us take the following printf function as an example. When this printf function gets executed, it prints the value of variable num as an integer since %d is for signed integers in decimal. It should be noted that this data will be retrieved from the stack. Similarly, if the format specifier is changed to %p, the data is going to be printed as a hex value. Following are some of the commonly used format specifiers: %d – used for signed integers in decimal %f – used for float values %c – used for character %s – used for printing string data pointed by an address %x – used for hexadecimal representation %p – used for pointers This should give some basic understanding of how format specifiers are used in printf function when printing data. Now, let us understand what happens when printf in the above program gets executed. When printf function is executed in the preceding program, the following events occur:
The data available in variable a (on the stack), will replace the format specifier %d and an integer value (100) will be printed. The data available in variable b (on the stack), will replace the format specifier %f and a float value (2.3) will be printed. The data available in variable c (on the stack), will replace the format specifier %p and the address of variable a, which is a pointer to the data stored in variable a will be printed.
Let us test our theory by compiling and executing this program. The following gcc command can be used on a Linux machine to compile the program. Let us run the program and we should see the following output. As expected, the variables a and b have decimal and float values respectively and variable c has a pointer, which is an address pointing to the value of a. $ Some format specifiers give the programmer granular control on the format of the data being printed. For example, the value of the variable b is displayed as 2.300000. If we want to print 2.3 instead, we can update the format specifier for variable b as shown in the following code snippet. test2.c Let us compile the program using the following gcc command. int a = 100; float b = 2.3; int *c; c = &a; printf(“%d, %1.1f, %p n”, a,b,c); } Running the output binary test2 produces the float value 2.3 instead of 2.300000
Printing hexadecimal instead of decimal
Variable a contains the value 100, which is a decimal value. Format specifiers also provide us an advantage of printing the hexadecimal equivalent of it without explicitly converting it in the program. The following example shows how 0x64 will be printed instead of decimal 100 just by changing the format specifier from %d to %p. $ test3.c Let us compile the program using the following gcc command. int a = 100; float b = 2.3; int *c; c = &a; printf(“%p, %1.1f, %p n”, a,b,c); } Running the output binary test3 prints the hexadecimal value 0x64 instead of decimal 100.
Printing strings and their addresses
So far, we have explored how numbers are printed using printf. Now let us add two string variables to our program and understand how strings are printed using the printf function. Following is the program. $ test4.c int a = 100; float b = 2.3; int *c; c = &a; char d[] = “demo”; char *e = d; printf(“%d, %1.1f, %p, %s, %s n”, a,b,c,d,e); } The preceding program contains a character array d, which contains the string demo. We have then created another variable named e, which is a pointer to the character array. Essentially, both the variables should contain the same string value, when printed. As we can see in the printf statement, %s is used as the format specifier to print these string values. Let us compile the program using the following gcc command. Run the output binary test4 and we should see the string values being printed. It is also possible to print the addresses of these strings just by changing the format specifier from %s to %p. We can update the C program to print these pointers as shown below. $ test5.c Compile the program using the following gcc command. int a = 100; float b = 2.3; int *c; c = &a; char d[] = “demo”; char *e = d; printf(“%d, %1.1f, %p, %p, %p n”, a,b,c,d,e); } Run the output binary test5 and we should see the addresses of the two string variables d and e. As we can notice, both the addresses are the same. $
Conclusion
This article has provided a detailed explanation of how format strings can be used to format the data being printed by making use of format specifiers within the printf function. Understanding printf function is a foundation for understanding Format String class vulnerabilities, which will be discussed in the upcoming articles.
Sources
https://owasp.org/www-community/Source_Code_Analysis_Tools https://owasp.org/www-community/attacks/Format_string_attack https://www.netsparker.com/blog/web-security/format-string-vulnerabilities/