Prelab 11.2 Completed

# Name: Son Pham
# Class: CSCI 315
# Prof: Luiz Felipe Perrone
# prelab.txt Lab11
[2.1] ./fdump hexdump.c 1000 128
I see some content of the file that looks like the c code of the hexdump.c
[2.2] ./fdump fdump 500 128
I see random characters with a lot of dots.
[2.3] It is clear that information in both files are in binary encoding (they are
stored in disk after all).
However, each byte in the first file is data collected from the text file, so
each byte will represents one byte in the file. Overall, the bytes collected
are part of the code that is written by human and thus readable.
In the second file, the byte we read are machine code, which is instruction for
machines and are generated by machines. These machine code is not readable.
Therefore, if we translate each byte into character, they usually don't mean that
[2.4] /usr/bin/file: A 64-bit executable with dynamically linked libraries for Linux 2.6.18
~cs315/Labs/Lab11/work: A PDF document compatiable for version 1.3
~cs315/Labs/Lab11/beauty: A JPEG image data with JFIF standard 1.0.1
hexdump.o: A 64-bit relocatable object.
hexdump.h: An ASCII C program text
How does file figure out all these information?
It figures out all these information by three test: filesystem tests, magic tests and language tests
file system tests:
file first check if the file actually exist and then whthere the file is an executable, text or data.
It does so by calling stat(2) on the file.
magic tests:
Then file will do something called a "magic test" by checking the magic number to determine what kind
of format the file is in. Usually the magic number is placed somewhere near the beginning of the file
The number will be checked with a directory of existing entries available in the system. If the number
does not fall into any of those entries, then the test will try to determine what kind of encoding for
the file (ASCII or Unicode)
language tests:
In case the file is the text file, the program will then try the language test to determine what
language the file is written in. They usually take the first few blocks of the file and determine
what kind of language it is. This test is usually less reliable.
Source: man page
