In the last chapter, We came to a conclusion that writing computer programs in machine language or machine code is very tedious, time consuming, error prone and difficult to understand.
Programmers mostly use a more simpler way to write computer programs often referred to as source code using what are called : programming languages. A few popular ones are Java, C#, Python, C++ etc.
These source code are more human friendly and the instructions are written line by line in the language specific syntax using any kind of text editor like notepad for windows or TextEdit for mac based on the programmers personal preferences.
As machine language closely deals with the CPU itself (i.e it does not have almost any level of abstraction from the hardware), It falls in the category of low-level languages. It is also sometimes referred to as 1GL or first-generation language.
Programming Languages are more human readable and with strong abstraction from the computer hardware. Hence, they fall in the category of high-level languages often referred to as 3GL or third-generation languages.
The source code written in any high-level programming language needs to be converted to low-level programming language or machine code at some point.
Lets take a look at how it is done…
The CPU does not understand the source code written using any high level programming languages. The source code written in any such languages must first be translated into machine code using a Language Processor.
The language processing is typically done by a special software which can be categorized into three types :
A compiler in the most simplest terms, is a program that converts source code- human-readable code into machine readable code (object code) in a single compilation process. This generated object code is still not pure machine language. Hence, It is referred to as IL Code or Intermediate Language code.
Compiled Languages : C, C++, Java, C#, Objective-C, Swift etc.
The compiler actually takes the whole file containing the source code and goes through every instructions line by line, processes it and spits out a new file containing intermediate language code. This new file that the compiler generates is often called an executable.
Lets say, I wrote a computer program in a programming language called C# using a simple text editor (Notepad) in my personal laptop.
I can then simply compile my source code using a C# language compiler which then generates an executable. The type of executable in case of the C# compiler is generally a .exe file.
I can later on give the executable file to my friend Ted who can then run the executable on his laptop.
It is a program that converts assembly language code into machine code. Some compilers often perform the task of the assembler and generate machine code. The output of an assembler is called the object code.
Assembly language is a low level language which is sometimes called 2GL or second generation language.
Assembly language is specific to a particular computers architecture and sometimes to an operating system.
An Interpreter is a program that converts human-readable code into machine code one line at a time. Unlike the compiler, Interpreters have to do the translation each time you run the program. No object code will be generated most of the time as the translation is generally done, directly to machine readable code.
Due to this line by line translation which happens during execution time, Interpreters are rather slower than compilers.
However, The development process using an interpreter is faster in comparison to a compiler while doing incremental development by dividing the source code into smaller sections, as it provides immediate output which makes running and functionality testing easier.
This time, instead of compiling the source code and giving an executable file to my friend Ted, I have to give him the whole source code.
Instead of doing the machine readable code conversion in my laptop, Ted needs to do the conversion himself in his own laptop.
Luckily, He does not need to install an interpreter separately as interpreters usually come bundled inside a web browser or the operating system.
All Ted needs to do now is load the file using a web browser and the language translation happens while the source code is being executed by the browser.
The (Just In Time) JIT Compiler
Some of the modern programming languages like Java and C# make use of the Just in time compilation (on the fly compilation). JIT compilation happens after you have executed the program.
Unlike the conventional compilation process where the source code is converted to machine code, a few modern programming languages are compiled to an intermediate language code. In case of the language like Java, the IL code is called the byte-code. The conversion to machine code finally happens only when the program is executed.
But what is the use of this dual compiling ?
JIT compilers are highly advanced, performance oriented compilers which have access to dynamic runtime information. They perform specialized tasks like monitoring and optimizing the code for the particular CPU where the program runs.
Some of its features are:
- global code optimization
- statistical data analysis of the program
- CPU targeted compilation, analysis and recompilation