Python: Compiled or Interpreted

Frequently asked question in the interview or a viva 

Sun Mar 5, 2023

Under the hood

"Whether you think you can or you think you can't, you are right." — Henry Ford

The major question is whether a Python is compiled language or interpreted one? It is compiled. But unlike C or C++, which is converted into machine specific code, Python source code is compiled to a byte stream called as Bytecode. Since the CPU can only understand the machine code, modern interpreter cannot interpret human code because it is too inefficient. So Python interpreter first reads the human code and translates into intermediate code.

This Bytecode is then redirected to virtual machine for interpretation. Now there could be different implementation like CPython or JPython. CPython is standard implementation of Python for PVM and JPython is standard implementation for JVM.

Steps involved when you run your Python code,

  • The compiler receives the source code.
  • The compiler checks the syntax of each line in the source code.
  • If the compiler encounters an error, it halts the translation process with an error message (Syntax error).
  • Else if the instruction is well formatted, then it translates the source code to Bytecode.
  • Bytecode is sent to the Python Virtual Machine (PVM)
  • Bytecode along with the inputs and Library modules is given as the input to the PVM.
  • PVM executes the Bytecode and if any error occurs, it displays an error message (Runtime error).
  • Otherwise, if there is no error in execution, it results in the output.

Note that here is Virtual machine is fictitious machine. These virtual machines are responsible for generating native machine code.

Java produces .class file from .java file which is in Bytecode format. And then Java interpreter directs Bytecode to JVM for execution.

The same thing happens with Python i.e. first source code is converted into Bytecode and it is fed into Python virtual machine. So PVM is nothing but a software/interpreter that converts the byte code to machine code for given operating system. PVM is also called Python Interpreter and this is the reason Python is called an Interpreted language by many programmers community.

Note that Bytecode is not binary machine code but is a low level platform independent code. The Bytecode created has got .PYC extension. The only difference is that compilation happens behind the scene. Note that CPython takes the responsibility of both compilation and execution. In fact actual compilation happens with the help of cython.

If file is not changed, Python will load .PYC file and skip the compilation phase if changes are not made in the source code. Python automatically checks the timestamps of source and byte code files to know when it must recompile. If we resave the source code, byte code is automatically created again the next time the program is run.

It is the Python shell which works interactively with us because of which Python is recognized as Interpreted language. Partially it is true since C, C++, Java don’t have that facility.

Summary

Let's summarize what happens behind the scenes. When a Python executes a program, Python reads the .py into memory, and parses it in order to get a Bytecode, then goes on to execute. For each module that is imported by the program, Python first checks to see whether there is a precompiled Bytecode version, in a .PYO or .PYC, which has a timestamp which corresponds to its .py file. Python uses the Bytecode version if any. Otherwise, it parses the module's .py file, saves it into a .PYC file, and uses the Bytecode it just created.

Bhalchandra Gholkar
A trainer, Programmer, Traveller, a trainee Chef and a wannabe artist