{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lecture 1: Introduction To Programming Languages\n", "\n", "Originally by Sriram Sankaranarayanan \n", "\n", "Modified by Ravi Mangal \n", "\n", "*Last Edited :* Jan 26, 2025.\n", "\n", "\n", "In this lecture, we will look at the concepts to be learned in this class at a high level. This course is titled _Principles of Programming Languages_. We all know what programming languages are. A programming language is an _interface_ between the programmer and the computer. Computers natively execute machine instructions but writing\n", "complex systems in _machine language_ is highly time consuming if not just plain impossible. High level programming languages were designed to be\n", "ultimately translated into a series of machine instructions either through a _compiler_ or an _interpreter_.\n", "\n", "\n", "Everyone in this class is probably familiar with at a couple of programming languages in the set _{C/C++, Java, Python}_. Even as you have worked with these languages for at least two years now, you should have noticed numerous differences between them: some are superficial matters of syntax, whereas others are deep underlying differences in these languages.\n", "\n", "Let us get the superficial differences out of the way.\n", "\n", "## Syntax \n", "\n", "The first and most important noticable difference between programming languages is the _syntax_. Syntax includes aspects of the languages such as reserved keywords, what is considered a valid identifier, operators, comments, \n", "delineation of blocks, separation between statements and so on.\n", "\n", "Here is a C program that prints `Hello:` followed by the invoked command.\n", "\n", "```\n", "#include // This imports the printf function definition.\n", "// C programs need to have a main function\n", "int main(int argc, char * [] argv){ \n", " if (argc >= 1) { // the open curly braces begins a block\n", " printf(\"Hello: %s \\n\", argv[0]) // Indentation is ignored by compiler.\n", " } // close curly braces end a block\n", " return 1;\n", " } \n", "```\n", "\n", "Here is a Python program that roughly implements the same.\n", "\n", "```\n", "import sys # This imports the argv command line arguments\n", "if len(sys.argv) >= 1: # : in Python begins a block\n", " print('Hello: %s ' % (sys.argv[0]) ) # Failing to indent produces a syntax error\n", "sys.exit(1) # Provide a specific return value \n", "```\n", "\n", "You can notice many differences in syntax including how strings are written, how are external modules imported and referenced, how are code blocks specified and so on.\n", "\n", "\n", "## Beyond Syntax\n", "\n", "However, syntactic differences are just the first ones that meet our eye. C and Python are different in many ways that are more fundamental. Let us list as many differences as we can think of here.\n", "\n", " - **Compiled vs Interpreted:** C programs are *compiled*, whereas Python programs are *interpreted*: for most computer scientists, this is a substantial difference between the two languages.\n", " \n", " - **Memory Model:** The C language allows pointers to memory along with the ability to create a pointer to virtually any memory cell in the computer . Of course, operating systems and page protection mechanisms will disallow us from reading these memory locations. Python on the other hand is supposedly *memory safe*. We are restricted in what memory cells are accessible. The concept of pointers is replaced by *references*, which are more stringent. This is a good feature for 99.9% of the applications perhaps but a problem for certain specialized applications. Being able to address any memory cell is useful for systems programming, embedded programming and other *low level* applications. \n", " \n", " - **Memory Management:** C has a very basic built-in memory management that is restricted to dynamic memory allocations (`malloc`/`free`). In other words, the programmer is responsible for allocating and freeing memory as the program executes. Programmers often make mistakes leading to bugs such as memory leaks or out of bounds access that can lead to crashes in the best case or malicious security holes in the worst case. Python, on the other hand, manages memory behind the scenes using a _garbage collector_ that collects inaccessible memory cells and frees them.\n", " \n", " \n", " - **Type System:** C is type unsafe (more or less): you can force a value of any type into almost any other type. Inside a C program you can add an integer to a string or divide one string by another. Of course, this typecast can result in garbage if done carelessly. Python is type safe: it implements runtime type checking. Thus adding an integer to a string is possible in C (though it will end up adding an integer and a pointer) but will lead to a type error in Python.\n", " \n", " - **Object System:** Python is object oriented: it supports classes and inheritence. C is not objected oriented. The language C++ is built on top of C and is object oriented. But details how how inheritence works in C++ and Python are quite different.\n", " \n", " - **Inbuilt Datatypes:** Python has inbuilt data types such as lists, arrays and maps. C does not have inbuilt data types.\n", " \n", " - **Higher-Order Functions:** Python treats functions as *first class values*. In other words, you can have a procedure that creates a new function and returns it. The same can be done in C in that functions can be made into pointers and passed around. But there are many features such as *lambdas* and *comprehensions* that are natively supported in Python and quite cumbersome to implement in C.\n", " \n", "These are just a few of the many differences between C and Python.\n", "\n", "\n", "## The Tower of Babel for Programming Languages\n", "\n", "Following other fields such as biology or linguistics, it is tempting to \n", "place programming languages into silos such as *imperative languages*, *object oriented languages*, \n", "*interpreted languages*, *compiled languages*, *functional languages*, *scripting languages* and so on.\n", "These categories are often used in the same way biologists would describe animals as belonging to the genera of\n", "insects (insecta), fishes (pisces), birds (aves) and mammals (mammalia). Closer to home, linguists describe *Sino-Urgic*, *Indo-European* or *Australasian* languages. \n", "\n", "~~~\n", "Such categorizations of programming languages are at best pseudo-scientific, and they can be highly misleading.\n", "~~~\n", "\n", "Moreover, many languages defy such characterizations. Also, a language can be imperative, object oriented,\n", "functional and useful for scripting (eg., Python). Furthermore, it can have an interpreter and a compiler (eg., OCaml, Scala, Python). The latest version of C++ supports many functional programming features. And yet, it\n", "is less \"functional\" than a \"pure functional\" language such as Haskell.\n", "\n", "\n", "## How To Understand Programming Languages?\n", "\n", "Rather than categorize programming languages into exclusive categories, it is important to understand that there are numerous often tightly inter-related aspects to the design of programming languages. A few of these aspects are given below in no specific order:\n", " - How are functions treated in the language: functions are _first class values_ on par with integers, floats or strings versus _subroutines_ that encapsulate a bunch of statements?\n", " - What are the control structures in the language?\n", " - What are the native data types in the language?\n", " - How are classes, inheritence, interfaces, virtual functions and polymorphism handled?\n", " - How are types specified and checked? Are type declarations compulsory, optional or disallowed?\n", " - Does the language allow for concurrent threads/processes? Does it support asynchronous control flow?\n", " - Is the language extensible?\n", " - Are there special inbuilt constructs that are defined to make it easier to program certain class of applications (eg., scripting, data processing, asynchronous event handling, graphics, 3D printing, controlling physical systems)?\n", " \n", " \n", "However, it does not end here. Programming languages must be supported by a compiler or interpreter that takes in programs that are simply text files and _brings them to life_ (so to speak) by compiling/intepreting the programmer's text. Additionally, there is a _runtime system_ consisting of a set of libraries that support the execution of programs written in the language. A holistic view of programming languages should consider the design of interpreters and the runtime system as well.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Structure of the Course\n", "\n", "The primary motivation of this course is to understand how to design our own languages and _bring them to life_. Naturally, it is not enough just to understand but also to put our understanding into practice. However, most realistic programming languages are _complex_ and supported by interpreters and runtime systems that can easily\n", "run into millions of lines of code. It is infeasible to design a full scale language and write an interpreter for this language. Rather, we will design a few _small languages_ with just enough features to illustrate the point.\n", "We will consider many design choices for these languages. The purpose is not to pretend that these small languages are practical ones but to use them as a means to understand what goes on under the hood of a larger language. \n", "\n", "However, a lot of languages that programmers like to build are *bespoke* (custom designed) and relatively small *domain specific languages* (DSLs) that are built as part of a larger system. For example, a small DSL can be built for a large system to support human readable and editable configuration files. Javascript, especially in its earlier incarnations, was a DSL that was interpreted inside a larger system (the web browser) to specifically allow interactive web pages, forms and so on. However, since it also happens to be a feature rich language, it has evolved (somewhat haphazardly), and is now used in applications outside of web browsers (eg., server-side javascript, NodeJS). Popular frameworks such as TensorFlow or PyTorch are built on top of \n", "languages (Python in this instance). They help us express models such as neural networks and automate the process\n", "of training them on the data to a large extent through a useful framework API.\n", "\n", "~~~\n", "In your career as a computer scientist, you will encounter several languages as your own skills and tastes in the community evolve. At the same time many of you will design and implement frameworks or DSLs, along with interpreters and runtime systems.\n", "~~~\n", "\n", "### Scala (the language used to program)\n", "\n", "The primary focus language for this course will be _Scala_. Scala was designed by Martin Odersky, a computer scientist at EPFL in Laussane, Switzerland. It supports many modern language features such as first class functions, objects, a strong type system, interoperability with Java (since it primarily targets Java bytecode), generics, abstract classes, traits, combinator library, ways to extend the core language to build DSLs on top and a rich runtime collection of runtime libraries including the direct use of the Java runtime library.\n", "\n", "_All programming will be implemented in Scala, unless otherwise noted_. Instructions for setting up and running Scala programs will be provided.\n", "\n", "### Languages that we will design\n", "\n", "We will design many small languages and keep adding features to them. These could be arithmetic expression languages, music specification language, graphics language and/or physical modeling language. The languages will be small to begin with and grow in size/complexity as we explore design choices for many of the aspects of PL previously mentioned. \n", "\n", "\n", "# Basic Axioms for This Class\n", "\n", "Past experience suggests that the study of programming languages can lead to **unnecessary** panic leading to frustration amongst students. \n", "The language constructs we study may be opaque, and it may seem like a lot of effort to define and bring a simple toy language to life. There are some basic expectations that every student of this subject should have.\n", "\n", " - __No Mysteries__ Ideally, there should be nothing mysterious or ambiguous about any aspect of a programming language. Mysteries and ambiguities are bad. They confuse not just the *dilletante* but also the \"seasoned practitioner\". Everything is built on simple underlying principles that should be understood, nay grokked. The light of this understanding shall magically transform the firebreathing, human crunching and gargantuan monsters into tiny and unthreatening minnows. To this end, the student should always seek help from the *cognescenti* and remember not to panic.\n", " \n", " - __Seek and Read Documentation__ Languages like Scala have a loyal following and are well documented online. You should get into the habit of looking up this documentation. Once again, the instructor and course staff will point you out to helpful sources, upon request and do so with a friendly smile on their faces.\n", " \n", " - __Knowing is Doing__ Like every class, it is one thing to watch your instructor and course staff program, walk through concepts and do proofs. However, until your fingers try out the code, play with it, write new code, reason about it in your own words and write them down, you will not _own_ the knowledge. In the process you will make mistakes and may lose your way but once again, the instructor and the course staff will support you in this endeavor.\n" ] } ], "metadata": { "kernelspec": { "display_name": "Scala", "language": "scala", "name": "scala" }, "language_info": { "codemirror_mode": "text/x-scala", "file_extension": ".sc", "mimetype": "text/x-scala", "name": "scala", "nbconvert_exporter": "script", "version": "2.13.14" } }, "nbformat": 4, "nbformat_minor": 4 }