Sunday, March 01, 2015

The Spherical Cow in System Science - Teaching Distributed System Course (Part I)

I. What Do We Want to Teach Students in System Courses?

I have been the teaching assistant of the Distributed System Project course, which is one of the compulsory course for every master student specialized in system and networking, for over five years' time in the University of Helsinki. Now it is my last year to handle this course before moving to the Cambridge University. During these four years, I learnt quite a lot by interacting with different students in the class. Since I designed most of the exercises for the course and graded all the students' solutions, it provided me the first-hand experience to understand how the students thought about the teaching and the content. I feel obliged to write down some of my thoughts and share the experience with current and future teaching assistants/staffs.

The Distributed System Project was the first course I had ever got involved in teaching in my life. I independently taught this course in 2013 while my PhD supervisor was away taking his sabbatical in the U.S. Because it is a project course, the core philosophy is "Learn by doing", we do not really need prepare any slides to actually lecture the students in the class. Instead, we try to let the students gain the hands-on experience by finishing some small projects. Therefore, the course assignments need to be carefully designed so that the students can apply those abstract theories (i.e., the spherical cow mentioned in the title) they have learnt in the distributed system course in a more concrete scenario. You can find all the information of the course via the following links:

The exercises are generally divided into two categories. First category usually covers the classic distributed algorithms (e.g., Lamport clock, vector clock, election algorithm and etc.) which is quite trivial and does not require me too much efforts to design. The second category is much more challenging since we tried to let the students take a system-level view when they are solving the problems. Some of the questions were very successful, some were not. In this series of posts, I will choose some examples and discuss them one by one.

About the title, assuming you know that joke, I think my purpose is quite obvious. So let me safely assume we can save the strength in explaining the title and simply start our first example.

II. Example I - The Migration of Complexity in the Distributed System Design

This was the exercise we gave in both 2011 and 2012. The exercise is very straightforward, the detailed the assignment description can be found [here]. In short, we simply ask the students to implement a naive calculator application which can plot the sine function using client/service model. We chose to use web browser, AJAX and RESTful API to avoid letting the students develop everything from scratch. We also specify that every request to the service can only carry one operator and two operands.

However, the story did not end like this simple. We asked the students to implement three versions of the aforementioned calculator under three different cases as following:

  • The server is smart enough  to know how to calculate sine function. The client only plots the figure.
  • The server is stupid and only plots the figure for the given points. The client is smart enough to know how to calculate sine function.
  • Both server and clients are stupid. The server only knows +, -, *, / four arithmetic operations, and the client only knows how to plot.

I think you probably already got the idea. Case 1 and case 2 actually represent the evolution of our computation model within these several decades. In the beginning of computer systems, we had a really "powerful" server (powerful is a relative sense) who would take care of all the computations. A user only needs a terminal to connect to the server. A terminal, in some sense, is just a naked and stupid client which is responsible for submitting jobs and displaying the results. In technical terms, we call it thin client and fat server model. This model is reflected in our first case.

As time goes by, functionality started shifting to the client side. CPU became much faster, the storage grew much bigger and the price dropped a lot. Then more and more features were added to the client side. As the PC became more powerful, the applications were also growing in their complexity and more applications started running at the client side to improve user experience. The shift of the use pattern eventually led to fat client and thin server model. Nowadays, even a moderate smartphone is more powerful than a mainframe dozens of years ago. Apparently, our second case tries to capture this model.

It is very interesting to notice that the hyped clouds computing is shifting the computations back to a centralized entity again. Note that the entity here can refer a cluster of computers instead of a standalone server. The key enabler of this trend is the virtualization technology. People realized that the horizontal scalability is a more feasible solution from both economy and engineering perspective.

III. Do Not Forget the Communication Channel Is Also An Integral Part of A Distributed System!

However, our third assumption tries to capture another (maybe a bit unfortunate) case wherein both the server and client are stupid. In such a case, how can a client collaborate with a server only supporting  +, -, *, / four basic arithmetic operations to finish the calculation of sine function? One solution is using Taylor Series to approximate the sine function. Because we explicitly required the solution need to reach certain precision, the students have to figure out the minimal degree needed in the polynomials. When solving this case, we can clearly see that the complexity migrates to the communication channel. Namely, the client and server need to collaborate many rounds in order to finish the computation task.

I am pretty sure you already got the big picture of this assignment. In the first case, the server takes care of the (computation) complexity while the client handles the complexity in the second case. However, if neither can deal with the complexity, the complexity needed to finish the computation has to go somewhere, then it goes to the communication channel in our third case. The exercise reminds the student that the communication channel (usually our network) is also an integral part of a distributed system design.

IV. Is This Really A Good Exercise?

About this exercise, the comments from my colleagues were quite positive. However, the actual feedback from the students in the class were rather bad. Most students failed to grasp the core idea of the exercise and simply thought that we asked them to develop a naive AJAX application. I even got some angry complaints from a student saying "I am not coming here to become a web developer!" Oh, well, you know what? The web is actually the biggest distributed computer system in the world nowadays (excluding the mobile devices ;-) ). Besides, being a web developer is not a bad idea at all in the first place :D

All in all, we finally realized that, without explicitly explaining what is the purpose of this exercise, the students could not really grasp its idea. However, if we put everything across in the first place, there is simply no fun any more in solving the problem. Eventually, we dropped this exercise out after its two trials in 2011 and 2012, which was really sad in my opinion!

Read this series: The Spherical Cow in System Science - Part IPart IIPart III

No comments: