Twilight of the Empire (2): SQL is the most powerful language in the world

2020-08-03 09:54:13 0 Comment 2489 views
abstract

Relational algebra was born in 1969, and Oracle was born in 1978. The history of this period is very long, so there are at least several chapters in t

Relational algebra was born in 1969, and Oracle was born in 1978. The history of this period is very long, so there are at least several chapters in this series of articles waiting for Oracle to play.

Because the history is relatively long, it was very difficult to check historical materials before I was born, and the writing of this series was very slow.

My title reveals exactly what I want to say. And I believe that countless people disagree with my point of view. But there is no way, I am a database system, and my admiration for SQL is like a surging river. Plus I write an article, so naturally I can use whatever title I want.

IBM's attitude towards the relationship model of Edgar Frank Codd is very ambiguous: no rejection, no opposition, but no money for the system. Looking back now, the reason is that I am afraid of affecting the IMS layer I already have.The money for the model database.

However, Codd is also a very tenacious person. He went to IBM's major customers and brainwashed them that relational databases are the future and the level is the past. After being washed, the big clients all believed in the Relational Algebra Theology, and they turned to IBM and said they should quickly make a relational database for their fathers.

IBM is not afraid of Codd, but could not withstand the repeated requests of the client's fathers, so it added a new research object to its Future System: System R. The Future System project is a large-scale research project carried out by IBM around 1970 in order to develop revolutionary new software and hardware. At that time, IBM was thriving and sprinkled money.

System R is a landmark system in database history. We laterAlso talk about it specifically. The System R team was founded in 1973. It includes many people who later became famous in the database circle, including the later Turing Award winner Jim Gray. Of course, I don’t know what IBM thinks. IBM has isolated the System R team from Codd.

In 1974, Donald Chamberlin and Raymond Boyce published a paper: SEQUEL: A structured English query language. In order to show you how this paper is, I searched it in the ACM database. The screenshot is as follows:


Twilight of the Empire (2): SQL is the most powerful language in the world


Then why Has SEQUEL become SQL? It was because IBM discovered that SEQUEL was actually a registered trademark of a British company, so it had to change it. Later, in order to compete with Ingres (I will talk about it later), IBM preemptively submitted SQL toStandards Committee. So the full name of SQL was secretly replaced with Standard Query Language---a more domineering name.

I think most computer programmers, DBAs, data scientists, data engineers, etc. in this world will write SQL queries more or less: SELECT ... FROM ... WHERE.... SQL was born in 1974 and is so widely used, so I still think it is the most powerful language in the world.

When Codd proposed the relational model, there was a query language called Alpha in the paper. But because of isolation from System R personnel, the other party invented SQL. Is Alpha better or SQL better?

Another Turing Award winner in history Michael StoneBraker uses a query language similar to Alpha in his system Ingres, so some people think it is a foolish act for IBM to issue SQL again.

I have just written three articles, and three future Turing Award winners have emerged. There are a total of four Turing Award winners in the database field, and they will appear repeatedly in this series.

From my personal point of view, SQL is a simple language to get started, but if you want to write complex queries, it is a chasm like the sky. Therefore, whether such a language is designed reasonably is a matter of opinion.

But SQL has a problem, it is inconsistent with relational algebra. Its SELECT is PROJECT in relational algebra. The SELECT in relational algebra is its WHERE and HAVING. Such inconsistencies are confusing for beginners.

SQL also inherits the biggest pit of relational algebra: NULL. Simply put, relational algebra is a ternary relation: TRUE, FALSE, NULL. Rather than the common binary relationship. The introduction of NULL here brings a series of complex rule changes, which is one of the biggest pits of SQL.

As a database person, if you are not puzzled by fixing NULL related bugs, as a database person, if you don’t know how bad NULL is, it’s not qualified.

Raymond Boyce died of an aneurysm the year he published his paper. Donald Chamberlin enjoys the glory of SQL exclusively. He has won numerous awards for SQL and became an ACM fellow, IEEE fellow, IBM fellow, and the United StatesAcademician of the Academy of Engineering, etc. I met Donald when I went to IBM for an internship in 2008. When I saw a real person from the photo, I really had the urge to kneel. This is a living treasure.


avatar