Today a handful of us who work heavily in various reporting and other related quantitative data tasks had a meeting that was scheduled for an hour, and went slightly overtime, where we enjoyed some food courtesy of our company from Great Harvest Bread Company and heard a seminar from one of our employees on the basics of the language, model, and mathematics behind SQL and Relational Database Management Systems in general. As is my habit, I thought it worthwhile to record my own minutes and reflections on the lunch . As far as the food was concerned, it was a good lunch. I had a turkey sandwich with no mayo, mustard, or tomatoes, a chicken noodle soup, and an oatmeal raisin cookie with plain chips, and enjoyed it. Concerning the talk, while the subject was admittedly technical, it had ramifications that extended far beyond the merely technical, and so while IT was moving my desk yet again , we had an enjoyable discussion that went far beyond the mere technical. Thankfully, the presenter sent her slides as well, which will make for some worthwhile studying as well, especially as far as the references are concerned.
The talk began with an introduction that discussed some of the higher level features of SQL, including its status as a highly typed language that started with some basic standards as well as various proprietary connections by Microsoft and Oracle, among others, the relational database model it is based on, namely Codd’s 13 rules, that test whether a given system is truly relational, and some of the basic aspects of the mathematics that SQL uses, namely set theory, first order logic, three-valued logic (including null as well as true and false), as well as relational algebra. After this we discussed at some length the expansions provided by T-SQL above and beyond that contained in the basic ANSI/ISO standards for SQL, functions like ISNULL, Convert, != (is not equal to), Money, Identity (1,1), and the decreased use of semicolons that is allowed. We also discussed the comparison between a declarative language like SQL that works with a set as a whole and deals in interpretation with the point that one does not micromanage a language like SQL, which generally knows what it is doing and follows a particular pattern, whereas with many imperative programming languages, one specifies a procedure precisely. I, for one, consider myself more suited to a declarative language than an imperative language. Let me know what you want, and let me deal with the how. Despite the somewhat rudimentary level of most of us in terms of our SQL skills, we are all familiar with the issues with float and integer data types contained in the data we have worked with, and the limits as to what functions one can use based upon the types of data that one is dealing with. One aspect of the talk that was particularly interesting to me was the discussion of Codd’s 13 rules (which include a Rule 0, so it is usually referred to as 12 rules) for relational databases. The rules themselves require rigorous, systematic, consistent, structured, universal, set-based procedures that require a great deal of independence between the data and the user of the data to avoid corruption.
After this discussion of these rules, we then touched briefly on the difference between sets, bags, and sequences, namely that sets (as well as subsets and supersets) are unordered and distinct, while bags are unordered but not distinct, and sequences are ordered. This led into a discussion of why the order by aspect of a query is done at the end. There were also discussions of unions, intersections, complements, and Cartesian products (like joins). In looking at first order logic, dealing with logical operators (like not, and, or, if/then, and if and only if), Boolean logic, and grouping, we had some jokes about how logicians behave by comparing normal language with logical language. This led to a discussion of the fact that a failure to speak logically and to specify assumptions and order of operations is responsible for much of the ambiguity and miscommunication that occurs between people, because what is perfectly logical to different people rests on different, and even wildly contradictory, assumptions. Our discussion of nulls, and the fact that in SQL the nulls include both known unknowns that are not applicable as well as unknown unknowns that are missing but applicable are indistinguishable, and that in forcing a given value if something is null masks what is missing in the original data, which we all agreed was a pretty meta discussion. By this point the lunch was nearing its end, and we briefly discussed the query engine as well as some resources on how to improve our SQL skills, which I will no doubt take advantage of as I have in the past .
 See, for example:
 See, for example: