List of commits:
Subject Hash Author Date (UTC)
Few patches to prepare for Fall 2019. 545561b3d47dfe9bff2436f8f2b7e32572cd808a aubert@math.cnrs.fr 2019-08-07 17:12:26
Editing the cycle of design picture 9bd1c3190a71202b88011b831161cf9d6a365237 aubert@math.cnrs.fr 2019-08-07 16:13:03
Test c742ba7590d59e2941f9fa197c9d009d5c642e76 au 2019-07-31 23:56:14
Minor fix in Known bugs and contrib. 19f3adfa1f88e6d12e94868f4eab0b5838d05eb6 aubert@math.cnrs.fr 2019-07-29 20:51:14
Commit 2ed1a8532cbdae46d3feb99cf52f23f2afa144ec aubert@math.cnrs.fr 2019-07-29 19:01:36
Adding the final to the notes. 0e8848a3e67ec49bab5b767e25d6099fd218e417 aubert@math.cnrs.fr 2019-05-20 14:44:54
Added the solution to one of the problem from Exam #2. 625545a335c5dc7b7303355088a3abba999fdf40 aubert@math.cnrs.fr 2019-05-06 16:59:50
Added a solution to one of the problem of Exam #2, and added the last quiz. 4e0ee25d509412e5e455d9ad4c98f7aad8c5a354 aubert@math.cnrs.fr 2019-04-19 15:14:10
A few glitches in the first java programs patched, added the alternative version to the source. 48611244d21515269bae041dc2535b7e08fb5877 aubert@math.cnrs.fr 2019-04-04 22:05:03
patching a few glitches with the java code and the mysql code used in the demo for the first program. 0fcb73dcc60aef476899c4efea092c99dd6027db aubert@math.cnrs.fr 2019-04-04 21:55:55
Tidying up a program. dd409ab3fd6736af4b8adaac284ee101b9770f60 aubert@math.cnrs.fr 2019-04-04 19:02:21
Correcting the chapter on db programming. 484c4a6372056106bae77e48b397c1174738fc26 aubert@math.cnrs.fr 2019-04-04 18:57:50
Added solution to first problem of Exam 2. 45881a9b8ab1e5a0071b58d72362059593365d52 aubert@math.cnrs.fr 2019-04-03 18:38:48
Quick note in Java programming. 625282d4fece1f6a7a617a8c9bcf7c3c856185b3 aubert@math.cnrs.fr 2019-04-02 18:42:48
Added exam #2 0aa6788451b34cb6371fed5a0f719a4ac52cf0c8 aubert@math.cnrs.fr 2019-04-01 18:37:43
Fixing a typo. 502262b742ac86d7b3d5e95a0f9fcd7ca3a6e586 aubert@math.cnrs.fr 2019-03-22 17:35:43
Forgot one figure. 1212e57a5f2e7b2f104f432feb4dce0669cdb452 aubert@math.cnrs.fr 2019-03-22 17:32:36
Working on Ch. 4, adding the content of the weekly announcements to the core of the document, added illustrations on the conversion E.R. -> Rel. model, fixed various typos. 1995837f41c1508ae809d5a91065bdefd5bd6c03 aubert@math.cnrs.fr 2019-03-22 17:27:51
Fixing exercise and normalization in Chapter 4. 7aa4b9a581daf7144dd485ba791b7707585088b8 aubert@math.cnrs.fr 2019-03-20 18:55:42
Quick fix on UML diagrams. 86a52b2f6ce92750bdb2c33eaa4dbc4497322e1c aubert@math.cnrs.fr 2019-03-18 18:37:24
Commit 545561b3d47dfe9bff2436f8f2b7e32572cd808a - Few patches to prepare for Fall 2019.
Author: aubert@math.cnrs.fr
Author date (UTC): 2019-08-07 17:12
Committer name: aubert@math.cnrs.fr
Committer date (UTC): 2019-08-07 17:12
Parent(s): 9bd1c3190a71202b88011b831161cf9d6a365237
Signing key:
Tree: ae6d7ac395ee7ff6248853b245fef50fb08bdebd
File Lines added Lines deleted
notes/lectures_notes.md 51 25
File notes/lectures_notes.md changed (mode: 100644) (index 9f6da94..15d6c5c)
... ... For `DISTINCT`, `ALL` and `UNION`, cf. [@Textbook6, 4.3.4] or [@Textbook7, 6.3.
1705 1705 For `ORDER BY` , cf. [@Textbook6, 4.3.6] or [@Textbook7, 6.3.6]. For `ORDER BY` , cf. [@Textbook6, 4.3.6] or [@Textbook7, 6.3.6].
1706 1706 For aggregate functions, cf. [@Textbook6, 5.1.7] or [@Textbook7, 7.1.7]. For aggregate functions, cf. [@Textbook6, 5.1.7] or [@Textbook7, 7.1.7].
1707 1707
1708 ## AUTO_INCREMENT
1708 ### AUTO_INCREMENT
1709 1709
1710 1710 Something that is not exactly a constraint, but that can be used to "qualify" domains, is the `AUTO_INCREMENT`{.sqlmysql} feature of MySQL. Something that is not exactly a constraint, but that can be used to "qualify" domains, is the `AUTO_INCREMENT`{.sqlmysql} feature of MySQL.
1711 1711 Cf. <https://dev.mysql.com/doc/refman/8.0/en/example-auto-increment.html>, you can have MySQL increment a particular attribute (most probably intended to be your primary key) for you. Cf. <https://dev.mysql.com/doc/refman/8.0/en/example-auto-increment.html>, you can have MySQL increment a particular attribute (most probably intended to be your primary key) for you.
 
... ... The following links could be useful:
1985 1985 #. Save the "mysql-installer-web-community-XXX.msi" file, and open it. If there is an updated version of the installer available, agree to download it. Accept the license term. #. Save the "mysql-installer-web-community-XXX.msi" file, and open it. If there is an updated version of the installer available, agree to download it. Accept the license term.
1986 1986 #. We will now install the various components needed for this class, leaving all the choices by defaults. This means that you need to do the following: #. We will now install the various components needed for this class, leaving all the choices by defaults. This means that you need to do the following:
1987 1987 #. Leave the first option on "Developer Default" and click on "Next", or click on "Custom", and select the following: #. Leave the first option on "Developer Default" and click on "Next", or click on "Custom", and select the following:
1988 ![](img/mysql_install.png){width=90%}
1988
1989 ![](img/mysql_install.png){width=90%}
1990
1989 1991 #. Click on "Next" even if you don't meet all the requirements #. Click on "Next" even if you don't meet all the requirements
1990 1992 #. Click on "Execute". The system will download and install several softwares (this may take some time). #. Click on "Execute". The system will download and install several softwares (this may take some time).
1991 1993 #. Click on "Next" twice, leave "Type and Networking" on "Standalone MySQL Server / Classic MySQL Replication" and click "Next", and leave the next options as they are (unless you know what you do and want to change the port, for instance) and click on "Next". #. Click on "Next" twice, leave "Type and Networking" on "Standalone MySQL Server / Classic MySQL Replication" and click "Next", and leave the next options as they are (unless you know what you do and want to change the port, for instance) and click on "Next".
 
... ... Note that:
4390 4392 - **Reflexivity**: If $Y$ is a subset of $X$, then $X → Y$ - **Reflexivity**: If $Y$ is a subset of $X$, then $X → Y$
4391 4393 - **Augmentation**: If $X → Y$, then $\{X, Z\} → Y$ - **Augmentation**: If $X → Y$, then $\{X, Z\} → Y$
4392 4394 - **Transitivity**: If $X → Y$ and $Y → Z$, then $X → Z$ - **Transitivity**: If $X → Y$ and $Y → Z$, then $X → Z$
4393 We will assume that the consequence of those axioms always hold ("closure under those rules"), but will generaly not write them explicitely.
4395
4396 We will assume that the consequence of those axioms always hold ("closure under those rules"), but will generaly not write them explicitely, since they don't carry any new or additional information.
4394 4397
4395 4398 #### Definitions #### Definitions
4396 4399
 
... ... We now have a formal definition.
4400 4403 In one particular relation $R(A_1, …, A_n)$, In one particular relation $R(A_1, …, A_n)$,
4401 4404
4402 4405 - If $\{A_1, …, A_n\} → Y$ for all attribute $Y$, then $\{A_1, …, A_n\}$ is a superkey. - If $\{A_1, …, A_n\} → Y$ for all attribute $Y$, then $\{A_1, …, A_n\}$ is a superkey.
4403 - If $\{A_1, …, A_n\} \setminus A_i$ is not a superkey anymore for all $A_i$, then $\{A_1, …, A_n\}$ is a key.
4406 - If $\{A_1, …, A_n\} / A_i$ is not a superkey anymore for all $A_i$, then $\{A_1, …, A_n\}$ is a key.
4404 4407 - We will often discard candidate keys and focus on one primary key. <!-- try to list all the candidates key, keep all the options open. --> - We will often discard candidate keys and focus on one primary key. <!-- try to list all the candidates key, keep all the options open. -->
4405 4408 - If $A_i$ is a member of some candidate key of $R$, it is a **prime attribute** of $R$. - If $A_i$ is a member of some candidate key of $R$, it is a **prime attribute** of $R$.
4406 4409 It is a **non-prime attribute** otherwise. It is a **non-prime attribute** otherwise.
4407 4410
4408 4411 Given a FD $\{A_1, …, A_n\} → Y$, Given a FD $\{A_1, …, A_n\} → Y$,
4409 4412
4410 - It is a **full functional dependency** if for all $A_i$, \{A_1, …, A_n\} \setminus A_i → Y$, does not hold.
4413 - It is a **full functional dependency** if for all $A_i$, $\{A_1, …, A_n\} / A_i → Y$, does not hold.
4411 4414 - It is a **partial dependency** otherwise. - It is a **partial dependency** otherwise.
4412 4415
4413 4416 A FD : $X → Y$ is a **transivive dependency** if there exist a set of attribute $B$ s.t. A FD : $X → Y$ is a **transivive dependency** if there exist a set of attribute $B$ s.t.
 
... ... A FD : $X → Y$ is a **transivive dependency** if there exist a set of attribut
4417 4420 - $B$ is not a subset of any candidate key, - $B$ is not a subset of any candidate key,
4418 4421 - $X → B$ and $B → Y$ hold - $X → B$ and $B → Y$ hold
4419 4422
4423 <!--
4420 4424 **Examples on lecture 17's note to incorporate?** **Examples on lecture 17's note to incorporate?**
4425 -->
4421 4426
4422 4427 --- ---
4423 4428
 
... ... Problem (Normal form of the MESSAGE relation) +.#NormalizeMessage
5542 5547 #. Write each of the following business statement as a functional dependency: #. Write each of the following business statement as a functional dependency:
5543 5548 #. The length of a message can be computed from its content. #. The length of a message can be computed from its content.
5544 5549 #. The content and attachment determines the size of a message. #. The content and attachment determines the size of a message.
5545 #. A sender can send the same content and attachment to multiple receivers at the exact same time and date, but cannot send two different content and attachment at the exact same time and date. \vspace{5em}
5546 #. Assuming all the functional dependencies you identified at the previous step hold, determine a suitable primary key for this relation. \vspace{8em}
5547 #. Taking the primary key you identified at the previous step, what is the degree of normality of this relation? Justify your answer.\vspace{10em}
5550 #. A sender can send the same content and attachment to multiple receivers at the exact same time and date, but cannot send two different content and attachment at the exact same time and date.
5551 #. Assuming all the functional dependencies you identified at the previous step hold, determine a suitable primary key for this relation.
5552 #. Taking the primary key you identified at the previous step, what is the degree of normality of this relation? Justify your answer.
5548 5553 #. If needed, normalize this relation to the third normal form. #. If needed, normalize this relation to the third normal form.
5549 5554
5550 5555 --- ---
 
... ... Solution to [%D %n (%T)](#problem:movie)
5685 5690
5686 5691 Solution to [%D %n (%T)](#problem:car-insurance) Solution to [%D %n (%T)](#problem:car-insurance)
5687 5692 ~ ~
5693 Two possible solutions are
5688 5694
5689 5695 ![](img/p) ![](img/p)
5690
5691 OR
5696
5697 and
5692 5698
5693 5699 ![](fig/er/Accident) ![](fig/er/Accident)
5694 5700 \ \
 
... ... Solution to [%D %n (%T)](#problem:schedule)
5818 5824 Solution to [%D %n (%T)](#problem:bike) Solution to [%D %n (%T)](#problem:bike)
5819 5825 ~ ~
5820 5826
5821 -
5827 - The functional dependencies we obtain are:
5822 5828 #. \{ Manufacturer, Serial\_no \} → \{ Model, Batch, Wheel\_size, Retailer\} #. \{ Manufacturer, Serial\_no \} → \{ Model, Batch, Wheel\_size, Retailer\}
5823 5829 #. Model → Manufacturer #. Model → Manufacturer
5824 5830 #. Batch → Model #. Batch → Model
 
... ... Java actually uses
5936 5942 And the routine is a bit more complex: And the routine is a bit more complex:
5937 5943
5938 5944 #. Import library #. Import library
5939 #. Load driver (can also be done at execution time)
5945 #. Load driver (done at execution time)
5940 5946 #. Open connection (create `Connection` and `Statement` objects) #. Open connection (create `Connection` and `Statement` objects)
5941 5947 #. Interactc with DB (use `Statement` object) #. Interactc with DB (use `Statement` object)
5942 5948 #. Close connection #. Close connection
 
... ... The records selected are:
6136 6142 `BOOLEAN` | `boolean` `BOOLEAN` | `boolean`
6137 6143 `BIT(1)` | `byte` `BIT(1)` | `byte`
6138 6144
6145 (`DECIMAL(t,d)` was not previously introduced: the `t` stands for the number of digits, the `d` for the precision.)
6146
6139 6147 We cannot always have that correspondance: what would correspond to a reference variable? We cannot always have that correspondance: what would correspond to a reference variable?
6140 6148 To a private attribute? To a private attribute?
6141 6149 This series of problems is called "object-relational impedance mismatch", it can be overcomed, but at a cost. This series of problems is called "object-relational impedance mismatch", it can be overcomed, but at a cost.
 
... ... Problem (Advanced Java Programming) +.#Advanced_java
6469 6477 - Flow control (prevent indirect access) - Flow control (prevent indirect access)
6470 6478 - Encryption (salting + encrypting, can be a legal obligation): password + salt -> hashed. - Encryption (salting + encrypting, can be a legal obligation): password + salt -> hashed.
6471 6479
6480 ### How to recover?
6481
6482 - Have a plan.
6483
6484 <!--
6472 6485 **Insert short intro. to salting, cryptography.** **Insert short intro. to salting, cryptography.**
6473 6486
6487 + Document, log.
6488 -->
6489
6474 6490 ## Specificities Of Databases ## Specificities Of Databases
6475 6491
6476 6492 ### Attack ### Attack
 
... ... Can also be used for DBMS fingerprinting.
6499 6515 ~~~{.bash} ~~~{.bash}
6500 6516 mysqldump --all-databases - u testuser -p password - h localhost > dump.sql mysqldump --all-databases - u testuser -p password - h localhost > dump.sql
6501 6517 ~~~ ~~~
6502
6503 #. Prepared Statemets (a.k.a. stored procedures)
6504 #. White list input validation
6505 #. Escaping
6518 #. Possible protections from sql injections (-like):
6519 #. Prepared Statemets (a.k.a. stored procedures)
6520 #. White list input validation
6521 #. Escaping
6506 6522 #. Be up-to-date, desactivate the options you are not using, read newsfeeds, #. Be up-to-date, desactivate the options you are not using, read newsfeeds,
6507 6523
6508 6524 # Presentation of NoSQL # Presentation of NoSQL
6509 6525
6510 6526 ## Resources {-} ## Resources {-}
6511 6527
6512 [@NoSQLDistilled], <https://en.wikipedia.org/wiki/NoSQL>
6513 [@Sullivan2015], [@Textbook7, Chapter 24], [@DBLP:journals/sigmod/PavloA16]
6514 - <http://delivery.acm.org/10.1145/1780000/1773922/p35-lakshman.pdf?ip=134.224.220.1&id=1773922&acc=ACTIVE%20SERVICE&key=A79D83B43E50B5B8%2EA1A26A3EF7ED82C5%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35&__acm__=1524060110_1b69882dcd91c4186c3613d6cebf5549> and <https://docs.datastax.com/en/articles/cassandra/cassandrathenandnow.html>
6528 To write this chapter, I used
6529
6530 - [@NoSQLDistilled],
6531 -<https://en.wikipedia.org/wiki/NoSQL>
6532 - [@Sullivan2015],
6533 - [@Textbook7, Chapter 24],
6534 - [@DBLP:journals/sigmod/PavloA16]
6535 - <http://delivery.acm.org/10.1145/1780000/1773922/p35-lakshman.pdf?ip=134.224.220.1&id=1773922&acc=ACTIVE%20SERVICE&key=A79D83B43E50B5B8%2EA1A26A3EF7ED82C5%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35&__acm__=1524060110_1b69882dcd91c4186c3613d6cebf5549> and
6536 - <https://docs.datastax.com/en/articles/cassandra/cassandrathenandnow.html>
6515 6537
6516 6538
6517 6539 ## A Bit of History ## A Bit of History
 
... ... When you write a DB application, you have two options:
6525 6547 #. One database for many softwares #. One database for many softwares
6526 6548 #. One database for each softwares #. One database for each softwares
6527 6549
6528 Option a. can cause severe impacts on the efficiency of your database: since maintening the integrity of the database is a requirement, a lot of synchronization is needed.
6550 Option 1. can cause severe impacts on the efficiency of your database: since maintening the integrity of the database is a requirement, a lot of synchronization is needed.
6529 6551 With option b., you develop an "application database", and you have more freedom of choice: since only a program interact with a database, you can chose whatever data management you want. With option b., you develop an "application database", and you have more freedom of choice: since only a program interact with a database, you can chose whatever data management you want.
6530 6552
6531 6553 But people were attached to SQL and kept using it. But people were attached to SQL and kept using it.
 
... ... Increase in everything (traffic, size of data, number of clients, etc.) meant "u
6537 6559 #. Bigger machines #. Bigger machines
6538 6560 #. More machines #. More machines
6539 6561
6540 Option b. was generally less expensive, but came with two drawbacks w.r.t. databases:
6562 Option 2. was generally less expensive, but came with two drawbacks w.r.t. databases:
6541 6563
6542 6564 #. Cost of licences, #. Cost of licences,
6543 6565 #. Force to perform "unnatural acts": relational model are really not made to be distributed #. Force to perform "unnatural acts": relational model are really not made to be distributed
 
... ... Today, no official definition, but NoSQL often implies the followig:
6580 6602 - Not using `SQL`. Some still have a query language, and it ressembles `SQL` (to minimize learning cost), for instance Cassandra's CQL. - Not using `SQL`. Some still have a query language, and it ressembles `SQL` (to minimize learning cost), for instance Cassandra's CQL.
6581 6603 - Run well on clusters - Run well on clusters
6582 6604 - Schemaless: you can add records without having to define a change in the structure first. - Schemaless: you can add records without having to define a change in the structure first.
6583 - Open source
6605 - "Open source" (even if recents changes makes their licence [not really open source](https://opensource.org/LicenseReview122018)).
6584 6606
6585 6607 Most importantly: polyglot persistence, "using different data storage technologies to handle varying data storage needs." Most importantly: polyglot persistence, "using different data storage technologies to handle varying data storage needs."
6586 6608
 
... ... MongoDB announced that it would have more and more of the ACID properties! <http
6596 6618 Also, a really great use of NoSQL is to adopt it at an early stage of the development, when it isn't clear what the schemas should be. Also, a really great use of NoSQL is to adopt it at an early stage of the development, when it isn't clear what the schemas should be.
6597 6619 When the schemas are final, then you can shift to relational DBMS! When the schemas are final, then you can shift to relational DBMS!
6598 6620
6621 The retro-acronym "Not Only SQL" emphasizes that `SQL` will still be one of the principal actor, but that developer should be aware of other solutions for other needs.
6622
6599 6623 ## Comparison ## Comparison
6600 6624
6601 6625 ### Overview ### Overview
 
... ... When the schemas are final, then you can shift to relational DBMS!
6613 6637 Vs Vs
6614 6638
6615 6639 - Immediate data consistency - Immediate data consistency
6616 - Powerfull query language (join is missing from SQL, has to be implemented on the application-side)
6640 - Powerfull query language (for instance, join is often missing in NoSQL, has to be implemented on the application-side)
6617 6641 - Structured data storage (can be too restrictive) - Structured data storage (can be too restrictive)
6618 6642
6619 ### ACID Vs CAP {#sec:AcidVsCAP}
6643 ### ACID Vs CAP Vs BASE {#sec:AcidVsCAP}
6620 6644
6621 6645 ACID is the guarantee of validity even in the event of errors, power failures, etc. ACID is the guarantee of validity even in the event of errors, power failures, etc.
6622 6646
 
... ... ACID is the guarantee of validity even in the event of errors, power failures, e
6625 6649 - Isolation → Executing two transactions in parallel or one after the other would have the same result - Isolation → Executing two transactions in parallel or one after the other would have the same result
6626 6650 - Durability → Once a transaction has been commited, it is stored in non-volatile memory. - Durability → Once a transaction has been commited, it is stored in non-volatile memory.
6627 6651
6628 CAP (a.k.a. Brewer's theorem): Roughly, "In a distributed system, one has to choose between consistency (every read receives the most recent write or an error) and availability (every request receives a (non-error) response, without guarantee that it contains the most recent write)" (the P. standing for "Partition tolerance").
6652 CAP (a.k.a. Brewer's theorem): Roughly, "In a distributed system, one has to choose between consistency (every read receives the most recent write or an error) and availability (every request receives a (non-error) response, without guarantee that it contains the most recent write)" (the P. standing for "Partition tolerance", a guarantee of availability).
6653
6654 BASE is Basic Availability, Soft state, Eventual consistency.
6629 6655
6630 6656 ## Categories of NoSQL Systems ## Categories of NoSQL Systems
6631 6657
Hints:
Before first commit, do not forget to setup your git environment:
git config --global user.name "your_name_here"
git config --global user.email "your@email_here"

Clone this repository using HTTP(S):
git clone https://rocketgit.com/user/caubert/CSCI_3410

Clone this repository using ssh (do not forget to upload a key first):
git clone ssh://rocketgit@ssh.rocketgit.com/user/caubert/CSCI_3410

Clone this repository using git:
git clone git://git.rocketgit.com/user/caubert/CSCI_3410

You are allowed to anonymously push to this repository.
This means that your pushed commits will automatically be transformed into a merge request:
... clone the repository ...
... make some changes and some commits ...
git push origin main