File notes/lectures_notes.md changed (mode: 100644) (index 6178d98..2974fcf) |
... |
... |
_I would like here to include some of the softwares I believe could make your se |
274 |
274 |
|
|
275 |
275 |
There is a good chance that any programming language you can think of is [Turing complete](https://en.wikipedia.org/wiki/Turing-completeness). |
There is a good chance that any programming language you can think of is [Turing complete](https://en.wikipedia.org/wiki/Turing-completeness). |
276 |
276 |
Actually, even some of the extremely basic tools you may be using [may be Turing complete](https://www.gwern.net/Turing-complete). |
Actually, even some of the extremely basic tools you may be using [may be Turing complete](https://www.gwern.net/Turing-complete). |
277 |
|
However, being complete does not mean being good at any task: it just mean that any computable problem can be solved, but does not imply anything in terms of efficiency, comfort, or usability. |
|
|
277 |
|
However, being complete does not mean being good at any task: it just means that any computable problem can be solved, but does not imply anything in terms of efficiency, comfort, or usability. |
278 |
278 |
|
|
279 |
279 |
In theory, pretty much any programming language can be used to |
In theory, pretty much any programming language can be used to |
280 |
280 |
|
|
|
... |
... |
In theory, pretty much any programming language can be used to |
282 |
282 |
- Have accessible catalog describing the metadata, |
- Have accessible catalog describing the metadata, |
283 |
283 |
- Support transactions and concurrency, |
- Support transactions and concurrency, |
284 |
284 |
- Support authorization of access and update of data, |
- Support authorization of access and update of data, |
285 |
|
- Enforce constraints, |
|
|
285 |
|
- Enforce constraints. |
286 |
286 |
|
|
287 |
287 |
But to obtain a system that is fast in reading and writing on the disk, convenient to search in the data, and that provides as many "built-in" tools as possible, one should use a specialized tool. |
But to obtain a system that is fast in reading and writing on the disk, convenient to search in the data, and that provides as many "built-in" tools as possible, one should use a specialized tool. |
288 |
288 |
|
|
289 |
|
In this lecture notes, we will introduce one of this tool--the `SQL` programming language-- and the theory underneath it--the relational model--. |
|
|
289 |
|
In those lecture notes, we will introduce one of this tool--the `SQL` programming language-- and the theory underneath it--the relational model--. |
290 |
290 |
We will also observe that a careful design is a mandatory step before implementing a catalog, and that how good a catalog is can be assessed, and introduce the tools to do so. |
We will also observe that a careful design is a mandatory step before implementing a catalog, and that how good a catalog is can be assessed, and introduce the tools to do so. |
291 |
|
Finally, we will discuss how an application interacting with a database can be implemented and secured, and the alternatives to `SQL` offered by the NoSQL approach, as well as the limitations of both models. |
|
|
291 |
|
Finally, we will discuss how an application interacting with a database can be implemented and secured, and the alternatives to `SQL` offered by the NoSQL approach, as well as the limitations and highlights of both models. |
292 |
292 |
|
|
293 |
293 |
## Database |
## Database |
294 |
294 |
|
|
295 |
295 |
A database (DB) is **a collection of related data**. |
A database (DB) is **a collection of related data**. |
296 |
296 |
Data (= information, can be anything, really) + management (= logical organization of the data), through **D**ata**b**ase **M**anagement **S**ystem. |
Data (= information, can be anything, really) + management (= logical organization of the data), through **D**ata**b**ase **M**anagement **S**ystem. |
297 |
297 |
|
|
298 |
|
#. Represent a mini-world, a Universe of Disclosure (UoD). |
|
|
298 |
|
#. Represent a mini-world, a "Universe of Disclosure" (UoD). |
299 |
299 |
#. Logically coherent, with a meaning. |
#. Logically coherent, with a meaning. |
300 |
300 |
#. Populated for a purpose. |
#. Populated for a purpose. |
301 |
301 |
|
|
|
... |
... |
A DBMS has multiple components, as follows: |
307 |
307 |
|
|
308 |
308 |
Note that |
Note that |
309 |
309 |
|
|
310 |
|
- The program can be written in any language, be a web interface, etc. It is sometimes part of the software shipped with the DBMS, but not necessarily. |
|
311 |
|
- Most DBMS software include a Command-Line Interface (C.L.I.). |
|
312 |
|
- The catalog (or schema, meta-data^[The term "meta-data" has numerous definition ("data about the data"): we use it here to refer to the description of the organization of the data, and not e.g. to statistical data about our data.]) contains the description of how the data is stored, i.e., the datatypes, nature of the attributes, etc. |
|
|
310 |
|
- The program can be written in any language, be a web interface, etc. It is sometimes part of the software shipped with the DBMS, but not necessarily (you can, and we will, develop your own program to interact with the DBMS). |
|
311 |
|
- Most DBMS software includes a Command-Line Interface (CLI). |
|
312 |
|
- The catalog (or schema, meta-data^[The term "meta-data" has numerous definition ("data about the data"): we use it here to refer to the description of the organization of the data, and not e.g. to statistical data about the data.]) contains the description of how the data is stored, i.e., the datatypes, nature of the attributes, etc. |
313 |
313 |
- Sometimes, catalog and data are closer than pictured (you can have "self-describing meta-data", that is, they cannot be distinguished). |
- Sometimes, catalog and data are closer than pictured (you can have "self-describing meta-data", that is, they cannot be distinguished). |
314 |
314 |
|
|
315 |
315 |
## Database Management System (DBMS) |
## Database Management System (DBMS) |
|
... |
... |
A DBMS contains a *general purpose* software that is used to |
323 |
323 |
|
|
324 |
324 |
You can think of a tool to |
You can think of a tool to |
325 |
325 |
|
|
326 |
|
#. specify a storage unit, |
|
327 |
|
#. fill it, |
|
328 |
|
#. allow to change its content, as well as its organization, |
|
329 |
|
#. allow multiple persons to access all or parts of it at the same time. |
|
|
326 |
|
#. Specify a storage unit, |
|
327 |
|
#. Fill it, |
|
328 |
|
#. Allow to change its content, as well as its organization, |
|
329 |
|
#. Allow multiple persons to access all or parts of it at the same time. |
330 |
330 |
|
|
331 |
331 |
## Subtasks |
## Subtasks |
332 |
332 |
|
|
333 |
333 |
Exactly like a program can have |
Exactly like a program can have |
334 |
334 |
|
|
335 |
|
- a client, that specify the requirements, |
|
|
335 |
|
- clients, that specify the requirements, |
336 |
336 |
- designers, that define the overall architecture of a program, |
- designers, that define the overall architecture of a program, |
337 |
337 |
- programmers, that implement the details of the program, |
- programmers, that implement the details of the program, |
338 |
338 |
- testers, that make sure the program is free of bugs, and |
- testers, that make sure the program is free of bugs, and |
|
... |
... |
Exactly like a program can have |
340 |
340 |
|
|
341 |
341 |
a DBMS offers multiple (sub)tasks and can be interacted with different persons with different roles. |
a DBMS offers multiple (sub)tasks and can be interacted with different persons with different roles. |
342 |
342 |
|
|
343 |
|
<!-- |
|
344 |
|
can be used by multiple users for different reasons, and different tasks can be assigned to different members of a project. |
|
345 |
|
--> |
|
346 |
343 |
|
|
347 |
344 |
| Role | Task | |
| Role | Task | |
348 |
345 |
| --- | ---------- | |
| --- | ---------- | |
|
... |
... |
can be used by multiple users for different reasons, and different tasks can be |
352 |
349 |
| Programmer | Implement the database, work on the programs that will interface with it | |
| Programmer | Implement the database, work on the programs that will interface with it | |
353 |
350 |
| User | Provide, search, and edit the data (usually) | |
| User | Provide, search, and edit the data (usually) | |
354 |
351 |
|
|
355 |
|
<!-- |
|
356 |
|
#. Maintenance (DB administrator) |
|
357 |
|
#. Organization (DB designer) |
|
358 |
|
#. Modification, retrieval (end-user) |
|
359 |
|
#. Software engineer, web developer, programers, … |
|
360 |
|
--> |
|
361 |
|
|
|
362 |
|
In those lecture notes, the main focus will be on design, but we will have to do a little bit of everything, without forgetting which role we are currently playing. |
|
|
352 |
|
In those lecture notes, the main focus will be on design and implementation, but we will have to do a little bit of everything, without forgetting which role we are currently playing. |
363 |
353 |
|
|
364 |
354 |
## Life of a Project |
## Life of a Project |
365 |
355 |
|
|
|
... |
... |
You can describe the structure as a collection of relations, and a collection of |
448 |
438 |
|
|
449 |
439 |
### Interactions |
### Interactions |
450 |
440 |
|
|
451 |
|
- This organization will allow some interactions. For instance, we can obtain the answer to questions like "What is the name of the course whose number is 1301?", "What courses is Kate teaching this semester?", "Does Bob meets the pre-requisite for 2910?", etc. Note that this last query is a bit different, as it forces us to look up information in multiple relations. |
|
|
441 |
|
- This organization will allow some interactions. For instance, we can obtain the answer to questions like |
|
442 |
|
|
|
443 |
|
> "What is the name of the course whose number is 1301?", |
|
444 |
|
> "What courses is Kate teaching this semester?", |
|
445 |
|
>"Does Bob meets the pre-requisite for 2910?" |
|
446 |
|
|
|
447 |
|
Note that this last query is a bit different, as it forces us to look up information in multiple relations. |
452 |
448 |
- We should also be able to perform updates, removal, addition of records in an efficient way (using auxiliary files (indexes), optimization). |
- We should also be able to perform updates, removal, addition of records in an efficient way (using auxiliary files (indexes), optimization). |
453 |
449 |
- Finally, selection (for any operation) requires care: do we want all the records, some of them, exactly one? |
- Finally, selection (for any operation) requires care: do we want all the records, some of them, exactly one? |
454 |
450 |
|
|
|
... |
... |
You can describe the structure as a collection of relations, and a collection of |
456 |
452 |
|
|
457 |
453 |
Why are the files separated like that? |
Why are the files separated like that? |
458 |
454 |
Why do not we store the section with the course with the students? |
Why do not we store the section with the course with the students? |
|
455 |
|
For multiple reasons: |
459 |
456 |
|
|
460 |
|
- **Avoiding redundancy** ("data normalization"), or having it controlled, |
|
461 |
|
- **Levels of access** (multiple user interface), |
|
462 |
|
- And we still have the same usability! |
|
|
457 |
|
- To **avoid redundancy** ("data normalization"), or having it controlled, |
|
458 |
|
- To controle multiple **levels of access** (multiple user interface), |
|
459 |
|
- Without sacrificing the **usability**! |
463 |
460 |
|
|
464 |
|
But need to be careful about **consistency** / **referential integrity**. |
|
|
461 |
|
In separating the datae, we also need to remember to be careful about **consistency** and **referential integrity**, which is a topic we will discuss in detail. |
465 |
462 |
|
|
466 |
463 |
### How Is a Database Conceived? |
### How Is a Database Conceived? |
467 |
464 |
|
|
|
... |
... |
But need to be careful about **consistency** / **referential integrity**. |
470 |
467 |
#. Logical design |
#. Logical design |
471 |
468 |
#. Physical design |
#. Physical design |
472 |
469 |
|
|
473 |
|
Gradation, from really abstract specification that is easy to modify, to more solidified description of what needs to be coded. |
|
|
470 |
|
There is a gradation, from really abstract specification that is easy to modify, to more solidified description of what needs to be coded. |
474 |
471 |
When we will be discussing high-level models, we will [come back to those notions](#interest-for-high-level-design). |
When we will be discussing high-level models, we will [come back to those notions](#interest-for-high-level-design). |
475 |
472 |
The global idea is that it is easier to move things around early in the conception, and harder once everything is implemented. |
The global idea is that it is easier to move things around early in the conception, and harder once everything is implemented. |
476 |
473 |
|
|
|
... |
... |
Wide, powerful, but also intimidating. |
5810 |
5807 |
|
|
5811 |
5808 |
You know UML from object-oriented programming language: |
You know UML from object-oriented programming language: |
5812 |
5809 |
|
|
5813 |
|
{width=90%} |
|
|
5810 |
|
 |
5814 |
5811 |
|
|
5815 |
|
That's a class diagram, there are other types of diagrams, they are not unrelated! |
|
|
5812 |
|
That is an example of a class diagram (with class name, attributes and operators, as well as a particular way to represent that a class extends another), there are other types of diagrams, they are not unrelated! |
5816 |
5813 |
For instance, using communication diagrams, deployment diagrams, and state chart diagrams, you can collect the requirements needed to draw a class diagram! |
For instance, using communication diagrams, deployment diagrams, and state chart diagrams, you can collect the requirements needed to draw a class diagram! |
5817 |
5814 |
They each offer a viewpoint on a software that will help you in making sure the various pieces will fit together: it is a tool commonly used in software engineering, and useful in database design. |
They each offer a viewpoint on a software that will help you in making sure the various pieces will fit together: it is a tool commonly used in software engineering, and useful in database design. |
5818 |
5815 |
|
|