4 Implementation
4.1 Execution
Start psql in src folder.
cd src
psql -d <database> username
Create tables and populate them with :
\i XXX_YYY_tables.sql
4.2 Code Details
We ignore most redudancy errors using ON CONFLICT DO NOTHING
.
4.2.1 Populate Transports
Raw transport line :
"Métro -> 1 : Bastille (Paris) (304m)
Bus -> 29618791 : Lyon / Daumesnil - Ledru Rollin (Paris) (246m)
Vélib -> Lacuée - Lyon (121.91m)
<a href=""https://www.geovelo.fr/paris/route?to=2.370458543300628,48.849268481958404"">Calculez votre itinéraire sur GéoVélo</a>"
- List \(\to\) Row
- Split over line jump using :
unnest(string_to_array(transport, E'\n')) AS transport
- Split over line jump using :
- Remove unnecessary HTML content
WHERE transport NOT LIKE '<%'
- Extract (transport_type, transport_line, station + distance)
- use
split_part
over->
and:
- use
Vélib
case- If station is empty, it means it is a
Vélib
, and the station was wrongly splited into transport_line. Then we swapstation
andtransport_line
- If station is empty, it means it is a
- Extract distance
- It can be done using a combination of
substring
,reverse
andlength
functions. - Distance information is always between the last occurence of
(
andm)
.
- It can be done using a combination of
4.2.2 Populate Sub-Events
Raw sub event line example:
"Rencontre avec Jørn Lier Horst (https://www.paris.fr/evenements/rencontre-avec-jorn-lier-horst-52356);Raconte-moi une histoire (https://www.paris.fr/evenements/raconte-moi-une-histoire-55682);Numéri’kids (https://www.paris.fr/evenements/numeri-kids-55668);Créa'kids ""Fabrique ton kit de détective"" (https://www.paris.fr/evenements/crea-kids-fabrique-ton-kit-de-detective-55664);Samedi des tout-petits (https://www.paris.fr/evenements/samedi-des-tout-petits-55660);Spectacle de conte ""Caillou !!!"" par Sylvie Mombo (https://www.paris.fr/evenements/spectacle-de-conte-caillou-par-sylvie-mombo-55656);Atelier ""Scène de crime"" avec les Petits Débrouillards (https://www.paris.fr/evenements/atelier-scene-de-crime-avec-les-petits-debrouillards-55665);Ciné'mômes (https://www.paris.fr/evenements/cine-momes-55662);Spectacle « L'enlèvement de la bibliothécaire » (https://www.paris.fr/evenements/spectacle-l-enlevement-de-la-bibliothecaire-52082);Jeux et énigmes (https://www.paris.fr/evenements/jeux-et-enigmes-55657)"
- List \(\to\) Row
- Split over
;
delimiters using :unnest(string_to_array(children, ';'))
- Split over
- Extract id
- The id is always at the end of the url. It can be extracted knowing it’s always between the last
-
and)
of the string.
- The id is always at the end of the url. It can be extracted knowing it’s always between the last