# Computational graph

The ETL is a computational graph, that uses a directed acyclic graph (DAG) to describe the dependencies between datasets.

The following diagram shows the structure of the computational graph. Each step (or dataset) is represented by a node in the graph, and their dependencies are shown with edges:

``````flowchart LR

id00((____)):::node --> id11((____)):::node
id01((____)):::node --> id11((____)):::node --> id21((____)):::node  --> id31((____)):::node
id02((____)):::node --> id12((____)):::node --> id22((____)):::node
id12((____)):::node --> id21((____)):::node
id03((____)):::node --> id13((____)):::node --> id22((____)):::node  --> id32((____)):::node
classDef node fill:#002147,color:#002147

Whenever there is a change in a node (red node), all nodes that marked it as a dependency (yellow nodes) will be updated.

``````flowchart LR
id00((____)):::node --> id11((____)):::node
id01((____)):::node_nonodechange --> id11((____)):::node --> id21((____)):::node  --> id31((____)):::node_deps
id02((____)):::node_change --> id12((____)):::node_deps --> id22((____)):::node_deps
id12((____)):::node_deps --> id21((____)):::node_deps
id03((____)):::node --> id13((____)):::node --> id22((____)):::node_deps  --> id32((____)):::node_deps

classDef node fill:#002147,color:#002147
classDef node_change fill:#ce261e,color:#ce261e,stroke-width:3px;
classDef node_deps fill:#f7c020,color:#f7c020,stroke-width:3px;
``````

The computational graph is summarised in our DAG files, which lists all the steps and their dependencies.

## Why use a graph?

Graphs present a unique structure that, unlike matrices or tables, the order is not given much priority. Information is stored in terms of node-link. Graphs also can enable distributed computing for large problems, where there are lots of elements. This reduces the computational cost and the time complexity.

A graph is also a simple and easy way to communicate the relationships of our datasets, hence making our processes more transparent and comprehensible by the public.

In addition, it is also an efficient way of organising the data so that the dependency map for a step can be rapidly obtained.

## The nodes and edges

An edge in the computational graph has a very simple meaning: A node depends on another node. For instance, the following graph states that node `B` has node `A` as a dependency:

``````flowchart LR
id00((A)):::node --> id11((B)):::node

classDef node fill:#002147,color:#ddd;