Quickstart
Causal Graphs
The causal graph fundamentally consists of nodes, i.e. variables, and edges, i.e. their relationships. For instance,
a causal graph containing a directed edge ->
between node A
and node B
would imply that variable A
is a causal
driver of variable B
, but not the other way around.
You can easily add nodes and edges to the CausalGraph class by using the add_node / add_nodes_from and add_edge methods, as shown below.
from cai_causal_graph import CausalGraph
# Construct the causal graph object
causal_graph = CausalGraph()
# Add a single node to the causal graph
causal_graph.add_node('A')
# Add several nodes at once to the causal graph
causal_graph.add_nodes_from(['B', 'C', 'D'])
# Add edges to the causal graph
causal_graph.add_edge('A', 'B') # this adds a directed edge (i.e., an edge from A to B) by default
causal_graph.add_edge('B', 'E') # if the node does not exist, it gets added automatically
Any node added to a causal graph will, by default, be an unspecified variable type. It is, however, possible to specify
different variable types via the variable_type
argument. For a full list of variable types, see
NodeVariableType. For instance, you can add a binary node F
, as shown below.
from cai_causal_graph import NodeVariableType
causal_graph.add_node('F', variable_type=NodeVariableType.BINARY)
Any edge added to a causal graph will, by default, be a directed edge. It is, however, possible to specify different
edge types via the edge_type
argument. For a full list of edge types, see
this section. For instance, you can add an undirected edge A -- C
, as
shown below.
from cai_causal_graph import EdgeType
# Add an undirected edge between A and C
causal_graph.add_edge('A', 'C', edge_type=EdgeType.UNDIRECTED_EDGE)
Time Series Causal Graphs
The TimeSeriesCausalGraph class extends the CausalGraph class to represent time series causal graphs. The main differences with respect to the CausalGraph class are:
- each node is associated with a
variable_name
and atime_lag
attribute. - a new from_adjacency_matrices method is provided to create a TimeSeriesCausalGraph from a dictionary of adjacency matrices where the keys are the time lags.
You can define a TimeSeriesCausalGraph as shown below.
from cai_causal_graph import TimeSeriesCausalGraph
# Construct the time series causal graph object
ts_causal_graph = TimeSeriesCausalGraph()
# Add edges to the causal graph; this will also add the nodes if they are not present in the graph yet.
ts_causal_graph.add_edge('X1 lag(n=1)', 'X2') # this adds a directed edge (i.e., an edge from X1 lag(n=1) to X2) by default
ts_causal_graph.add_edge('X2 lag(n=1)', 'X2') # this adds a directed edge (i.e., an edge from X2 lag(n=1) to X2) by default
This is equivalent to the following:
from cai_causal_graph import TimeSeriesCausalGraph
ts_causal_graph = TimeSeriesCausalGraph()
# add edges to the causal graph
ts_causal_graph.add_time_edge('X1', -1, 'X2', 0) # this is a directed edge by default, unless specified otherwise
ts_causal_graph.add_time_edge('X2', -1, 'X2', 0) # this is a directed edge by default, unless specified otherwise
TimeSeriesCausalGraph is aware of the time lags of the variables and can be extended backwards and forward in time.
When you define the node X1 lag(n=1)
you are actually defining a node with variable_name='X1'
and time_lag=-1
. Thus,
the edge X1 lag(n=1) -> X2
means that X1
at time t-1
is a causal driver of X2
at time t
.
The method add_time_edge is a convenience method to add edges between nodes with different time lags without having to specify them in the node names.
Similar to CausalGraph, any node added to the time series causal graph will, by
default, be an unspecified variable type. It is, however, possible to specify different node variable types via the
variable_type
argument. For a full list of variable types, see NodeVariableType.
Similar to CausalGraph, any edge added to the time series causal graph will, by
default, be a directed edge. It is, however, possible to specify different edge types via the edge_type
argument.
For a full list of edge types, see EdgeType. It is NOT possible to add edges
that go backwards in time.
from cai_causal_graph import EdgeType
# Add an undirected edge between X1 at time t-2 and X3
ts_causal_graph.add_edge('X1 lag(n=2)', 'X3', edge_type=EdgeType.UNDIRECTED_EDGE)
If you want to convert a CausalGraph to a
TimeSeriesCausalGraph, you can use the
cai_causal_graph.time_series_causal_graph.TimeSeriesCausalGraph.from_causal_graph
method as shown below.
from cai_causal_graph import CausalGraph, TimeSeriesCausalGraph
# Instantiate an empty causal graph
causal_graph = CausalGraph()
# Add edges to the causal graph with lagged nodes
causal_graph.add_edge('X1 lag(n=1)', 'X2')
causal_graph.add_edge('X2 lag(n=1)', 'X2')
# Convert the causal graph to a time series causal graph
ts_causal_graph = TimeSeriesCausalGraph.from_causal_graph(causal_graph)
# Can also just construct the TimeSeriesCausalGraph directly
ts_causal_graph = TimeSeriesCausalGraph()
# Add edges. Could also use add_time_edge instead, as shown above.
ts_causal_graph.add_edge('X1 lag(n=1)', 'X2')
ts_causal_graph.add_edge('X2 lag(n=1)', 'X2')
The difference between the two graphs is that the TimeSeriesCausalGraph is
now aware of the time lags of the nodes and understands that 'X2 lag(n=1)'
and 'X2'
refer to the same variable.
Moreover, TimeSeriesCausalGraph provides the capability to extend the minimal graph backwards and forward in time using the extend_graph method. For instance, if you want to extend the graph backwards in time to time delta -2 (backwards) you can do the following:
# Using the time causal graph defined above.
ts_causal_graph.extend_graph(backward_steps=2)
# The graph now contains the following nodes
# X1 lag(n=1), X1 lag(n=2), X2 lag(n=1), X2 lag(n=2), X2
# and the following edges
# X1 lag(n=1) -> X2, X2 lag(n=1) -> X2, X1 lag(n=2) -> X2 lag(n=1), X2 lag(n=2) -> X2 lag(n=1)