Changelog
0.5.3
cai_causal_graph.utils.get_variable_name_and_lag
now allows new lines in the names of variables.- Passing
input_list
,output_list
, andfully_connected
as arguments to the constructor of a CausalGraph is now deprecated, and will be removed in future versions. To maintain this behavior, call add_fully_connected_nodes after construction. - Upgraded
poetry
version from1.8.2
to1.8.3
in the GitHub workflows.
0.5.2
- Removed caching from get_minimal_graph, get_stationary_graph and get_summary_graph because the caching would not account for changes in node/edge metadata.
0.5.1
- Improved caching of results for TimeSeriesCausalGraph class, including:
- Adding caching to the following methods: get_minimal_graph, is_minimal_graph, is_stationary_graph.
- Fixed a bug in get_stationary_graph where the same cached graph would be returned multiple times. Instead now, a deepcopied version of this graph is returned.
- Fixed a bug where cached attributes would not be reset when adding edges to TimeSeriesCausalGraph, which can lead to erroneous stationary graphs being returned.
- Added caching of is_dag, adjacency_matrix and to_networkx methods.
- Unified cached attribute resetting between CausalGraph and TimeSeriesCausalGraph
by introducing a
cai_causal_graph.causal_graph.CausalGraph._reset_cached_attributes
method andcai_causal_graph.causal_graph.reset_cached_attributes_decorator
decorator. - Fixed a bug where calling variables method would return the variable list by reference, rather than returning a copy.
0.5.0
NOTE: Backwards compatibility warning! Global metadata to TimeSeriesCausalGraph and CausalGraph, which means their serialized state has changed.
NOTE: Backwards compatibility warning! The default value of the
include_all_parents
argument to extend_graph has been changed toTrue
(fromFalse
).
- The default value of the
include_all_parents
argument to extend_graph has been changed toTrue
(fromFalse
). - Added metadata handling system to the HasMetadata class. All extending classes should use this system to parse and set their metadata.
- Added TimeSeriesEdge class, which is used as an edge class by TimeSeriesCausalGraph.
- Added metadata to TimeSeriesCausalGraph and CausalGraph.
- Ensured consistent metadata handling. Metadata passed at construction to TimeSeriesNode, Node, TimeSeriesEdge, Edge, TimeSeriesCausalGraph and CausalGraph is shallow-copied. Any metadata is deepcopied when constructing TimeSeriesCausalGraph and CausalGraph from dictionary.
- Ensured that replace_node performs similarly to replace_node, meaning that if any additional information (such as metadata) is specified, it is used to overwrite corresponding information in the constructed node.
- Generalized __eq__ to check for the class of the instance itself, enabling to reuse this method by extending classes.
- Added has_non_serializable_metadata method, which returns
False
by default. - Extended the string representation (
repr
) of CausalGraph to include whether the graph instance is a directed acyclic graph (DAG). - Dropped support for
python
3.8
as it is approaching end of life.
0.4.10
- Improved documentation by cleaning up a few syntax issues to support the new docs building process.
0.4.9
- Fixed a bug where metadata on floating nodes would not be correctly carried over to the minimal graph when calling
get_minimal_graph. This also fixes an issue where
is_minimal_graph could return
False
on minimal graphs that contain floating nodes.
0.4.8
- Upgraded
docs-builder
dependency to"~0.2.1"
in the Makefile and updated syntax to support newerpoetry
.
0.4.7
- Added the
cai_causal_graph.causal_graph.identify_utils.identity_colliders
utility function, which allows you to identify a list of colliders in a CausalGraph. - Improved efficiency of get_nodes_at_lag and get_nodes_for_variable_name by adding caching.
- Updated
networkx
dependency from">=3.0.0, <4.0.0"
to">=3.0.0, <3.3.0"
andnetworkx
have claimed to fix a bug with simple paths and other related functions, but this now introduces paths of length 1 which we do not want. Also updatedpyproject.toml
to allow differentnetworkx
versions forpython
3.8
and3.9
-3.12
asnetworkx
3.1
is the last one that supports3.8
. - Upgraded
poetry
version from1.7.1
to1.8.2
in the GitHub workflows.
0.4.6
- Added the Boolean keyword argument
include_all_parents
to extend_graph. WhenTrue
extra nodes may be added. Nodes and edges will be added as far back that all nodes up tobackward_steps
in the past have all their parents and inbound edges. This means that the extended graph may now have nodes at lags further back thanbackward_steps
. Default isFalse
, meaning the default behavior of the method has not changed. - Added the Boolean keyword argument
construct_minimal
tocai_causal_graph.time_series_causal_graph.TimeSeriesCausalGraph.from_adjacency_matrices
to allow the user to specify whether to construct a minimal graph from adjacency matrices. Default isTrue
, making the change backwards compatible. - Fixed a bug in the
cai_causal_graph.time_series_causal_graph.TimeSeriesCausalGraph.from_adjacency_matrices
method of TimeSeriesCausalGraph that was introduced in the previous release. The method was not properly handling undirected edges in the adjacency matrices. - Fixed a bug in the to_numpy_by_lag method of TimeSeriesCausalGraph where it was not always returning the correct variable names.
- Fixed a bug in the adjacency_matrices property of TimeSeriesCausalGraph where it was not looking at the variable names of the minimal graph, as expected, but the variables names of the full graph.
- Removed the
return_minimal
argument from the adjacency_matrices property of TimeSeriesCausalGraph as this was never working.
0.4.5
- Fixed a bug in
cai_causal_graph.time_series_causal_graph.TimeSeriesCausalGraph.from_adjacency_matrices
for TimeSeriesCausalGraph where the method was not handling properly undirected edges in the adjacency matrices. - Added
mypy-extensions
dependency as"^1.0.0"
and movedmypy
back to dev dependency (setting it as"^1.8.0"
).
0.4.4
- Relaxed
mypy
dependency to"*"
.
0.4.3
- Added the replace_edge method to the CausalGraph class, which allows you to replace an existing edge with a new one.
- Added the get_nodes_for_variable_name method to the TimeSeriesCausalGraph class.
- Added the is_source_node and is_sink_node methods to the Node, indicating whether the node is a source node or sink node respectively.
- Changed the following methods in CausalGraph from static to class methods:
cai_causal_graph.causal_graph.CausalGraph.from_skeleton
cai_causal_graph.causal_graph.CausalGraph.from_networkx
cai_causal_graph.causal_graph.CausalGraph.from_gml_string
- Note that
cai_causal_graph.causal_graph.CausalGraph.from_dict
andcai_causal_graph.causal_graph.CausalGraph.from_adjacency_matrix
were already class methods. Now all thefrom_
methods are consistent. This change is transparent to the user. - This allowed us to remove them from TimeSeriesCausalGraph. This change is transparent to the user.
- Updated Skeleton such that its nodes match the class type of its graph.
- Added ability to pass
graph_class
to the following methods in Skeleton such that the node classes will match accordingly when the new Skeleton is constructed:cai_causal_graph.causal_graph.Skeleton.from_dict
cai_causal_graph.causal_graph.Skeleton.from_adjacency_matrix
cai_causal_graph.causal_graph.Skeleton.from_networkx
cai_causal_graph.causal_graph.Skeleton.from_gml_string
- Note that
cai_causal_graph.causal_graph.Skeleton.from_dict
was already a class method but the other three were also changed to class methods for consistency. This change is transparent to the user.
- Added ability to pass
- Fixed a bug where the NodeVariableType was not synced between the CausalGraph and Skeleton.
- Changed
cai_causal_graph.time_series_causal_graph.TimeSeriesCausalGraph.from_adjacency_matrices
from a static method to a class method to align with otherfrom_
methods. - Updated
numpy
dependency from"^1.18.0"
to"^1.20.0"
and allow differentnumpy
versions in thepoetry.lock
forpython
3.8
and3.9
-3.12
. This is to allow specific versions for3.8
and the others as there is no version ofnumpy
that supports them all. - Upgraded
poetry
version from1.4.2
to1.7.1
in the GitHub workflows. - Moved
mypy
from a dev dependency to a dependency as it is now used in source code. - Updated actions in GitHub workflows to transition from
Node
16
toNode
20
.
0.4.2
- Added the
validate
flag to add_edge, add_edges_from, add_edges_from_paths, add_edge_by_pair,cai_causal_graph.causal_graph.CausalGraph.from_dict
,cai_causal_graph.time_series_causal_graph.TimeSeriesCausalGraph.add_edge
, and add_time_edge which, if set toFalse
, will disable validation checks. Currently, this disables cyclicity checks when adding edges. Default isTrue
, making the change backwards compatible. There is no guarantees about the behavior of the resulting graph if this is disabled specifically to introduce cycles. This should only be used to speed up this method in situations where it is known the resulting graph is still valid, for example when copying a graph.
0.4.0
- Removed
**kwargs
from the add_node, add_edge, and add_edge_by_pair methods in CausalGraph and its subclasses, as none of them use it. - Improved the get_minimal_graph method in TimeSeriesCausalGraph to avoid a recursion issue.
- Added add_edges_from_paths convenience method to the CausalGraph class, in order to add edges from paths.
- The get_minimal_graph method in TimeSeriesCausalGraph will now always return a graph of the same class type as the current graph instance.
- The returned graph type from the get_summary_graph method
in TimeSeriesCausalGraph can now be specified in subclasses of
TimeSeriesCausalGraph using the
_SummaryGraphCls
attribute. - Fixed a bug where
__getitem__
method of CausalGraph would work when specifying invalid node or edge query, for example a whole path. Instead, aTypeError
is now raised. - Fixed a bug where delete_edge would not support passing source and destination as Node.
- Added support for passing source and destination as Node to remove_edge.
- The
cai_causal_graph.causal_graph.CausalGraph.from_adjacency_matrix
is now a class method (rather than being a static method) which has been generalized to return an instance of the class on which it has been called (e.g. enabling returning instances of classes inheriting from CausalGraph.
0.3.14
- Fixed a bug in
cai_causal_graph.identify_utils.identify_confounders
where an empty confounding set would be returned in the edge case where all causal paths from the true confounders tonode_1
were blocked by ancestors ofnode_2
, or vice versa. This comes at a slight performance cost.
0.3.13
- Improved the get_topological_order method in
TimeSeriesCausalGraph to improve performance. Added a new keyword
argument
respect_time_ordering
to allow the user to specify whether the topological order must respect the time ordering of the nodes. Ifrespect_time_ordering=True
, the topological order will respect the time ordering, otherwise it may not. For example, if the graph is'Y lag(n=1)' -> 'Y' <- 'X'
, then['X', 'Y lag(n=1)', 'Y']
and['Y lag(n=1)', 'X', 'Y']
are both valid topological orders. However, only the second one would respect time ordering. If bothreturn_all
andrespect_time_ordering
areTrue
, then only all topological orders that respect time are returned, not all valid topological orders. The default isrespect_time_ordering=True
, matching previous behavior.
0.3.12
- Improved efficiency of
cai_causal_graph.identify_utils.identify_confounders
by performing all operations directly usingnetworkx
, removing the need to copy graphs and improving recursive logic, such that only the minimal confounders of the specified nodes are calculated (rather than recursively calculating minimal confounders of each parent). This results in significant speedups (in the order of hundreds of times in some cases).
0.3.11
- Fixed a bug in the extend_graph method in TimeSeriesCausalGraph where the method was not adding nodes correctly with particular graph configurations.
0.3.10
- Fixed a bug in the get_minimal_graph method in TimeSeriesCausalGraph where floating nodes were not added correctly. This also impacted the extend_graph method in TimeSeriesCausalGraph; it is also fixed now for floating nodes.
0.3.9
- Fixed a bug in the get_topological_order method in TimeSeriesCausalGraph.
0.3.8
- Fixed a bug in
cai_causal_graph.time_series_causal_graph.TimeSeriesCausalGraph.from_causal_graph
for TimeSeriesCausalGraph where the method was not adding floating nodes correctly. - Added max_backward_lag and max_forward_lag properties to TimeSeriesCausalGraph to return the absolute maximum backward and forward time lag of the graph, respectively.
- Fixed a bug with the property maxlag in TimeSeriesCausalGraph since it could give wrong information if the future was included in the graph.
0.3.7
- Fixed a bug in
cai_causal_graph.time_series_causal_graph.TimeSeriesCausalGraph.from_adjacency_matrices
for TimeSeriesCausalGraph where the method was not adding floating nodes to the graph. Now any floating nodes at time lag 0 will be added.
0.3.6
- Added
__iter__
to Skeleton.
0.3.5
- Added the
cai_causal_graph.identify_utils.identify_markov_boundary
utility function, which allows you to identify the Markov boundary of a node in a CausalGraph or in a Skeleton. - Added
get_neighbor_nodes
andget_neighbors
methods to CausalGraph and Skeleton.get_neighbor_nodes
returns the nodes neighboring the specified node whileget_neighbors
returns the identifiers of the neighboring nodes. Note: For a CausalGraph, it does not matter what the edge type is, as long as there is an edge between the specified node and another node, that other node is considered its neighbor.
0.3.4
- Extended documentation to provide further information regarding the types of mixed graphs that can be defined in a CausalGraph.
0.3.3
- Fixed a bug in
cai_causal_graph.identify_utils.identify_instruments
andcai_causal_graph.identify_utils.identify_mediators
, where an unclear error was raised if thesource
node was a descendant of thedestination
node. Instead, these methods now return an empty list in that case. - Extended the quickstart documentation to describe how to set the
variable_type
when adding a Node / TimeSeriesNode to a CausalGraph / TimeSeriesCausalGraph, respectively.
0.3.2
- Improved documentation.
0.3.1
- Fixed a bug for forward extension in the extend_graph method in TimeSeriesCausalGraph.
0.3.0
NOTE: Backwards compatibility warning! The definition of the EdgeConstraint enumeration has changed.
- Updated the EdgeConstraint enumeration to simplify the enumeration members and
exposed it at the root level, so it can be imported as
from cai_causal_graph import EdgeConstraint
. EdgeConstraint is not used by thecai-causal-graph
package but any packages that rely on its definition, must be updated to reflect the new members.
0.2.17
- Fixed the docstrings of
cai_causal_graph.time_series_causal_graph.TimeSeriesCausalGraph.from_adjacency_matrices
, such that the examples render properly.
0.2.16
- Fixed a bug where
cai_causal_graph.utils.get_variable_name_and_lag
would not match variable names with non-alphanumeric characters, and would not match variable names with the stringlag
orfuture
in them.
0.2.15
- Improved performance of checking for cycles when adding edges by avoiding repeated checks.
0.2.14
- Modified
cai_causal_graph.time_series_causal_graph.TimeSeriesCausalGraph.add_edge
in TimeSeriesCausalGraph to always order source and destination nodes bytime_lag
even when the edge is not directed. For example,add_edge('c', 'a (lag=1)', edge_type=EdgeType.UNDIRECTED_EDGE)
will add an undirected edgea (lag=1) -- c
instead ofc -- a (lag=1)
. This is done just for convenience and to avoid confusion.
0.2.13
- Improved documentation.
0.2.12
- Added node_name property to Node as an alias of identifier.
- Added get_parent_nodes and get_children_nodes to CausalGraph. These return a list of the parent and children Node objects, respectively. This is to supplement the get_parents and get_children methods, which only return the node identifiers.
0.2.11
- Improved the get_topological_order method for a TimeSeriesCausalGraph to better account for time.
- Fixed a bug in CausalGraph and TimeSeriesCausalGraph that prevented get_minimal_graph, get_summary_graph and extend_graph from working properly as it did not maintain the correct extra information such as node variable types.
0.2.10
- Added the get_nodes_at_lag and get_contemporaneous_nodes methods to the TimeSeriesCausalGraph class to get the nodes at a given lag and the contemporaneous nodes of the provided node, respectively.
- General improvements to several
from_*
methods in the TimeSeriesCausalGraph class.
0.2.9
- Added support for
python
version3.12
.
0.2.8
- Changed the order of the documentation in the sidebar to ensure Quickstart is at the top.
0.2.7
- Improved internal logic for how an Edge is instantiated from a dictionary.
0.2.6
- Improved the copy method in CausalGraph
such that it is more general and preserves the subclass type. As such, the
.copy
method was removed from the TimeSeriesCausalGraph class. - Extended equality methods for the Skeleton,
CausalGraph, and
TimeSeriesCausalGraph classes. A new keyword parameter
deep
has been added. Ifdeep=True
, deep equality checks are also done on all nodes and edges in the graphs. To call you must dograph_1.__eq__(graph_2, deep=True)
asgraph_1 == graph_2
still matches previous behavior.
0.2.5
- Added the
cai_causal_graph.identify_utils.identify_confounders
utility function, which allows you to identify a list of confounders between two nodes in a CausalGraph. - Added the
cai_causal_graph.identify_utils.identify_instruments
utility function, which allows you to identify a list of instrumental variables between two nodes in a CausalGraph. - Added the
cai_causal_graph.identify_utils.identify_mediators
utility function, which allows you to identify a list of mediators between two nodes in a CausalGraph.
0.2.4
- Fixed formatting in the documentation.
0.2.3
- Added the deserialization method
from_dict
to the following classes: Node, TimeSeriesNode, and Edge. - Added the serialization method to_dict to TimeSeriesNode. Node and Edge already had it.
- Changed behavior of add_node method in
TimeSeriesCausalGraph such that when both
identifier
and (time_lag
,variable_name
) are provided. Now, if all are provided, the method will raise an error only ifidentifier
is not equal toget_name_with_lag(time_lag, variable_name)
, that is, the correct name. - Extended equality methods for the Node, Edge
and TimeSeriesNode classes. A new keyword parameter
deep
has been added. Ifdeep=True
, additional class attributes are also checked; see the docstrings for additional information. To call you must donode_1.__eq__(node_2, deep=True)
asnode_1 == node_2
still matches previous behavior. - Added edge_type property to the Edge class.
0.2.2
- Fixed typo in the quickstart documentation.
0.2.1
- Fixed
repr
bug in the TimeSeriesCausalGraph class. - Added to_numpy_by_lag method to convert the TimeSeriesCausalGraph to a dictionary of adjacency matrices where the keys are the time lags with the values being the adjacency matrices with respect to the variables.
- Changed the extend_graph method in
TimeSeriesCausalGraph to work with a non-negative
backward_steps
andforward_steps
instead of strictly positive. - Fixed edge type in the extend_graph method in TimeSeriesCausalGraph.
- Added add_time_edge method in
TimeSeriesCausalGraph to add a time edge between two nodes. This method
allows to specify the time lag for source and destination variables. This avoids having to create the corresponding
node name manually or using the utility function
cai_causal_graph.utils.get_name_with_lag
. - Added equality method for the TimeSeriesNode class.
- Extended unit tests for the TimeSeriesCausalGraph and TimeSeriesNode classes.
- Documentation:
- Added a documentation page for the TimeSeriesCausalGraph class.
- Changed quickstart to start from a TimeSeriesCausalGraph instead of a CausalGraph.
0.2.0
- Added the TimeSeriesCausalGraph class to represent a time series causal
graph. TimeSeriesCausalGraph is aware of the time relationships between
the nodes in the graph while CausalGraph is not. Moreover, the
TimeSeriesCausalGraph class has three new representations:
- The minimal graph, which can be obtained via the
get_minimal_graph method, defines the graph with
the minimal number of nodes and edges that is required to capture all the information encoded in the original
graph. This is because a time series causal graph may contain a lot of repetitive information. For example, if the
original graph is
x(t-2) -> x(t-1) -> x(t)
, then the minimal graph would bex(t-1) -> x(t)
. In other words, it is a graph that has no edges whose destination is not time 0. - The summary graph, which can be obtained via the
get_summary_graph method, defines the graph
collapsed in time so there is a single node per variable. For example, if the original graph is
z(t-2) -> x(t) <- y(t) <- y(t-1)
then the summary group would bez -> x <- y
. Note, it is possible to have cycles in the summary graph. For example, a graph with edgesy(t-1) -> x(t)
andx(t-1) -> y(t)
would have a summary graph ofx <-> y
. - The extended graph, which can be obtained via the
extend_graph method, defines the graph obtained
by extending backward and forward in time via the arguments
backward_steps
andforward_steps
, respectively. This graph may contain lots of redundant information. For example, if the original graph isx(t-1) -> x(t)
andbackward_steps=2
andforward_steps=1
, then the extended graph would bex(t-2) -> x(t-1) -> x(t) -> x(t+1)
.
- The minimal graph, which can be obtained via the
get_minimal_graph method, defines the graph with
the minimal number of nodes and edges that is required to capture all the information encoded in the original
graph. This is because a time series causal graph may contain a lot of repetitive information. For example, if the
original graph is
- The
cai_causal_graph.time_series_causal_graph.TimeSeriesCausalGraph.from_adjacency_matrices
method was added to instantiate an instance of TimeSeriesCausalGraph from a dictionary of adjacency matrices where the keys are the time lags. - Added the TimeSeriesNode class to extend the
Node class to represent time information on the node.
The following properties were added:
variable_name
: The variable name of the time series node. For example, if the identifier of the time series node is'X1 lag(n=1)'
, i.e., it is a lagged version of the variable'X1'
, thenvariable_name
would be'X1'
.time_lag
: The time lag of the time series node. For example, if the identifier of the time series node is'X1 lag(n=1)'
, i.e., it is a lagged version of the variable'X1'
, thentime_lag
would be-1
.
- Renamed
EdgeTypeEnum
to EdgeType.
0.1.3
- Fixed a syntax error in the docstring for the get_bidirected_edges method that was preventing the reference docs from being built.
0.1.2
- Improved
README
links so images appear on PyPI. - Upgraded
poetry
version from1.2.2
to1.4.2
in the GitHub workflows.
0.1.1
- Added security linting checks of source code using
bandit
. - Improved documentation packaging and publishing.
0.1.0
- Initial release of the
cai-causal-graph
package with the CausalGraph class and component classes: Skeleton, Node, and Edge.