API Reference¶
ctrlnmod.integrators¶
- class ctrlnmod.integrators.RK45Simulator(ss_model, ts)¶
Bases:
Simulator
- clone()¶
- classmethod discretize(A, h)¶
Discretize matrix A using the RK45 method and return the Ad matrix.
- forward(u_batch, x0_batch=tensor([0.]), d_batch=None)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class ctrlnmod.integrators.RK4Simulator(ss_model, ts)¶
Bases:
Simulator
- clone()¶
- classmethod discretize(A, h)¶
Discretize matrix A using the RK4 method and return the Ad matrix.
- forward(u_batch, x0_batch=tensor([0.]), d_batch=None)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class ctrlnmod.integrators.Sim_discrete(ss_model, ts=1)¶
Bases:
Simulator
- clone()¶
- forward(u_batch, x0_batch=tensor([0.]))¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
ctrlnmod.utils¶
- class ctrlnmod.utils.Experiment(u, y, ts=1, x=None, nx=None, x_trainable=False, d=None)
Bases:
Dataset
This class represents a single experiment with inputs (u), outputs (y), and optionally states (x) and disturbances (d). It is designed to handle time-series data, where each experiment can have a different number of samples.
- u
Input data of shape (n_samples, nu).
- Type:
Tensor
- y
Output data of shape (n_samples, ny).
- Type:
Tensor
- ts
Sampling time.
- Type:
float
- x
State data of shape (n_samples, nx).
- Type:
Tensor, optional
- nu
Number of inputs.
- Type:
int
- ny
Number of outputs.
- Type:
int
- nx
Number of states.
- Type:
int, optional
- n_samples
Number of samples in the experiment.
- Type:
int
- x_trainable
Whether the state vector is trainable.
- Type:
bool
- d
Disturbance data of shape (n_samples, nd).
- Type:
Tensor, optional
- __getitem__(idx, seq_len)
Returns a tuple of (u, y, x, x0) for the given index and sequence length.
- __len__()
Returns the number of samples in the experiment.
- denormalize(u=None, y=None, x=None, scaler=None)
Denormalizes the data if a scaler is provided.
- get_data(idx=None, unscaled=False, scaler=None)
Returns the experiment data up to the specified index.
- plot(idx=None, unscaled=False, scaler=None)
Plots the experiment data.
- Parameters:
u (np.ndarray) – Input data of shape (n_samples, nu).
y (np.ndarray) – Output data of shape (n_samples, ny).
ts (float) – Sampling time.
x (np.ndarray, optional) – State data of shape (n_samples, nx). Defaults to None.
nx (int, optional) – Number of states. Must match x.shape[1] if x is provided. Defaults to None.
x_trainable (bool) – Whether the state vector is trainable. Defaults to False.
d (np.ndarray, optional) – Disturbance data of shape (n_samples, nd). Defaults to None.
- Raises:
ValueError – If u and y do not have the same number of samples.
ValueError – If x is provided but x_trainable is True.
ValueError – If x is None and nx is not provided.
ValueError – If seq_len is invalid.
- denormalize(u=None, y=None, x=None, d=None, scaler=None)
Dénormalise les données si un scaler est fourni
- get_data(idx=None, unscaled=False, scaler=None)
Return the experiment values up to the idx index if idx is not None
- Parameters:
idx – Optional[int] - Index jusqu’auquel récupérer les données
unscaled – bool - Si True et si un scaler est fourni, retourne les données dénormalisées
scaler – Optional[BaseScaler] - Scaler pour la dénormalisation
- Returns:
Tuple[Tensor, Tensor, Tensor] - (u, y, x) normalisés ou non
- plot(idx=None, unscaled=False, scaler=None)
Affiche les données de l’expérience jusqu’à l’index idx
- class ctrlnmod.utils.ExperimentsDataset(exps, seq_len=1, scaler=None)
Bases:
Dataset
This class implements methods to deal with sequences potentially coming from several experiments and from different length.
- append(exp)
Ajoute une nouvelle expérience au dataset
- Return type:
None
- plot(figsize=(15, 10), max_exp_to_plot=4, unscaled=False)
- set_seq_len(seq_len)
- class ctrlnmod.utils.FrameCacheManager
Bases:
object
- cache_frame()
Context manager to store parameterization results
- register_child(child_cache)
- ctrlnmod.utils.find_module(model, target_class)
- ctrlnmod.utils.is_legal(v)
- Return type:
bool
- ctrlnmod.utils.parse_act_f(act_f)
Parse the activation function from a string or a torch module.
- Parameters:
act_f (Union[str, torch.nn.Module]) – The activation function as a string or a torch module.
- Returns:
The corresponding activation function module.
- Return type:
torch.nn.Module
ctrlnmod.layers¶
- class ctrlnmod.layers.BetaLayer(n_in, n_out, hidden_layers, act_f=Tanh(), func='softplus', tol=0.01, scale=1.0, use_residual=False)
Bases:
Module
- clone()
Create a clone of the BetaLayer with the same parameters.
- Return type:
BetaLayer
- forward(x)
Compute the matrix-valued function beta(x).
- Parameters:
x (Tensor) – Input tensor of shape (…, n_in)
- Returns:
Output tensor of shape (…, n_out, n_out)
- Return type:
Tensor
- class ctrlnmod.layers.CustomSoftplus(beta=1.0, threshold=20.0, margin=0.01)
Bases:
Module
- forward(x)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class ctrlnmod.layers.SandwichLayer(in_features, out_features, scale=1.0, act_f=ReLU(), param='expm', bias=True, AB=True)
Bases:
Linear
A specific Sandwich layer with Lipschitz constant equal to scale
\[h_{out} = \sqrt{2} A^T \Psi \sigma \left( \sqrt{2} \Psi^{-1} B h_{in} + b \right)\]- alpha
Tensor scaling parameter for computation
- scale
float | Tensor scaling parameter to define Lipschitz constant
- AB
bool If true the product of A and B matrices is computed instead of just B.
- act_f
activation function for the sandwich layer
- param
str ‘expm’ or ‘cayley’ way to parameterize the matrices on the Stiefel manifold
- scale
float the input tensor is multiplied by scale
References
This module includes some bounded Lipschitz layers. See https://github.com/acfr/LBDN for more details.
- forward(x)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class ctrlnmod.layers.SandwichLinear(in_features, out_features, scale=1.0, param='expm', bias=True, AB=False)
Bases:
Linear
A specific linear layer with Lipschitz constant bounded by scale.
\[h_{out} = \sqrt{2} A^T \Psi \sigma \left( \sqrt{2} \Psi^{-1} B h_{in} + b \right)\]- alpha
Scaling parameter for computation.
- Type:
torch.Tensor
- scale
Scaling parameter to define the Lipschitz constant.
- Type:
float
- AB
If True, the product of A and B matrices is computed instead of just B.
- Type:
bool
- param
Method to parameterize the matrices on the Stiefel manifold, either ‘expm’ or ‘cayley’.
- Type:
str
- scale
The input tensor is multiplied by this scale.
- Type:
float
References
This module includes some bounded Lipschitz layers. See https://github.com/acfr/lbdn for more details.
- forward(x)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class ctrlnmod.layers.ScaledSoftmax(scale=1.0)
Bases:
Softmax
- forward(input)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type:
Tensor
- ctrlnmod.layers.softplus_epsilon(x, epsilon=1e-06)
ctrlnmod.linalg¶
- class ctrlnmod.linalg.InvSoftmaxEta(eta=1.0, epsilon=1e-06)
Bases:
Module
- forward(s)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class ctrlnmod.linalg.Logm(*args, **kwargs)
Bases:
Function
Computes the matrix logarithm of a given sqaure matrix.
- static backward(ctx, G)
Define a formula for differentiating the operation with backward mode automatic differentiation.
This function is to be overridden by all subclasses. (Defining this function is equivalent to defining the
vjp
function.)It must accept a context
ctx
as the first argument, followed by as many outputs as theforward()
returned (None will be passed in for non tensor outputs of the forward function), and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Tensor or is a Tensor not requiring grads, you can just pass None as a gradient for that input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computed w.r.t. the output.
- static forward(ctx, A)
Define the forward of the custom autograd Function.
This function is to be overridden by all subclasses. There are two ways to define forward:
Usage 1 (Combined forward and ctx):
@staticmethod def forward(ctx: Any, *args: Any, **kwargs: Any) -> Any: pass
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
See combining-forward-context for more details
Usage 2 (Separate forward and ctx):
@staticmethod def forward(*args: Any, **kwargs: Any) -> Any: pass @staticmethod def setup_context(ctx: Any, inputs: Tuple[Any, ...], output: Any) -> None: pass
The forward no longer accepts a ctx argument.
Instead, you must also override the
torch.autograd.Function.setup_context()
staticmethod to handle setting up thectx
object.output
is the output of the forward,inputs
are a Tuple of inputs to the forward.See extending-autograd for more details
The context can be used to store arbitrary data that can be then retrieved during the backward pass. Tensors should not be stored directly on ctx (though this is not currently enforced for backward compatibility). Instead, tensors should be saved either with
ctx.save_for_backward()
if they are intended to be used inbackward
(equivalently,vjp
) orctx.save_for_forward()
if they are intended to be used for injvp
.
- class ctrlnmod.linalg.MatrixSquareRoot(*args, **kwargs)
Bases:
Function
Square root of a positive definite matrix.
- NOTE: matrix square root is not differentiable for matrices with
zero eigenvalues.
- static backward(ctx, grad_output)
Define a formula for differentiating the operation with backward mode automatic differentiation.
This function is to be overridden by all subclasses. (Defining this function is equivalent to defining the
vjp
function.)It must accept a context
ctx
as the first argument, followed by as many outputs as theforward()
returned (None will be passed in for non tensor outputs of the forward function), and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Tensor or is a Tensor not requiring grads, you can just pass None as a gradient for that input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computed w.r.t. the output.- Return type:
Optional
[Tensor
]
- static forward(ctx, input)
Define the forward of the custom autograd Function.
This function is to be overridden by all subclasses. There are two ways to define forward:
Usage 1 (Combined forward and ctx):
@staticmethod def forward(ctx: Any, *args: Any, **kwargs: Any) -> Any: pass
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
See combining-forward-context for more details
Usage 2 (Separate forward and ctx):
@staticmethod def forward(*args: Any, **kwargs: Any) -> Any: pass @staticmethod def setup_context(ctx: Any, inputs: Tuple[Any, ...], output: Any) -> None: pass
The forward no longer accepts a ctx argument.
Instead, you must also override the
torch.autograd.Function.setup_context()
staticmethod to handle setting up thectx
object.output
is the output of the forward,inputs
are a Tuple of inputs to the forward.See extending-autograd for more details
The context can be used to store arbitrary data that can be then retrieved during the backward pass. Tensors should not be stored directly on ctx (though this is not currently enforced for backward compatibility). Instead, tensors should be saved either with
ctx.save_for_backward()
if they are intended to be used inbackward
(equivalently,vjp
) orctx.save_for_forward()
if they are intended to be used for injvp
.- Return type:
Tensor
- class ctrlnmod.linalg.SoftmaxEta(eta=1.0, epsilon=1e-06)
Bases:
Module
- forward(x)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- ctrlnmod.linalg.adjoint(A, E, f)
- ctrlnmod.linalg.block_diag(arr_list)
create a block diagonal matrix from a list of cvxpy matrices
- ctrlnmod.linalg.cayley(W)
Perform Cayley transform of rectangular matrix from https://github.com/locuslab/orthogonal-convolutions
- ctrlnmod.linalg.check_controllability(A, B, tol=1e-10)
Check the controllability of a system defined by matrices A and B.
- Parameters:
A (
Tensor
) – torch.Tensor State transition matrix of shape (n, n).B (
Tensor
) – torch.Tensor Input matrix of shape (n, m).tol (
float
) – float Tolerance for numerical stability.
- Return type:
bool
- Returns:
- bool
True if the system is controllable, False otherwise.
- ctrlnmod.linalg.check_observability(A, C, tol=1e-10)
Check the observability of a system defined by matrices A and C.
- Parameters:
A (
Tensor
) – torch.Tensor State transition matrix of shape (n, n).C (
Tensor
) – torch.Tensor Output matrix of shape (m, n).tol (
float
) – float Tolerance for numerical stability.
- Return type:
bool
- Returns:
- bool
True if the system is observable, False otherwise.
- ctrlnmod.linalg.getEigenvalues(L)
Return the eigenvalues of a given Tensor L
- Args:
torch.Tensor: the eigenvalues vector of L
- ctrlnmod.linalg.is_alpha_stable(A, alpha)
Check if all eigenvalues of A are negative and lower than - alpha
- ctrlnmod.linalg.is_positive_definite(L, tol=0.001)
Check if a Tensor is Positive definite up to a fixed tolerance.
- Returns:
True if the matrix is positive definite with a maximum deviation from symmetry up to tol, False otherwise.
- Return type:
bool
- ctrlnmod.linalg.project_onto_stiefel(A)
Project a matrix onto the Stiefel manifold.
The Stiefel manifold \(\mathrm{St}(n, p)\) is the set of all \(n \times p\) matrices with orthonormal columns. This function projects the input matrix \(A \in \mathbb{R}^{n \times p}\) onto the Stiefel manifold using polar decomposition:
\[A = U H, \quad \text{with } U \in \mathrm{St}(n, p)\]- Parameters:
A (torch.Tensor) – A 2D tensor of shape (n, p) representing the matrix to be projected.
- Returns:
A matrix of shape (n, p) with orthonormal columns, lying on the Stiefel manifold.
- Return type:
numpy.ndarray
ctrlnmod.lmis¶
- class ctrlnmod.lmis.AbsoluteStableLFT(model=None, extract_lmi_matrices=None, A=None, B1=None, C1=None, D11=None, Lambda_vec=None, P=None, alpha=tensor([0.]), mu=tensor([1.]))
Bases:
LMI
- forward()
Returns M as a positive definite matrix.
- Returns:
A positive definite matrix.
- Return type:
Tensor
- classmethod solve(A, B1, C1, D11, alpha, mu=tensor([1.]), solver='MOSEK', tol=1e-06, Lambda=None)
Returns the LMI and the corresponding bounds and certificates for evaluation.
- Parameters:
tensors (Tuple[Tensor]) – A tuple of tensors required for solving the LMI.
solver (str) – The solver to be used for LMI.
tol (float) – The tolerance for the solver.
- Returns:
The solution of the LMI and the corresponding bounds.
- Return type:
Tuple[Tensor, Tensor]
- update_matrices(*args)
- class ctrlnmod.lmis.HInfCont(module, extract_matrices, solver='MOSEK')
Bases:
HInfBase
- forward()
Returns M as a positive definite matrix.
- Returns:
A positive definite matrix.
- Return type:
Tensor
- init_(A=None, B=None, C=None, epsilon=0.0001, solver='MOSEK')
- classmethod solve(A, B, C, D, alpha=0.0, solver='MOSEK', tol=1e-08)
Returns the LMI and the corresponding bounds and certificates for evaluation.
- Parameters:
tensors (Tuple[Tensor]) – A tuple of tensors required for solving the LMI.
solver (str) – The solver to be used for LMI.
tol (float) – The tolerance for the solver.
- Returns:
The solution of the LMI and the corresponding bounds.
- Return type:
Tuple[Tensor, Tensor]
- class ctrlnmod.lmis.HInfDisc(*args, **kwargs)
Bases:
HInfBase
- forward()
Returns M as a positive definite matrix.
- Returns:
A positive definite matrix.
- Return type:
Tensor
- classmethod solve(A, B, C, D, alpha, solver='MOSEK', tol=1e-06)
Returns the LMI and the corresponding bounds and certificates for evaluation.
- Parameters:
tensors (Tuple[Tensor]) – A tuple of tensors required for solving the LMI.
solver (str) – The solver to be used for LMI.
tol (float) – The tolerance for the solver.
- Returns:
The solution of the LMI and the corresponding bounds.
- Return type:
Tuple[Tensor, Tensor]
- class ctrlnmod.lmis.LMI(module, extract_matrices)
Bases:
ABC
,Module
Base class for all Linear Matrix Inequalities (LMI). An LMI is built from the weights of a PyTorch Module. Subclasses must implement the specific attributes/submatrices they need to compute the forward method.
- check_(tol=1e-09)
Checks if the matrix is positive semidefinite within a user-defined tolerance.
- Parameters:
tol (float) – The tolerance for checking positive semidefiniteness.
- Returns:
True if the matrix is positive semidefinite within the given tolerance, False otherwise.
- Return type:
bool
- forward()
Returns M as a positive definite matrix.
- Returns:
A positive definite matrix.
- Return type:
Tensor
- classmethod solve(*args, **kwargs)
Returns the LMI and the corresponding bounds and certificates for evaluation.
- Parameters:
tensors (Tuple[Tensor]) – A tuple of tensors required for solving the LMI.
solver (str) – The solver to be used for LMI.
tol (float) – The tolerance for the solver.
- Returns:
The solution of the LMI and the corresponding bounds.
- Return type:
Tuple[Tensor, Tensor]
- class ctrlnmod.lmis.Lipschitz(module, extract_matrices, beta=1.0, epsilon=1e-06, solver='MOSEK')
Bases:
LMI
This class computes an upper bound on the Lipschitz constant for a neural network that can be put into a standard form like
- ..math:
z = Az + Bx w = Sigma(z) y = Cw
with :math: Sigma being a diagonal operator collecting all activation functions. It is assumed the activation functions in \(Sigma\) are slope restricted in \([0, \beta]\).
- Notethis an algebraic equation, so for this form to be well-posed,
a sufficient condition can be that A strictly lower triangular but other conditions exist. See for example https://arxiv.org/abs/2104.05942
- A
the interconnexion matrix of size (nz x nz)
- Type:
Tensor
- B
the input matrix of size (nz x n_in)
- Type:
Tensor
- C
the output matrix of size (n_out x nz)
- Type:
Tensor
- extract_matrices
a method provided by the model to extract the submatrices for LMI
- Type:
Callable
- beta
maximum admitted slope for activation functions Default = 1
- Type:
float
- lip
lipschitz constant for the network
- Type:
float
- Lambda_vec
1-D vector representing the diagonal matrix certificate for LMI feasibility
- Type:
Tensor
- ExtractMatricesFn
alias of
Callable
[[],Tuple
[Tensor
, …]]
- forward()
Returns M as a positive definite matrix.
- Returns:
A positive definite matrix.
- Return type:
Tensor
- init_(A=None, B=None, C=None, beta=1.0, solver='MOSEK')
- classmethod solve(A, B, C, beta=1, solver='MOSEK', tol=1e-08)
This class computes an upper bound on the Lipschitz constant for a neural network that can be put into the standard form:
\[\begin{split}z = Az + Bx \\ w = \Sigma(z) \\ y = Cw\end{split}\]where \(\Sigma\) is a diagonal operator collecting all activation functions.
It is assumed that the activation functions in \(\Sigma\) are slope-restricted in \([0, \beta]\).
Note: This is an algebraic equation. For the formulation to be well-posed, a sufficient condition is that \(A\) is strictly lower triangular, although other conditions may apply.
For more details, see:
TODO: For larger networks, solving the LMI can become difficult — this is particularly true for the LBDN architecture. This requires further investigation.
- Return type:
Tuple
[Tensor
,Tensor
,Tensor
]
- class ctrlnmod.lmis.LyapunovContinuous(A, alpha)
Bases:
LMI
Lyapunov LMI for continuous-time linear systems.
- forward()
Compute the LMI for the continuous-time Lyapunov equation.
- Returns:
The positive definite LMI matrix.
- Return type:
Tensor
- classmethod solve(A, alpha, tol=1e-09, solver='MOSEK', Q=None)
Solve the continuous-time Lyapunov LMI using cvxpy.
- Parameters:
A (Tensor) – The system matrix A.
alpha (float) – The decay rate alpha for alpha-stability.
tol (float) – The tolerance for the solver.
solver (str) – The solver to use for cvxpy.
- Returns:
The positive definite solution matrix P and the corresponding bounds.
- Return type:
Tuple[Tensor, Tensor]
- class ctrlnmod.lmis.LyapunovDiscrete(A, alpha)
Bases:
LMI
Lyapunov LMI for discrete-time linear systems.
- forward()
Compute the LMI for the discrete-time Lyapunov equation.
- Returns:
The positive definite LMI matrix.
- Return type:
Tensor
- classmethod solve(A, alpha, tol=1e-09, solver='MOSEK')
Solve the discrete-time Lyapunov LMI using cvxpy.
- Parameters:
A (Tensor) – The system matrix A.
alpha (float) – The decay rate alpha for alpha-stability.
tol (float) – The tolerance for the solver.
solver (str) – The solver to use for cvxpy.
- Returns:
The positive definite solution matrix P and the corresponding bounds.
- Return type:
Tuple[Tensor, Tensor]
ctrlnmod.losses¶
- class ctrlnmod.losses.BaseLoss(regularizers=None)
Bases:
ABC
- add_regularization(loss, **kwargs)
Adds regularization terms to the given loss.
- Parameters:
loss (Tensor) – The base loss.
**kwargs – Additional keyword arguments for StateRegularization.
- Returns:
The loss with regularization terms added.
- Return type:
Tensor
- update()
- class ctrlnmod.losses.FitPercentLoss(regularizers=None)
Bases:
BaseLoss
- class ctrlnmod.losses.MSELoss(regularizers=None)
Bases:
BaseLoss
- class ctrlnmod.losses.NMSELoss(regularizers=None)
Bases:
BaseLoss
- class ctrlnmod.losses.NRMSELoss(regularizers=None)
Bases:
BaseLoss
- class ctrlnmod.losses.RMSELoss(regularizers=None)
Bases:
BaseLoss
ctrlnmod.optim¶
- class ctrlnmod.optim.BackTrackOptimizer(optimizer, module, condition_fn, beta=0.5, max_iter=20)
Bases:
object
Wrapper optimizer implementing backtracking line search to enforce a condition on updated parameters.
This optimizer wraps a PyTorch optimizer and performs a backtracking line search after each step. After performing the step, it checks a user-provided condition function on the updated model parameters. If the condition is not satisfied, it rolls back the parameters and reduces the learning rate by multiplying it by beta, then retries the step. This process repeats until the condition is met or the maximum number of backtracking iterations is reached.
- Parameters:
optimizer (torch.optim.Optimizer) – The base optimizer to wrap.
module (nn.Module) – The model whose parameters are being optimized.
condition_fn (Callable[[nn.Module], bool]) – A callable that receives the model and returns True if the updated parameters satisfy the acceptance condition.
beta (float, optional) – Multiplicative factor to decrease learning rate during backtracking (default: 0.5).
max_iter (int, optional) – Maximum number of backtracking iterations (default: 20).
- n_backtrack_iter
The number of backtracking iterations performed.
- Type:
int
Example
>>> import torch >>> import torch.nn as nn >>> import torch.optim as optim >>> >>> model = MyModel() >>> criterion = nn.MSELoss() >>> base_optimizer = optim.SGD(model.parameters(), lr=0.1) >>> >>> def condition_fn(mod): ... # Example condition: loss decreases after step ... output = mod(input) ... loss = criterion(output, target) ... return loss.item() < old_loss >>> >>> optimizer = BackTrackOptimizer(base_optimizer, model, condition_fn) >>> >>> def closure(): ... optimizer.zero_grad() ... output = model(input) ... loss = criterion(output, target) ... loss.backward() ... return loss >>> >>> for input, target in data_loader: ... old_loss = float('inf') ... optimizer.step(closure)
- step(closure)
Performs a single optimization step with backtracking line search.
- Parameters:
closure (callable) – A closure that reevaluates the model and returns the loss.
- Returns:
The loss returned by the closure after the accepted step.
- zero_grad()
Clears the gradients of all optimized parameters.
- class ctrlnmod.optim.ProjectedOptimizer(optimizer, project, model, modules=None)
Bases:
Optimizer
Optimizer wrapper that applies a projection function on specified model parameters after each optimization step.
This optimizer delegates optimization to a wrapped PyTorch optimizer and then applies a user-defined projection function (e.g. projection onto positive definite matrices) on selected parameters to enforce constraints.
- Parameters:
optimizer (torch.optim.Optimizer) – The base optimizer to wrap.
project (Callable[[torch.Tensor], torch.Tensor]) – Projection function applied to the selected parameters after each step.
model (torch.nn.Module) – The model whose parameters are being optimized.
modules (Optional[List[Union[str, torch.nn.Module]]], optional) – List of parameter names or submodules to which projection is applied. If None, no projection is applied. Defaults to None.
Example
>>> import torch.nn as nn >>> import torch.optim as optim >>> >>> model = MyModel() >>> base_optimizer = optim.Adam(model.parameters(), lr=0.01) >>> >>> # Project parameters with names 'layer1.weight' and 'layer2.weight' to positive definite matrices >>> proj_opt = ProjectedOptimizer( ... base_optimizer, ... project=project_to_pos_def, ... model=model, ... modules=['layer1.weight', 'layer2.weight'] ... ) >>> >>> def closure(): ... proj_opt.zero_grad() ... output = model(input) ... loss = criterion(output, target) ... loss.backward() ... return loss >>> >>> for input, target in data_loader: ... proj_opt.step(closure)
- add_param_group(param_group)
Add a param group to the
Optimizer
s param_groups.This can be useful when fine tuning a pre-trained network as frozen layers can be made trainable and added to the
Optimizer
as training progresses.- Parameters:
param_group (dict) – Specifies what Tensors should be optimized along with group specific optimization options.
- property param_groups
- property state
- step(closure=None)
Performs a single optimization step, then applies the projection function to the selected parameters.
- Parameters:
closure (callable, optional) – A closure that reevaluates the model and returns the loss.
- Returns:
The loss returned by the closure, if any.
- zero_grad()
Clears the gradients of all optimized parameters.
- ctrlnmod.optim.project_to_pos_def(matrix)
Projects a symmetric matrix to the closest positive definite matrix.
This function symmetrizes the input matrix, then performs an eigen-decomposition, clips the eigenvalues to a minimum positive threshold (1e-6), and reconstructs the matrix to ensure positive definiteness.
- Parameters:
matrix (torch.Tensor) – A square matrix (assumed symmetric or nearly symmetric).
- Returns:
The closest positive definite matrix.
- Return type:
torch.Tensor
ctrlnmod.preprocessing¶
ctrlnmod.regularizations¶
- class ctrlnmod.regularizations.DDRegularization(lmi, lambda_dd, update_factor, actf='lse', updatable=True, verbose=False, e=0.1)
Bases:
Regularization
- update()
- class ctrlnmod.regularizations.L1Regularization(model, lambda_l1, update_factor, updatable=True, verbose=False)
Bases:
Regularization
- update()
- Return type:
None
- class ctrlnmod.regularizations.L2Regularization(model, lambda_l2, update_factor, updatable=True, verbose=False)
Bases:
Regularization
- update()
- Return type:
None
- class ctrlnmod.regularizations.LogdetRegularization(lmi, lambda_logdet, update_factor, updatable=True, min_weight=1e-06, verbose=False)
Bases:
Regularization
- update()
- Return type:
None
- class ctrlnmod.regularizations.Regularization(model, factor, updatable=True, verbose=False)
Bases:
ABC
,Module
- abstractmethod update()
- Return type:
None
- class ctrlnmod.regularizations.StateRegularization(model, lambda_state, update_factor, updatable=True, verbose=False)
Bases:
Regularization
- update()
- Return type:
None
ctrlnmod.train¶
- class ctrlnmod.train.LitNode(model, criterion, val_criterion, lr, patience_soft=30, use_backtracking=False, use_projection=False, condition_fn=None, custom_logging_fn=None, log_gradient_norms=False, val_idx_max=None)
Bases:
LightningModule
- configure_optimizers()
Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.
- Returns:
Any of these 6 options.
Single optimizer.
List or Tuple of optimizers.
Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple
lr_scheduler_config
).Dictionary, with an
"optimizer"
key, and (optionally) a"lr_scheduler"
key whose value is a single LR scheduler orlr_scheduler_config
.None - Fit will run without any optimizer.
The
lr_scheduler_config
is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.lr_scheduler_config = { # REQUIRED: The scheduler instance "scheduler": lr_scheduler, # The unit of the scheduler's step size, could also be 'step'. # 'epoch' updates the scheduler on epoch end whereas 'step' # updates it after a optimizer update. "interval": "epoch", # How many epochs/steps should pass between calls to # `scheduler.step()`. 1 corresponds to updating the learning # rate after every epoch/step. "frequency": 1, # Metric to monitor for schedulers like `ReduceLROnPlateau` "monitor": "val_loss", # If set to `True`, will enforce that the value specified 'monitor' # is available when the scheduler is updated, thus stopping # training if not found. If set to `False`, it will only produce a warning "strict": True, # If using the `LearningRateMonitor` callback to monitor the # learning rate progress, this keyword can be used to specify # a custom logged name "name": None, }
When there are schedulers in which the
.step()
method is conditioned on a value, such as thetorch.optim.lr_scheduler.ReduceLROnPlateau
scheduler, Lightning requires that thelr_scheduler_config
contains the keyword"monitor"
set to the metric name that the scheduler should be conditioned on.Metrics can be made available to monitor by simply logging it using
self.log('metric_to_track', metric_val)
in yourLightningModule
.Note
Some things to know:
Lightning calls
.backward()
and.step()
automatically in case of automatic optimization.If a learning rate scheduler is specified in
configure_optimizers()
with key"interval"
(default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s.step()
method automatically in case of automatic optimization.If you use 16-bit precision (
precision=16
), Lightning will automatically handle the optimizer.If you use
torch.optim.LBFGS
, Lightning handles the closure function automatically for you.If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.
If you need to control how often the optimizer steps, override the
optimizer_step()
hook.
- forward(u, x0, d=None)
Same as
torch.nn.Module.forward()
.- Parameters:
*args – Whatever you decide to pass into the forward method.
**kwargs – Keyword arguments are also possible.
- Returns:
Your model’s output
- get_res_dir()
- classmethod load_model(checkpoint_path, model, criterion, val_criterion)
Méthode helper pour charger le modèle plus facilement
- on_after_backward()
Called after
loss.backward()
and before optimizers are stepped.Note
If using native AMP, the gradients will not be unscaled at this point. Use the
on_before_optimizer_step
if you need the unscaled gradients.
- on_train_epoch_end()
Called in the training loop at the very end of the epoch.
To access all batch outputs at the end of the epoch, you can cache step outputs as an attribute of the
LightningModule
and access them in this hook:class MyLightningModule(L.LightningModule): def __init__(self): super().__init__() self.training_step_outputs = [] def training_step(self): loss = ... self.training_step_outputs.append(loss) return loss def on_train_epoch_end(self): # do something with all training_step outputs, for example: epoch_mean = torch.stack(self.training_step_outputs).mean() self.log("training_epoch_mean", epoch_mean) # free up the memory self.training_step_outputs.clear()
- on_train_start()
Called at the beginning of training after sanity check.
- on_validation_epoch_end()
Called in the validation loop at the very end of the epoch.
- training_step(train_batch, batch_idx)
Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Tensor
- The loss tensordict
- A dictionary which can include any keys, but must include the key'loss'
in the case of automatic optimization.None
- In automatic optimization, this will skip to the next batch (but is not supported for multi-GPU, TPU, or DeepSpeed). For manual optimization, this has no special meaning, as returning the loss is not required.
In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.
Example:
def training_step(self, batch, batch_idx): x, y, z = batch out = self.encoder(x) loss = self.loss(out, x) return loss
To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:
def __init__(self): super().__init__() self.automatic_optimization = False # Multiple optimizers (e.g.: GANs) def training_step(self, batch, batch_idx): opt1, opt2 = self.optimizers() # do training_step with encoder ... opt1.step() # do training_step with decoder ... opt2.step()
Note
When
accumulate_grad_batches
> 1, the loss returned here will be automatically normalized byaccumulate_grad_batches
internally.
- validation_step(batch, batch_idx)
Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.
- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Tensor
- The loss tensordict
- A dictionary. Can include any keys, but must include the key'loss'
.None
- Skip to the next batch.
# if you have one val dataloader: def validation_step(self, batch, batch_idx): ... # if you have multiple val dataloaders: def validation_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single validation dataset def validation_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'val_loss': loss, 'val_acc': val_acc})
If you pass in multiple val dataloaders,
validation_step()
will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple validation dataloaders def validation_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. ...
Note
If you don’t need to validate you don’t need to implement this method.
Note
When the
validation_step()
is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.
- ctrlnmod.train.train_model(lit_model, data_module, logger, epochs, patience=100)