quantbullet.research.jump_model#
Module for statistical jump models
Module Contents#
Classes#
| Statistical Jump Model with Discrete States | |
| Continuous Jump Model with Soft State Assignments | |
| Enrich univaraite time series with features | |
| Generate simulated returns that follows a Hidden Markov process. | |
| Parameters and plotting functions for testing | 
Attributes#
- quantbullet.research.jump_model.logger#
- class quantbullet.research.jump_model.DiscreteJumpModel[source]#
- Statistical Jump Model with Discrete States - fixed_states_optimize(y, s, k=2)[source]#
- Optimize the parameters of a discrete jump model with states fixed first. - Parameters:
- y (np.ndarray) – Observed data of shape (T x n_features). 
- s (np.ndarray) – State sequence of shape (T x 1). 
- theta_guess (np.ndarray) – Initial guess for theta of shape (k x n_features). 
- k (int) – Number of states. 
 
- Returns:
- np.ndarray: Optimized parameters of shape (k x n_features). 
- float: Optimal value of the objective function. 
 
- Return type:
- tuple 
 
 - generate_loss_matrix(y, theta)[source]#
- Generate the loss matrix for a discrete jump model for fixed theta - Parameters:
- y (np.ndarray) – observed data (T x n_features) 
- theta (np.ndarray) – parameters (k x n_features) 
- k (int) – number of states 
 
- Returns:
- loss matrix (T x k) 
- Return type:
- loss (np.ndarray) 
 
 - fixed_theta_optimize(lossMatrix, lambda_)[source]#
- Optimize the state sequence of a discrete jump model with fixed parameters - Parameters:
- lossMatrix (np.ndarray) – loss matrix (T x k) 
- lambda (float) – regularization parameter 
 
- Returns:
- optimal state sequence (T,) v (float): optimal value of the objective function 
- Return type:
- s (np.ndarray) 
 
 - initialize_kmeans_plusplus(data, k)[source]#
- Initialize the centroids using the k-means++ method. - Parameters:
- data – ndarray of shape (n_samples, n_features) 
- k – number of clusters 
 
- Returns:
- ndarray of shape (k, n_features) 
- Return type:
- centroids 
 
 - classify_data_to_states(data, centroids)[source]#
- Classify data points to the states based on the centroids. - Parameters:
- data – ndarray of shape (n_samples, n_features) 
- centroids – centroids or means of the states, ndarray of shape (k, n_features) 
 
- Returns:
- ndarray of shape (n_samples,), indices of the states to which each data point is assigned 
- Return type:
- state_assignments 
 
 - infer_states_stats(ts_returns, states)[source]#
- Compute the mean and standard deviation of returns for each state - Parameters:
- ts_returns (np.ndarray) – observed returns (T x 1) 
- states (np.ndarray) – state sequence (T x 1) 
 
- Returns:
- mean and standard deviation of returns for each state 
- Return type:
- state_features (dict) 
 
 - remapResults(optimized_s, optimized_theta, ts_returns)[source]#
- Remap the results of the optimization. - We would like the states to be in increasing order of the volatility of returns. This is because vol has smaller variance than returns, a warning is triggered if the states identified by volatility and returns are different. 
 - cleanResults(raw_result, ts_returns, rearrange=False)[source]#
- Clean the results of the optimization. - This extracts the best results from the ten trials based on the loss. 
 - single_run(y, k, lambda_)[source]#
- Run a single trial of the optimization. Each trial uses a different initialization of the centroids. - Parameters:
- y (np.ndarray) – observed data (T x n_features) 
- k (int) – number of states 
- lambda (float) – regularization parameter 
 
- Returns:
- optimal state sequence (T x 1) loss (float): optimal value of the objective function cur_theta (np.ndarray): optimal parameters (k x n_features) 
- Return type:
- cur_s (np.ndarray) 
 
 - fit(y, k=2, lambda_=100, rearrange=False, n_trials=10)[source]#
- fit discrete jump model - Note - A multiprocessing implementation is used to speed up the optimization Ten trials with k means++ initialization are ran - Parameters:
- y (np.ndarray) – observed data (T x n_features) 
- k (int) – number of states 
- lambda (float) – regularization parameter 
- rearrange (bool) – whether to rearrange the states in increasing order of volatility 
 
- Returns:
- optimal state sequence (T x 1) best_loss (float): optimal value of the objective function best_theta (np.ndarray): optimal parameters (k x n_features) optimized_s (list): state sequences from all trials (10 x T) optimized_loss (list): objective function values from all trials (10 x 1) optimized_theta (list): parameters from all trials (10 x k x n_features) 
- Return type:
- best_s (np.ndarray) 
 
 - evaluate(true, pred, plot=False)[source]#
- Evaluate the model using balanced accuracy score - Parameters:
- true (np.ndarray) – true state sequence (T x 1) 
- pred (np.ndarray) – predicted state sequence (T x 1) 
- plot (bool) – whether to plot the true and predicted state sequences 
 
- Returns:
- evaluation results 
- Return type:
- res (dict) 
 
 
- class quantbullet.research.jump_model.ContinuousJumpModel[source]#
- Bases: - DiscreteJumpModel- Continuous Jump Model with Soft State Assignments - fixed_states_optimize(y, s, k=None)[source]#
- Optimize theta given fixed states - Parameters:
- y – (T, n_features) array of observations 
- s – (T, k) array of state assignments 
 
- Returns:
- (k, n_features) array of optimal parameters 
- Return type:
- theta 
 - Note - s is assumed to have each row sum to 1 
 - generate_C(k, grid_size=0.05)[source]#
- Uniformly sample of states distributed on a grid - Parameters:
- k (int) – number of states 
- Returns:
- K x N matrix of states 
- Return type:
- matrix (np.ndarray) 
 
 - fixed_theta_optimize(lossMatrix, lambda_, C)[source]#
- Optimize the state sequence of a continuous jump model with fixed parameters - Parameters:
- lossMatrix (np.ndarray) – loss matrix (T x K) 
- C (np.ndarray) – K x N matrix of states 
- lambda (float) – regularization parameter 
 
- Returns:
- optimal state sequence with probability dist (T x K) v_hat (float): loss value 
- Return type:
- s_hat (np.ndarray) 
 
 - fit(y, k=2, lambda_=100, rearrange=False, n_trials=10, max_iter=20)[source]#
- fit discrete jump model - Note - A multiprocessing implementation is used to speed up the optimization Ten trials with k means++ initialization are ran - Parameters:
- y (np.ndarray) – observed data (T x n_features) 
- k (int) – number of states 
- lambda (float) – regularization parameter 
- rearrange (bool) – whether to rearrange the states in increasing order of volatility 
 
- Returns:
- optimal state sequence (T x 1) best_loss (float): optimal value of the objective function best_theta (np.ndarray): optimal parameters (k x n_features) optimized_s (list): state sequences from all trials (10 x T) optimized_loss (list): objective function values from all trials (10 x 1) optimized_theta (list): parameters from all trials (10 x k x n_features) 
- Return type:
- best_s (np.ndarray) 
 
 
- class quantbullet.research.jump_model.FeatureGenerator[source]#
- Enrich univaraite time series with features 
- class quantbullet.research.jump_model.SimulationGenerator[source]#
- Generate simulated returns that follows a Hidden Markov process. - stationary_distribution(transition_matrix)[source]#
- Computes the stationary distribution for a given Markov transition matrix. - Parameters:
- transition_matrix (numpy array) – The Markov transition matrix. 
- Returns:
- The stationary distribution. 
- Return type:
- numpy array 
 
 - simulate_markov(transition_matrix, initial_distribution, steps)[source]#
- Simulates a Markov process. - Parameters:
- transition_matrix (numpy array) – The Markov transition matrix. 
- initial_distribution (numpy array) – The initial state distribution. 
- steps (int) – The number of steps to simulate. 
 
- Returns:
- The states at each step. 
- Return type:
- states (list) 
 
 - generate_conditional_data(states, parameters)[source]#
- Generate data using normal distribution conditional on the states. - Parameters:
- states (list) – The list of states 
- parameters (dict) – Parameters for each state with means and standard deviations 
 
- Returns:
- Simulated data conditional on the states. 
- Return type:
- data (list) 
 
 - run(steps, transition_matrix, norm_params)[source]#
- Run the simulation, return the simulated states and conditional data - Note - States are forced to cover all states, if not, re-run the simulation - Parameters:
- steps (int) – number of steps to simulate 
- transition_matrix (np.ndarray) – transition matrix (k x k) 
- norm_params (dict) – parameters for the normal distribution for each state 
 
- Returns:
- simulated states simulated_data (list): simulated data conditional on states 
- Return type:
- simulated_states (list) 
 
 
- class quantbullet.research.jump_model.TestingUtils[source]#
- Parameters and plotting functions for testing - plot_returns(returns, shade_list=None)[source]#
- Plot both the cumulative returns and returns on separate subplots sharing the x-axis. - Parameters:
- returns (np.ndarray) – An array of returns.