Generate a dataset — generate

This function contains the complete pipeline for generating a dataset with dyngen. In order to have more control over how the dataset is generated, run each of the steps in this function separately.

generate_dataset(
  model,
  format = c("list", "dyno", "sce", "seurat", "anndata", "none"),
  output_dir = NULL,
  make_plots = FALSE,
  store_dimred = model$simulation_params$compute_dimred,
  store_cellwise_grn = model$simulation_params$compute_cellwise_grn,
  store_rna_velocity = model$simulation_params$compute_rna_velocity
)

Arguments

model: A dyngen initial model created with initialise_model().
format: Which output format to use, must be one of 'dyno' (requires dynwrap), 'sce' (requires SingleCellExperiment), 'seurat' (requires Seurat), 'anndata' (requires anndata), 'list' or 'none'.
output_dir: If not NULL, then the generated model and dynwrap dataset will be written to files in this directory.
make_plots: Whether or not to generate an overview of the dataset.
store_dimred: Whether or not to store the dimensionality reduction constructed on the true counts.
store_cellwise_grn: Whether or not to also store cellwise GRN information.
store_rna_velocity: WHether or not to store the log propensity ratios.

Value

A list containing a dyngen model (li$model) and a dynwrap dataset (li$dataset).

Examples

model <- 
  initialise_model(
    backbone = backbone_bifurcating()
  )
# \dontshow{
# actually use a smaller example 
# to reduce execution time during
# testing of the examples
model <- initialise_model(
  backbone = model$backbone,
  num_cells = 5,
  num_targets = 0,
  num_hks = 0,
  gold_standard_params = gold_standard_default(census_interval = 1, tau = 0.1),
  simulation_params = simulation_default(
    burn_time = 10,
    total_time = 10,
    census_interval = 1,
    ssa_algorithm = ssa_etl(tau = 0.1),
    experiment_params = simulation_type_wild_type(num_simulations = 1)
  )
)
# }
# \donttest{
# generate dataset and output as a list format
# please note other output formats exist: "dyno", "sce", "seurat", "anndata"
out <- generate_dataset(model, format = "list")
#> Generating TF network
#> Sampling feature network from real network
#> Generating kinetics for 36 features
#> Generating formulae
#> Generating gold standard mod changes
#> Precompiling reactions for gold standard
#> Running gold simulations
#> 
  |                                                  | 0 % elapsed=00s   
  |========                                          | 14% elapsed=00s, remaining~00s
  |===============                                   | 29% elapsed=00s, remaining~00s
  |======================                            | 43% elapsed=00s, remaining~00s
  |=============================                     | 57% elapsed=00s, remaining~00s
  |====================================              | 71% elapsed=00s, remaining~00s
  |===========================================       | 86% elapsed=00s, remaining~00s
  |==================================================| 100% elapsed=00s, remaining~00s
#> Precompiling reactions for simulations
#> Running 1 simulations
#> Mapping simulations to gold standard
#> Warning: Simulation does not contain all gold standard edges. This simulation likely suffers from bad kinetics; choose a different seed and rerun.
#> Performing dimred
#> Simulating experiment
#> Warning: Certain backbone segments are not covered by any of the simulations. If this is intentional, please ignore this warning.
#>   Otherwise, increase the number of simulations (see `?simulation_default`) or decreasing the census interval (see `?simulation_default`).
#> Wrapping dataset as list

model <- out$model
dataset <- out$dataset
# }