Automated Annotation Workflow¶

This workflow uses the auto_annot tools from besca to newly annotate a scRNAseq dataset based on one or more preannotated datasets. Ideally, these datasets come from a similar tissue and condition. If multiple training datasets are used, the performance is dependent on all dataset being annotated to the same resolution with broadly similar cell types.

We use supervised machine learning methods to annotate each individual cell utilizing methods like support vector machines (SVM) or logistic regression.

First, the traning dataset(s) and the testing dataset are loaded from h5ad files or made available as adata objects. Next, the training and testing datasets are corrected using scanorama, and the training datasets are then merged into one anndata object. Then, the classifier is trained utilizing the merged training data. Finally, the classifier is applied to the testing dataset to predict the cell types. If the testing dataset is already annotated (to test the algorithm), a report including confusion matrices can be generated.

[1]:

import besca as bc

Parameter specification¶

Give your analysis a name.

[2]:

analysis_name = (
    "auto_annot_tutorial"  # The analysis name will be used to name the output files
)

Specify column name of celltype annotation you want to train on.

[3]:

celltype = "dblabel"  # This needs to be a column in the .obs of the training datasets (and test dataset if you want to generate a report)

Choose a method:

linear: Support Vector Machine with Linear Kernel
sgd: Support Vector Machine with Linear Kernel using Stochastic Gradient Descent
rbf: Support Vector Machine with radial basis function kernel. Very time intensive, use only on small datasets.
logistic_regression: Standard logistic classifier iwth multinomial loss.
logistic_regression_ovr: Logistic Regression with one versus rest classification.
logistic_regression_elastic: Logistic Regression with elastic loss, cross validates among multiple l1 ratios.

[4]:

method = "logistic_regression"

Specify merge method. Needs to be either scanorama or naive.

[5]:

merge = "scanorama"  # We recommend to use scanorama here

Decide if you want to use the raw format or highly variable genes. Raw increases computational time and does not necessarily improve predictions.

[6]:

use_raw = False  # We recommend to use False here

You can choose to only consider a subset of genes from a signature set or use all genes.

[7]:

genes_to_use = "all"  # We suggest to use all here, but the runtime is strongly improved if you select an appropriate gene set

Data loading¶

Read in all training sets and the testing set.¶

We will use a publicly available PBMC dataset, including ~3000 cells, as testing dataset. For the training dataset we use another PBMC dataset that is delivered with besca. The training datasets used are from:

Granja JM, Klemm S, McGinnis LM, et al. Single-cell multiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia. Nat Biotechnol. 2019;37(12):1458-1465. doi:10.1038/s41587-019-0332-7

and

Kotliarov Y, Sparks R, Martins AJ, et al. Broad immune activation underlies shared set point signatures for vaccine responsiveness in healthy individuals and disease activity in patients with lupus. Nat Med. 2020;26(4):618-629. doi:10.1038/s41591-020-0769-8

A detailed analysis of the annotation performed below can be found in figure 3 of the publication on besca.

[8]:

adata_test = (
    bc.datasets.pbmc3k_processed()
)  # Dataset to be annotated (it's already annotated in this case for testing)
adata_test_orig = (
    adata_test.copy()
)  # Make a copy of the testing data, which will be annotated in .obs, but not further modified
adata_train_list = [
    bc.datasets.Granja2019_processed(),
    bc.datasets.Kotliarov2020_processed(),
]  # List of annotated training datasets

You can also load your own dataset from the file system. Specify folders where .h5ad files are found and their names. The datasets that are already annotated and should be used for training. If you only use one dataset please use list of one. These load functions are useful in particular when the datasets have been processed with the besca standard pipeline. Alternatively, datasets can be loaded in any fashion, if more suitable.

[9]:

# test_dataset_path = '/path/to/test/dataset/folder'
# test_dataset = 'testdataset.h5ad'
#
# train_dataset_paths = ['/path/to/train/dataset/folder1', '/path/to/train/dataset/folder2']
# train_datasets = ['traindataset1.h5ad', 'traindataset2.h5ad']

[10]:

# adata_train_list, adata_test, adata_test_orig = bc.tl.auto_annot.read_data(train_paths = train_dataset_paths, train_datasets= train_datasets, test_path=  test_dataset_path, test_dataset= test_dataset, use_raw = use_raw)

One training dataset’s label column is not named “dblabel”. We need to create a column so that all datasets used contain a column that are named according to the previously specified variable celltype and that contain the labels.

[11]:

adata_train_list[1].obs["dblabel"] = adata_train_list[1].obs["celltype3"]

In this case the testing dataset is already annotated to demonstrate the methodology. All datasets adhere to the same naming convention.

[12]:

adata_test.obs.dblabel.unique()

[12]:

['naive thymus-derived CD8-positive, alpha-beta..., 'naive B cell', 'central memory CD4-positive, alpha-beta T cell', 'classical monocyte', 'IL7R-max CD8-positive, alpha-beta cytotoxic T..., 'non-classical monocyte', 'naive thymus-derived CD4-positive, alpha-beta..., 'CD8-positive, alpha-beta cytotoxic T cell', 'cytotoxic CD56-dim natural killer cell', 'CD1c-positive myeloid dendritic cell']
Categories (10, object): ['CD1c-positive myeloid dendritic cell', 'CD8-positive, alpha-beta cytotoxic T cell', 'IL7R-max CD8-positive, alpha-beta cytotoxic T..., 'central memory CD4-positive, alpha-beta T cell', ..., 'naive B cell', 'naive thymus-derived CD4-positive, alpha-beta..., 'naive thymus-derived CD8-positive, alpha-beta..., 'non-classical monocyte']

[13]:

adata_train_list[0].obs.dblabel.unique()

[13]:

['naive thymus-derived CD4-positive, alpha-beta..., 'classical monocyte', 'naive B cell', 'lymphocyte of B lineage', 'naive thymus-derived CD8-positive, alpha-beta..., ..., 'IL7R-max CD8-positive, alpha-beta cytotoxic T..., 'hematopoietic multipotent progenitor cell', 'myeloid leukocyte', 'basophil', 'plasma cell']
Length: 25
Categories (25, object): ['naive thymus-derived CD4-positive, alpha-beta..., 'naive thymus-derived CD8-positive, alpha-beta..., 'naive B cell', 'IL7R-max CD8-positive, alpha-beta cytotoxic T..., ..., 'common lymphoid progenitor', 'basophil', 'plasma cell', 'erythrocyte']

[14]:

adata_train_list[1].obs.dblabel.unique()

[14]:

['cytotoxic CD56-dim natural killer cell', 'naive thymus-derived CD8-positive, alpha-beta..., 'naive thymus-derived CD4-positive, alpha-beta..., 'classical monocyte', 'CD8-positive, alpha-beta cytotoxic T cell', ..., 'regulatory T cell', 'CD1c-positive myeloid dendritic cell', 'plasmacytoid dendritic cell', 'erythrocyte', 'plasma cell']
Length: 14
Categories (14, object): ['cytotoxic CD56-dim natural killer cell', 'naive B cell', 'IL7R-max CD8-positive, alpha-beta cytotoxic T..., 'memory B cell', ..., 'plasmacytoid dendritic cell', 'erythrocyte', 'plasma cell', 'naive thymus-derived CD8-positive, alpha-beta...]

Correct datasets (e.g. using scanorama) and merge training datasets¶

This function merges training datasets, removes unwanted genes, and if scanorama is used corrects for datasets.

[15]:

adata_train, adata_test_corrected = bc.tl.auto_annot.merge_data(
    adata_train_list, adata_test, genes_to_use=genes_to_use, merge=merge
)

merging with scanorama
using scanorama rn
Found 207 genes among all datasets
[[0.         0.69861833 0.47963259]
 [0.         0.         0.98881789]
 [0.         0.         0.        ]]
Processing datasets (1, 2)
Processing datasets (0, 1)
Processing datasets (0, 2)
integrating training set

Train the classifier¶

The returned scaler is fitted on the training dataset (to zero mean and scaled to unit variance). The scaling will then be applied to the counts in the testing dataset and then the classifier is applied to the scaled testing dataset (see next step, adata_predict()). This function will run multiple jobs in parallel if if logistic regression was specified as method.

[16]:

classifier, scaler = bc.tl.auto_annot.fit(adata_train, method, celltype, njobs=14)

RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.12039D+05    |proj g|=  2.86381D+04
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.11969D+05    |proj g|=  2.70303D+04
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.11946D+05    |proj g|=  2.74976D+04
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.12051D+05    |proj g|=  2.70821D+04
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.11957D+05    |proj g|=  2.71735D+04

At iterate   50    f=  1.04971D+05    |proj g|=  2.73525D+01

At iterate   50    f=  1.03637D+05    |proj g|=  8.64883D+00

At iterate   50    f=  1.03011D+05    |proj g|=  8.89155D+00

At iterate   50    f=  1.03004D+05    |proj g|=  5.90025D+00

At iterate   50    f=  1.04246D+05    |proj g|=  7.64672D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200     70     73      1     0     0   1.162D+00   1.030D+05
  F =   103010.68989000401

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  7.74054D+04    |proj g|=  2.54100D+03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200     72     76      1     0     0   1.197D+00   1.050D+05
  F =   104971.11070892689

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  7.96649D+04    |proj g|=  2.47790D+03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200     74     78      1     0     0   6.220D-01   1.036D+05
  F =   103636.52996705240

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  7.83556D+04    |proj g|=  2.44585D+03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200     77     82      1     0     0   6.687D-01   1.030D+05
  F =   103003.76722693436

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  7.76244D+04    |proj g|=  2.55120D+03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200     80     85      1     0     0   8.110D-01   1.042D+05
  F =   104245.97290945785

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  7.90204D+04    |proj g|=  2.50165D+03

At iterate   50    f=  5.93638D+04    |proj g|=  2.00635D+01

At iterate   50    f=  6.15150D+04    |proj g|=  3.03599D+01

At iterate   50    f=  6.04466D+04    |proj g|=  1.57444D+01

At iterate   50    f=  5.98289D+04    |proj g|=  2.91009D+01

At iterate   50    f=  6.11011D+04    |proj g|=  2.25134D+01

At iterate  100    f=  5.93575D+04    |proj g|=  4.64568D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    104      1     0     0   4.646D+00   5.936D+04
  F =   59357.500019558975

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.73539D+04    |proj g|=  5.57261D+02

At iterate  100    f=  6.15099D+04    |proj g|=  1.67961D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    107      1     0     0   1.680D+00   6.151D+04
  F =   61509.900886338968

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.94544D+04    |proj g|=  6.00582D+02

At iterate  100    f=  6.04395D+04    |proj g|=  1.89094D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    107      1     0     0   1.891D+00   6.044D+04
  F =   60439.492975334550

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.85205D+04    |proj g|=  5.86978D+02

At iterate  100    f=  5.98230D+04    |proj g|=  1.48425D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   1.484D+00   5.982D+04
  F =   59822.961628412340

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.79571D+04    |proj g|=  5.79302D+02

At iterate  100    f=  6.10936D+04    |proj g|=  2.97156D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    108      1     0     0   2.972D+00   6.109D+04
  F =   61093.551063188512

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.91858D+04    |proj g|=  5.88940D+02

At iterate   50    f=  4.11176D+04    |proj g|=  2.70626D+01

At iterate   50    f=  4.31390D+04    |proj g|=  3.15829D+01

At iterate   50    f=  4.23066D+04    |proj g|=  3.96669D+01

At iterate   50    f=  4.17778D+04    |proj g|=  2.30889D+01

At iterate   50    f=  4.30381D+04    |proj g|=  2.38411D+01

At iterate  100    f=  4.10888D+04    |proj g|=  7.57278D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    104      1     0     0   7.573D+00   4.109D+04
  F =   41088.792765115402

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.64715D+04    |proj g|=  1.14671D+02

At iterate  100    f=  4.22764D+04    |proj g|=  1.94224D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   1.942D+01   4.228D+04
  F =   42276.428932985196

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT

At iterate  100    f=  4.31078D+04    |proj g|=  2.42772D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    108      1     0     0   2.428D+01   4.311D+04
  F =   43107.796510953172

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.76858D+04    |proj g|=  1.23967D+02
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.84692D+04    |proj g|=  1.18329D+02

At iterate  100    f=  4.17517D+04    |proj g|=  4.54061D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    105      1     0     0   4.541D+01   4.175D+04
  F =   41751.728515726456

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.71857D+04    |proj g|=  1.24753D+02

At iterate  100    f=  4.30019D+04    |proj g|=  1.31606D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    103      1     0     0   1.316D+01   4.300D+04
  F =   43001.948646084813

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.84556D+04    |proj g|=  1.19735D+02

At iterate   50    f=  3.43179D+04    |proj g|=  1.61990D+01

At iterate   50    f=  3.55724D+04    |proj g|=  5.30592D+01

At iterate   50    f=  3.63519D+04    |proj g|=  2.47038D+01

At iterate   50    f=  3.50436D+04    |proj g|=  3.81051D+01

At iterate   50    f=  3.63920D+04    |proj g|=  5.06944D+01

At iterate  100    f=  3.42672D+04    |proj g|=  1.54011D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    105      1     0     0   1.540D+01   3.427D+04
  F =   34267.225020703758

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.26307D+04    |proj g|=  2.29930D+01

At iterate  100    f=  3.62979D+04    |proj g|=  6.97876D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   6.979D+01   3.630D+04
  F =   36297.920086924394

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT

At iterate  100    f=  3.55243D+04    |proj g|=  2.52911D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   2.529D+01   3.552D+04
  F =   35524.276263567997

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.46728D+04    |proj g|=  7.72150D+01

At iterate  100    f=  3.49969D+04    |proj g|=  9.13133D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    104      1     0     0   9.131D+00   3.500D+04
  F =   34996.943723823075

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.39185D+04    |proj g|=  3.31235D+01
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.33719D+04    |proj g|=  2.34712D+01

At iterate  100    f=  3.63334D+04    |proj g|=  1.43529D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    107      1     0     0   1.435D+01   3.633D+04
  F =   36333.443243589172

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.47128D+04    |proj g|=  2.46425D+01

At iterate   50    f=  3.20623D+04    |proj g|=  5.45743D+01

At iterate   50    f=  3.41204D+04    |proj g|=  3.54927D+01

At iterate   50    f=  3.33592D+04    |proj g|=  3.47271D+01

At iterate   50    f=  3.27833D+04    |proj g|=  5.03492D+01

At iterate   50    f=  3.41880D+04    |proj g|=  8.07172D+01

At iterate  100    f=  3.19593D+04    |proj g|=  5.11212D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    110      1     0     0   5.112D+01   3.196D+04
  F =   31959.310456641466

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.14746D+04    |proj g|=  5.22624D+01

At iterate  100    f=  3.40062D+04    |proj g|=  1.23469D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   1.235D+01   3.401D+04
  F =   34006.179329086517

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.35020D+04    |proj g|=  1.28782D+01

At iterate  100    f=  3.32580D+04    |proj g|=  2.06409D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    107      1     0     0   2.064D+01   3.326D+04
  F =   33258.003907179125

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT

At iterate  100    f=  3.26660D+04    |proj g|=  1.49031D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    108      1     0     0   1.490D+01   3.267D+04
  F =   32666.015817912816

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.27945D+04    |proj g|=  2.22048D+01
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.21525D+04    |proj g|=  1.54769D+01

At iterate  100    f=  3.40856D+04    |proj g|=  2.61951D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    108      1     0     0   2.620D+01   3.409D+04
  F =   34085.608687699583

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.36314D+04    |proj g|=  2.70166D+01

At iterate   50    f=  3.13759D+04    |proj g|=  5.13549D+01

At iterate   50    f=  3.26863D+04    |proj g|=  4.69096D+01

At iterate   50    f=  3.34000D+04    |proj g|=  2.70486D+01

At iterate   50    f=  3.20391D+04    |proj g|=  1.90035D+01

At iterate   50    f=  3.35226D+04    |proj g|=  9.19913D+01

At iterate  100    f=  3.12623D+04    |proj g|=  2.61464D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    109      1     0     0   2.615D+01   3.126D+04
  F =   31262.281830163720

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.11541D+04    |proj g|=  2.60008D+01

At iterate  100    f=  3.25599D+04    |proj g|=  4.23087D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   4.231D+01   3.256D+04
  F =   32559.924042316710

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.24459D+04    |proj g|=  4.21871D+01

At iterate  100    f=  3.32669D+04    |proj g|=  1.81318D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    107      1     0     0   1.813D+01   3.327D+04
  F =   33266.879532518455

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.31391D+04    |proj g|=  1.78562D+01

At iterate  100    f=  3.18701D+04    |proj g|=  2.01354D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    108      1     0     0   2.014D+01   3.187D+04
  F =   31870.144245662003

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.17202D+04    |proj g|=  2.02726D+01

At iterate  100    f=  3.33875D+04    |proj g|=  3.87361D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    105      1     0     0   3.874D+01   3.339D+04
  F =   33387.501525823609

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.32710D+04    |proj g|=  3.89076D+01

At iterate   50    f=  3.24120D+04    |proj g|=  1.68045D+01

At iterate   50    f=  3.11244D+04    |proj g|=  2.11394D+01

At iterate   50    f=  3.31137D+04    |proj g|=  1.36011D+01

At iterate   50    f=  3.16790D+04    |proj g|=  1.49819D+01

At iterate   50    f=  3.32350D+04    |proj g|=  2.77797D+01

At iterate  100    f=  3.10458D+04    |proj g|=  5.27721D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    107      1     0     0   5.277D+01   3.105D+04
  F =   31045.808935526016

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT

At iterate  100    f=  3.23504D+04    |proj g|=  1.68157D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    107      1     0     0   1.682D+01   3.235D+04
  F =   32350.360595808739

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.10268D+04    |proj g|=  5.27721D+01
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.23327D+04    |proj g|=  1.68403D+01

At iterate  100    f=  3.16356D+04    |proj g|=  2.32266D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    107      1     0     0   2.323D+01   3.164D+04
  F =   31635.616571226965

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT

At iterate  100    f=  3.30666D+04    |proj g|=  2.88477D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    108      1     0     0   2.885D+01   3.307D+04
  F =   33066.577323674304

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.16146D+04    |proj g|=  2.32253D+01
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.30481D+04    |proj g|=  2.88754D+01

At iterate  100    f=  3.31910D+04    |proj g|=  1.95868D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   1.959D+01   3.319D+04
  F =   33190.978410728567

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.31747D+04    |proj g|=  1.95730D+01

At iterate   50    f=  3.23028D+04    |proj g|=  1.67149D+01

At iterate   50    f=  3.10050D+04    |proj g|=  6.82002D+00

At iterate   50    f=  3.15900D+04    |proj g|=  1.27052D+01

At iterate   50    f=  3.30243D+04    |proj g|=  8.12354D+00

At iterate   50    f=  3.31546D+04    |proj g|=  2.98043D+01

At iterate  100    f=  3.22851D+04    |proj g|=  2.58396D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   2.584D+01   3.229D+04
  F =   32285.126911492029

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.22824D+04    |proj g|=  2.58431D+01

At iterate  100    f=  3.09928D+04    |proj g|=  7.38270D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    109      1     0     0   7.383D+00   3.099D+04
  F =   30992.812399692764

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT

At iterate  100    f=  3.15678D+04    |proj g|=  1.11096D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    105      1     0     0   1.111D+01   3.157D+04
  F =   31567.796015751748

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.15645D+04    |proj g|=  1.11109D+01
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.09901D+04    |proj g|=  7.38042D+00

At iterate  100    f=  3.30095D+04    |proj g|=  6.40468D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   6.405D+00   3.301D+04
  F =   33009.483212087900

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.30067D+04    |proj g|=  6.40487D+00

At iterate  100    f=  3.31285D+04    |proj g|=  1.18847D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    107      1     0     0   1.188D+01   3.313D+04
  F =   33128.459170482820

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.31258D+04    |proj g|=  1.18797D+01

At iterate   50    f=  3.22709D+04    |proj g|=  2.02002D+01

At iterate   50    f=  3.09805D+04    |proj g|=  5.53911D+00

At iterate   50    f=  3.15491D+04    |proj g|=  8.27267D+00

At iterate   50    f=  3.29966D+04    |proj g|=  1.00619D+01

At iterate   50    f=  3.31086D+04    |proj g|=  6.37572D+00

At iterate  100    f=  3.22532D+04    |proj g|=  1.12258D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   1.123D+01   3.225D+04
  F =   32253.171229828164

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.22528D+04    |proj g|=  1.12254D+01

At iterate  100    f=  3.15356D+04    |proj g|=  1.73320D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    105      1     0     0   1.733D+01   3.154D+04
  F =   31535.570385863415

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT

At iterate  100    f=  3.09676D+04    |proj g|=  1.96023D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   1.960D+01   3.097D+04
  F =   30967.570517159849

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.15351D+04    |proj g|=  1.73315D+01
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.09672D+04    |proj g|=  1.96027D+01

At iterate  100    f=  3.29838D+04    |proj g|=  4.59068D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    107      1     0     0   4.591D+00   3.298D+04
  F =   32983.762119559331

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.29834D+04    |proj g|=  4.59065D+00

At iterate  100    f=  3.30958D+04    |proj g|=  4.64356D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    107      1     0     0   4.644D+01   3.310D+04
  F =   33095.848676494395

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =         5200     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.30955D+04    |proj g|=  4.64353D+01

At iterate   50    f=  3.22430D+04    |proj g|=  6.39587D+00

At iterate   50    f=  3.15266D+04    |proj g|=  1.91596D+01

At iterate   50    f=  3.09581D+04    |proj g|=  4.43404D+00

At iterate   50    f=  3.29741D+04    |proj g|=  1.85344D+01

At iterate   50    f=  3.30852D+04    |proj g|=  8.04640D+00

At iterate  100    f=  3.22340D+04    |proj g|=  7.42192D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   7.422D+00   3.223D+04
  F =   32234.027868082881

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT

At iterate  100    f=  3.15146D+04    |proj g|=  6.80253D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    107      1     0     0   6.803D+00   3.151D+04
  F =   31514.581161989288

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT

At iterate  100    f=  3.09528D+04    |proj g|=  6.39301D+00

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    110      1     0     0   6.393D+00   3.095D+04
  F =   30952.786412086960

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT

At iterate  100    f=  3.29657D+04    |proj g|=  1.48928D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    108      1     0     0   1.489D+01   3.297D+04
  F =   32965.669129274691

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT

At iterate  100    f=  3.30737D+04    |proj g|=  1.99126D+01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
 5200    100    106      1     0     0   1.991D+01   3.307D+04
  F =   33073.655496377971

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT

Prediction¶

Use fitted model to predict celltypes in adata_pred (adata_test_corrected in our case). In the case of logistic regression, the threshold specifies the probability that needs to be reached to annotate a cell type or will be annotated as “unknown” if not reached. The threshold should be set to 0 or left out for SVM.

[17]:

adata_predicted = bc.tl.auto_annot.adata_predict(
    classifier=classifier,
    scaler=scaler,
    adata_pred=adata_test_corrected,
    adata_orig=adata_test_orig,
    threshold=0.1,
)

The prediction was added in a new column called ‘auto_annot’.

If in addition to the most likely class you would like to have all class probabilities returned use the following function. (This is only a sensible choice if using logistic regression.)

[18]:

adata_predicted = bc.tl.auto_annot.adata_pred_prob(
    classifier=classifier,
    scaler=scaler,
    adata_pred=adata_test_corrected,
    adata_orig=adata_test_orig,
    threshold=0.1,
)

Output¶

The adata object that includes the predicted cell type annotation can be written out as h5ad file.

[19]:

adata_predicted.write("/tmp/adata_predicted.h5ad")

If the testing dataset included already a cell type annotation, a report can be generated and written, which includes metrics, confusion matrices and comparative umap plots. The report creates standardised overview figures, for a more detailed control of figure layout other besca and scanpy figure functions should be used.

[20]:

%matplotlib inline
bc.tl.auto_annot.report(adata_predicted, celltype, method, analysis_name, False, merge, use_raw, genes_to_use, clustering = 'leiden')

besca.tl.auto_annot.report(...) is deprecated( besca > 2.3); please use besca.tl.report(...)
WARNING: saving figure to file figures/umap.ondata_auto_annot_tutorial.png

../_images/tutorials_auto_annot_tutorial_45_1.png

WARNING: saving figure to file figures/umap.auto_annot_tutorial_dblabel.png

../_images/tutorials_auto_annot_tutorial_45_3.png

WARNING: saving figure to file figures/umap.auto_annot_tutorial_auto_annot.png

../_images/tutorials_auto_annot_tutorial_45_5.png

WARNING: saving figure to file figures/umap.auto_annot_tutorial_leiden.png

../_images/tutorials_auto_annot_tutorial_45_7.png

Data type cannot be displayed: application/vnd.plotly.v1+json

../_images/tutorials_auto_annot_tutorial_45_9.png

../_images/tutorials_auto_annot_tutorial_45_10.png

If we change our threshold, we can vary the amount of cells labelled as unknown, depending on the confidence required from the predictions. Let’s try 0.7

[21]:

adata_predicted = bc.tl.auto_annot.adata_predict(
    classifier=classifier,
    scaler=scaler,
    adata_pred=adata_test_corrected,
    adata_orig=adata_test_orig,
    threshold=0.7,
)

We observe that many more cells, where the assignment is not unambiguous are now labelled as unknown.

[22]:

%matplotlib inline
bc.tl.auto_annot.report(adata_predicted, celltype, method, analysis_name, False, merge, use_raw, genes_to_use, clustering = 'leiden')

besca.tl.auto_annot.report(...) is deprecated( besca > 2.3); please use besca.tl.report(...)
WARNING: saving figure to file figures/umap.ondata_auto_annot_tutorial.png

../_images/tutorials_auto_annot_tutorial_49_1.png

WARNING: saving figure to file figures/umap.auto_annot_tutorial_dblabel.png

../_images/tutorials_auto_annot_tutorial_49_3.png

WARNING: saving figure to file figures/umap.auto_annot_tutorial_auto_annot.png

../_images/tutorials_auto_annot_tutorial_49_5.png

WARNING: saving figure to file figures/umap.auto_annot_tutorial_leiden.png

../_images/tutorials_auto_annot_tutorial_49_7.png

Data type cannot be displayed: application/vnd.plotly.v1+json

../_images/tutorials_auto_annot_tutorial_49_9.png

../_images/tutorials_auto_annot_tutorial_49_10.png

Let’s try another classifier (SVM)¶

Especially when thresholds are not used, SVMs can be tried as alternative classifiers.

[23]:

analysis_name = "auto_annot_tutorial_svm"
celltype = "dblabel"  # column name of celltype of interest
method = "linear"  # rbf or linear or sgd, rbf extremely slow cannot recommend, logistic_regression recommended, as you can get probabilities, random_forest is a fast but not very powerful option (in current implementation)
merge = "scanorama"
use_raw = False
genes_to_use = "all"

[24]:

adata_test = adata_test_orig.copy()

[25]:

adata_train, adata_test_corrected = bc.tl.auto_annot.merge_data(
    adata_train_list, adata_test, genes_to_use=genes_to_use, merge=merge
)

merging with scanorama
using scanorama rn
Found 207 genes among all datasets
[[0.   0.7  0.48]
 [0.   0.   0.99]
 [0.   0.   0.  ]]
Processing datasets (1, 2)
Processing datasets (0, 1)
Processing datasets (0, 2)
integrating training set

[26]:

classifier, scaler = bc.tl.auto_annot.fit(
    adata_train=adata_train, method=method, celltype=celltype, njobs=14
)

[27]:

adata_predicted = bc.tl.auto_annot.adata_predict(
    classifier=classifier,
    scaler=scaler,
    adata_pred=adata_test_corrected,
    adata_orig=adata_test_orig,
)

[28]:

adata_predicted.write("/tmp/adata_predicted_svm.h5ad")

[29]:

%matplotlib inline
bc.tl.auto_annot.report(adata_predicted, celltype, method, analysis_name, False,
                merge, use_raw, genes_to_use, clustering = 'leiden')

besca.tl.auto_annot.report(...) is deprecated( besca > 2.3); please use besca.tl.report(...)
WARNING: saving figure to file figures/umap.ondata_auto_annot_tutorial_svm.png

../_images/tutorials_auto_annot_tutorial_58_1.png

WARNING: saving figure to file figures/umap.auto_annot_tutorial_svm_dblabel.png

../_images/tutorials_auto_annot_tutorial_58_3.png

WARNING: saving figure to file figures/umap.auto_annot_tutorial_svm_auto_annot.png

../_images/tutorials_auto_annot_tutorial_58_5.png

WARNING: saving figure to file figures/umap.auto_annot_tutorial_svm_leiden.png

../_images/tutorials_auto_annot_tutorial_58_7.png

Data type cannot be displayed: application/vnd.plotly.v1+json

../_images/tutorials_auto_annot_tutorial_58_9.png

../_images/tutorials_auto_annot_tutorial_58_10.png

Let’s try with a specified gene set¶

Gene sets from GeMS or user specified genesets, can, if carefully chosen, significantly decrease computation time, withouth leading to a performance decrease.

[30]:

analysis_name = "auto_annot_tutorial_geneset"
celltype = "dblabel"  # column name of celltype of interest
method = "linear"  # rbf or linear or sgd, rbf extremely slow cannot recommend, logistic_regression recommended, as you can get probabilities, random_forest is a fast but not very powerful option (in current implementation)
merge = "scanorama"
use_raw = False

[31]:

annotSigns = bc.datasets.load_immune_signatures()
allGenes = lambda annotSigns: [item for sublist in annotSigns for item in sublist]
genes_to_use = list(set(allGenes(list(annotSigns.values()))))
display(genes_to_use[:10])

['MLANA',
 'LEF1',
 'MMRN1',
 'GZMA',
 'MCM2',
 'FGFBP2',
 'KRT18',
 'PCNA',
 'TRDC',
 'FABP4']

[32]:

adata_test = adata_test_orig.copy()

[33]:

adata_train, adata_test_corrected = bc.tl.auto_annot.merge_data(
    adata_train_list, adata_test, genes_to_use=genes_to_use, merge=merge
)

merging with scanorama
using scanorama rn
Found 77 genes among all datasets
[[0.   0.69 0.52]
 [0.   0.   0.98]
 [0.   0.   0.  ]]
Processing datasets (1, 2)
Processing datasets (0, 1)
Processing datasets (0, 2)
integrating training set

[34]:

classifier, scaler = bc.tl.auto_annot.fit(adata_train, method, celltype)

[35]:

adata_predicted = bc.tl.auto_annot.adata_predict(
    classifier=classifier,
    scaler=scaler,
    adata_pred=adata_test_corrected,
    adata_orig=adata_test_orig,
)

[36]:

adata_predicted.write("/tmp/adata_predicted_svm.h5ad")

[37]:

%matplotlib inline
bc.tl.auto_annot.report(adata_predicted, celltype, method, analysis_name,
False, merge, use_raw, genes_to_use, clustering = 'leiden')

besca.tl.auto_annot.report(...) is deprecated( besca > 2.3); please use besca.tl.report(...)
WARNING: saving figure to file figures/umap.ondata_auto_annot_tutorial_geneset.png

../_images/tutorials_auto_annot_tutorial_68_1.png

WARNING: saving figure to file figures/umap.auto_annot_tutorial_geneset_dblabel.png

../_images/tutorials_auto_annot_tutorial_68_3.png

WARNING: saving figure to file figures/umap.auto_annot_tutorial_geneset_auto_annot.png

../_images/tutorials_auto_annot_tutorial_68_5.png

WARNING: saving figure to file figures/umap.auto_annot_tutorial_geneset_leiden.png

../_images/tutorials_auto_annot_tutorial_68_7.png

Data type cannot be displayed: application/vnd.plotly.v1+json

../_images/tutorials_auto_annot_tutorial_68_9.png

../_images/tutorials_auto_annot_tutorial_68_10.png

Automated Annotation Workflow¶

Parameter specification¶

Data loading¶

Read in all training sets and the testing set.¶

Correct datasets (e.g. using scanorama) and merge training datasets¶

Train the classifier¶

Prediction¶

Output¶

Let’s try another classifier (SVM)¶

Let’s try with a specified gene set¶

Done!¶