Logistic Regression · Random Forest · Gradient Boosting · MLP · Residual MLP · Attention MLP
LSTM · Stacked LSTM · Bidirectional LSTM · LSTM+Attention · CNN-LSTM
Trained and evaluated on a Chinese AP cohort (n=722) with 5-fold stratified cross-validation
All models evaluated with 5-fold Stratified Cross-Validation | Ground truth: SAP label (Atlanta 2012) | n=722 (585 severe / 137 mild)
| Model | Type | AUC | F1 | Threshold | TP | FP | FN | TN | Sensitivity | Specificity | PPV |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Logistic Regression | ML | 0.817 | 0.907 | 0.575 | 549 | 77 | 36 | 60 | 93.8% | 43.8% | 87.7% |
| Random Forest ★ | ML | 0.877 | 0.917 | 0.535 | 566 | 84 | 19 | 53 | 96.8% | 38.7% | 87.1% |
| Gradient Boosting | ML | 0.874 | 0.918 | 0.350 | 568 | 85 | 17 | 52 | 97.1% | 38.0% | 87.0% |
| MLP (3-layer) | DL | 0.836 | 0.909 | 0.282 | 567 | 103 | 18 | 34 | 96.9% | 24.8% | 84.6% |
| Residual MLP | DL | 0.804 | 0.912 | 0.203 | 572 | 98 | 13 | 39 | 97.8% | 28.5% | 85.4% |
| Attention MLP | DL | 0.784 | 0.909 | 0.418 | 575 | 105 | 10 | 32 | 98.3% | 23.4% | 84.6% |
| LSTM | LSTM | 0.699 | 0.895 | 0.300 | 585 | 137 | 0 | 0 | 100.0% | 0.0% | 81.0% |
| Stacked LSTM | LSTM | 0.705 | 0.898 | 0.489 | 575 | 121 | 10 | 16 | 98.3% | 11.7% | 82.6% |
| Bidirectional LSTM | LSTM | 0.715 | 0.898 | 0.318 | 582 | 129 | 3 | 8 | 99.5% | 5.8% | 81.9% |
| LSTM + Attention | LSTM | 0.684 | 0.896 | 0.172 | 585 | 136 | 0 | 1 | 100.0% | 0.7% | 81.1% |
| CNN-LSTM ★ | LSTM | 0.777 | 0.897 | 0.082 | 585 | 135 | 0 | 2 | 100.0% | 1.5% | 81.2% |
Each tab shows TP/FP/FN/TN across thresholds 0.10–0.80 | Ground truth: 1=Severe SAP, 0=Mild SAP
| Threshold | TP | FP | FN | TN | Sensitivity | Specificity | PPV | F1 |
|---|---|---|---|---|---|---|---|---|
| 0.10 | 582 | 126 | 3 | 11 | 99.5% | 8.0% | 82.2% | 0.900 |
| 0.20 | 574 | 118 | 11 | 19 | 98.1% | 13.9% | 82.9% | 0.899 |
| 0.30 | 572 | 107 | 13 | 30 | 97.8% | 21.9% | 84.2% | 0.905 |
| 0.40 | 563 | 101 | 22 | 36 | 96.2% | 26.3% | 84.8% | 0.902 |
| 0.50 | 556 | 90 | 29 | 47 | 95.0% | 34.3% | 86.1% | 0.903 |
| 0.575 ★ | 549 | 77 | 36 | 60 | 93.8% | 43.8% | 87.7% | 0.907 |
| 0.60 | 542 | 76 | 43 | 61 | 92.6% | 44.5% | 87.7% | 0.901 |
| 0.70 | 516 | 64 | 69 | 73 | 88.2% | 53.3% | 89.0% | 0.886 |
| 0.80 | 467 | 44 | 118 | 93 | 79.8% | 67.9% | 91.4% | 0.852 |
| Threshold | TP | FP | FN | TN | Sensitivity | Specificity | PPV | F1 |
|---|---|---|---|---|---|---|---|---|
| 0.10 | 585 | 137 | 0 | 0 | 100% | 0.0% | 81.0% | 0.895 |
| 0.20 | 585 | 137 | 0 | 0 | 100% | 0.0% | 81.0% | 0.895 |
| 0.30 | 584 | 124 | 1 | 13 | 99.8% | 9.5% | 82.5% | 0.903 |
| 0.40 | 582 | 110 | 3 | 27 | 99.5% | 19.7% | 84.1% | 0.912 |
| 0.50 | 571 | 94 | 14 | 43 | 97.6% | 31.4% | 85.9% | 0.914 |
| 0.535 ★ | 566 | 84 | 19 | 53 | 96.8% | 38.7% | 87.1% | 0.917 |
| 0.60 | 547 | 67 | 38 | 70 | 93.5% | 51.1% | 89.1% | 0.912 |
| 0.70 | 520 | 44 | 65 | 93 | 88.9% | 67.9% | 92.2% | 0.905 |
| 0.80 | 455 | 22 | 130 | 115 | 77.8% | 83.9% | 95.4% | 0.857 |
| Threshold | TP | FP | FN | TN | Sensitivity | Specificity | PPV | F1 |
|---|---|---|---|---|---|---|---|---|
| 0.10 | 583 | 115 | 2 | 22 | 99.7% | 16.1% | 83.5% | 0.909 |
| 0.20 | 576 | 106 | 9 | 31 | 98.5% | 22.6% | 84.5% | 0.909 |
| 0.30 | 570 | 91 | 15 | 46 | 97.4% | 33.6% | 86.2% | 0.915 |
| 0.350 ★ | 568 | 85 | 17 | 52 | 97.1% | 38.0% | 87.0% | 0.918 |
| 0.40 | 561 | 81 | 24 | 56 | 95.9% | 40.9% | 87.4% | 0.914 |
| 0.50 | 550 | 71 | 35 | 66 | 94.0% | 48.2% | 88.6% | 0.912 |
| 0.60 | 537 | 63 | 48 | 74 | 91.8% | 54.0% | 89.5% | 0.906 |
| 0.70 | 523 | 51 | 62 | 86 | 89.4% | 62.8% | 91.1% | 0.903 |
| 0.80 | 492 | 39 | 93 | 98 | 84.1% | 71.5% | 92.7% | 0.882 |
| Threshold | TP | FP | FN | TN | Sensitivity | Specificity | PPV | F1 |
|---|---|---|---|---|---|---|---|---|
| 0.10 | 584 | 125 | 1 | 12 | 99.8% | 8.8% | 82.4% | 0.903 |
| 0.20 | 578 | 116 | 7 | 21 | 98.8% | 15.3% | 83.3% | 0.904 |
| 0.281 ★ | 576 | 106 | 9 | 31 | 98.5% | 22.6% | 84.5% | 0.909 |
| 0.30 | 570 | 105 | 15 | 32 | 97.4% | 23.4% | 84.4% | 0.905 |
| 0.40 | 557 | 96 | 28 | 41 | 95.2% | 29.9% | 85.3% | 0.900 |
| 0.50 | 537 | 81 | 48 | 56 | 91.8% | 40.9% | 86.9% | 0.893 |
| 0.60 | 510 | 61 | 75 | 76 | 87.2% | 55.5% | 89.3% | 0.882 |
| 0.70 | 471 | 43 | 114 | 94 | 80.5% | 68.6% | 91.6% | 0.857 |
| 0.80 | 423 | 30 | 162 | 107 | 72.3% | 78.1% | 93.4% | 0.815 |
| Threshold | TP | FP | FN | TN | Sensitivity | Specificity | PPV | F1 |
|---|---|---|---|---|---|---|---|---|
| 0.10 | 578 | 127 | 7 | 10 | 98.8% | 7.3% | 82.0% | 0.896 |
| 0.20 | 571 | 114 | 14 | 23 | 97.6% | 16.8% | 83.4% | 0.899 |
| 0.276 ★ | 567 | 106 | 18 | 31 | 96.9% | 22.6% | 84.3% | 0.901 |
| 0.30 | 564 | 105 | 21 | 32 | 96.4% | 23.4% | 84.3% | 0.900 |
| 0.40 | 549 | 94 | 36 | 43 | 93.8% | 31.4% | 85.4% | 0.894 |
| 0.50 | 535 | 83 | 50 | 54 | 91.5% | 39.4% | 86.6% | 0.889 |
| 0.60 | 505 | 72 | 80 | 65 | 86.3% | 47.4% | 87.5% | 0.869 |
| 0.70 | 475 | 53 | 110 | 84 | 81.2% | 61.3% | 90.0% | 0.854 |
| 0.80 | 414 | 31 | 171 | 106 | 70.8% | 77.4% | 93.0% | 0.804 |
| Threshold | TP | FP | FN | TN | Sensitivity | Specificity | PPV | F1 |
|---|---|---|---|---|---|---|---|---|
| 0.10 | 584 | 131 | 1 | 6 | 99.8% | 4.4% | 81.7% | 0.898 |
| 0.20 | 579 | 125 | 6 | 12 | 99.0% | 8.8% | 82.2% | 0.898 |
| 0.30 | 575 | 123 | 10 | 14 | 98.3% | 10.2% | 82.4% | 0.896 |
| 0.40 | 574 | 109 | 11 | 28 | 98.1% | 20.4% | 84.0% | 0.905 |
| 0.50 | 572 | 101 | 13 | 36 | 97.8% | 26.3% | 85.0% | 0.909 |
| 0.602 ★ | 566 | 85 | 19 | 52 | 96.8% | 38.0% | 86.9% | 0.916 |
| 0.70 | 540 | 66 | 45 | 71 | 92.3% | 51.8% | 89.1% | 0.907 |
| 0.80 | 472 | 49 | 113 | 88 | 80.7% | 64.2% | 90.6% | 0.854 |
| Threshold | TP | FP | FN | TN | Sensitivity | Specificity | PPV | F1 |
|---|---|---|---|---|---|---|---|---|
| 0.10 | 585 | 137 | 0 | 0 | 100.0% | 0.0% | 81.0% | 0.895 |
| 0.20 | 585 | 137 | 0 | 0 | 100.0% | 0.0% | 81.0% | 0.895 |
| 0.300 ★ | 585 | 137 | 0 | 0 | 100.0% | 0.0% | 81.0% | 0.895 |
| 0.40 | 580 | 132 | 5 | 5 | 99.1% | 3.6% | 81.5% | 0.894 |
| 0.50 | 569 | 122 | 16 | 15 | 97.3% | 10.9% | 82.3% | 0.892 |
| 0.60 | 545 | 115 | 40 | 22 | 93.2% | 16.1% | 82.6% | 0.876 |
| 0.70 | 498 | 82 | 87 | 55 | 85.1% | 40.1% | 85.9% | 0.855 |
| 0.80 | 390 | 47 | 195 | 90 | 66.7% | 65.7% | 89.2% | 0.763 |
| Threshold | TP | FP | FN | TN | Sensitivity | Specificity | PPV | F1 |
|---|---|---|---|---|---|---|---|---|
| 0.10 | 585 | 137 | 0 | 0 | 100.0% | 0.0% | 81.0% | 0.895 |
| 0.20 | 585 | 137 | 0 | 0 | 100.0% | 0.0% | 81.0% | 0.895 |
| 0.30 | 582 | 137 | 3 | 0 | 99.5% | 0.0% | 80.9% | 0.893 |
| 0.40 | 578 | 127 | 7 | 10 | 98.8% | 7.3% | 82.0% | 0.896 |
| 0.489 ★ | 575 | 121 | 10 | 16 | 98.3% | 11.7% | 82.6% | 0.898 |
| 0.50 | 573 | 121 | 12 | 16 | 97.9% | 11.7% | 82.6% | 0.896 |
| 0.60 | 564 | 111 | 21 | 26 | 96.4% | 19.0% | 83.6% | 0.895 |
| 0.70 | 539 | 102 | 46 | 35 | 92.1% | 25.5% | 84.1% | 0.879 |
| 0.80 | 395 | 51 | 190 | 86 | 67.5% | 62.8% | 88.6% | 0.766 |
| Threshold | TP | FP | FN | TN | Sensitivity | Specificity | PPV | F1 |
|---|---|---|---|---|---|---|---|---|
| 0.10 | 582 | 133 | 3 | 4 | 99.5% | 2.9% | 81.4% | 0.895 |
| 0.20 | 582 | 131 | 3 | 6 | 99.5% | 4.4% | 81.6% | 0.897 |
| 0.318 ★ | 582 | 129 | 3 | 8 | 99.5% | 5.8% | 81.9% | 0.898 |
| 0.30 | 582 | 129 | 3 | 8 | 99.5% | 5.8% | 81.9% | 0.898 |
| 0.40 | 580 | 127 | 5 | 10 | 99.1% | 7.3% | 82.0% | 0.898 |
| 0.50 | 576 | 125 | 9 | 12 | 98.5% | 8.8% | 82.2% | 0.896 |
| 0.60 | 556 | 106 | 29 | 31 | 95.0% | 22.6% | 84.0% | 0.892 |
| 0.70 | 324 | 43 | 261 | 94 | 55.4% | 68.6% | 88.3% | 0.681 |
| 0.80 | 211 | 22 | 374 | 115 | 36.1% | 83.9% | 90.6% | 0.516 |
| Threshold | TP | FP | FN | TN | Sensitivity | Specificity | PPV | F1 |
|---|---|---|---|---|---|---|---|---|
| 0.10 | 585 | 137 | 0 | 0 | 100.0% | 0.0% | 81.0% | 0.895 |
| 0.172 ★ | 585 | 136 | 0 | 1 | 100.0% | 0.7% | 81.1% | 0.896 |
| 0.20 | 584 | 136 | 1 | 1 | 99.8% | 0.7% | 81.1% | 0.895 |
| 0.30 | 582 | 136 | 3 | 1 | 99.5% | 0.7% | 81.1% | 0.893 |
| 0.40 | 581 | 135 | 4 | 2 | 99.3% | 1.5% | 81.1% | 0.893 |
| 0.50 | 576 | 130 | 9 | 7 | 98.5% | 5.1% | 81.6% | 0.892 |
| 0.60 | 536 | 111 | 49 | 26 | 91.6% | 19.0% | 82.8% | 0.870 |
| 0.70 | 479 | 89 | 106 | 48 | 81.9% | 35.0% | 84.3% | 0.831 |
| 0.80 | 409 | 56 | 176 | 81 | 69.9% | 59.1% | 88.0% | 0.779 |
| Threshold | TP | FP | FN | TN | Sensitivity | Specificity | PPV | F1 |
|---|---|---|---|---|---|---|---|---|
| 0.082 ★ | 585 | 135 | 0 | 2 | 100.0% | 1.5% | 81.2% | 0.897 |
| 0.10 | 582 | 134 | 3 | 3 | 99.5% | 2.2% | 81.3% | 0.895 |
| 0.20 | 573 | 125 | 12 | 12 | 97.9% | 8.8% | 82.1% | 0.893 |
| 0.30 | 563 | 120 | 22 | 17 | 96.2% | 12.4% | 82.4% | 0.888 |
| 0.40 | 554 | 113 | 31 | 24 | 94.7% | 17.5% | 83.1% | 0.885 |
| 0.50 | 542 | 100 | 43 | 37 | 92.6% | 27.0% | 84.4% | 0.883 |
| 0.60 | 527 | 78 | 58 | 59 | 90.1% | 43.1% | 87.1% | 0.886 |
| 0.70 | 498 | 61 | 87 | 76 | 85.1% | 55.5% | 89.1% | 0.871 |
| 0.80 | 446 | 50 | 139 | 87 | 76.2% | 63.5% | 89.9% | 0.825 |
5-fold cross-validation out-of-fold predictions | X axis: False Positive Rate · Y axis: True Positive Rate
Top-10 features per classical ML model | LR: |coefficient| · RF/GB: feature_importance score
Deep learning models use all 106 features via learned attention — no single-feature ranking is available.
Linear model with L2 regularisation (C=0.5). Features scaled with StandardScaler. Optimal threshold 0.575 selected by maximum F1.
Key features: Lymphocytes, Calcium, Creatinine, Lactate, LDH. Fully interpretable — each coefficient directly quantifies feature contribution.
Ensemble of 200 decision trees (max_depth=6, min_samples_leaf=5). Averages predicted probabilities across trees. Highest overall AUC (0.877).
Robust to missing values; no normalisation required. Key features: Calcium, D-dimer, LDH, Lactate, Hematocrit.
150 shallow boosted trees (max_depth=3), learning_rate=0.05. Highest F1 score (0.918). Low threshold (0.35) achieves high sensitivity early on the ROC curve.
Key features: D-dimer, Calcium, LDH, Lactate, Hematocrit — nearly identical to RF, confirming feature stability.
Architecture: 256 → 128 → 64 → 1 sigmoid. Each hidden layer followed by BatchNormalization and Dropout (0.35/0.30/0.20). L2 regularisation on dense layers.
Early stopping on val_AUC (patience=8) converged at ~11 epochs per fold (range 8–16). AUC 0.836.
Projection layer (128) followed by two residual blocks: Dense→BN→Dropout→Dense with skip connection. Lower LR (8e-4) chosen to stabilise skip-connection learning.
Early stopping converged at ~10 epochs per fold (range 4–17). Wider fold variance reflects sensitivity of residual depth to initialisation. AUC 0.804.
Feature-wise sigmoid attention gate (Dense 106→106) scales each input before the downstream MLP (256→128→64→1). The gate weights are inspectable per sample.
Fastest convergence: early stopping fired at ~7 epochs per fold (range 2–14), indicating rapid feature selection learning. AUC 0.784.
Vanilla single-layer LSTM (64 units) followed by Dropout(0.3) and a Dense(32,relu) head. Features reshaped to (106,1) so each lab value is treated as one time step.
Early stopping on val_AUC converged at ~19 epochs per fold (range 10–33). AUC 0.699.
Two-layer stacked LSTM: LSTM(64, return_sequences) → Dropout(0.25) → LSTM(32) → Dropout(0.25) → Dense(32). Lower LR (5e-4) chosen to stabilise deeper gradient flow.
Converged at ~17 epochs per fold (range 4–25). AUC 0.705.
BiLSTM(64) processes the 106-length sequence in both forward and backward directions, concatenating the outputs (128-dim), followed by BatchNormalization and Dropout(0.3).
Highest AUC among vanilla LSTM variants (~0.715). Converged at ~22 epochs per fold (range 6–43). High fold variance due to dataset size.
Bahdanau-style additive attention: LSTM(64, return_sequences=True) produces (106,64) states; Dense(1,tanh) scores each step; Softmax(axis=1) normalises; weighted sum collapses to context vector.
Converged at ~27 epochs per fold (range 14–53). AUC 0.684.
Conv1D(32,k=5,same) extracts local feature patterns → MaxPool1D(2) halves the sequence → Conv1D(64,k=3,same) abstracts higher-level motifs → LSTM(64) captures temporal order → Dropout(0.3) → Dense(32,relu).
Highest AUC among all LSTM variants (0.777). Converged at ~27 epochs per fold (range 12–43).