ISSN: 2161-038X
+44 1300 500008
Research Article - (2017) Volume 0, Issue 0
Background: 20-25% of patients affected by prostate cancer relapses in the first 5 years after radical prostatectomy. Risk assessment is normally performed on consolidated parameters relating the peri-operative tumor characteristics, namely tumor staging, nodal involvement, positive margins, pathological Gleason Score and pre-surgery PSA values.
Methods: Based on the EUREKA-1 database, which collected clinical data from a large cohort of prostatectomized Italian patients, we validated three different models, which differ in the splitting of the pGS=7 patients in 3+4 and 4+3 cases and in the inclusion of the first post-surgery PSA value.
Results: Differences in the ROC curves’ AUC were detected, which are highly significant (73%) when the first postsurgical PSA is accounted for in the evaluation of the risk of tumor recurrence.
Conclusions: Early post-surgical PSA evaluation, besides being the starting point for long-time monitoring and a very sensible ‘alarm-bell’ when the biochemical recurrence threshold is approached, is therefore a valuable coparameter for the post-surgical risk assessment for prostate tumor recurrence.
<Keywords: Prostate cancer; Nomogram; PSA; Prostatectomy; Cancer recurrence
Predicting the probability of recurrence of Prostate Cancer (PCa) after Radical Prostatectomy (RP) is one of the main goals of studies and researches in this field. Roughly speaking, there are two useful deals of information: the peri-operative tumor characteristics (i.e. Gleason Score, tumor stage, surgical details and so on which are related to the time of surgery, and are therefore called ‘static models’) and the postoperative tumor dynamics mainly based on the progression of the PSA concentration in serum (dynamic models).
Dynamic models, already proposed by [1] and independently by [2], and further developed by [3,4] focus on the estimation of PSA velocity and doubling time from serial PSA measurements, which is expected to mirror tumor proliferation.
Static models, on which we concentrate our attention in the present paper, are normally validated on huge clinical database, and aim at producing simple and reliable tools for addressing therapeutic decisions based on the concept of classes of risk. Very popular nomograms have been proposed, starting from the first model of [5], the GPSM (Gleason Score, PSA, Seminal Vesicle invasion and Margin Status) proposed by [6], the nomogram of [7] and all their updated versions.
The definition of risk class obviously depends on the choice of the clinical endpoint: most of the available nomograms assume cancer specific survival as primary endpoint. However the biochemical disease-free survival is sometime more sensible, because the so-called ‘biochemical recurrence’ is well defined and more easily assessable than PCa-related death in a population of old patients presenting various age-related co-morbidities.
The more popular models take into consideration few parameters which are described below. First of all the Prostate Specific Antigen (PSA), which is secreted in the human body exclusively by the prostatic tissue and can be collected from serum in doses of few ng/ml, and is therefore an easily evaluable and cost effective blood marker. Normally, the last value before surgery (initial-PSA or iPSA) is taken into account, although also other values and PSA-kinetics could be of interest especially if metastases are suspected.
The traditional TNM staging system assigned at diagnosis through clinical and radiological exams, (i.e. the clinical staging cT), or better that confirmed after surgery by pathological examination (pathological staging pT), is accounted as well and staging is performed according to American Joint Committee on Cancer (AJCC) 2010, Seventh Edition [8].
Histological data are normally based on the Gleason Score system proposed almost 50 years ago, which can be performed either on the needle biopsy specimen, i.e. bioptic Gleason Score (bGS) or, following radical prostatectomy, on the whole prostate specimen, i.e. pathologic Gleason Score (pGS) using the classification proposed by ISUP Modified Gleason System in 2005 [9]. Due to the intrinsic randomness of biopsy sampling and the frequent occurrence of post-surgical re-staging, pGS is normally preferred. The GS is normally composed by its primary grade, assigned to the dominant (>50%) pattern & its secondary grade, assigned to the next-most frequent pattern.
Moreover, a novel way of grouping Gleason grades has been recently proposed by [10]. Nine potential Gleason scores (2-10) were grouped into five groups: Grade group 1 (Gleason score 2-6); Grade group 2 (Gleason score 3+4=7); Grade group 3 (Gleason score 4+3=7); Grade group 4 (Gleason score 8); and Grade group 5 (Gleason score 9-10) to accurately reflect prognosis of prostate cancer.
Such new groups strongly correlated with the risk of biochemical recurrence after surgery. According to [10], differences in recurrence rates between both Gleason 3+4 versus 4+3 were found. The new grading system was endorsed by ISUP and accepted by the World Health Organization for the 2016 edition of the genitourinary pathology blue book [11].
Finally, the detection of the tumor in the regional lymph nodes post lymphodectomy (pN) and the presence of positive margins (i.e. tumor extending to the surgical margin evidenced by an inked surface at the pathological exam) are normally accounted for.
The aim of our work is to determine how important the ‘traditional’ variables (iPSA, pGS, pT, pN and surgical margins) are and if and how adding additional parameters, such as the new classification of the GS and the first PSA value after surgery (Post Surgery PSA or PS-PSA), may improve the model predictions.
The assumed clinical endpoint is the occurrence of the biochemical recurrence (which is clearly and unambiguously determined in our database) instead of the disease-free survival. In particular, we aimed at assessing their risk to relapse in 5 years.
Data are presented in Section 2, statistical analysis is shown in Section 3, a discussion and the conclusions are presented in Section 4.
Following direct invitation, most of the Piedmont Urology divisions with the largest casuistic agreed to collaborate to a large clinical data collection named EUREKA-1 study, which was approved by FPOIRCCS Ethical Committee as leader clinical center. Being aimed at evaluating/validating the model activities promoted within the Computational Horizon in Cancer (CHIC) European project (Grant agreement n°600841), such data were available for further studies. The centers participating to the EUREKA-1 study were: San Luigi Gonzaga Hospital (Orbassano), Giovanni Bosco Hospital (Torino), Città della Salute Hospital (Torino), Maggiore Hospital (Novara), ASLTO4 Ivrea- Cirié (Ivrea-Cirié), Gradenigo Hospital (Torino), Santa Croce Hospital (Cuneo), Regionale Hospital (Aosta), Cardinal Massaia Hospital (Asti), Mauriziano Hospital (Torino), Maria Vittoria Hospital (Torino) and Fondazione Edo Tempia (Biella).
We consider 1862 patients belonging to the historic cohort EUREKA-1. Patients were selected with the following criteria:
• They underwent radical prostatectomy between 1991 and 2014 in one of the hospitals which entered the Eureka1 study [12]
• They had the first PSA assessment after surgery and its dosage was below the relapse threshold (0.2 ng/ml)
• They were not treated with adjuvant therapies
Collected data (after written informed consent) refer to pre-surgical information (e.g. the clinical stage of the tumor) and histological results. Patients are anonymous, identified by their ID code only and by no way amenable to their identity.
In particular, we concentrated on:
• Pathological Gleason Score (pGS): histologic scoring of the definite RP sample that range between 2 and 10.
• Pathological T Stage (pT): it designates the size and invasiveness of the tumor.
• Initial PSA (iPSA): PSA concentration in serum just before surgery
• First PSA after surgery (PS-PSA): the first PSA dosage in serum collected between 1 and 12 months after RP
• pN: pathologic presence in lymph nodes of the tumor after pelvic lymphadenectomy. N0 means that the cancer has not spread to any lymph nodes, N1 means the cancer has spread to nearby lymph nodes in the pelvis.
• Positive margins: tumor extending to the surgical margin (inked surface at the pathological exam).
Generally, pN and positive margins are very important to assess the risk of relapse or death and they are included in most nomograms (e.g. Partin) to account for the possibility of residual disease after RP. Indeed, pN=1 and/or having positive margins indicate a high risk of dying for metastatic PCa. In our sample a very small percentage of patients is positive to one of them. However, we inserted them in the analysis because of their importance.
We extracted a subgroup of 273 of patients using the SURVEYSELECT algorithm (SAS) in order to preserve the same representativeness of each variable that is present in the original cohort. This subgroup, that we will call control group, is excluded by the analysis shown in the following section and it is used only to validate the model (see Section 3.5).
Both parametric and non-parametric statistical analysis is performed. In particular:
• Kaplan-Meier curves are shown, which indicate the percentage of non-relapsed patients along time (months) after surgery, according to the values of each variable considered in the model.
• Generalized Linear Models (GLMs) were used in order to weight the information given by each variable. Indeed, a GLM provides a score for each patient, calculated by his clinical parameters, and links this score to the probability to relapse in 5 years.
• Receiver Operating Characteristic (ROC) curves show how reliable is the relationship between the outcome (relapsed vs. not relapsed), considered as gold standard, and the score, obtained by the GLMs. ROC curves for each models are produced and compared.
In Section 3.5 we discuss the best cut-off for the score considering the balance between sensitivity (the proportion of positives that are correctly identified as such) and specificity (the proportion of negatives that are correctly identified as such).
All the above analysis is performed with SAS/STAT® software.
Descriptive statistics
In Table 1 we summarize the principal characteristics of our cohort.
Variable | Value | n (%) |
---|---|---|
pGS | <7 | 839 (45) |
3+4 | 616 (33) | |
4+3 | 284 (15.2) | |
>7 | 122 (6.5) | |
NA | 1(0.3) | |
pT | ≤ pT2 | 1423 (76.4) |
≥ pT3 | 292 (15.8) | |
NA | 147 (7.8) | |
iPSA | <20 | 1808 (97) |
20-50 | 52 (2.7) | |
>50 | 2 (0.3) | |
PS-PSA | PS-PSA=0 | 295 (15.8) |
0 |
1376 (73.9) | |
PS-PSA ≥ 0.1 | 191 (10.3) | |
pN | pN0 | 1850 (99.3) |
pN1 | 12 (0.7) | |
Positive margins | No | 1492 (80.1) |
Yes | 370 (19.9) |
Table 1: Basic statistics of the sample.
Note that the majority of patients have a GS
Kaplan-Meier curves
We consider firstly each variable separately, in order to analyze their impact on the disease progression. Kaplan-Meier curves are hence shown in Figure 1. Differences between classes are always significant (p-values of Log-Rank tests
Figure 1a refers to the pathological GS. A high pGS is a bad prognostic factor for recurrence: indeed, less than 60% of pGS>7 are free from recurrence after 5 years. On the contrary, more than the 90% of pGS10, [13], we split the pGS=7 cohort in two sub-cohorts: 3+4 (straight dark curve) and 4+3 (dotted light curve). The difference between these two groups becomes particularly important after five years, since more than 80% of patients with pGS=3+4 had no relapse in 5 years while their percentage lowers to 70% for the 4+3 ones. Note that we did not consider separately pGS=8 and pGS>8 but only pGS>7 because of the small number of patients with pGS>8 (36 patients).
Figure 1b refers to pT. As expected from literature and clinical practice, the pT2 stages (pT2, pT2a, pT2b and pT2c) are more protective with respect to higher stages as pT3 and pT4. In particular, after 60 months (5 years) more than 80% of pT2 were free from the disease, while less of 70% of pT3 had no recurrence.
Figure 1c refers to the iPSA. We divide the continuous iPSA values into 3 classes: less than 20, 20-50 and greater than 50. The first class prefigures a good prognosis, in fact the percentage of patients without recurrence was around 80% not only in the 5 years but also for longer observation periods. On the contrary, only 60% of patients whose iPSA was between 20 and 50 had no recurrence in 5 years. The third class (iPSA >50) leads to a very bad prognosis, indeed all the patients relapsed within 5 years.
In Figure 1d we consider the first PSA dosages after surgery (PSPSA), controlled between 1 and 12 months after RP (within 1-6 months in the 63.6% and within 1-4 months in the 52.2% of our cohort). Indeed, as shown by many authors, see for instance [1,14,15], the PSA values after surgery are very important to assess the risk of relapse.
We divided the PS-PSA into three classes: equal to 0, more than 0 and less than 0.1, larger than 0.1 but lower than the threshold value (0.2 ng/ml). As shown in 1d, PS-PSA=0 led to a very good prognosis, indeed more than 90% of this group had no relapse in 5 years. On the contrary, PSA values larger than 0.1 is an ‘alarm bell’ for relapse, since only the 60% of patients within this class did not relapse after 5 years.
Figure 1e refers to the presence of cancer cells in the regional lymph nodes. pN is very important to discriminate good or bad prognosis. In the first two years, more than 40% of pN1 had a relapse, while 80% of pN0 were free from relapse in the first 5 years.
Figure 1f refers to positive margins, which led to the worst prognosis: indeed, less than 60% of patients with positive margins were free from relapse after 5 years while more than 80% was in case of negative margins.
The GLM analysis
Three models were considered: the first one similar to the Partin table [5], the second which used the new classification of the Gleason Score and the last one comprising also the PS-PSA parameter.
For each model, a GLM analysis was performed. Namely, the actual time of biochemical relapse of each patient was considered as ‘gold standard’ and pGS, pT, margins, pN and iPSA were considered as independent variables, which are supposed to be correlated with the gold standard (relapse within 5 years or more). Accordingly, the correlation strength was assumed as a ‘weight’ representing how important is the value of the variable to determine the outcome and defining the variable partial score. Summing the partial scores of all the single variables, a total score for each patient can be found which can be used as predictor of the outcome of the patient. Examples are shown in Section 3.5.
Starting from the first model, we took into account, as independent variables, only iPSA (continuous variable), pGS divided in three classes (<7, =7 and >7), pT as ≤ pT2 and ≥ pT3, pN as pN0 and pN1 and positive margins. Performing a GLM analysis, only pGS and positive margins were significant (p-value <0.05). Note that pN remains an important factor in the prognosis of the patient, as shown in Figure 1a; the above result comes from the fact that the size of the pN1 sample is too small to weight in the analysis (12 pN1 patients versus 1850 pN0). A similar situation occurs for the iPSA value: for 1808 patients iPSA 50 ng/ml. As concerns pT, only 15% (292) of patients had pT ≥ 3 and 7.8% (142) were missing.
We then compared the above model with a similar one obtained by splitting the pGS=7, as proposed by [10] in 3+4 and 4+3. Also in this case only pGS and positive margins were significant.
Finally, we inserted in the model also the PS-PSA value, finding that it is the most significant prognostic factor between the considered variables, while pGS and positive margins remain important factors.
The ROC curves
After GLM analysis, we compared the three models looking for the most reliable. We computed therefore the ROC curve of each model (Figure 2) and calculated the Area under the Curve (AUC), expressing the probability that relapsed patient obtains higher score than that of not-relapsed ones.
The last model, including the PS-PSA (two-dashed line in Figure 2) obtains an AUC of 0.7344 (0.7294-0.7494, 95%CI, which is much better than the previous ones. We propose in the next section both a graphical and a numerical tool to calculate the probability of relapse in 5 years using this last model.
The Nomogram
A nomogram is a simple, graphical way to calculate the total risk of relapse by selected factors, using as scores the beta values evaluated by the GLM analysis (PROC GLM in SAS Software) as explained in Section 3.3. Here we suggest a nomogram scheme (Figure 3) and the underlying numerical values (Table 2) to assess the score of a single patient. Summarizing the scores, the probability of relapse in 5 years can be easily calculated. As example, pGS=3+4, PS-PSA=0.05 and positive margins means a score of is 6+55+7=68 which is larger than 55, a possible threshold value. Actually he relapsed after 41 months from surgery.
Variable | Value | Score |
---|---|---|
pGS | ≤ 6 | 0 |
3+4 | 6 | |
4+3 | 15 | |
≥ 8 | 17 | |
PS-PSA | 0 | 0 |
0-0.1 | 55 | |
≥ 0.1 | 50 | |
Positive margins | No | 0 |
Yes | 7 |
Table 2: How to calculate the scores.
To select a meaningful cut-off score to assess the occurrence of relapse in the first 5 years, different strategies can be followed. Choosing a cut-off equal to 55, the sensitivity is high (93%), i.e. 93% of relapsed patients can be identified by the model. However, the specificity is 42%, i.e. 58% of patients who are not going to relapse could be alarmed unnecessarily. Choosing a cut-off equal to 58, the specificity is higher (70%), but the sensitivity decreases (61%).
We applied the nomogram also on the control group of 273 patients (see Section 2). In this sample, for the cut-off equal to 55 the sensitivity is 95% and the specificity 43%, which are consistent to those previously found. The nomogram was therefore validated on both the control group and the whole cohort.
This work focuses on the risk of biochemical recurrence after radical prostatectomy in 5 years, since PSA data in our database are reported in details, which makes the occurrence of ‘biochemical recurrence’ a very reliable information. On the contrary, old patients are progressively lost at follow-up and are often affected by comorbidities, making the actual prostate cancer related death sometimes difficult to assess. Unfortunately most of the other available nomograms, as EMPaCT [13- 16] and Partin [5] assume as clinical endpoint the PCa-related death and no direct comparisons can therefore be performed and discussed.
Another important aspect is that the study is referred to a specific population (Northern Italy) and therefore reflects both the lifestyle and the state of the art of the surgery, which may be geo-dependent, as stressed by [17].
While splitting the pGS 3+4 and 4+3 has almost no impact on the results, the novelty of the present approach is the addition of the PSPSA which greatly improves the reliability of the model predictions. This finding is not surprising, since relapse may follow microscopic incomplete resection at prostate bed or nodal/distant metastases already present but hardly detectable before RP. Their contribution to the large PSA value mainly produced by the primary tumor was somehow masked, but post-surgery PSA measures make it now apparent.
As concerns PS-PSA, our study confirms that PSA dosage smaller than 0.2 ng/ml can intercept a significant fraction of patients that are normally considered in range (but who actually are at risk). As a matter of fact, the inclusion of this parameter greatly improves the predictability of model.
A possible objection is that waiting for a postsurgical PSA assessment (at least one month after RP, to avoid artefactual results related to the acute response) may improperly delay clinical decisions. As a matter of fact, PS-PSA is normally sampled in the first 1-6 months, but it can be anticipated in order to run the nomogram as earlier as possible (e.g. within 3 months from RP), especially if metastases are suspected and supplementary therapy may be needed.
Another point of caution refers to the fact that the patients entering this study underwent PSA dosage assessment using different techniques and, especially for the older ones, PSA measures may be less accurate and affected by significant systematic errors.
In conclusion, clinical decision following RP become more reliable if the value of the PSA serum concentration detected as early as possible (between 1 and 3 months) after prostatectomy is added to the other presurgical parameters, namely pGS and surgical margins.
This model is easily applicable as graphical nomogram or as Table of scores. In both cases, clinicians can choose the best cut-off value depending on their Institution guidelines, derived from ethical and practical considerations.
Moreover, the model not only identifies the patients whose residual disease already produce PSA concentrations values above the biochemical recurrence threshold, who urgently need to be treated, but also detects other ‘at risk’ patients who should carefully and frequently monitored.
Besides the definition of classes of risk based on the peri-surgical evaluation of the tumor severity, the lifetime collection of PSA samples remains indeed the best way to monitor the cancer progression after surgery [4].
This study has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 600841.