CRC 2000 - colorectal cancer - data structure
and protocols
[Return to the Contents
List]
Contents List
CRC
2000 data preparation: brief written protocol
Strict confidentiality of trial results is
observed. Information is held in the Clinical Trial Service Unit computers
in a form which can be accessed only by known individuals.
All patient records are converted into
'green form' format (described below) if not already supplied in it. Results
received as tables are converted into sets of synthetic 'green form' records.
The following routine checks (where appropriate) are performed on every
'green form' compilation:
-
Duplicate patient entries
-
Patient identifier missing
-
Randomisation date missing
-
Treatment allocation missing
-
Surgery date missing
-
Tumour site missing
-
Tumour stage missing
-
Gender missing
-
Randomisation age missing
-
Recurrence date missing
-
Recurrence type missing
-
Survival status missing
-
Death date missing
-
Randomisation date wrong, before 1945 or out
of range
-
Surgery date wrong or out of range
-
Recurrence date wrong or out of range
-
Last follow-up or death date wrong or out
of range
-
Treatment allocation code unknown
-
Tumour site code unknown
-
Gender code unknown
-
Randomisation age not in range 20-98
-
Recurrence type code unknown
-
Survival status code unknown
-
Tumour stage incompatible with metastatic
disease status
-
Recurrence flag error
-
Recurrence type given without event
-
Cause of death given when alive
-
Died of colorectal cancer without recurrence
-
Died of cause other than colorectal cancer
but with recurrence
The total numbers of patients and the distributions
of randomisation age, tumour site, tumour stage and gender are checked
for any significant imbalance between treatment groups. These four distributions
are compared as follows. Patients are grouped into four categories according
to randomisation age (below 50 years; 50 - 64 years or unknown; 65 - 74
years; 75 years or above) and a chi-squared test is applied to the
population of the three categories found in each treatment group. Similarly,
three categories are formed for tumour site (colon; (colon plus rectum)
or unknown; rectum), five for tumour stage ('other' or unknown; A; B; C;
D) and three for gender (male; unknown; female) and these are tested in
the same way as the categories formed for randomisation ages.
If an event such as the recurrence of disease
is reported at a date later than the quoted last follow-up date, the last
follow-up date is automatically changed to the later date. The completeness
of follow-up is then calculated for the end of each calendar year. The
distributions of randomisation dates, randomisation ages and time elapsed
since last follow-up are checked for any significant imbalance between
treatment groups in two ways as follows. Firstly, a t-test is applied
to the difference between the mean value of each distribution for patients
in each group with the corresponding mean for patients in the remainder.
Secondly, an F-ratio is calculated for each distribution by comparing
the variance between the groups with the variance within the groups. The
distribution of time elapsed since last follow-up is also checked in these
two ways for any significant imbalance between those patients with and
those patients without a recorded recurrence of disease. Finally, the distribution
of time elapsed since last follow-up is checked in the same two ways for
any significant imbalance between patients in two categories of tumour
site (colon; rectum), two categories of tumour stage (A/B; C/D) and two
categories of gender (male; female).
Where patient serial numbers form an obvious
sequence it is checked for missing numbers.
A tabulated breakdown of variables is produced
for each trial, together (where relevant) with lists of patients in 'problematical'
categories such as those with lapsed follow-up, uncertain death cause or
second malignancy site. Graphs of accrual date and the proportion of living
patients still on follow-up as a function of time from randomisation by
treatment allocation are also produced, together with Kaplan-Meier life-table
curves. Before trial data are finally incorporated into the overview, the
analyses described above are sent to the participating trialist(s) for
checking and approval.
Contact
Please address inquiries concerning data preparation
and checking to:
Vaughan Evans
CCC Secretariat
C.T.S.U.
Radcliffe Infirmary
Oxford OX2 6HE
England
Tel. U.K.(44)-Oxford(1865)-557241; FAX: U.K.(44)-Oxford(1865)-558817
Specification
of CRC 2000 'green form' format
Item |
Description |
FORTRAN |
Columns |
Details |
Abbreviation |
0 |
Trial/stratum identifying code |
I6 |
1 - 6 |
|
Trial |
1 |
Patient identifier (or sequence number) |
A12 |
8 - 19 |
|
Patient |
2 |
Randomisation date |
I8 |
21 - 28 |
DDMMYYYY |
'Entry Date' |
3 |
Treatment group allocated (as on master
list) |
I1 |
30 |
|
Trt. Grp |
4 |
Surgery date |
I8 |
32 - 39 |
Value |
Description |
Abbreviation |
DDMMYYYY |
|
|
-1 |
No surgery |
|
-2 |
No surgery, NOT on account of disease
stage |
|
-3 |
Surgery but date unknown |
|
-4 |
Too ill for surgery |
|
|
Surg. Date |
5 |
(Not used) |
|
|
|
|
6 |
Tumour site |
I1 |
43 |
Value |
Description |
Abbreviation |
1 |
Colon |
Colon |
2 |
Rectum |
Rectum |
3 |
Colon and rectum |
Col+Rect |
|
Tum. Site |
7 |
Tumour stage |
A2 |
45 - 47 |
Value |
Description |
Abbreviation |
A |
A |
A |
B1 |
B1 |
B1 |
B |
B |
B |
B2 |
B2 |
B2 |
B3 |
B3 |
B3 |
C |
C |
C |
C1 |
C1 |
C1 |
C2 |
C2 |
C2 |
C3 |
C3 |
C3 |
D |
Metastatic disease |
D |
D? |
Metastatic disease |
D? |
N |
Not colorectal cancer |
N |
W |
'Advanced/metastatic disease' |
W |
X |
Benign tumour |
X |
Y |
Inoperable disease |
Y |
Y? |
Inoperable disease |
Y? |
Z |
Malignant tumour (unclassified) |
Z |
O |
Other |
O |
|
Tumour Stage |
8 |
Gender |
I1 |
48 |
Value |
Description |
Abbreviation |
1 |
Male |
Male |
2 |
Female |
Female |
|
Gender |
9 |
Randomisation age |
I2 |
50 - 51 |
years |
Age |
10 |
Recurrence |
I1 |
53 |
Value |
Description |
Abbreviation |
1 |
No |
No |
2 |
Yes |
Yes |
|
Any/Rec |
11 |
Date of first recurrence |
I8 |
55 - 62 |
DDMMYYYY |
Rec. Date |
12 |
Type of first recurrence |
I2 |
63 - 64 |
Value |
Description |
Abbreviation |
1 |
Local only |
Local |
2 |
Local and distant, liver unknown |
L+D,?Hep. |
3 |
Distant only, including liver |
Dist+Hep |
4 |
Distant only, excluding liver |
Dist,NoHep |
5 |
Distant only, liver unknown |
Dist,?Hep |
6 |
Distant, but local unknown |
Dist,?Loc |
7 |
Local and distant, including liver |
L+D+Hep. |
8 |
Local and distant, excluding liver |
L+D,NoHep |
9 |
Local, but distant unknown |
Loc+?Dis |
10 |
Unknown, liver sometime |
??+Hep. |
11 |
Unknown, but not liver |
??,NoHep |
12 |
Unknown |
?? |
|
Type Rec |
13 |
State when last traced |
I1 |
66 |
Value |
Description |
Abbreviation |
1 |
Alive |
Alive |
2 |
Dead |
Dead |
3 |
Lost |
Lost |
|
State |
14 |
Date died or last traced |
I8 |
68 - 75 |
DDMMYYYY |
L.F.U. |
15 |
Cause of death (extra category) |
I2 |
76 - 77 |
Value |
Description |
Abbreviation |
1 |
Acute iatrogenic |
|
2 |
Infective |
|
3 |
Leukaemia, lymphoma or myeloma |
|
4 |
Other second neoplasm |
|
5 |
Cardiovascular |
|
6 |
Venous embolism |
|
7 |
Cerebrovascular |
|
8 |
Extraneous cause |
|
9 |
Not 1-8,13-18 or colorectal cancer |
|
10 |
Unspecified non-colo.-ca. cause |
|
11 |
Colorectal cancer or its mets. |
|
12 |
Unascertainable cause |
|
13 |
Renal failure |
|
14 |
Bowel fistula / ulcer |
|
15 |
Intestinal obstruction |
|
16 |
Probably not colorectal cancer |
|
17 |
Liver failure |
|
18 |
Gastrointestinal haemorrhage |
|
19 |
Second primary colorectal cancer |
|
|
Death |
16 |
Name (if given) and comments |
A |
79 - end |
|
Name |
Missing or unknown items are left blank or
set to zero.
CRC
2000 data form rubric
GUARANTEE OF CONFIDENTIALITY OF DATA
ANY INFORMATION PROVIDED OVERLEAF TO THE
CCC/LIMAG SECRETARIAT WILL BE HELD SECURELY AND IN STRICT CONFIDENCE.
NOTES ON FORMAT OF DATA REQUESTED OVERLEAF
Special coding conventions
Please accompany these forms by an explanatory
letter about any special coding conventions (e.g. on tumour site,
tumour staging or cause of death) you have used, plus notes on any special
features of the study(s) to which you wish to draw attention.
Dates that are not (or not yet) known exactly
either leave DAY blank, and give
(approximate or provisional) month and year;
or leave DAY and MONTH blank, and
just give approximate year.
BASELINE DATA
Patient identifier
Any convenient convention you wish, in
case any correspondence becomes necessary. (If reporting several trials,
please try to use a system that implicitly specifies both the trial and
the patient.)
Date randomised
Please describe ALL patients EVER randomised,
including
even lost, ineligible or withdrawn patients, and ignore all non-randomised
patients.
Trt. gp. allocated
Treatment group number: 1 or 2 only, for
2-group trials, or a wider range for trials with more arms, as defined
by you at the top of the form. N.B: even if, in reality, some quite
different (or even opposite!) treatment was inadvertently given, what is
wanted is the originally-allocated treatment. (For patients erroneously
entered more than once, give only the first allocation.)
Date of surgery
See note above on approximate dates.
Tumour site
0 = unspecified; 1 = colon; 2 = rectum;
3 = colon and rectum. If you prefer to use your own classification of tumour
site (e.g. in order to code sigmoid tumours separately) please do
so, and send us details of it.
Tumour stage
Please use your own classification and
send us details of it, or use the Dukes classification (A = lesion confined
to muscularis propria; B = lesion extends through muscularis propria with
negative nodes; C = positive nodes), or any other standard system (e.g.
Astler-Coller modification, TNM etc). Extra codes: D = metastatic
disease; X = benign tumour (e.g. adenoma); and Y = inoperable disease.
Gender
Entry age
FOLLOW-UP DATA
Recur?
CRC
2000 data preparation diktats
1 Local spread found at surgery
Nothing special, unless excision is incomplete.
2 Incomplete excision / residual tumour after
surgery
Set recurrence = 'Yes', set recurrence
date = surgery date, set recurrence type = 'local' (or as appropriate,
if other site(s) already given). If stage is 'A', 'B', 'C' or unknown,
set it to 'Y?' and if 'A', 'B' or 'C', check the change with the trialist.
(N.B: formerly, known stages were not
changed and unknown stage was set to 'D?'; later, 'Y' was used instead
of 'Y?').
3 Local recurrence at surgery
Set stage to 'Y?' and if 'A', 'B' or 'C',
check the change with the trialist.
(N.B: formerly, known stages were not
changed and unknown stage was set to set to 'D?')
4 Local recurrence reported on days 1-30 after
surgery
Put on hold and refer case to RG.
(N.B: formerly, procedure adopted was
as 'local recurrence at surgery'; later, scope was reduced to days 1-5
after surgery)
5 Metastases at surgery
Set recurrence = 'Yes', set recurrence
date = surgery date (if not given), set recurrence type = 'distant' (or
as appropriate, if other site(s) already given). If stage is 'A', 'B',
'C' or unknown, set it to 'D?' and if 'A', 'B' or 'C', check the change
with the trialist.
(N.B: formerly, known stages were not
changed and unknown stage was set to set to 'D?'; later, only the alterations
from 'A' and 'B' were checked with the trialist)
6 Metastases reported on days 1-30 after surgery
Put on hold and refer case to RG.
(N.B: formerly, procedure adopted was
as 'metastases at surgery'; later, scope was reduced to days 1-5 after
surgery)
7 Recurrence at unknown site at surgery
If stage is 'A', 'B', 'C' or unknown,
set it to 'D?' and if 'A', 'B' or 'C', check the change with the trialist.
8 Recurrence at unknown site reported on days
1-30 after surgery
Put on hold and refer case to RG.
9 Conversion into Dukes system
T |
N |
M |
Dukes |
Tis |
0 |
0 |
X |
1-2 |
0 |
0 |
A |
3-4 |
0 |
0 |
B |
Any |
>0 |
0 |
C |
Any |
Any |
>0 |
D |
Adenoma |
X ('benign') |
N 'unknown' to be treated as 'N0' (RG)
Note on simultaneity of events
Events within 30 days are taken as simultaneous,
e.g.
'local recurrence' followed by 'hepatic metastases' 15 days later is counted
as 'local + distant including liver'.
Note on coding of relapse sites
There are up to 65 logical possibilities,
depending on coding in the data presented. These have been collapsed to
the 12 categories listed in the current form definition.
[Return to the Contents
List]
[End of document, updated to 29 November
2000]