APPLICATIONS / DATABASESSUPPLEMENTAL DATAFACULTYMIRROR SITESCOLLABORATIVE PROJECTSDEPARTMENTS

 
 


Section III: Detailed Analysis

Hierarchical cluster analysis of diagnostic cases using all genes that passed the variation filter
Two-dimensional hierarchical clustering was performed using Pearson correlation coefficient and an unweighted pair group method using arithmetic averages (GeneMaths, version 1.5). The results of hierarchical clustering of the 327 diagnostic samples using genes selected by a variety of metrics are shown below.

Methods for gene selection
Discriminating genes for the various leukemia subtypes were selected using a variety of statistical metrics. The individual metrics used and the list of selected probe sets and corresponding genes are given below. Genes selected by Chi-square, T-statistics, Wilkins', and CFS, were chosen using the decision tree format shown in Figure 19 below. In this process, genes were selected that distinguished the class for all classes listed below it in the decision tree structure. For the selection of genes using SOM/DAV genes were selected that distinguished the class from all others. The degree of overlap between the lists of genes selected for each genetic subtype by the various metrics is discussed below.

Chi-square
The Chi-square method evaluates each gene individually by measuring the Chi-square statistics with respect to the classes. The method first discretizes the observed expression values of the gene into several intervals using an entropy-based discretization method. The Chi-square statistics of a gene is then calculated as X2 = ΣΣ(Aij - Eij)2/Eij, summing over intervals i = 1..m and classes j = 1..k. Aij is the number of samples in the ithinterval that are of the jth class. Eij is the expected frequency of Aij and is calculated as Eij = Ri * Ci/N, where Ri is the number of samples in the ith interval, Cj is the number of samples in the jth class, and N is the total number of samples. The genes are then sorted according to their Chi-square statistics---the larger the Chi-square statistics, the more important the gene. The 40 genes with the highest Chi-square statistics in each subtype are listed in Table 11. Generally, using anywhere from the top 20 to 40 genes did not result in significant differences in subtype prediction accuracy. Therefore, we used only the top 20 genes in subtype prediction, unless noted otherwise.


Table 11. Genes selected by Chi-square


  Affymetrix number Gene
Name
GeneSymbol Reference number
Chi-
square
value
Above/
Below
Mean


BCR-ABL

     
 

1

1637_at

mitogen-activated protein kinase-activated protein kinase 3

MAPKAPK3

U09578

62.75

Above

2

36650_at

cyclin D2

CCND2

D13639

59.79

Above

3

40196_at

HYA22 protein

HYA22

D88153

54.79

Above

4

1635_at

proto-oncogene tyrosine-protein kinase ABL gene

ABL

U07563

54.77

Above

5

33775_s_at

caspase 8 apoptosis-related cysteine protease

CASP8

X98176

49.70

Above

6

1636_g_at

proto-oncogene tyrosine-protein kinase ABL gene

ABL

U07563

48.29

Above

7

41295_at

GTT1 protein

GTT1

AL041780

42.60

Above

8

37600_at

extracellular matrix protein 1

ECM1

U68186

42.60

Above

9

37012_at

capping protein actin filament muscle Z-line beta

CAPZB

U03271

38.46

Above

10

39225_at

alkylglycerone phosphate synthase

AGPS

Y09443

38.46

Above

11

1326_at

caspase 10 apoptosis-related cysteine protease

CASP10

U60519

37.83

Above

12

34362_at

solute carrier family 2 facilitated glucose transporter member 5

SLC2A5

M55531

37.54

Above

13

33150_at

disrupter of silencing 10

SAS10

AI126004

36.95

Above

14

40051_at

TRAM-like protein

KIAA0057

D31762

36.95

Above

15

39061_at

bone marrow stromal cell antigen 2

BST2

D28137

36.95

Above

16

33172_at

hypothetical protein FLJ10849

FLJ10849

T75292

36.95

Above

17

37399_at

aldo-keto reductase family 1 member C3 3-alpha hydroxysteroid dehydrogenase type II

AKR1C3

D17793

36.95

Above

18

317_at

protease cysteine 1 legumain

PRSC1

D55696

36.95

Above

19

40953_at

calponin 3 acidic

CNN3

S80562

33.94

Above

20

330_s_at

tubulin, alpha 1, isoform 44

TUBA1

HG2259-HT2348

33.32

Above

21

40504_at

paraoxonase 2

PON2

AF001601

31.46

Above

22

38578_at

tumor necrosis factor receptor superfamily member 7

TNFRSF7

M63928

30.47

Above

23

39044_s_at

diacylglycerol kinase delta 130kD

DGKD

D73409

29.59

Below

24

36634_at

BTG family member 2

BTG2

U72649

29.16

Below

25

38119_at

glycophorin C Gerbich blood group

GYPC

X12496

29.16

Above

26

32562_at

endoglin Osler-Rendu-Weber syndrome 1

ENG

X72012

27.96

Above

27

33228_g_at

interleukin 10 receptor beta

IL10RB

AI984234

27.70

Below

28

37006_at

step II splicing factor SLU7

SLU7

AI660656

27.15

Above

29

38641_at

Homo sapiens mRNA for TSC-22-like protein

 

AJ133115

27.15

Above

30

38220_at

dihydropyrimidine dehydrogenase

DPYD

U20938

27.15

Above

31

1211_s_at

CASP2 and RIPK1 domain containing adaptor with death domain

CRADD

U84388

26.46

Above

32

39730_at

v-abl Abelson murine leukemia viral oncogene homolog 1

ABL1

X16416

25.90

Above

33

36591_at

tubulin alpha 1 testis specific

TUBA1

X06956

25.90

Above

34

36035_at

anchor attachment protein 1 Gaa1p yeast homolog

GPAA1

AB002135

25.34

Above

35

980_at

Niemann-Pick disease type C1

NPC1

AF002020

25.29

Above

36

671_at

secreted protein acidic cysteine-rich osteonectin

SPARC

J03040

25.29

Above

37

40698_at

C-type calcium dependent carbohydrate-recognition domain lectin superfamily member 2 activation-induced

CLECSF2

X96719

23.80

Above

38

39330_s_at

actinin alpha 1

ACTN1

M95178

23.70

Above

39

1983_at

cyclin D2

CCND2

X68452

23.70

Above

40

2001_g_at

ataxia telangiectasia mutated

ATM

U26455

22.60

Above

         
 

E2A-PBX1

     
 

1

41146_at

ADP-ribosyltransferase NAD poly ADP-ribose polymerase

ADPRT

J03473

187.00

Above

2

1287_at

ADP-ribosyltransferase NAD poly ADP-ribose polymerase

ADPRT

J03473

187.00

Above

3

32063_at

pre-B-cell leukemia transcription factor 1

PBX1

M86546

187.00

Above

4

33355_at

Homo sapiens cDNA FLJ12900 fis clone NT2RP2004321 (by CELERA serach of target sequence = PBX1)

PBX1

AL049381

187.00

Above

5

430_at

nucleoside phosphorylase

NP

X00737

187.00

Above

6

40454_at

FAT tumor suppressor Drosophila homolog

FAT

X87241

176.11

Above

7

753_at

nidogen 2

NID2

D86425

164.28

Above

8

33821_at

Human DNA sequence from clone RP3-483K16 on chromosome 6p12.1-21.1

HELO1

AL034374

155.00

Above

9

39614_at

KIAA0802 protein

KIAA0802

AB018345

153.46

Above

10

38340_at

huntingtin interacting protein-1-related

KIAA0655

AB014555

143.85

Above

11

1786_at

c-mer proto-oncogene tyrosine kinase

MERTK

U08023

142.34

Above

12

39929_at

KIAA0922 protein

KIAA0922

AB023139

139.97

Above

13

39379_at

Homo sapiens mRNA cDNA DKFZp586C1019 from clone DKFZp586C1019

 

AL049397

139.49

Above

14

717_at

GS3955 protein

GS3955

D87119

135.24

Above

15

362_at

protein kinase C zeta

PRKCZ

Z15108

131.36

Above

16

33513_at

signaling lymphocytic activation molecule

SLAM

U33017

131.36

Above

17

37225_at

KIAA0172 protein

KIAA0172

D79994

131.36

Above

18

854_at

B lymphoid tyrosine kinase

BLK

S76617

130.95

Above

19

35974_at

lymphoid-restricted membrane protein

LRMP

U10485

123.33

Above

20

36452_at

synaptopodin

KIAA1029

AB028952

123.33

Above

21

40648_at

c-mer proto-oncogene tyrosine kinase

MERTK

U08023

120.51

Above

22

38393_at

KIAA0247 gene product

KIAA0247

D87434

120.51

Above

23

38994_at

STAT induced STAT inhibitor-2

STATI2

AF037989

118.58

Below

24

34861_at

golgi autoantigen golgin subfamily a 3

GOLGA3

D63997

116.80

Above

25

38748_at

adenosine deaminase RNA-specific B1 homolog of rat RED1

ADARB1

U76421

114.13

Above

26

40113_at

GS3955 protein

GS3955

D87119

114.13

Above

27

36179_at

mitogen-activated protein kinase-activated protein kinase 2

MAPKAPK2

U12779

113.43

Above

28

37493_at

colony stimulating factor 2 receptor beta low-affinity granulocyte-macrophage

CSF2RB

H04668

113.04

Above

29

578_at

Human recombination acitivating protein (RAG2) gene

RAG2

M94633

111.32

Above

30

41017_at

myosin-binding protein H

MYBPH

U27266

109.73

Above

31

37625_at

interferon regulatory factor 4

IRF4

U52682

108.51

Above

32

38679_g_at

small nuclear ribonucleoprotein polypeptide E

SNRPE

AA733050

106.02

Above

33

1389_at

membrane metallo-endopeptidase neutral endopeptidase enkephalinase CALLA CD10

MME

J03779

105.65

Below

34

34783_s_at

BUB3 budding uninhibited by benzimidazoles 3 yeast homolog

BUB3

AF047473

103.87

Above

35

36959_at

ubiquitin-conjugating enzyme E2 variant1

UBE2V1

U49278

103.87

Above

36

39864_at

cold inducible RNA-binding protein

CIRBP

D78134

99.76

Below

37

41862_at

KIAA0056 protein

KIAA0056

D29954

99.76

Above

38

41425_at

Friend leukemia virus integration 1

FLI1

M98833

96.47

Above

39

37177_at

CD58 antigen lymphocyte function-associated antigen 3

CD58

Y00636

93.84

Above

40

37485_at

fatty-acid-Coenzyme A ligase very long-chain 1

FACVL1

D88308

93.17

Above

         
 

Hyperdiploid >50

   
 

1

36620_at

superoxide dismutase 1 soluble amyotrophic lateral sclerosis 1 adult

SOD1

X02317

52.43

Above

2

37350_at

Human DNA sequence from clone 889N15 on chromosome Xq22.1-22.3.

PSMD10

AL031177

48.71

Above

3

171_at

von Hippel-Lindau binding protein 1

VBP1

U56833

45.80

Above

4

37677_at

phosphoglycerate kinase 1

PGK1

V00572

45.80

Above

5

41724_at

accessory proteins BAP31/BAP29

DXS1357E

X81109

45.58

Above

6

32207_at

membrane protein palmitoylated 1 55kD

MPP1

M64925

44.07

Above

7

38738_at

SMT3 suppressor of mif two 3 yeast homolog 1

SMT3H1

X99584

43.57

Above

8

40480_s_at

FYN oncogene related to SRC FGR YES

FYN

M14333

43.57

Above

9

38518_at

sex comb on midleg Drosophila like 2

SCML2

Y18004

43.20

Above

10

41132_r_at

heterogeneous nuclear ribonucleoprotein H2 H

HNRPH2

U01923

43.15

Above

11

31492_at

muscle specific gene

M9

AB019392

43.01

Below

12

38317_at

transcription elongation factor A SII like 1

TCEAL1

M99701

41.10

Above

13

40998_at

trinucleotide repeat containing 11 THR-associated protein 230 kDa subunit

TNRC11

AF071309

40.88

Above

14

35688_g_at

mature T-cell proliferation 1

MTCP1

Z24459

40.52

Above

15

40903_at

ATPase H transporting lysosomal vacuolar proton pump membrane sector associated protein M8-9

APT6M8-9

AL049929

40.33

Above

16

36489_at

phosphoribosyl pyrophosphate synthetase 1

PRPS1

D00860

40.33

Above

17

1520_s_at

interleukin 1 beta

IL1B

X04500

40.29

Above

18

35939_s_at

POU domain class 4 transcription factor 1

POU4F1

L20433

38.74

Above

19

38604_at

neuropeptide Y

NPY

AI198311

38.26

Above

20

31863_at

KIAA0179 protein

KIAA0179

D80001

38.26

Above

21

890_at

ubiquitin-conjugating enzyme E2A RAD6 homolog

UBE2A

M74524

37.99

Above

22

39402_at

interleukin 1 beta

IL1B

M15330

37.92

Above

23

41490_at

phosphoribosyl pyrophosphate synthetase 2

PRPS2

Y00971

37.72

Above

24

34753_at

synaptobrevin-like 1

SYBL1

X92396

37.72

Above

25

40891_f_at

DNA segment on chromosome X unique 9879 expressed sequence

DXS9879E

X92896

37.15

Above

26

306_s_at

high-mobility group nonhistone chromosomal protein 14

HMG14

J02621

37.15

Above

27

37640_at

hypoxanthine phosphoribosyltransferase 1 Lesch-Nyhan syndrome

HPRT1

M31642

37.15

Above

28

34829_at

dyskeratosis congenita 1 dyskerin

DKC1

U59151

36.48

Above

29

36169_at

NADH dehydrogenase ubiquinone 1 alpha subcomplex 1 7.5kD MWFE

NDUFA1

N47307

36.48

Above

30

38968_at

SH3-domain binding protein 5 BTK-associated

SH3BP5

AB005047

35.95

Above

31

36128_at

transmembrane trafficking protein

TMP21

L40397

35.88

Above

32

37014_at

myxovirus influenza resistance 1 homolog of murine interfer