Skip to main content

Table 1 Term selection by filtering

From: Mining microarray expression data by literature profiling

 

Occurrence in abstracts

Terms

Baseline

AK3

H2A

IRF7

ISG15

A

92.6

85.7

84.6

100

100

Active

5.9

14.3

7.7

28.6

0

Cell-free

0.6

0

0

0

0

Histone

1.4

0

92.3

0

0

Infected

1.2

0

0

28.7

28.7

Interestingly

2.7

0

0

14.3

0

Interferon

1.1

0

0

78.6

71.4

Levels

13.9

0

7.7

7.1

42.9

Protein

41.2

28.6

38.5

57.1

100

Signaling

5.3

0

0

7.1

0

  1. This sample was extracted from a table (see Additional data files) containing occurrence values for nearly 25,000 terms for each of the 50 genes used in our example. It illustrates the selective process resulting from the use of several filtering rounds. Baseline occurrence levels are calculated by averaging the occurrence values determined for 250 randomly chosen genes. The occurrence values for four of the genes included in the analysis are shown here: H2A (histone 2A), IRF7 (interferon regulatory factor 7), AK3 (adenylate kinase 3), ISG15 (interferon-stimulated protein, 15 kDa). The first filtering removes terms with high baseline occurrence levels (shown in italics). The second filter selects the terms with occurrence values over baseline by at least 25% (bold). Only terms meeting this criterion for at least two genes - in this case 'interferon' and 'infected'- are retained.