Download:
pdf |
pdfAmerican Journal of Epidemiology
Published by the Johns Hopkins Bloomberg School of Public Health 2007.
Vol. 166, No. 11
DOI: 10.1093/aje/kwm212
Advance Access publication August 28, 2007
Practice of Epidemiology
When Epidemiology Meets the Internet: Web-based Surveys in the Millennium
Cohort Study
Besa Smith1, Tyler C. Smith1, Gregory C. Gray2, and Margaret A. K. Ryan1 for the Millennium
Cohort Study Team
1
2
Department of Defense Center for Deployment Health Research, Naval Health Research Center, San Diego, CA.
College of Public Health, University of Iowa, Iowa City, IA.
Received for publication March 1, 2007; accepted for publication June 19, 2007.
Almost 60% of American households were connected to the Internet in 2001, when the Millennium Cohort Study,
the largest longitudinal study ever undertaken by the Department of Defense, was launched. To facilitate survey
completion, increase data integrity, and encourage cohort retention while maintaining the highest standards of
participant privacy, an online questionnaire was made available on the World Wide Web in addition to a traditional
paper questionnaire sent via US mail. Over 50% of 77,047 participants chose to enroll in the study via the Web,
affording substantial cost savings to the project. Using multivariable logistic regression, the authors compared the
demographic and health characteristics of Web responders with those of paper responders. Web responders were
slightly more likely to be male, to be younger, to have a high school diploma or college degree, and to work in
information technology or another technical occupation. Web responders were more likely to be obese and to
smoke more cigarettes and were less likely to be problem alcohol drinkers and to report occupational exposures.
Question completion rates were 98.3%, on average, for both Web and paper responders. Web responders provided more complete contact information, including their e-mail addresses. These results demonstrate the value of
survey research conducted over the Internet in concert with traditional mail survey strategies.
data collection; epidemiologic methods; health surveys; Internet; longitudinal studies; military personnel;
questionnaires
Abbreviations: CI, confidence interval; OR, odds ratio.
Researchers interested in data collection through survey
implementation have exciting new and unique opportunities
to conduct research using the World Wide Web. This new
medium for data collection has become available through an
explosion in both the growth of the Web and the use of
personal computers. Almost 60 percent of US households
were connected to the Internet in 2001 (1), with this number
jumping to 75 percent by 2004 (2). In light of this growth, it
is evident that researchers will begin to use and investigate
the Web as a tool, learning the benefits as well as pitfalls of
using the Web for data collection. It is highly anticipated
that these methods will become a more viable way of im-
plementing future surveys, providing a number of advantages over traditional methods, including convenience for
the participant, potentially large cost savings for the researcher, efficiency in data collection, higher data quality,
a degree of perceived anonymity for the participant, and the
ability to increase response rates (3–9).
A cautious tone is apparent in the literature, however,
when addressing survey implementation on the Web. There
are many unknowns regarding survey construction, implementation, and Web security, prompting some authors to
express concerns associated with Web-based surveys (3, 6,
8, 10). These concerns include sampling problems, lack of
Correspondence to Dr. Besa Smith, Department of Defense Center for Deployment Health Research, Naval Health Research Center,
P.O. Box 85122, San Diego, CA 92186-5122 (e-mail: Besa.Smith@med.navy.mil).
1345
Am J Epidemiol 2007;166:1345–1354
1346 Smith et al.
participant access to computers with Internet connections,
Internet privacy concerns, and response inconsistency
across different media. Still, others maintain that the bias
associated with collecting information over the Web is no
greater than that introduced by traditional paper methods
(11). More research is necessary to better quantify the risks
and benefits of conducting survey research over the Internet.
In 2001, the largest Department of Defense longitudinal
study to date was launched. The Millennium Cohort Study,
a prospective study of more than 100,000 military personnel, will survey participants every 3 years over a 21-year
period (12, 13). Originally, the cohort was to receive the
survey in traditional paper form via the US Postal Service.
However, the initial steering committee and focus groups
encouraged investigators to additionally offer submission
of the survey via the Web. A mixed-mode strategy was developed to facilitate survey completion and increase data
integrity, while continuing to maintain the highest level of
participant privacy.
The combination of Web and paper data submission
presented an opportunity to investigate and compare demographic, occupational, and self-reported exposure and health
characteristics between participants choosing Web survey
submission and those choosing paper survey submission.
Our objective in this analysis was to evaluate the use of a
mixed-mode survey approach in a large 21-year prospective
study. We tested the hypothesis that there were no differences in demographic, military, and occupational characteristics or in self-reported behavioral, health, and exposure
characteristics between Web and paper responders.
MATERIALS AND METHODS
Study population
The methods used for the Millennium Cohort Study have
been described in detail elsewhere (13). In brief, persons
invited to participate in the Millennium Cohort Study were
derived from a stratified random sample of US military
personnel representing approximately 11.3 percent of the
2.2 million men and women on active service rosters as
of October 1, 2000. To ensure adequate power in small
subgroups, Reserve and National Guard members, women,
and past deployers were oversampled. Using a modified
Dillman approach (14, 15), Web and US mail-based enrollment began in July 2001. All invitees were contacted by
traditional US mail with postcard reminders and survey instruments. For those from whom an e-mail address could be
obtained, an e-mail invitation was also sent. Participants
were allowed to submit a survey by paper or online via
the Web site, according to their preference. All correspondence, paper or electronic, invited potential respondents to
complete the survey via the Web site. Because of the potential to increase data quality and reduce study costs, a nominal
cost-savings initiative was offered to persons who completed the survey online. Enrollment ended on June 30,
2003, with 77,047 consenting participants (a 37 percent
response rate). Previous analyses have demonstrated that
Millennium Cohort Study participants well represent the
US military, as measured by demographic and health characteristics and reliable health and exposure reporting
(13, 16–19).
Demographic, occupational, and military-specific data
included date of birth, marital status, gender, race/ethnicity,
occupation, service branch (Army, Navy, Coast Guard, Air
Force, or Marine Corps), service component (active duty or
Reserve/National Guard), highest educational level, pay
grade, and past deployment to Southwest Asia, Bosnia, or
Kosovo (between January 1, 1998 and September 1, 2000).
For this study, missing demographic data for marital status,
occupation, education, and pay grade were supplemented
with self-reported data from the survey when possible.
Survey instrument
The Millennium Cohort Study questionnaire consists of
more than 450 questions regarding diagnosed medical conditions, reported symptoms, psychosocial history, physical
status, functional status, problem alcohol use, tobacco use,
occupation, and basic demographic and contact data (12, 13).
Contact information was obtained to assist in participant
tracking in this 21-year longitudinal effort. Standardized
scoring instruments were used for established reliability
and validity and for the ability to compare results with those
from other populations. They included the PRIME-MD Patient Health Questionnaire (20–22), the Medical Outcomes
Study 36-Item Short Form Health Survey for Veterans (23),
a Department of Veterans Affairs Gulf War survey (24, 25),
the Posttraumatic Stress Disorder Symptom Checklist–
Civilian Version (26, 27), and the CAGE questionnaire for
the detection of problem drinking behavior (28). Additionally, free-text fields were available to allow participants to
report conditions, problems, concerns, and exposures not
listed on the survey. The paper survey was created, scanned,
and verified using mark-sense TeleForm Elite software
(Cardiff Software, Inc., Vista, California).
Web site
The Web site was originally designed as a place for
participants to learn more about the study and to obtain
research findings as they became available. Informational
Web pages, including useful links, study contact information, documentation of annual study review processes, and
displays of signed endorsements from leaders in the military
community were all important components included to establish a personal relationship with each and every participant and to emphasize the legitimacy and need for the study.
During pilot testing, however, the importance of including
the opportunity to submit the questionnaire via the Internet
became apparent.
Web metrics
Web-based submission also presented us with information not available in traditional paper survey research. Time
stamps allowed establishment of the exact dates and times at
which participants began and completed their surveys; therefore, the average time to completion of the questionnaire
Am J Epidemiol 2007;166:1345–1354
Web-based Epidemiologic Surveys
could be calculated in comparison with estimates initially
made on the basis of focus groups. Time stamps were also
logged to reflect time of modification for persons who elected
to save their responses and return later to complete the survey.
Additionally, Web metrics allowed for the exploration of individual page submissions to identify lengthy or difficult-toanswer questions for future survey redesign. Finally, by
evaluating time stamps for days of the week and times of
the day in which participants most frequently submitted their
online surveys, we were able to develop optimal mailing
schedules to disseminate invitations.
Statistical analyses of Web responders versus paper
responders
Initial investigation of population characteristics included
univariate analyses with chi-square tests of association to
assess significant differences in the demographic, military,
behavioral, and health characteristics of persons submitting
the survey via the Web when compared with those submitting
paper surveys. We conducted a multivariable exploratory
model analysis to assess multicollinearity, significant associations, and possible confounding, while simultaneously
adjusting for all other covariates in the model. We used multivariable logistic regression to compare the differences in
adjusted odds of Web submission while controlling for possible confounders.
A graphical investigation of question completion rates
was conducted for assessment of differences in mode of
survey submission and differences in fatigue between the
Web and paper surveys. All analyses were completed using
SAS software (version 9.1; SAS Institute, Inc., Cary, North
Carolina).
This research was approved by the institutional review
board at the Naval Health Research Center and was conducted in compliance with all applicable federal regulations
governing the protection of human subjects in research.
RESULTS
Of the 77,047 persons who responded during 2001–2003,
42,168 (54.7 percent) completed the questionnaire online,
while 34,879 (45.3 percent) completed a paper questionnaire. Demographic data were missing for 87 persons, leaving 76,960 (99.9 percent) for further statistical analyses.
Table 1 shows proportional differences by method of
response, as well as adjusted odds ratios and 95 percent
confidence intervals calculated by multivariable logistic regression. After adjustment for age, education, marital status,
race/ethnicity, past deployment history, military rank, service component, service branch, and occupation, men were
1.4 times more likely to complete the survey online than
were women (odds ratio (OR) ¼ 1.37, 95 percent confidence
interval (CI): 1.32, 1.42). Participants in middle age ranges
and middle educational levels were more likely to fill out the
questionnaire on the Web. Web responders were also more
likely to be married (OR ¼ 1.18, 95 percent CI: 1.14, 1.22),
to be on active duty (OR ¼ 1.73, 95 percent CI: 1.67, 1.79),
and to have an occupation related to information technology
Am J Epidemiol 2007;166:1345–1354
1347
or another technical specialty. Finally, Army (OR ¼ 1.39, 95
percent CI: 1.33, 1.45) and Air Force (OR ¼ 1.39, 95 percent CI: 1.33, 1.46) personnel were individually 1.4 times
more likely to respond using the Web than were Navy and
Coast Guard personnel.
Potentially traumatic exposures, including knowledge of
or witnessing physical abuse, dead and/or decomposing
bodies, prisoners of war/refugees, chemical or biological
warfare agents, other medical countermeasures for exposure
to a chemical or biological warfare agent, and alarms necessitating the wearing of protective gear, were self-reported
significantly less often by Web responders than by paper
responders, after adjustment for sex, age, education, marital
status, race/ethnicity, deployment status, military pay grade,
service component, service branch, and occupation (table 2).
Only anthrax vaccination was reported more often among
Web responders (OR ¼ 1.05, 95 percent CI: 1.02, 1.09).
Web responders also self-reported less exposure to occupational hazards requiring protective equipment and
less routine contact with paint/solvents, microwaves, and
pesticides.
Self-reported behavioral health characteristics are displayed in table 3 by mode of survey response. After adjustment for sex, age, education, marital status, race/ethnicity,
deployment history, military rank, service component, service branch, and occupation, Web responders were more
likely to self-report their health as fair, good, or very good,
were 1.2 times more likely to be overweight (OR ¼ 1.17, 95
percent CI: 1.13, 1.21) or obese (OR ¼ 1.20, 95 percent CI:
1.14, 1.26), and were less likely to be classified as problem
drinkers (OR ¼ 0.81, 95 percent CI: 0.78, 0.84). Web responders were less likely to report smoking cigars (OR ¼
0.88, 95 percent CI: 0.85, 0.92) but slightly more likely to
report smoking cigarettes (OR ¼ 1.08, 95 percent CI: 1.04,
1.10). Among participants who reported smoking at least
100 cigarettes in their lifetime, Web responders also had
a slightly higher mean number of pack-years than paper
responders (5.1 and 4.9, respectively; p ¼ 0.04).
Figure 1 shows the overall completion percentages for
individual questions, which were found to be similar by
mode of response, although some questions were found to
have a higher percentage of completion by Web responders.
Of the 42,127 Web participants, the average completion
percentage for all survey questions was 98.3, with a maximum of 100 and a minimum of 87.6. Among Web responders, 87 percent provided a personal e-mail address, 68
percent provided a personal telephone number, and 51 percent provided a phone number for an alternative contact.
Of the 34,833 paper survey participants, the average completion percentage was 98.3, with a maximum of 99.7 and
a minimum of 79.0. Among paper responders, 32 percent
provided a personal e-mail address, 36 percent provided
a personal phone number, and 26 percent provided a phone
number for an alternative contact.
There were 48,958 persons who logged onto the Web
survey at some point during enrollment and, of those, 87.3
percent (n ¼ 42,747) submitted their survey via the Web. Of
the 6,211 who logged on but did not submit a survey, 814
(13 percent) ultimately submitted a paper copy of the survey. The average number of days from initial Web log-on to
1348 Smith et al.
TABLE 1. Characteristics of participantsy by mode of survey response (World Wide Web or mailed
questionnaire), Millennium Cohort Study, 2001–2003
Web (n ¼ 42,127)
Paper (n ¼ 34,833)
No.
%
No.
%
Adjusted
odds ratio§
Female{
10,051
23.9
10,550
30.3
1.00
Male
32,076
76.1
24,283
69.7
1.37*
Characteristicz
95%
confidence
interval
Sex
1.32, 1.42
Birth cohort
Pre-1960{
7,217
17.1
9,420
27.0
1.00
1960–1969
16,806
39.9
12,332
35.4
1.53*
1.47, 1.60
1970–1979
15,834
37.6
10,817
31.1
1.66*
1.58, 1.74
2,270
5.4
2,264
6.5
1.37*
1.27, 1.49
3,359
8.0
3,610
10.4
1.00
1980 or later
Education
Master’s or doctoral degree{
Bachelor’s degree
6,695
15.9
6,002
17.2
1.18*
1.10, 1.25
Some college
11,353
27.0
8,294
23.8
1.33*
1.23, 1.43
High school diploma/equivalent
18,498
43.9
14,430
41.4
1.32*
1.22, 1.43
2,222
5.3
2,497
7.2
1.18*
1.07, 1.30
No high school diploma
Marital status
Single{
12,214
29.0
10,930
31.4
1.00
Married
27,215
64.6
21,333
61.2
1.18*
1.14, 1.22
Divorced
2,698
6.4
2,570
7.4
1.13*
1.06, 1.21
Race/ethnicity
29,047
70.0
24,520
70.4
1.00
Black, non-Hispanic
White, non-Hispanic{
5,782
13.7
4,817
13.8
0.96
0.92, 1.01
Asian/Pacific Islander
3,615
8.6
2,464
7.1
1.04
0.98, 1.11
Hispanic
2,687
6.4
2,258
6.5
0.96
0.90, 1.02
996
2.4
774
2.2
1.00
0.91, 1.10
Other
Deployment history#
Non-deployed{
28,011
66.5
25,724
73.9
1.00
Deployed
14,116
33.5
9,109
26.2
1.05*
1.01, 1.09
Table continues
survey submission was 7 (standard deviation, 40.6); however, 89 percent of Web participants completed the survey
the same day they initially logged on. Among those who
submitted a survey the same day that they logged on, the
median amount of time it took to complete the survey was
29 minutes (interquartile range, 16.0).
DISCUSSION
Cohort study investigators often rely heavily on survey
data to evaluate exposures and outcomes at multiple points
in time. Sophisticated methods of manual and electronic
data collection in such studies continue to evolve. The dramatic advances in the development of Web-based data collection methods are an important part of this evolution. In
this study, the benefits and feasibility of employing both
Web- and paper-based data collection techniques were investigated. In this report, we have described the consistency
of question reporting as well as a lack of survey-length
fatigue bias among both paper and Web responders. Further
demographic, behavioral, and health characteristics that distinguished Web and paper responders in a large populationbased cohort have been reported and discussed.
The hybrid data submission system, using both postal and
Internet capabilities, provided more opportunities for participants to take the survey and easier access for those who
may have been on deployment or stationed abroad. Offering
the survey both via the Web and on paper provided respondents with a choice as to how to participate, appealing both
to those who may have been concerned about informational
privacy on the Internet and to those concerned about sending
personal information by US mail. The majority of persons
who went to the survey on the Internet completed the survey
and submitted it online (87 percent). Web responders represented more than half of the cohort (55 percent) when enrollment closed. In the current study, the median amount of
Am J Epidemiol 2007;166:1345–1354
Web-based Epidemiologic Surveys
1349
TABLE 1. Continued
Web (n ¼ 42,127)
Paper (n ¼ 34,833)
%
No.
%
Adjusted
odds ratio§
32,934
78.2
26,338
75.6
1.00
9,193
21.8
8,495
24.4
1.06
Characteristicz
No.
95%
confidence
interval
Military rank
Enlisted{
Officer
1.00, 1.13
Service component
Reserve/National Guard{
15,140
35.9
17,971
51.6
1.00
Active duty
26,987
64.1
16,862
48.4
1.73*
1.67, 1.79
Branch of service
7,262
17.2
6,945
19.9
1.00
Army
Navy/Coast Guard{
19,912
47.3
16,559
47.5
1.39*
1.33, 1.45
Air Force
12,734
30.2
9,623
27.6
1.39*
1.33, 1.46
Marines
2,219
5.3
1,706
4.9
1.11*
1.03, 1.20
Occupational category
Health care{
3,821
9.1
4,176
12.0
1.00
Combat specialists
8,378
19.9
7,031
20.2
1.00
0.95, 1.06
Electronic equipment repair
4,207
10.0
2,574
7.4
1.41*
1.31, 1.51
Communications/intelligence
3,175
7.5
2,252
6.5
1.20*
1.11, 1.29
Other technical specialties
1,123
2.7
849
2.4
1.16*
1.05, 1.29
Functional support specialists
8,604
20.4
6,797
19.5
1.30*
1.23, 1.38
Electrical/mechanical repair
6,354
15.1
5,028
14.4
1.04
0.97, 1.11
Craft workers
1,165
2.8
1,221
3.5
0.91
0.83, 1.00
Service support
3,564
8.5
3,120
9.0
1.07
1.00, 1.15
Trainees, other
1,736
4.1
1,785
5.1
1.04
0.96, 1.14
* p < 0.05.
y Only participants with complete data on covariates (99.6%) were used in the analyses.
z All univariate analyses based on Pearson chi-square statistics were statistically significant (p < 0.001).
§ Odds of Web response versus paper response based on multivariable logistic regression, adjusted for sex, age,
education, marital status, race/ethnicity, deployment status, military pay grade, service component, service branch,
and occupation.
{ Reference category.
# Deployed to Southwest Asia, Bosnia, or Kosovo during 1998–2000.
time it took Web responders to complete the questionnaire
online (29 minutes) accurately reflected the 30-minute estimate included in the instructions of the questionnaire based
on focus group testing.
There are benefits and drawbacks to both means of submission. Online survey submission included electronic consent as part of the Web site log-on process. This allowed for
easy documentation of consent and mitigated the need for
subsequent participant contact, as is necessary when obtaining missing consent forms for paper responders. Data quality can be enhanced by electronic skip patterns that
automatically skip irrelevant questions on the Web survey.
However, these techniques rely on more advanced computer
resources for the responder, and their absence may preclude
some persons from accessing the electronic questionnaire.
Paper surveys cannot prevent a respondent from entering
multiple responses to a single-response question. This can
actually provide researchers with unsolicited yet useful adAm J Epidemiol 2007;166:1345–1354
ditional information. However, deciding on the most appropriate answer when a participant selects multiple answers
can be complex and burdensome, quite often resulting in
a loss of data. Certain programming techniques in Webbased questionnaires can permit only one answer to be selected, thereby preventing misinterpretation of multiple answers and reducing missing data. Electronic free-text fields
offer keystroke survey responses instead of the handwritten
responses given on paper surveys, which can be difficult to
decipher and costly to transfer to a database. Finally, though
one might hypothesize that individual question completion
rates and survey fatigue might differ by mode of response,
we found comparable results (figure 1).
The cost-effectiveness of Web submission can be measured in terms of data quality as well as direct financial
savings. In the Millennium Cohort Study, there were significant initial costs of establishing the Web capability, such as
purchasing identical servers to run simultaneously in the
1350 Smith et al.
TABLE 2. Self-reported exposures of participantsy by mode of survey response (World Wide Web or mailed questionnaire),
Millennium Cohort Study, 2001–2003
Exposure
Web
(n ¼ 42,127)
No.
%
10,612
25.5
Paper
(n ¼ 34,833)
No.
Adjusted
odds ratioz
95%
confidence
interval
0.97
0.94, 1.01
0.93, 1.00
%
Marked ‘‘yes’’ for ever having been exposed to:
Witnessing a person’s death due to war, disaster, or tragic event
Having knowledge of or witnessing instances of physical abuse
(torture, beating, rape)
Dead and/or decomposing bodies
Maimed soldiers/civilians
8,985
26.0
8,054
19.4
7,096
20.5
0.96
13,591
32.7
12,357
35.7
0.89*
0.86, 0.92
7,258
17.5
6,099
17.7
0.99
0.95, 1.03
Prisoners of war/refugees
4,690
11.3
3,951
11.4
0.91*
0.87, 0.95
Chemical or biological warfare agents
2,184
5.3
1,988
5.8
0.93*
0.87, 0.99
Other medical countermeasures for exposure to a chemical or
biological warfare agent
2,642
6.4
2,414
7.0
0.86*
0.81, 0.92
Alarms necessitating the wearing of chemical/biological warfare
protective gear
6,600
15.9
5,748
16.7
0.88*
0.85, 0.92
14,608
35.0
10,080
29.4
1.05*
1.02, 1.09
Occupational hazards requiring protective equipment, such as
respirators or hearing protection
22,466
53.9
18,932
54.7
0.81*
0.78, 0.84
Routine skin contact with paints, solvents, and other
similar substances
11,149
26.9
9,958
28.8
0.81*
0.78, 0.84
0.87, 1.02
Anthrax vaccine
Marked ‘‘yes’’ for having been exposed within the past 3 years to:
Depleted uranium
1,636
4.0
1,187
3.5
0.94
Microwaves (excluding small microwave ovens)
7,132
17.2
7,107
20.6
0.73*
0.70, 0.75
Pesticides, including creams, sprays, or uniform treatments
10,666
25.8
9,073
26.4
0.92*
0.89, 0.95
Pesticides applied in the environment or around living facilities
11,370
27.5
10,349
30.0
0.87*
0.84, 0.90
2,771
6.7
2,406
7.1
0.90*
0.85, 0.95
Any exposure, physical or psychological, during a military
deployment that had a significant impact on health
* p < 0.05.
y Only participants with complete data on covariates (99.6%) were used in the analyses. Analyses were based on different sample sizes
because of missing exposure data.
z Odds of Web response versus paper response based on multivariable logistic regression, adjusted for sex, age, education, marital status,
race/ethnicity, military pay grade, service component, service branch, occupation, and deployment to Southwest Asia, Bosnia, or Kosovo during
1998–2000.
event of a primary failure. Gaining team expertise in the
form of hiring and training for Web site and database construction, data transfer techniques, and implementation,
which included significant understanding of security certificates, was necessary for the online survey. After the initial
start-up costs, money for sustaining team expertise and periodic Web site maintenance and security upgrades was
budgeted as an ongoing expense, even for non-survey-cycle
years. However, these costs are much lower than the laborintensive costs of compiling, scanning, verifying, sorting,
and filing thousands of paper submissions. Additional costs
associated with mailed surveys include the logistic and financial challenges associated with securely storing thousands of surveys and consent forms as required by human
subjects guidelines. Unforeseen expenses included costs involved in remailing potential participants when several
thousand surveys were returned without signed consent
statements. These extra costs associated with mailed surveys, in addition to the significant mailing costs of outgoing
and return postage for survey packets, resulted in substantial
cost savings associated with Web-based submission. For this
reason, a modest incentive (a T-shirt or a 60-minute phone
card valued at approximately $5.00) was offered to persons
choosing to complete the questionnaire online. Even after
factoring in the free gift, it was conservatively estimated that
each participant who elected to complete the questionnaire
online rather than by paper saved the study approximately
$50.00. The cost savings to the project, to date, have been
estimated to be as high as $2 million. Given that the current
project is expected to continue for more than 20 years and
the costs of Web surveys and traditional mail surveys are not
static, the full savings of using Web-based technology have
yet to be realized.
The finding that age influenced the mode of submission
has been reported previously (29). In this case, persons in
the birth cohort of 1960–1969 and subsequent cohorts were
more likely to use the Web than older participants, and this
may reflect greater technological savvy among recent generations. Persons with high school diplomas or college degrees were more likely to use the Web than those with lower
Am J Epidemiol 2007;166:1345–1354
Web-based Epidemiologic Surveys
1351
TABLE 3. Self-reported health characteristics of participantsy by mode of survey response (World Wide
Web or mailed questionnaire), Millennium Cohort Study, 2001–2003
Web (n ¼ 42,127)
Paper (n ¼ 34,833)
Characteristic
No.
%
No.
%
Adjusted
odds ratioz
95%
confidence
interval
General health
Poor
Fair
268
0.6
292
0.8
1.00
2,911
6.9
2,411
6.9
1.37*
1.14, 1.63
Good
13,223
31.4
10,133
29.1
1.48*
1.25, 1.75
Very good
16,410
39.0
13,992
40.2
1.38*
1.16, 1.63
Excellent
7,377
17.5
7,647
22.0
1.18
1.00, 1.41
0.57, 0.79
Body mass index§
Underweight (<18.5)
256
0.6
379
1.1
0.67*
Normal weight (18.5–24.9)
14,358
34.1
13,361
38.4
1.00
Overweight (25.0–29.9)
22,235
52.8
16,921
48.6
1.17*
1.13, 1.21
4,737
11.2
3,651
10.5
1.20*
1.14, 1.26
541
1.3
521
1.5
No
34,884
82.8
27,831
79.9
1.00
Yes
7,243
17.2
7,002
20.1
0.81*
0.78, 0.84
18,007
42.7
13,423
38.5
1.08*
1.04, 1.10
7,360
17.5
6,353
18.2
0.88*
0.85, 0.92
816
1.9
666
1.9
0.98
0.88, 1.08
5,446
12.9
4,031
11.6
0.97
0.93, 1.02
Obese (30.0)
Unknown
Problem drinking{
Smoking#
Cigarette smoker
Cigar smoker
Pipe smoker
Smokeless tobacco user
Average pack-years** for cigarette smokers
5.1
4.9
p ¼ 0.0379
* p < 0.05.
y Only participants with complete data on covariates (99.6%) were used in the analyses. Analyses were based on
different sample sizes because of missing data on health characteristics.
z Odds of Web response versus paper response based on multivariable logistic regression, adjusted for sex, age,
education, marital status, race/ethnicity, military pay grade, service component, service branch, occupation, and
deployment to Southwest Asia, Bosnia, or Kosovo during 1998–2000.
§ Weight (kg)/height (m)2.
{ Problem drinkers were defined as those who responded ‘‘Yes’’ to any one of the CAGE screening questions (felt
the need to cut back (C), felt annoyed (A) at anyone suggesting you cut back, felt guilty (G) about drinking, or felt the
need for an eye-opener (E) or early-morning drink).
# Cigarette smoking was defined as having smoked at least 100 cigarettes (five packs) in one’s lifetime. Cigar,
pipe, and smokeless tobacco use were defined as any use in the past year.
** Adjusted mean value (n ¼ 18,007 Web responders; n ¼ 13,423 paper responders).
or higher educational levels. Age and education, however,
appeared to be independent in multivariable adjusted models. The paradoxical education relation (i.e., the middleeducated differing from persons with lower or higher
educational levels) has been suggested in other studies of
Web use (30, 31).
The finding that men enroll via the Web more often than
women is interesting and has been noted previously (32);
however, this may be a population-specific occurrence, as
another study has suggested that women enroll via the Web
with higher frequency (11). Married and previously married
participants used the Web more often than unmarried participants, suggesting that computer use may be related to
family environment. The increased adjusted odds of Web
Am J Epidemiol 2007;166:1345–1354
response among persons in occupations related to information technology and other technical specialties may have
been due to increased computer access and/or familiarity
and comfort with computer use. Additionally, less Web response among Navy and Coast Guard members may have
reflected differential access to computers while at sea.
Of particular concern to researchers who base exposure
measurement on self-reports was the relatively consistent
lower reporting of occupational exposures among Web responders. Because these measures are adjusted for the differences in population composition, it is difficult to surmise
why Web responders were at lower odds for these selfreported exposures, with the exception of having received
the anthrax vaccine. With the growing concerns about
1352 Smith et al.
FIGURE 1. Percentage complete for each individual question in the Millennium Cohort Study survey, by mode of response (World Wide Web or
mailed questionnaire). Percentages shown incorporate skip patterns. A: 6% of paper responders and 2% of Web responders skipped the question,
‘‘Are you a twin?’’ B: 18% of paper responders and 8% of Web responders skipped ‘‘other’’ on the question, ‘‘Has your doctor or other health
professional ever told you that you have any of the following conditions?’’ C: 21% of paper responders and 10% of Web responders skipped ‘‘other’’
on the question, ‘‘During the last 12 months, have you had persistent or recurring problems with any of the following conditions?’’ D: Of persons
indicating a possible eating disorder, 14% of paper responders and 8% of Web responders skipped the frequency question. E: Of persons who
indicated functional health problems, 10% of paper responders and 8% of Web responders skipped a query quantifying the degree of challenge in
‘‘doing work, taking care of things at home, or getting along with other people.’’ F: 7% of paper responders and 2% of Web responders skipped
a query quantifying their degree of limitation in ‘‘bending, kneeling, or stooping.’’ G: 17% of paper responders and 12% of Web responders skipped
a query on military occupational coding.
compromised electronic data, as well as identity theft, some
people may feel that their data are not as secure via submission on the Web. While the Millennium Cohort Study
team has taken extraordinary care to ensure the protection of
these data and to convey our dedication to the privacy protection to our members, feelings of insecurity may persist.
More research into these findings is warranted.
The finding that Web responders were more likely to report a weight in the overweight or obese range is interesting.
This may reflect subtle occupational differences, such as
being employed in a sedentary work environment. However,
all military members are required to meet standards for
weight and physical fitness, regardless of occupation. Outside of work, these findings may indicate a more inactive
personal lifestyle among Web responders, with more individual time being allocated to computer use. Interpretation
of health outcome results may need to be met with caution in
research where only one survey mode, traditional paper or
the Web, is utilized. Given that excess weight is so strongly
associated with many long-term health problems, this finding is important and may highlight the need to offer both
traditional paper surveys and Web surveys in any study of
chronic health outcomes.
Web responders were significantly less likely to be classified as problem drinkers, but they were significantly more
likely to report smoking cigarettes. These were surprising
results given that tobacco and alcohol use are often closely
associated (33, 34). Although differences in these behaviors were small, the finding may be consistent with other
characteristics of technologically savvy persons and
underscores the importance of exploring all differences
between study participants who opt for different modes of
participation.
Limitations of this study should be noted. The rate of
response to the Millennium Cohort Study enrollment invitation was 37 percent, and therefore participants may not be
Am J Epidemiol 2007;166:1345–1354
Web-based Epidemiologic Surveys
representative of the US military in general (13). However,
investigation of possible biases suggested a representative
sample of military personnel as measured by demographic
and health characteristics and reliable health and exposure
reporting (13, 16–19). With the robust sample size and the
unique and abundant characteristics available for study,
these data suggest subpopulations that may be more or less
inclined to respond to a Web-based survey when both Web
and traditional paper modes are offered.
In summary, the Millennium Cohort Study has demonstrated successful implementation of multimodal survey
data collection using the Web in concert with postal mailed
questionnaires. Web response was associated with more
complete data and marked cost savings at a minimal risk
of enrolling a nonrepresentative group. Web responders
were very comparable to paper responders with regard to
most demographic and health metrics, but subtle differences
were observed. These differences may be consistent with
important health challenges, such as obesity, that distinguish
a growing generation of computer users. Such differences
may become less distinct as Internet use becomes more
global over time. Only a large, prospective study like the
21-year Millennium Cohort Study will be able to fully assess changing demographic and health characteristics over
time. In the meantime, it remains important to offer multiple
modes of participation when a diverse population is sought,
and equally important to understand response biases and
differences among cohort study participants.
ACKNOWLEDGMENTS
This article is report 07-06, supported by the Department
of Defense, under work unit no. 60002. This work was also
supported by the Henry M. Jackson Foundation for the
Advancement of Military Medicine (Rockville, Maryland).
The authors thank Scott L. Seggerman from the Management Information Division of the Defense Manpower
Data Center (Seaside, California) and Dr. Karl E. Friedl
from the US Army Medical Research and Materiel
Command (Fort Detrick, Maryland). Additionally, they
thank Lacy Farnell, Gia Gumbs, Isabel Jacobson, Cynthia
Leard, Travis Leleu, Robert Reed, Steven Spiegel, Damika
Webb, Kari Welch, and James Whitmer from the Department
of Defense Center for Deployment Health Research (San
Diego, California) and Michelle Stoia from the Naval Health
Research Center (San Diego, California).
In addition to the authors, the Millennium Cohort Study
Team comprises Drs. Paul J. Amoroso, Edward J. Boyko,
Gary D. Gackstetter, Tomoko I. Hooper, James R. Riddle,
and Timothy S. Wells.
The views expressed in this article are those of the authors
and do not reflect the official policy or position of the
Department of the Navy, the Department of the Army, the
Department of the Air Force, the Department of Defense,
the Department of Veterans Affairs, or the US government.
This research was conducted in compliance with all applicable federal regulations governing the protection of human
subjects in research (protocol NHRC.2000.007).
Am J Epidemiol 2007;166:1345–1354
1353
Conflict of interest: none declared.
REFERENCES
1. Nielsen//NetRatings. Internet penetration reaches 60 percent
in the U.S., according to Nielsen//NetRatings. New York, NY:
NetRatings, Inc, 2001:February 28. (http://www.nielsennetratings.com/pr/pr_010228.pdf).
2. Nielsen//NetRatings. Three out of four Americans have access
to the Internet, according to Nielsen//NetRatings. New York,
NY: NetRatings, Inc, 2004:March 18. (http://www.nielsennetratings.com/pr/pr_040318.pdf).
3. Best SJ, Krueger B, Hubbard C, et al. An assessment of the
generalizability of Internet surveys. Soc Sci Comput Rev
2001;19:131–45.
4. Hewson CM, Laurent D, Vogel CM. Proper methodologies for
psychological and sociological studies conducted via the
Internet. Behav Res Methods Instrum Comput 1996;28:
186–91.
5. Krantz JH, Ballard J, Scher J. Comparing the results of
laboratory and World Wide Web samples of the determinants
of female attractiveness. Behav Res Methods Instrum Comput
1997;29:264–9.
6. Schaefer DR, Dillman DA. Development of standard e-mail
methodology: results of an experiment. Public Opin Q 1998;
62:378–97.
7. Schmidt WC. World-Wide Web survey research: benefits,
potential problems, and solutions. Behav Res Methods Instrum
Comput 1997;29:274–9.
8. Smith MA, Leigh B. Virtual subjects: using the Internet as an
alternative source of subjects and research environment.
Behav Res Methods Instrum Comput 1997;29:496–505.
9. Shettle C, Mooney G. Monetary incentives in U.S. government
surveys. J Off Stat 1999;15:231–50.
10. Swoboda WJ, Mu¨hlberger N, Weitkunat R, et al. Internet
surveys by direct mailing. Soc Sci Comput Rev 1997;15:
242–55.
11. Ekman A, Dickman PW, Klint A, et al. Feasibility of using
web-based questionnaires in large population-based epidemiological studies. Eur J Epidemiol 2006;21:103–11.
12. Gray GC, Chesbrough KB, Ryan MA, et al. The Millennium
Cohort Study: a 21-year prospective cohort study of 140,000
military personnel. Mil Med 2002;167:483–8.
13. Ryan MA, Smith TC, Smith B, et al. Millennium Cohort:
enrollment begins a 21-year contribution to understanding the
impact of military service. J Clin Epidemiol 2007;60:181–91.
14. Dillman DA. Mail and telephone surveys: the total design
method. New York, NY: John Wiley and Sons, Inc, 1978.
15. Dillman DA. Mail and internet surveys: the tailored design
method. New York, NY: John Wiley and Sons, Inc, 2000.
16. Chretien JP, Chu LK, Smith TC, et al. Demographic and
occupational predictors of early response to a mailed invitation
to enroll in a longitudinal health study. BMC Med Res
Methodol 2007;7:6. (Electronic article).
17. Smith B, Leard CA, Smith TC, et al. Anthrax vaccination in
the Millennium Cohort: validation and measures of health.
Am J Prev Med 2007;32:347–53.
18. Smith TC, Jacobson IG, Smith B, et al. The occupational role
of women in military service: validation of occupation and
prevalence of exposures in the Millennium Cohort Study.
Int J Environ Health Res 2007;17:271–84.
1354 Smith et al.
19. Smith TC, Smith B, Jacobson IG, et al. Reliability of standard
health assessment instruments in a large, population-based
cohort study. Ann Epidemiol 2007;17:525–32.
20. Spitzer RL, Williams JB, Kroenke K, et al. Utility of a new
procedure for diagnosing mental disorders in primary
care. The PRIME-MD 1000 Study. JAMA 1994;272:
1749–56.
21. Spitzer RL, Kroenke K, Williams JB, et al. Validation and
utility of a self-report version of PRIME-MD: The PHQ
Primary Care Study. JAMA 1999;282:1737–44.
22. Spitzer RL, Williams JB, Kroenke K, et al. Validity and utility
of the PRIME-MD patient health questionnaire in assessment
of 3000 obstetric-gynecologic patients: the PRIME-MD
Patient Health Questionnaire Obstetrics-Gynecology Study.
Am J Obstet Gynecol 2000;183:759–69.
23. Ware N, Kleinman A. Culture and somatic experience: the
social course of illness in neurasthenia and chronic fatigue
syndrome. Psychosom Med 1992;54:546–60.
24. Gray GC, Reed RJ, Kaiser KS, et al. The Seabee Health Study:
self-reported multi-symptom conditions are common and
strongly associated among Gulf War veterans. Am J Epidemiol
2002;155:1033–44.
25. Kang HK, Mahan CM, Lee KY, et al. Illnesses among United
States veterans of the Gulf War: a population-based survey of
30,000 veterans. J Occup Environ Med 2000;42:491–501.
26. Weathers F, Huska J, Keane T. The PTSD Checklist Military
Version (PCL-M). Boston, MA: National Center for PTSD, 1991.
27. Lang AJ, Laffaye C, Satz LE, et al. Sensitivity and specificity
of the PTSD checklist in detecting PTSD in female veterans in
primary care. J Trauma Stress 2003;16:257–64.
28. Ewing JA. Detecting alcoholism. The CAGE questionnaire.
JAMA 1984;252:1905–7.
29. Morrell RW, Mayhorn CB, Bennett J. A survey of World Wide
Web use in middle-aged and older adults. Hum Factors 2000;
42:175-82.
30. Leece P, Bhandari M, Sprague S, et al. Internet versus mailed
questionnaires: a controlled comparison. J Med Internet Res
2004;6:e39. (Electronic article).
31. McCabe SE. Comparison of web and mail surveys in
collecting illicit drug use data: a randomized experiment.
J Drug Educ 2004;34:61–72.
32. McCabe SE, Diez A, Boyd CJ, et al. Comparing web and mail
responses in a mixed mode survey in college alcohol use
research. Addict Behav 2006;31:1619–27.
33. Jensen MK, Sorensen TI, Andersen AT, et al. A prospective
study of the association between smoking and later alcohol
drinking in the general population. Addiction 2003;98:
355–63.
34. Amit Z, Weiss S, Smith BR, et al. Use of caffeine-based
products and tobacco in relation to the consumption of
alcohol. An examination of putative relationships in
a group of alcoholics in Israel. Eur Addict Res 2004;10:
22–8.
Am J Epidemiol 2007;166:1345–1354
File Type | application/pdf |
File Title | kwm212 1345..1354 |
File Modified | 2007-11-14 |
File Created | 2007-11-07 |