Read 0471326518.pdf text version




he purpose of the Essentials of Bayley Scales of Infant Development­II Assessment is to highlight and expand the information provided in the Bayley Scales of Infant Development­II (BSID-II) manual. This book is designed to serve as an addition to the test manual and to facilitate the administration and interpretation of the test. It provides a discussion of the foundations of infant assessment as well as the development of the Bayley Scales. Additionally, the book uses "Rapid Reference," "Caution," and "Don't Forget" boxes to make some of the information most relevant to the user more readily accessible. The Essentials of Bayley Scales of Infant Development­II Assessment is intended to address the needs of clinicians and researchers who are assessing the cognitive, motor, and behavioral development of infants and young children.


The study of infant and child development, or developmental psychology as it is known today, dates back to the naturalistic observations of Johann Heinrich Pestalozzi in the 18th century. By the late 1800s and early 1900s, a contingency of psychologists around the world (Alfred Binet in France; Wilhelm Preyer in Germany; G. Stanley Hall, James Mark Baldwin, and John B. Watson in the United States) acknowledged that development took place from conception and continued throughout life, with increasing complexity (Cairns, 1983). Some of the earliest work in developmental psychology grew from the study of biological systems, and to that end Wilhelm Preyer has been attributed with a landmark publication, Die Seele des Kindes (The mind of the child) in 1882. He approached the development of the human child from a background in physiology and comparative development, and proposed an




objective, methodological study of children through rigorous observation and an ecological approach. Preyer addressed the development of children's perceptions, motivation, and "intellect" (i.e., language and social cognition). G. Stanley Hall is often regarded as the founder of developmental psychology. Not only was he the first president of the American Psychological Association in 1892, but he also founded the first scientific journal that addressed psychological issues as applied to children. His systematic studies of development included the entire life span, with scholarly writings about early childhood, adolescence, and old age. By the beginning of the 20th century there was growing concern in the fields of medicine, social work, and education for the welfare of children (Cairns, 1983). As more attention was being paid to child development, funding for research grew, and in the 1920s research institutes began to spring up across the United States. The topic of study at the institutes ranged from intelligence, memory, perception, emotion, personality, and motivation to motor development (Sears, Maccoby, & Levin, 1957 ). These were topics of interest in the biological or medical and educational fields that were influencing the study of developmental psychology in the 19th century and early part of the 20th century. The origins of many other constructs studied in child development in the latter part of the first half of the 20th century, such as attachment, dependency, aggression, sibling relations, gender-role development, achievement motivation, and the influences of child-rearing, may be traced to psychoanalytic theory. Theorists such as Sigmund Freud, Erik Erikson, and Alfred Adler, while building their theories primarily on retrospective accounts of development, assigned significance to the roles of parents in children's development. By the 1930s and 1940s theories of child development were broadened to include social learning theory. During the latter part of the 20th century, theories of child development were broadened further to incorporate the transactional effects that children and caregivers have on one another, together with an ecological perspective that incorporates influences from the community and culture (Bronfenbrenner, 1979; Sameroff & Chandler, 1975). In contrast, there was only fleeting interest in infant development prior to the mid-20th century, primarily because both scientists and the lay public regarded infants as helpless, dependent organisms whose perceptions were undifferentiated and whose movements were dominated by reflexes. However, as scientists learned to use systematic observation, they recognized both



common patterns and individual differences in infants' abilities to process sensory information and to organize their responses. Not only could infants differentiate within sensory modalities, but they could also demonstrate choices through voluntary movements, such as suck patterns, visual gaze, or turning to sound. During the past several decades, infant research has flourished. The expansion of conceptual models to include the processing capabilities of infants, together with technological advances in brain imaging, have led to a more sophisticated understanding of the impact of early experiences on both the structure of the brain and early child development (Gunnar, 1998). The first 3 years of life is a period of rapid brain growth that provides a "window of opportunity" for early learning. Using longitudinal data from the Berkeley Growth Study, Bayley and Schaefer (1964) reported on the mental and physical development of individuals over 36 years. There was a great deal of individual variability in scores, particularly during the first 3 years of life. However, there were complex but consistent patterns linking behavior during the first 3 years of life with cognitive performance at 18 and 36 years of age. In addition, children whose mothers were nurturant and understanding during the first 3 years had better cognitive development as adults. In more recent research, Hart and Risley (1995) demonstrated a link between early parent communication and later school performance, and they concluded that infants who are not exposed to an enriching environment may miss an important developmental opportunity. Findings such as these illustrate the importance of the early years and the critical role that the caregiving environment has on children's early development.


The origins of infant assessments are often traced to the work of Arnold Gesell, a physician and psychologist at the Yale Clinic of Child Development. Gesell had been influenced by Charles Darwin's work--not only in his comparative studies of animals but also in his interest in the growth and development of children. In the early 1920s Gesell compiled a schedule of tasks for infants ages 4, 6, 9, 12, and 18 months of age and 2, 3, 4, and 5 years of age (Gesell, 1925). By identifying predictable stages of development for the brain and visual and motor systems, Gesell hoped to document maturation and to



include assessments of behavior and development into children's well-child exams so that physicians could provide families with prescriptive and predictive recommendations (Gesell, 1948). Infant assessments such as the Gesell System of Developmental Diagnosis (Gesell, 1925) and Cattell Infant Intelligence Test (Cattell, 1940), as well as the Bayley Scales of Infant Development (Bayley, 1969), originated in the child development centers around the United States that were established in the early part of the 20th century. These early assessments were designed to catalog an infant's level of development at various ages and to establish normative data. Toward the latter half of the 20th century and particularly in the last few decades, infant assessment has focused on the need to evaluate infants at risk for developmental delay. With recent advances in medical technology, many premature and medically challenged infants are surviving. However, rates of cerebral palsy and developmental disabilities have increased (Hack et al., 1994). Infant assessments are needed to determine whether infants are developing at an expected rate, whether they are eligible for early intervention services, and whether the early intervention is effective in improving their rate of development. Within the last 20 years, society has also become increasingly concerned about individuals with disabilities. The concern encompasses the rights of all individuals to an equal opportunity for an education and occupation. Although school-aged children were the first group to be targeted for educational and social services, the age has been extended to birth. The Education of the Handicapped Act (EHA) was passed in 1975 and went into effect in 1977 to ensure special education and related services to children with disabilities (based on specific criteria and implemented according to state laws). In 1986 Public Law 99-457 was passed, which included Part H, the Handicapped Infants and Toddlers Program. Part H addressed the needs of infants from birth through the third birthday. This law mandates the identification and provision of services for infants suspected of being at risk for developmental delay. Eligibility for services usually requires an infant assessment that documents delays (criteria vary across states) in cognitive, physical, communication, social or emotional, or adaptive development, or the infant must have a diagnosed physical or mental condition that is likely to result in developmental delay. The Individuals with Disabilities Education Act (IDEA) extends the



provisions of previous amendments to provide additional services to children with disabilities. Information on the criteria for eligibility is available through the Office of Special Education Programs.


There are a number of infant developmental assessments on the market today. The following is a brief overview of commonly used instruments that assess children within the first 3 years of life. More extensive reviews that critique the psychometrics of the instruments are provided by Aylward (1994), Kamphaus (1993), and Sattler (1992). The quality of the normative data on these assessments is variable. Although some tests have normative data based on nationally representative samples of children, in other cases, there are no normative data or the normative data are out of date. Before selecting an infant test, readers are encouraged to review the test manuals that accompany the tests to review the procedures used to develop each test, including the establishment of norms. Commonly used infant assessments include: Battelle Developmental Inventory (Newborg, Stock, Wnek, Guidubaldi, & Svinicki, 1984) Bayley Infant Neurodevelopmental Screen (Aylward, 1995) Bayley Scales of Infant Development (Bayley, 1969, 1993) Brazelton Neonatal Behavioral Assessment Scale (Brazelton, 1973, 1984; Brazelton & Nugent, 1995) Cattell Infant Intelligence Test (Cattell, 1940) Clinical Adaptive Test/Clinical Linguistic Auditory Milestone Scale (Capute & Accardo, 1996a, 1996b) Denver Developmental Screening Test (Frankenburg & Dodds, 1967; Frankenburg, Goldstein, & Camp, 1971; Frankenburg, Fandal, Sciarillo, & Burgess, 1981) Gesell System of Developmental Diagnosis (Gesell, 1925) Griffiths Developmental Scale (Griffiths, 1967 ) Infant Psychological Development Scale (Uzgiris & Hunt, 1975) Milani-Comparetti Neurodevelopmenal Screening Examination (MilaniComparetti & Gidoni, 1967 )



The Gesell Developmental Schedules and the Cattell Infant Intelligence Test are the oldest assessments on the list. Although they have historical significance, they are rarely used today because they have been replaced by newer assessments. The Griffiths Developmental Scale was developed and published in Britain over 30 years ago. Infant researchers have been attracted to the Griffiths Scale because it yields scores in six areas of functioning: Locomotor, Hearing and Speech, Eye and Hand Coordination, Performance, Practical Reasoning, and Personal-Social. However, item coverage is very limited and norms are dated. The Brazelton Neonatal Behavioral Assessment Scale (NBAS) is a newborn assessment that is used by physicians, psychologists, nurses, and physical and occupational therapists to describe the individual differences in information processing and regulation displayed by newborns. Administration is limited to the 1st month of life. The NBAS has been widely used in research and practice and has contributed significantly to our understanding of newborn behavior. The Denver Developmental Screening Test (DDST) is widely used in primary care settings to screen the development of infants and young children (birth through age 6 ). It has been revised several times; the most recent revision is the Denver II (Frankenburg, Dodds, Archer & Bresnick, 1990, 1992). One of the strengths of the DDST (and the Denver II) is the one-page record form that highlights the infant's successes and failures, providing a summary of the child's skills at a glance. The Denver II is a screening test, not a diagnostic test. The test has been found to have high sensitivity (proportion of children with developmental problems who are identified), and low specificity (proportion of children without developmental problems who are categorized as normal) (Glascoe, Bryne, Ashford, Johnson, Chang, & Strickland, 1992). Although the Denver II does an excellent job of identifying children who are at risk for developmental delay, the rate of false positives is high. Thus children with developmental delays are not missed, but many children who are developing within normal limits may be identified as suspect. The Denver II is usually followed by a more comprehensive test of infant functioning, such as the BSID-II. The Clinical Adaptive Test/Clinical Linguistic Auditory Milestone Scale (CAT/CLAMS) (Capute & Accardo, 1996a, 1996b) is a screening test of lan-



guage, problem-solving abilities, and visual-motor skills for children under 36 months of age. The test has been found to have variable rates of sensitivity (5% to 67%) (Macias et al., 1998; Rossman et al., 1994), and high rates of specificity (95 to 100%) (Macias et al., 1998; Rossman et al., 1994). In contrast to the Denver II, the CAT/CLAMS may miss children who have developmental problems, but the test is unlikely to identify a child who is developing within normal limits as having a developmental problem. The Bayley Infant Neurodevelopmental Screener (BINS; Aylward, 1992) examines the neuropsychological development of infants from 3 to 24 months of age. It includes items that have been extracted from existing tests and requires approximately 10 minutes to administer. Initial comparisons with other measures of infant development suggest that the BINS has high sensitivity, meaning that it recognizes infants who have developmental delays (Macias et al., 1998). The Infant Psychological Development Scale is based on Piaget's sensorimotor stage of development (Uzgiris & Hunt, 1975). Unlike many of the other infant tests, it is theoretically based and not norm referenced. It provides a description of the infant's progress in eight areas (Object Permanence, Use of Objects as Means, Learning and Foresight, Development of Schemata, Development of an Understanding of Causality, Conception of Objects in Space, Vocal Imitation, and Gestural Imitation) and is useful in planning interventions. The Milani-Comparetti Neurodevelopmental Screening Examination (Milani-Comparetti & Gidoni, 1967 ) examines neuromotor function from birth through 24 months. It is a brief test that can be administered in about 5 minutes and incorporated into a physical exam. The Batelle Developmental Inventory (Newborg et al., 1984) evaluates children's development in five areas (Cognitive, Communication, Motor, Adaptive, and Personal-Social). The test extends from birth through 8 years of age and includes structured assessment, observation, and caregiver report. The Battelle has become a popular test because it addresses the developmental areas for assessment required by IDEA, includes standardized adaptations for children with disabilities, and includes a screening test that requires 10 to 30 minutes. A summary of these tests is provided in Rapid Reference 1.1.



Rapid Reference

1.1 Summary of Infant Assessments

Test Title Battelle Developmental Inventory Bayley Infant Neurodevelopmental Screener Bayley Scales of Infant Development Brazelton Neonatal Behavioral Assessment Scale Cattell Infant Intelligence Test Clinical Adaptive Test/Clinical Linguistic Auditory Milestone Scale (CAT/CLAMS) Denver Developmental Screening Test Source Age Range Newborg et al. 1 month­ (1984) 8 years Content Areas Cognitive, PersonalSocial, Adaptive, Motor, Communication Aylward (1992) 3 ­ 24 months Neurological, Receptive, and Expressive Functions, Processing, and Mental Activity Bayley (1969, BSID: 2 ­ 30 Mental, Motor, and 1993) months Behavior BSID II: 1­ 42 months Brazelton Birth­1 month Habituation, Orientation, (1973, 1984); Motor Performance, Brazelton and Range of State, State Nugent (1995) Regulation, Autonomic Regulation, Abnormal Reflexes Cattell (1940) 2 ­ 30 months Cognitive Capute and Birth ­ 36 Accardo (1996a, months 1996b) Language, Problem Solving, and VisualMotor Skills

Frankenburg Birth­ 6 years (1967); Frankenburg, Dodds, Archer, and Bresnick (1990); Frankenburg, Fandel, Sciarillo, and Burgess (1981); Frankenburg, Goldstein and Camp (1971)

Gross Motor, Language, Fine Motor-Adaptive, Personal-Social, Behavior



Test Title Gesell Developmental Schedules

Griffiths Developmental Scale

Source Gesell (1925); Gesell and Amatruda (1941); Knobloch, Stevens, and Malone (1980) Griffiths (1967)

Age Range Original: 4 ­ 60 months Revised:1 week ­ 36 months

Content Areas Adaptive, Gross Motor, Fine Motor, Language, Personal-Social

1­ 60 months

Infant Psychological Developmental Scale

Uzgiris and Hunt (1975)

2 weeks­ 2 years

MilaniComparetti Neurodevelopmental Screening Examination

MilaniComparetti and Gidoni (1967)

Birth­ 24 months

Locomotor, Hearing and Speech, Eye and Hand Coordination, Performance, Practical Reasoning, Personal-Social Object Permanence, Use of Objects as Means, Learning and Foresight, Development of Schemata, Development of an Understanding of Causality, Conception of Objects in Space,Vocal Imitation, Gestural Imitation Neuromotor Function


The BSID represents the lifetime work of Dr. Nancy Bayley and has its basis in the California First-Year Mental Scale (Bayley, 1934), the California Preschool Mental Scale ( Jaffa, 1934) and the California Infant Scale of Motor Development (Bayley, 1936 ). These instruments were developed as standardized assessments of infant development that would produce a score. When the BSID was published it sampled the widest array of mental and motor abilities on a developmental assessment at the time. Bayley included the Infant Behavior Record to account for behavioral aspects of the infant that affect cognitive performance, such as the infant's motivation and quality of interaction with others. Bayley began work on the California First-Year Mental Scale at the Institute of Child Welfare at the University of California at Berkeley in the 1920s. She incorporated many items from Gesell's assessment and also from other's work



such as Kuhlman's (1922) Handbook of Mental Tests and Preyer's (1882) Die Seele des Kendes. She also developed new items (Bayley, 1933). The BSID was considered to be a theoretically eclectic assessment that borrowed from different areas of research and examined higher functioning (Aylward, 1997 ).

History and Development of the BSID

The BSID was published in 1969 by a major test publisher, the Psychological Corporation. The test assesses infants between the ages of 2 and 30 months of age. The items are arranged in ordinal sequence of increasing difficulty, representing the maturation of abilities in cognitive and motor development. Raw scores are converted to standardized scores (mean = 100, standard deviation = 16 ) through tables, yielding a Mental Developmental Index (MDI) score from the Mental Scale and a Psychomotor Developmental Index (PDI) score from the Motor Scale. The Infant Behavior Record provides a description of the infant's behavior in reference to behavior expected of same age infants. Items from the three California scales, along with newly created items, were piloted in research studies from 1958 to 1960. These studies were funded by the National Institute of Neurological Diseases and Blindness. A second wave of research was conducted by the National Institutes of Health, along with other agencies, beginning in 1960. The first revision of the BSID, the BSID-II (Bayley, 1993) was designed to update the normative data, to expand the age range to 1 to 42 months, to incorporate research-based items that demonstrate predictive validity, to update the stimulus materials, to conduct reliability and validity studies, to report data from clinical populations of children, and to ensure a standardized assessment of children's mental and motor performance (Rapid Reference 1.2). The BSID-II maintains the same structure as the BSID with Mental and Motor Scales and a rating of the child's behavior, the Behavior Rating Scale (BRS) (Don't Forget 1.1). Like the BSID, the BSID-II MenDON'T FORGET tal Development Index and Psychomotor Development Index have a 1.1 Scales of the BSID-II mean of 100. The standard devia· Mental Scale tion was changed to 15 (instead of · Motor Scale 16 as it is for the BSID) in keeping · Behavior Rating Scale with most other assessments of cognitive performance.



Rapid Reference

1.2 Goals of the Revision of the BSID

· Update the normative data. · Expand the age range from 1 to 42 months of age. · Improve content coverage using research-based items with demonstrated predictive validity. · Modernize items using materials that facilitate infection control, reduce gender and racial bias, and are attractive to young children. · Conduct reliability and validity studies and explore the factor structure of the Mental and Motor Scales and the BRS. · Collect data on clinical populations of children. · Maintain the primary structure and purpose of the BSID, which is to provide a standardized assessment of infant mental and motor performance based on the infant's response to a structured set of stimulus materials.

The BSID-II, like the BSID, provides overall standard scores for mental and motor development. Although some attempts were made in the revision to provide more comprehensive coverage of all mandated areas of assessment (cognitive, language, social, self-help, and motor), the BSID-II was not designed to provide reliable, valid scores in all five of these areas. The notes made in Caution 1.1 should be heeded, or the BSID-II may disappoint the examiner and referral source if more diagnostic information is sought. A revision of the norms on the BSID was needed because over time there had been an upward drift of approximately 11 points on the Mental Scale and 10 points on the Motor Scale (Campbell, Siegel, Parr, & Ramey, 1986 ). This pattern, sometimes referred to as the Flynn effect (Flynn, 1999), has been demonstrated in other cognitive assessments for children and may reflect improvements in nutrition, environmental conditions, and family relations, as well as in our understanding of the determinants of early development. A primary reason for the first revision of the Bayley Scales was to update the norms so they were representative of American children at the end of the 20th century. Therefore scores obtained on the BSID-II are usually lower than scores obtained for the same children on the BSID. This phenomenon should be clarified with parents and colleagues who may misinterpret a child's lower scores on the BSID-II to mean a decline in developmental skills. With the updated norms on the BSID-II, more children




1.1 Design and Limitations of the BSID-II

Feature Mental Scale Limitation Does not provide standard scores for facets or separate domains; does not provide diagnostic information; does not provide standard scores < 50 Produce a standardized score Does not provide stanfor overall motor development dard scores for facets or separate domains; does not provide diagnostic information; does not provide standard scores < 50 Produce a percentile score for Does not provide diagcomparison to a nonclinical nostic information population; assess behavior related to test-taking session. Purpose Produce a standardized score for overall cognitive development; assess higher-order mental processing

Motor Scale

Behavior Rating Scale

should qualify for early intervention services. Remember that the norms for the BSID (and for other developmental assessments that do not have recent norms) are inflated and no longer accurate. The change in the mean scores between the BSID-II and BSID appear in Table 1.1. The Mental and Motor Scales The theoretical foundation of the BSID-II remains as eclectic as the BSID. The project staff at the Psychological Corporation, along with various content experts, critiqued the content of the BSID and identified pertinent areas of infant development based on research in the areas of cognitive, language, motor, and personal-social development to be included on the BSID-II. New items were developed to tap visual and auditory habituation and visual preference in younger infants. In addition, items were added that assess problem-solving abilities including object permanence, perspective taking, and following multistep directions. Many of these items represent higher order cognitive processes, involving reasoning, memory, and the integration of these processes (Aylward, 1997 ).



Table 1.1 Comparison of the BSID and BSID-II Means for the MDI and PDI

BSIDa BSID-IIb Difference



SD 17.2 15.3

Mean 99.8 100.4

SD 14.9 16.2

Mean 11.8 10.1

SD 2.3 0.9

111.6 110.5

SD = 16. = 15.


Note. From the Manual for the Bayley Scales of Infant Development: Second Edition. Copyright © 1993 by The Psychological Corporation. Reproduced by permission. All rights reserved.

The percentage of language items was increased in the BSID-II because language is a higher order cognitive process that plays a central role in children's cognitive development. The detection of language delay can signal neurological impairment, oral-motor impairment, general cognitive delay, or environmental deprivation. The language items in the BSID-II assess expressive and receptive language as well as grammar usage at the older ages. Early number concepts and prewriting skills, as well as other items that assess school readiness are included on the Mental and Motor Scales of the BSID-II. A child over 2 years of age is asked to count and exhibit stable number order, one-to-one correspondence, and an understanding of cardinality. Prewriting skills include the ability to rotate the wrist, grasp a pencil, manipulate the pencil, and hold it at the nearest end to draw. Other school-readiness concepts assessed are color identification and discrimination of shape, size, and mass. Visual perception is assessed by several item types on the Mental Scale with children 2 years of age and older. Tasks vary by the children's age and include matching colors, matching pictures, differentiating objects by size, and discriminating shapes and pictures. Perceptual-motor integration is also assessed with children over 2 years of age. Children are asked to imitate the examiner's hand movements and body postures on the Motor Scale, and to copy block designs on the Mental Scale. In the BSID, the item content and coverage of the Motor Scale was particularly weak. Items were added to the BSID-II to assess muscle tone, dy-



namic and static balance, and perceptual-motor development among the older infants. Items for the younger infants include an assessment of movement symmetry and antigravity movement (Thrusts Arms in Play, Thrusts Legs in Play, Lifts Head When Held at Shoulder, Holds Legs up for 2 Seconds, Balances Head). At the older ages, items assess motor planning and coordination (Swings Leg to Kick Ball, Stops From a Full Run). Behavior Rating Scale The BRS is a critical dimension of the Bayley Scales because an infant's state, orientation toward the environment and engagement with people, and motivation may partially explain variations in individual performance on the Mental and Motor Scales. Arnold Sameroff and Ronald Seifer made a significant contribution to the conceptualization and development of the BRS. The first two items on the BRS are applicable to all infants and are not included in the factors. They represent the caregiver's interpretation of the infant's performance: how typical the infant's behavior was and whether the test was an adequate measure of the infant's skills. These items are very important in the examiner's interpretation of the infant's performance. For example, the infant who has experienced a recent illness, loss, or traumatic event may be more lethargic, less motivated, less cooperative, or have difficulty concentrating and thus may obtain lower scores than under other circumstances. Items on the BRS are rated on a 5-point scale with behavioral anchors. The items have been factor-analyzed to obtain summed scores for conceptually similar items. The factor structure is somewhat different for the three age groups (1 to 5 months, 6 to 12 months, and 13 to 42 months). In the first age group (1 to 5 months), the BRS assesses Attention/Arousal and Motor Quality. In the second and third age group (6 to 12 and 13 to 42 months, respectively) infants are assessed on Orientation/Engagement, Emotional Regulation, and Motor Quality. Attention/Arousal includes an assessment of the infant's state, affect, energy, interest, exploration, and responsiveness to the examiner. Orientation/Engagement is used for infants 6 months of age and older and includes many of the Attention/Arousal items, along with additional items that assess aspects of the infant's behavior toward the materials. Emotional Regulation is an assessment of the infant's range of affect and emotional response to both success and failure on the assessment. Motor Quality refers to the quality of the infant's movements, including tone and control. Raw scores are converted to percentiles for each factor within each



age group. A total raw score can also be converted to a percentile by age group to provide an overall assessment of the infant's behavior.

Item Development


1.2 Changes in Items Between the First and Second Editions of the Bayley Scales

· Old items were dropped (roughly

Once new items were written, and 30%). some of the remaining BSID items · New items were added (roughly 50%). were rewritten for clarification, the · Some old items were rewritten items went through three pilot stud(in some cases to include a differies, tryout, and standardization. Durent stimulus, or different adminising the pilot studies and tryout, data tration or scoring directions).a were collected from approximately a The BSID-II manual lists each new and 350 and 643 infants, respectively. deleted item. Care should be taken by vetThe developmental sensitivity of eran BSID examiners to note that the instructions for some items that appeared on each item (i.e., item difficulty acthe BSID have been changed or clarified for cording to age of the infant) was the BSID-II. evaluated from the data. Ease of administration and appeal to infants were evaluated from examiner feedback. During the pilot studies, items were rewritten to clarify administration or scoring , stimulus materials were modified, some items were dropped, and new items were added. After tryout, items were dropped if they were redundant with other items or difficult to administer. Items (along with the stimulus materials) were also revised to reduce racial / ethnic and sex biases. Don't Forget 1.2 lists the types of changes made to the items to facilitate a comparison for veteran Bayley users.


The standardization data were collected from a sample of 1,700 infants, aged 1 to 42 months. One hundred infants were in each of 17 age groups (50 females and 50 males in each age group). The ages sampled were in monthly intervals (plus or minus 1 week) through 6 months of age, 2-month intervals from 8 to 12 months of age (plus or minus 2 weeks), 3-month intervals from 15 through 30 months of age (plus or minus 3 weeks), and 6-month intervals from 36 and 42 months of age. The sample was stratified according to the



1988 update of the U.S. census by race/ethnicity, parent education and geographic region. To be included in the normative sample, infants had to be full term (36 to 42 weeks gestation) with birth weight appropriate for gestational age, have no significant medical complications, no disabilities, and not be receiving treatment or intervention for disabilities.

Item Set Development

Unlike the BSID, the BSID-II has circumscribed item sets. The introduction of item sets is a major conceptual change in the BSID-II and one that has introduced significant concern among examiners (Ross & Lawson, 1997 ). Rather than thinking of individual items, examiners must think in terms of item sets. Along with the change in item sets came a change in the basal and ceiling rules (Don't Forget 1.3). A major complaint of the BSID was the time that was sometimes required to establish a basal and ceiling of 10 consecutive passes and failures, respectively. In addition, it was difficult for examiners to know where to begin testing. Therefore the scores for infants of the same age were not necessarily based on the same series of items. Item sets were constructed in the BSID-II to overcome these complaints from the BSID. Item sets are organized by chronological age and include a series of items that increase in difficulty. They overlap such that the item set for age 9 months includes the most difficult items in the 8-month item set and the easiest items in the 10-month item set. Therefore movement among items sets is straightforward and in most cases, introduces only a few additional items. The item sets were designed to be broad enough that in most cases an examiner could establish the basal and ceiling within one item set. Each item set includes scores that are approximately ± 1.5 standard deviations from the mean (approximately 78 to 122). When testing infants with scattered abilities or severe delays, clinical judgment is necessary to determine the initial item set. The use of items sets has been a major criticism of the BSID-II and will be addressed more fully in Chapter 5.


There have been hundreds of published studies that have used the Bayley Scales with both nonclinical and clinical samples of children, in clinical and research settings, and as an outcome as well as a predictor measure. Numer-




1.3 Item Sets and Basal and Ceiling Rules

· There are 22 item sets each for the Mental and Motor Scales designated by infant's age. · Item sets for the Mental Scale have an average of 27 items with a range of 20 to 36 items.The Motor Scale has an average of 17 items with a range of 14 to 21 items. · Item sets are listed on page 42 of the BSID-II manual and are demarcated on the Record Forms. · The Mental Scale basal is achieved when credit is received for at least five items in an item set and the ceiling is achieved when no credit is received for at least three items in an item set. · The Motor Scale basal is achieved when credit is received for at least four items in an item set and ceiling is achieved when no credit is received for at least two items in an item set. · If a basal is not achieved within the first item set administered, the examiner must go to the previous item set in an attempt to achieve a basal.The examiner continues in this fashion until a basal is reached. · If a ceiling is not reached within the first item set, the examiner must proceed to the next higher item set in an attempt to reach a ceiling.The examiner continues in this fashion until a ceiling is reached. · For most infants the basal and ceiling occur within the same item set, but they may occur in different item sets.

ous investigators have examined the psychometric properties of the BSID or compared it with other infant assessments (e.g., Burns, Burns, & Kabacoff, 1992; Costarides & Schulman, 1998; Gerken, Eliason, & Arthur, 1994; LeTendre, Spiker, Scott, & Constantine, 1992). The BSID has been used with typically developing populations to describe variations in development (e.g., Kopp & McCall, 1982) and with at-risk populations to describe the impact of biological or environmental challenges on children's development (e.g., Arendt, Singer, Angelopoulos, Bass-Busdiecker, & Mascia, 1998; Russell et al., 1998). The Bayley has been translated into multiple languages and adapted for use throughout the world (e.g., Chung, Rhee, & Park, 1993; Godbole, Barve, & Chaudhari, 1997; Phatak, 1993). Caution is warranted in using the BSID in populations that differ from the standardization sample because most investigators have not conducted normative studies on the populations they are using. Therefore some examiners who use the BSID for research purposes



report raw scores, rather than relying on U.S. norms (e.g., Sigman, Neumann, Carter, Cattle, D'Souza, & Bwibo, 1988).


The response to the BSID-II in the research literature has been generally positive, and there is an emerging methodological literature on the BSID-II. The introduction of the BSID-II was met by a series of methodological comments regarding changes in the administration procedures, particularly as applied to premature infants and infants with developmental delays. Nellis and Gridley (1994) provide a comprehensive review of the changes in the BSID-II, together with recommendations for examiners and for subsequent revisions. For example, they suggest that examiners laminate the Cue Sheets so they can be used repeatedly. They also suggest that the manufacturer provide separate technical and administration manuals, rather than combining them into one manual; that there be better empirical support for the facet scores; and that norms be developed to describe development among children with index scores below 50. The use of item sets has raised many questions. In a study involving 12-month-old infants who had been exposed to cocaine prenatally, Gauthier, Bauer, Messinger, and Closius (1999) illustrate how scores vary depending on the item set administered. Most infants (94%) met basal and ceiling criteria in the 11-, 12-, and 13-month item sets. Regardless of their chronological age, infants who received the 13-month item set achieved higher MDI and PDI scores than infants who were tested on the 11- or 12-month item sets. The authors recommend that examiners apply consistent rules (as recommended in the BSID-II manual) of starting to test with the chronological age item set. Other investigators have also raised concerns on the use of item sets with premature infants (Ross & Lawson, 1997 ) and infants with developmental delays (Washington, Scott, Johnson, Wendel, & Hay, 1998). Matula, Gyurke, and Aylward (1997 ) discuss the concerns raised by examiners and emphasize that the BSID-II norms apply only when the examiner adheres to standard administration procedures. Yet examiners can also use the BSID-II to test beyond the item sets to describe the infant's strengths and weaknesses. Several investigators have compared the BSID and the BSID-II (Goldstein, Fogle, Wieber, & O'Shea, 1995; Tasbihsazan, Nettelbeck, & Kirby, 1997 ). Goldstein et al. tested premature infants at 12 months of age and Tasbihsazan et al. tested healthy infants from 18 months to 27 months of age.



Both teams of investigators found that, as expected, mean scores on the BSID-II were lower than those on the BSID, suggesting that increased numbers of infants may be eligible for early intervention services. There has been limited work on the BRS. Thompson, Wasserman, and Matula (1996 ) used two samples and three age groups to examine the factor structure of the BRS. Additional research is needed on the relationship between the BRS and other measures of development. The Infant Behavior Record of the BSID has been a useful clinical indicator of infant behavior (Wolf & Lozoff, 1985) and, with the strong psychometric properties of the BRS, it should be a useful clinical measure. Rapid Reference 1.3 provides basic information on the BSID-II and its publisher.

Rapid Reference

1.3 Bayley Scales of Infant Development­Second Edition

Author: Nancy Bayley Publication date: 1993 Scales: Mental Scale, Motor Scale, Behavior Rating Scale Age range: 1­ 42 months Administration time: 30 ­ 60 minutes (depending on age of infant) Qualifications of examiners: Examiners should have training and experience administering and interpreting standardized assessments with infants (from birth through 3.5 years of age).Test administration is more complex than with other standardized assessments because the examiner alters the sequence of items in response to the infant's behavior and performance.Test interpretation is also complex and requires training in infant development, atypical development, and factors that influence behavior and development. Typically examiners have training at the master's or doctoral level and supervised experience, in accordance with guidelines from the American Psychological Association. Publisher: The Psychological Corporation 555 Academic Court San Antonio,TX 78204-2498 800-211-8378 (to order by phone) Price: $838




(a) (b) (c) (d)


1. What is the age range for the normative data on the BSID-II? 2. What are the three scales that constitute the BSID-II? 3. Which of the following was not a major goal of the BSID revision?


to develop additional subtests to expand the age range to provide more recent normative data to improve stimulus materials

4. Which of the following is a major strength of the BSID-II?

(a) Separate standardized scores are provided for cognitive, language, and motor abilities. (b) Standardized scores are based on a large sample that is representative of the U.S. population. (c) The normative data include a clinical sample. (d) Item sets alleviated the necessity for basal and ceiling rules.

Answers: 1. 1­ 42 months; 2. Mental, Motor, and Behavior Rating Scales; 3. a; 4. b


20 pages

Report File (DMCA)

Our content is added by our users. We aim to remove reported files within 1 working day. Please use this link to notify us:

Report this file as copyright or inappropriate


You might also be interested in

Clinical Practice Guideline: Report of the Recommendations, Down Syndrome, Assessment and Intervention for Young Children (Age 0-3 Years)