31 minute read

Research

Family Measurement, Methodology

FAMILY MEASUREMENT Murray A. Straus, Susan M. Ross

METHODOLOGY Alan Acock, Yoshie Sano

Advantages of Multiple-Item Measures

Multiple-item measures are emphasized in this entry because they are more likely to be valid than single-item measures. Although one good question or observation may be enough and thirty bad ones are useless, there are reasons why multiple-item measures are more likely to be valid. One reason is that most phenomena of interest to family researchers have multiple facets that can be adequately represented only by use of multiple items. A single question, for example, is unlikely to represent the multiple facets of marital satisfaction adequately.

A second reason for greater confidence in multiple-item measures occurs because of the inevitable risk of error in selecting items. If a single item is used and there is a conceptual error in formulating or scoring it, hypotheses that are tested by using that measure will not be supported even if they are true. However, when a multiple-item test is used, the adverse effect of a single invalid item is limited to a relatively small reduction in validity (Straus and Baron 1990). In a fifteen-item scale, for example, a defective item is only 6.6 percent of the total, so the findings would parallel those obtained if all fifteen items were correct.

Multiple items are also desirable because measures of internal consistency reliability are based on the number of items in the measure and the correlation between them. Given a certain average correlation between items, the more items, the higher the reliability. If only three items are used, it is rarely possible to achieve a high level of reliability. Reliability needs to be high because it sets an upper limit on validity.

Status and Trends in Family Measurement

To investigate the quality of measurement in family research, all empirical studies published in two major U.S. family journals (Journal of Marriage and the Family and Journal of Family Psychology) were examined. To determine trends in the Journal of Marriage and the Family, issues from 1982 and 1992 were compared. For the Journal of Family Psychology, issues from 1987 (the year the journal was founded) and 1992 were compared. Of the 161 empirical research articles reviewed, slightly fewer than two-thirds used a multiple item measurement. This increased from 46.9 percent initially to 68.1 percent in 1992. A typical article used more than one such instrument, so that a total of 219 multiple item measures were used in these 161 articles. Reliability was reported in 79.4 percent of these articles. Reliability reporting increased from 53.3 percent initially to 90.6 percent in 1992. Six percent of the articles had as their main purpose describing a new measurement instrument or presenting data concerning an existing instrument.

How one interprets these statistics depends on the standard of comparison. Articles in sociology journals and child psychology and clinical psychology journals are appropriate comparisons because these are the disciplines closest to family studies and in which many family researchers were trained. For sociology, the findings listed above can be compared to those reported in a study by Murray A. Straus and Barbara Wauchope (1992), in which they examined empirical articles from the 1979 and 1989 issues of American Sociological Review, American Journal of Sociology, and Sociological Methods and Research. This comparison shows that articles in family journals pay considerably more attention to measurement than articles in leading sociological journals. None of the 185 articles in sociology journals was on a specific measure, whereas 6 percent of the articles in the family journals were devoted to describing or evaluating an instrument. This portends well for family research because it is an investment in tools for future research. Only one-third of the articles in the sociology journals used a multiple-item measure, compared to more than two-thirds (68%) of articles in the family journals. The record of family researchers also exceeds that of sociologists in respect to reporting reliability. Only about 10 percent of the articles in sociology journals, compared to 80 percent of the articles in family journals, reported the reliability of the instruments. The main problem area is validity; only 12.4 percent of the articles in family journals described or referenced evidence of validity. The fact that this is three times more than in sociology is not much consolation because 12 percent is still a small percentage. Moreover, reporting or citing information on validity did not increase from the base period. Since validity is probably the most crucial quality of an instrument, the low percentage and the lack of growth indicate that more attention needs to be paid to measurement in family research.

There is no comparable study of measures in child or clinical psychological journals.

Reasons for Underdevelopment of Measures

The limited production of standard and validated measures of family characteristics is probably the result of a number of causes. Conventional wisdom attributes it to a lack of time and other resources for instrument development and validation. This is not an adequate explanation because it is true of all the social sciences. Why do psychologists devote the most resources to developing and validating tests, sociologists the least, and family researchers fall in between?

One likely reason is a difference in rewards for measurement research. A related reason is a difference in the opportunities and constraints. In psychology, there are journals devoted to psychological measures in whole or in part, such as Educational and Psychological Measurement and Journal of Clinical and Consulting Psychology. There are no such journals in sociology or family studies. Moreover, there is a large market for psychological tests, and several major firms specialize in publishing tests. It is a multimillion-dollar industry, and authors of tests can earn substantial royalties. By contrast, sociology lacks the symbolic and economic reward system that underlies the institutionalization of test development as a major specialization in psychology. The field of family studies lies in between. In principle there should be a demand for tests because of the large number of family therapists, but few family therapists actually use tests.

A second explanation for the differences among psychology, family studies, and sociology in attention to measurement is a situational constraint inherent in the type of research done. A considerable amount of family research is done by survey methods—for example, the National Survey of Families and Households. Surveys of this type usually include measures of many variables in a single thirty- to sixty-minute interview. Clinical psychologists, on the other hand, often can use longer and therefore more reliable tests, because their clients have a greater stake in providing adequate data and will tolerate undergoing two or more hours of testing.

Third, most tests are developed for a specific study and there is rarely a place in the project budget for adequate measure development—test/retest reliability, concurrent and construct validity, and construction of normative tables. Even when the author of a measure does the psychometric research needed to enable others to evaluate whether the measure might be suitable for their research, family journals rarely allow enough space to present that material.

Fourth, the optimum procedure is for the author to write a paper describing the test, the theory underlying the test, the empirical procedures used to develop the test, reliability and validity evidence, and norms. This rarely occurs because of the lack of resources indicated above. In addition, most investigators are more interested in the substantive issues for which the project was funded.

Another reason why standardized tests are less frequently used in family research is that many studies are based on cases from agencies. A researcher studying child abuse who draws the cases from child protective services might not need a method of measuring child abuse. However, standardized tests are still needed because an adequate understanding of child abuse cannot depend solely on officially identified cases. It is important also to do research on cases that are not known to agencies, because such cases are much more numerous than cases known to agencies and because general population cases typically differ in important ways from the cases known to agencies (Straus 1990b).

The Future of Family Research Measures

There are grounds for optimism and grounds for concern about the future of family tests. The grounds for concern are, first, that in survey research on the family, concepts are often measured by a single interview question. Second, even when a multiple-item test is used, it is rarely on the basis of empirical evidence of reliability and validity. Third, the typical measure developed for use in a family study is never used in another study. One can speculate that this hiatus in the cumulative nature of research occurs because of the lack of evidence of reliability and validity and because authors rarely provide sufficient information to facilitate use of the instrument by others.

The grounds for optimism are to be found in the sizable and slowly growing number of standardized instruments, as listed in compendiums (e.g., Grotevant and Carlson 1989; Fredman and Sherman 1987; Touliatos, Perlmutter, and Straus 1990). A second ground for optimism is the rapid growth in the number of psychologists doing family research, because psychologists bring to family research an established tradition of test development. Similarly, the explosive growth of family therapy is grounds for optimism, because it is likely that more tests will gradually begin to be used for intake diagnosis. A third ground for optimism is the increasing use of some family measures in cultures other than those in which the measures were initially developed. For example, David H. Olson's Family Adaptability and Cohesion Evaluation Scales (FACES) (1993) have been used to research Chinese families (Philips, West, Shen, and Zheng 1998; Tang and Chung 1997; Wang, Zhang, Li, and Zhao 1998; Zhang et al. 1995), immigrants to Israel (Ben-David and Gilbar 1997; Gilbar 1997), and Ethiopian migrants (Ben-David & Erez-Darvish 1997). The cross-cultural use of measures allows for assessments of validity and reliability outside of the background assumptions of the cultures in which they were developed.

There is a certain irony in the second source of optimism, because basic researchers usually believe that they, not clinicians, represent quality in science. In respect to measurement, clinicians tend to demand instruments of higher quality than do basic researchers because the consequences of using an inadequate measure are more serious. When a basic researcher uses an instrument with low reliability or validity, it can lead to a Type II error—that is, failing to accept a true hypothesis. This may result in theoretical confusion or a paper not being published. But when a practitioner uses an invalid or unreliable instrument, the worst-case scenario can involve injury to a client. Consequently, clinicians need to demand more evidence of reliability and validity than do researchers. As a result, clinically oriented family researchers tend to produce and make available more adequate measures. Hubert M. Blalock (1982) argued that inconsistent findings and failure to find empirical support for sound theories may be due to lack of reliable and valid means of operationalizing concepts in the theories being tested. It follows that research will be on a sounder footing if researchers devote more attention to developing reliable and valid measures of family characteristics.

Bibliography

Ben-David, A., and Erez-Darvish, T. (1997). "The Effect of the Family on the Emotional Life of Ethiopian Immigrant Adolescents in Boarding Schools in Israel." Residential Treatment for Children and Youth 15(2):39–50.

Ben-David, A., and Gilbar, O. (1997). "Family, Migration, and Psychosocial Adjustment to Illness." Social Work in Health Care 26(2):53–67.

Blalock, H. M. (1982). Conceptualization and Measurement in the Social Sciences. Newbury Park, CA: Sage.

Burgess, E. W., and Cottrell, L. S. (1939). Predicting Success or Failure in Marriage. Englewood Cliffs, NJ: Prentice Hall.

Cronbach, L. J. (1970). Essentials of Psychological Testing. New York: Harper & Row.

Draper, T. W., and Marcos, A. C. (1990). Family Variables: Conceptualization, Measurement, and Use. Newbury Park, CA: Sage.

Fredman, N., and Sherman, R. (1987). Handbook of Measurements for Marriage and Family Therapy. New York: Brunner/Mazel.

Gilbar, O. (1997). "The Impact of Immigration Status and Family Function on the Psychosocial Adjustment of Cancer Patients." Families, Systems and Health 15(4):405–412.

Gottman, J. M. (1994). What Predicts Divorce? The Relationship Between Marital Process and Marital Outcome. Hillsdale, NJ: Erlbaum.

Grotevant, H. D., and Carlson, C. I. (1989). Family Assessment: A Guide to Methods and Measures. New York: Guilford.

Nie, N. H.; Hull, C. H.; Jenkins, J. G.; Steinbrenner, K.; and Bent, D. H. (1978). SPSS: Statistical Package for the Social Sciences. New York: McGraw-Hill.

Olson, D. H. (1993). "Circumplex Model of Marital and Family Systems: Assessing Family Functioning." In Normal Family Process, ed. F. Walsh. New York: Guilford Press.

Olson, D. H.; Russell, C. S.; and Sprenkle, D. H. (1989). Circumplex Model: New Scales for Assessing Systematic Assessment and Treatment of Families. New York: Haworth Press.

Patterson, G. R., ed. (1982). Coercive Family Processes: A Social Learning Approach. Eugene, OR: Castalia.

Philips, M. R.; West, C. L.; Shen, Q.; and Zheng, Y. (1998). "Comparison of Schizophrenic Patients' Families and Normal Families in China, Using Chinese Versions of FACES-II and the Family Environment Scales." Family Process 37:95–106.

Spanier, G. B. (1976). "Measuring Dyadic Adjustment: The Quality of Marriage and Similar Dyads." Journal of Marriage and the Family 38:15–28.

Straus, M. A. (1964). "Measuring Families." In Handbook of Marriage and the Family, ed. H. T. Christenson. Chicago: Rand McNally.

Straus, M. A. (1990a). "The Conflict Tactics Scales and Its Critics: An Evaluation and New Data on Validity and Reliability." In Physical Violence in American Families: Risk Factors and Adaptations to Violence in 8,145 Families, ed. M. A. Straus and R. J. Gelles. New Brunswick, NJ: Transaction.

Straus, M. A. (1990b). "Injury and Frequency of Assault and the 'Representative Sample Fallacy' in Measuring Wife Beating and Child Abuse." In Physical Violence in American Families: Risk Factors and Adaptations to Violence in 8,145 Families, ed. M. A. Straus and
R. J. Gelles. New Brunswick, NJ: Transaction.

Straus, M. A. (1992). "Measurement Instruments in Child Abuse Research." Paper prepared for the National Academy of Sciences panel of child abuse research. Durham, NH: Family Research Laboratory, University of New Hampshire.

Straus, M. A., and Baron, L. (1990). "The Strength of Weak Indicators: A Response to Gilles, Brown, Geletta, and Dalecki." Sociological Quarterly 31:619–624.

Straus, M. A., and Brown, B. W. (1978). Family Measurement Techniques, 2nd edition. Minneapolis: University of Minnesota Press.

Strauss, M. A., and Wauchope, B. (1992). "Measurement Instruments." In Encyclopedia of Sociology, ed. E. F. Borgatta and M. L. Borgatta. New York: Macmillan.

Tang, C. S., and Chung, T. K. H. (1997). "Psychosexual Adjustment Following Sterilization: A Prospective Study on Chinese Women." Journal of Psychosomatic Research 42(2):187–196.

Touliatos, J.; Perlmutter, D.; and Straus, M. A. (2001). Handbook of Family Measurement Techniques, 4th edition. Thousand Oaks, CA: Sage.

Wampler, K. S., and Halverson, C. F., Jr. (1993). "Quantitative Measurement in Family Research." In Source Book of Family Theories and Methods: A Contextual Approach, ed. P. G. Boss, W. J. Doherty, R. LaRossa, W.
R. Schumm, and S. K. Steinmetz. New York: Plenum.

Wang, Z.; Zhang, X.; Li, G.; and Zhao, Z. (1998). "A Study of Family Environment, Cohesion, and Adaptability in Heroin Addicts." Chinese Journal of Clinical Psychology 6(1):32–34.

Zhang, J.; Weng, Z.; Liu, Q.; Li, H.; Zhao, S.; Xu, Z.; Chen, W.; and Ran, H. (1995). "The Relationship of Depression of Family Members and Family Functions." Chinese Journal of Clinical Psychology 3(4):225–229.

MURRAY A. STRAUS (1995)
SUSAN M. ROSS (1995)
REVISED BY JAMES M. WHITE

Strategies for Data Collection

Data is the empirical information researchers use for drawing conclusions. Often they will use a cross-sectional design when data are collected only once. This is a snapshot of how things are at a single time. Less common are longitudinal designs, where the data are collected at least twice. Although each collection point provides a snapshot, it is possible to make inferences about changes. With time-series designs, you have many snapshots, often more than thirty data collection points. Cross-sectional design. A cross-sectional design can be used in a survey, experiment, in-depth interview, or observational study. The justification for this design is usually cost.

Suppose researchers are interested in the effects of divorce on children. A cross-sectional design could take a large sample of children and measure their well-being. The children would be divided by whether they experienced divorce. If the children who had experienced divorce fared worse on well-being, the researcher would conclude that divorce had adverse effects.

Cross-sectional analysis requires the researcher to examine covariates (related variables) to minimize alternative explanations. Children who experienced divorce probably lived in families that had conflict, and may fare worse because of this conflict rather than because their parents divorced. Researchers would ask for retrospective information about marital conflict before the divorce, income before and after the divorce, and so on. These covariates would be controlled to clarify the effects of divorce, as distinct from the effects of these other variables, because each covariate is an alternative explanation for the children's well-being.

Longitudinal design. By collecting data at different times, causal order is clear; the variables measured at time one can cause the variables at time two, but not the reverse. When variables are measured imperfectly, however, the errors in the first wave are often correlated with the errors in the second and third waves. Therefore, statistical analyses of longitudinal data are typically very complex.

The question concerning the influence of divorce on the well-being of children illustrates advantages and disadvantages of longitudinal strategies. The well-being of children is measured at one time. Five years later the researcher would contact the same children and measure their well-being. Some of the children's parents would have gotten divorced. Children who experienced divorce could have their well-being at time two compared to time one. The difference would be attributed to the effects of divorce. By knowing the well-being of these children five years earlier, some controls for the influence of conflict would be automatically in place.

Although longitudinal designs are very appealing, they present some basic problems. After five years, the researcher may locate only 60 or 70 percent of the children. Those who vanished in the interval might have altered the researcher's conclusions. Second, five years is a long time in the life of a child, and many influences could have entered his or her life. Statistically these problems can be minimized, but the analysis is quite complex.

Time-series design. Although some people use time-series and longitudinal labels interchangeably, measures are made many times, usually thirty times or more, for time-series analyses. By tracking the participants over time, changes are described and attributed to life events.

Using the example of the effects of divorce on children, a researcher may be interested in how effects vary over time. Perhaps there is an initial negative effect that diminishes over time. Alternatively, initial adverse effects may decrease over time for girls but increase for boys.

Design for Collecting Data

Researchers have a variety of approaches and designs for collection of data. Three common designs are surveys, experiments and quasi-experiments, and observation and in-depth interviews.

Surveys. The most common data collection strategy is the survey. For example, the National Longitudinal Survey of Youth 1997 (NLSY97) is a sample of nearly 9,000 twelve- to sixteen-year-old adolescents and their families. This survey will be completed each year for this panel of youth, as they become adults. Such surveys allow researchers to generalize to a larger population, such as that of the United States, and to use longitudinal methods such as growth curves. Because these surveys are large, researchers can study special populations such as adolescents in single-parent families, teen mothers, and juvenile delinquents. These are "general-purpose" surveys, and independent scholars who had nothing to do with the data collection may have access to it to analyze the results.

A second type of survey focuses on special populations. Researchers with a particular interest—for example, middle-aged daughters caring for aged mothers—focus all of their resources on collecting data about a special group. In many cases, these surveys are not probability samples. Credibility for generalizing comes from comparing the profile of participants to demographic information. An advantage of these surveys is that they can ask questions the researcher wants to ask. There might be a twenty-item scale to measure the physical dependency of an aged mother. Such detailed measurements are not usually available in general-purpose surveys. Because the subject of specialized studies is focused, it is often possible to include more open-ended questions than would be practical in a general-purpose survey.

Experiments and quasi-experiments. Experimental designs are used when internal validity is critical (Brown and Melamed 1990). Experiments provide stronger evidence of causal relationship than surveys because an experiment involves random assignment of subjects to groups and the manipulation of the independent variable by the researcher. Nevertheless, experimental designs give up some external validity as they gain internal validity. Because of the difficulty or impossibility of locating subjects who will volunteer to be assigned randomly to groups, many experiments are based on "captive" populations such as college students. Captive populations are fairly homogeneous regarding age, education, race, and socioeconomic status, making it difficult to generalize to a broader population. Experiments that involve putting strangers together for a short experience provide groups that differ qualitatively from naturally occurring groups such as families (Copeland and White 1991).

Many research questions are difficult to address using experiments. Suppose a survey result shows a negative correlation between husband-wife conflict and child well-being. A true experiment requires both randomization of subjects and manipulation of the independent variable. The researcher cannot randomly assign children to families. Nor can the level of husband-wife conflict be manipulated.

Observation and in-depth interviews. Both qualitative and quantitative researchers use observation and in-depth interviews. This may be done in a deliberately unstructured way. For instance, a researcher may observe the interaction between an African-American mother and her child when the child is dropped off at a childcare facility, comparing this to the mother-child interaction for other ethnic and racial groups. The researcher may structure this observation by focusing on specific aspects such as counting tactile contact (i.e., touching or hugging). For many qualitative researchers, however, the aspects of interaction that are recorded emerge after a long period of unstructured observation.

A quantitative researcher may have an elaborate coding system for observing family interaction. This may involve videotaping either ordinary (real life) or contrived situations. A researcher interested in family decision making might give each family a task, such as deciding what they would do with $1,000. Alternatively, the researcher might record family interaction at the dinner table. The videotape would be analyzed using multiple observers and a prearranged system. Observers might record how often each family member spoke, how often each member suggested a solution, how often each member tried to relieve tension, and how often each member solicited opinions from others (Bates 1950).

In-depth interviews are widely used by qualitative researchers. When someone is trying to understand how families work, in-depth interviews are an important resource. In-depth interviews vary in their degree of structure. A white researcher, who is married, has a middle-class background and limited experience in interracial settings, may want to understand the relationship between nonresident African-American fathers and their children. Such a researcher would gain much from unstructured in-depth interviews with nonresident African-American fathers and their children, including knowledge to replace assumptions and stereotypes. It may take a series of extended, unstructured interviews before the researcher is competent to develop a structural interview, much less design a survey or an experiment.

Many scholars would limit in-depth interviews and observational studies to areas where knowledge is limited. A major advantage of such designs, however, is that they open up research to new perspectives precisely where survey or experimental researchers naively believe they have detailed knowledge. By grounding research in the behavior and interactions of ordinary people, researchers may be less prone to impose explanations developed by others.

Two major problems are evident with observation and in-depth interviews. First, these approaches are time-consuming and make it costly to have a large or representative sample. Second, there are dangers of the researcher losing objectivity. When a researcher spends months with a group either as a participant or an observer, there is a danger of identifying so much with the group that objectivity is lost.

Selected other strategies. Case studies are used on rare populations such as families in which a child has AIDS. Content analysis and narrative analysis are used to identify emergent themes. For example, a review of the role of fathers in popular novels of the 1930s, 1960s, and 1990s will tell much about the changing ideology of family roles. Historical analysis has experienced a remarkable growth in the past several decades (Lee 1999), as evidenced by a major journal, the Journal of Family History. Demographic analysis is sometimes done to provide background information (economic well-being of continuously single families—see Acock and Demo 1994), document trends (demographic change of U.S. families—see Teachman, Tedrow, and Crowder 2000), and comparative studies (development of close relationships in Japan and the United States—see Rothbaum, Pott, Azuma, Miyake, and Weisz 2000). Increasingly, studies are using multiple approaches: quantitative, qualitative, and historical. Using multiple methods is called triangulation.

Measurement

All methodological orientations share a common need for measurement. Scientific advancement in many fields is built on progress in measurement (Draper and Marcos 1990). Good measurement is critical to family studies because of the complexity of the variables being measured. Most concepts have multiple dimensions and a subjective component. A happy marriage for the husband may be a miserable marriage for the wife. A daughter may have a positive relationship with her father centered on her performance in sports but a highly negative relationship with her father centered on her sexual activity. Ignoring multiple dimensions and the subjective components of measurement is a problem for both quantitative and qualitative researchers.

Scales. The most common, the Likert scale, gives the participant a series of statements about a concept, and the participant checks whether he or she strongly agrees, agrees, does not know, disagrees, or strongly disagrees with each of the statements. Often fewer than ten questions are asked, but they are chosen in a way that represents the full domain of the concept. Thus, to measure marital happiness, several items would be used to represent various aspects of the marriage.

The following is becoming a minimum standard for evaluating a scale. First, a factor analysis is done to see if the several questions converge on a single concept. Second, the reliability of the result (whether the scales gives a consistent result when administered again) is measured. This is done by using the scale twice on the same people and seeing if their answers are consistent or by using the alpha coefficient as a measure on reliability. The alpha coefficient indicates the internal consistency of the scale and should have a value of .70 or greater. This minimum standard has been emerging since the early 1980s. Few studies met these minimum standards before 1980. There has been progress, but this is still a problem today.

Additional procedures are done to assess the validity of the scales—that is, whether a scale measures what it is intended to measure (Carmines and McIver 1979). This is most often evaluated by correlating a new scale with various criteria such as existing scales of the same concept or outcomes that are related to the concepts.

Questionnaires and interviews. Questionnaires are the most commonly used methods of measuring the variables in a study. A questionnaire may be designed so that it can be self-administered by the participant, asked in a face-to-face interview, or administered by telephone.

Computer-assisted interviews can be used for all three collection procedures. Self-administered questionnaires are now completed by putting the participant in front of a computer. After the participant answers a question, the computer automatically goes to the next appropriate question. This allows each participant to have an individually tailored questionnaire. The use of Web-based questionnaires is becoming more common.

Difficulties of cross-cultural comparative analysis. Common sources of measurement error stem from insensitivity to gender, race, and culture (Van de Vijver and Leung 1996). Constructing culturally sensitive instruments is particularly salient when a researcher and subjects do not share the same language (Rubin and Babbie 2000; Hambleton and Kanjee 1995). Direct translation of a particular word may not hold the same connotation in another language. Validity of questions can be also an issue. A researcher trying to measure parenting skills in Japan and the United States may ask: "How do you rate your parenting skills? Would you say they are: (a) excellent, (b) good, (c) fair, or (d) not good?" Because of a cultural value on humbleness, Japanese parents may rate themselves lower than do American parents. The findings from this question might be reliable, but certainly not valid to make a comparison between two cultures. Social desirability and how participants react to particular questions should be carefully examined in an appropriate cultural context.

It is not possible to completely avoid cultural biases, but there are some steps to minimize the effect of them. A rule of thumb for researchers is to become immersed in the culture before selecting, constructing, or administering measures. A researcher may utilize knowledgeable informants in the study population, use translation and back-translation of instruments, and pretest measures for reliability and validity before conducting the study.

Missing data. Regardless of the approach to measurement or research design, missing data is a problem. In longitudinal strategies missing data often comes from subjects dropping out of the studies. In cross-sectional strategies missing data often comes from participants refusing to answer questions. Readers should pay special attention to the amount of missing data. It is not unusual for studies to have 20 percent or more of the cases missing from the analysis. If those who drop out of a study or those who refuse to answer questions are different on the dependent variable, then the results will be biased.

There is no simple solution to missing data. Researchers often impute a value for missing cases. For example, if 10 percent of the participants did not report their income, the researchers might substitute the median income of those who did not report their income. A slightly better solution is to substitute the median for homogeneous subgroups. Instead of using the overall median, the researcher might substitute a different median, depending on the participant's gender and education. There are many other imputation methods, involving more complex statistical analysis (see Robin 1987; Acock 1997, Roth 1994; Ward and Clark 1991). In any case, it is important to report information about participants who have missing data.

Quantitative Analysis

The variety of statistical analysis techniques seems endless. The statistical procedures range from descriptive (e.g., means, standard deviations, percentage) to multivariate (e.g., ANCOVA, MANOVA, logistic regression, principal component and factor analysis, structural equation modeling, hierarchical linear modeling, event history analysis, and latent growth curves). Most analysis involves several independent variables. OLS regression is widely used as a basic statistical model. It allows researchers to include multiple independent variables (predictors) and systematically control for important covariates. Many of the procedures are either special cases of OLS regression (e.g., ANOVA, ANCOVA) or extensions (e.g., logistic regression, structural equation modeling). There is also clear evidence that factor analysis procedures and their extensions, such as confirmatory factor analysis, play a major role in evaluating how well variables are measured.

Special Problems and Ethical Issues

Family researchers study the issues that concern people the most—factors that enhance or harm the well-being of people and families. This often involves asking sensitive questions. Most studies have a high compliance rate, with 80 percent to 90 percent of the people answering most questions. When studies begin by asking questions that participants are willing to answer, the participants buy into their role and later report intimate information. The reality is that participants will tell interviewers, who are strangers, personal information they would never share with members of their own family.

Although researchers can get people to cooperate with studies, a crucial question is how the researchers should limit themselves in what they ask people to do. All universities have committees that review research proposals where human subjects are involved. Researchers need to demonstrate that the results of their study are sufficiently promising to justify any risks to their subjects. Researchers must take precautions to minimize risks. Sometimes this involves anonymity for the participants (no name or identification associated with participants); sometimes it involves confidentiality (name or identification known only to the project's staff). It also involves informed consent, wherein people agree to participate after they are told about the project. Informed consent is a special problem with qualitative research. The design of qualitative research is emergent in that the researcher does not know exactly what is being tested before going into the field. Consequently, it is difficult to have meaningful informed consent. The participants simply do not know enough about the project when they are asked to participate.

Even with the best intentions, subjects can be put at risk. Asking adolescents about their relationship with a nonresident father may revive problems that had been put to rest. In some cases, the effect of this can be positive; in some cases, it can be negative. Observational studies and participant observation studies are especially prone to risks for subjects. A scholar interested in interaction between family members and physicians when a family member is on an extraordinary life-support system is dealing with very important questions. Who decides to turn the machine off? What is the role of the physician? What are the roles for different family members? All these are important questions. The presence of the researcher may be extremely intrusive and may even influence the decision-making process. This potential influence involves serious ethical considerations.

Another special risk for qualitative work is unanticipated self-exposure (Berg 2001). As the project develops, the participant may reveal information about self or associates that goes beyond the original informed consent agreement.

Feminist methodology is not a particular research design method or data collection method (Nielsen, 1990). It is distinguished by directly stating the researchers' values, explicitly recognizing the influence research has on the researcher, being sensitive to how family arrangements are sources of both support and oppression for women, and having the intention of doing research that benefits women rather than simply being about women (Allen and Walker 1993). Given this worldview, feminist methodology presents complex ethical issues to researchers, and it demands that all family scholars be sensitive to these concerns.

Conclusion

The diversity of strategies, designs, and methods of analysis used by marriage and family researchers reflects the equally diverse root disciplines and content areas that overlap the study of marriage and family. In view of this, cross-sectional surveys remain the most widely used strategy, and quantitative analysis is dominant in the reporting of research results in the professional literature. However, experiments, longitudinal, time-series, and qualitative strategies also remain crucial tools for research.

Bibliography

Acock, A. C. (1997). "Working with Missing Data." Family Science Review 10(1):76–102.

Acock, A. C., and Demo, D. (1994). Family Diversity and Well-Being. Newbury Park, CA: Sage.

Allen, K. R., and Walker, A. J. (1993). "A Feminist Analysis of Interviews with Elderly Mothers and Their Daughters." In Qualitative Methods in Family Research, ed.
J. F. Gilgun, K. Daly, and G. Handel, Newbury Park, CA: Sage.

Bates, R. F. (1950). Interaction Process Analysis: A Method for the Study of Small Groups. Cambridge, MA: Addison-Wesley.

Berg, B. L. (2001). Qualitative Research Methods for the Social Sciences, 4th edition. Boston: Allyn and Bacon.

Brown, S. R., and Melamed, L. (1990). Experimental Design and Analysis. Newbury Park, CA: Sage.

Carmines, E. G., and McIver, J. P. (1979). Reliability and Validity Assessment. Newbury Park, CA: Sage.

Copeland, A. P., and White, K. M. (1991). Studying Families. Newbury Park, CA: Sage.

Drapper, T., and Marcos, A. C. (1990). Family Variables: Conceptualization, Measurement, and Use. Newbury Park, CA: Sage.

Hambleton, R. K., and Kanjee, A. (1995). "Increasing the Validity of Cross-Cultural Assessments: Use of Improved Methods for Test Adaptations." European Journal of Psychological Assessment 11(3):147–157.

Larzelere, R. E., and Klein, D. M. (1987). "Methodology." In Handbook of Marriage and the Family, ed. M. B. Sussman and S. K. Steinmetz. New York: Plenum.

Lee, G. R. (1999). Comparative Perspectives, ed. M. B. Sussman, S. K. Steinmetz, and G. W. Peterson. New York: Plenum Press.

Neilsen, J. M. (1990). Introduction to Feminist Research Methods, ed. J. M. Neilsen. Boulder, CO: Westview Press.

Roth, P. L. (1994). "Missing Data: A Conceptual Review for Applied Psychologists." Personnel Psychology 47:537–560.

Rothbaum, F.; Pott, M.; Azuma, H.; Miyake, K.; and Weisz, J. (2000). "The Development of Close Relationships in Japan and the United States: Paths of Symbiotic Harmony and Generative Tension." Child Development 71(5):1121–1142.

Rubin, A., and Babbie, E. (2000). Research Methods for Social Work, 4th edition. Belmont, CA: Wadsworth.

Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Surveys. New York: John Wiley & Sons.

Schumm, W. R., and Hemesath, K. K. (1999). Measurement in Family Studies, ed. B. Sussman, S. K. Steinmetz, and G. W. Peterson. New York: Plenum Press.

Teachman, J. D.; Tedrow, L. M.; and Crowder, K. D. (2000). "The Changing Demography of America's Families." Journal of Marriage and the Family 62(November):1234–1246.

Van de Vijver, F., and Leung, K. (1996). "Methods and Data Analyis of Comparative Research." In Handbook of Cross-Cultural Psychology, 2nd edition, Vol. 3, ed. J. W. Berry, Y. H. Poortinga, and J. Padey. Needham, MA: Allyn & Bacon.

Ward, T. J., and Clark, H. T. (1991). "A Reexamination of Public-Versus Private-School Achievement: The Case for Missing Data." Journal for Educational Research 84:153–163.

ALAN ACOCK
YOSHIE SANO

Additional topics

Marriage and Family EncyclopediaFamily Theory & Types of Families