
Sociological Methodology
SSCI-ISI SCOPUS (1987-1990,1993,1996-2023)
1467-9531
0081-1750
Anh Quốc
Cơ quản chủ quản: SAGE Publications Inc. , SAGE Publications Ltd
Các bài báo tiêu biểu
Survey and longitudinal studies in the social and behavioral sciences generally contain missing data. Mean and covariance structure models play an important role in analyzing such data. Two promising methods for dealing with missing data are a direct maximum-likelihood and a two-stage approach based on the unstructured mean and covariance estimates obtained by the EM-algorithm. Typical assumptions under these two methods are ignorable nonresponse and normality of data. However, data sets in social and behavioral sciences are seldom normal, and experience with these procedures indicates that normal theory based methods for nonnormal data very often lead to incorrect model evaluations. By dropping the normal distribution assumption, we develop more accurate procedures for model inference. Based on the theory of generalized estimating equations, a way to obtain consistent standard errors of the two-stage estimates is given. The asymptotic efficiencies of different estimators are compared under various assumptions. We also propose a minimum chi-square approach and show that the estimator obtained by this approach is asymptotically at least as efficient as the two likelihood-based estimators for either normal or nonnormal data. The major contribution of this paper is that for each estimator, we give a test statistic whose asymptotic distribution is chisquare as long as the underlying sampling distribution enjoys finite fourth-order moments. We also give a characterization for each of the two likelihood ratio test statistics when the underlying distribution is nonnormal. Modifications to the likelihood ratio statistics are also given. Our working assumption is that the missing data mechanism is missing completely at random. Examples and Monte Carlo studies indicate that, for commonly encountered nonnormal distributions, the procedures developed in this paper are quite reliable even for samples with missing data that are missing at random.
The most promising class of statistical models for expressing structural properties of social networks observed at one moment in time is the class of exponential random graph models (ERGMs), also known as p* models. The strong point of these models is that they can represent a variety of structural tendencies, such as transitivity, that define complicated dependence patterns not easily modeled by more basic probability models. Recently, Markov chain Monte Carlo (MCMC) algorithms have been developed that produce approximate maximum likelihood estimators. Applying these models in their traditional specification to observed network data often has led to problems, however, which can be traced back to the fact that important parts of the parameter space correspond to nearly degenerate distributions, which may lead to convergence problems of estimation algorithms, and a poor fit to empirical data.
This paper proposes new specifications of exponential random graph models. These specifications represent structural properties such as transitivity and heterogeneity of degrees by more complicated graph statistics than the traditional star and triangle counts. Three kinds of statistics are proposed: geometrically weighted degree distributions, alternating k-triangles, and alternating independent two-paths. Examples are presented both of modeling graphs and digraphs, in which the new specifications lead to much better results than the earlier existing specifications of the ERGM. It is concluded that the new specifications increase the range and applicability of the ERGM as a tool for the statistical analysis of social networks.
Logit and probit models are widely used in empirical sociological research. However, the common practice of comparing the coefficients of a given variable across differently specified models fitted to the same sample does not warrant the same interpretation in logits and probits as in linear regression. Unlike linear models, the change in the coefficient of the variable of interest cannot be straightforwardly attributed to the inclusion of confounding variables. The reason for this is that the variance of the underlying latent variable is not identified and will differ between models. We refer to this as the problem of rescaling. We propose a solution that allows researchers to assess the influence of confounding relative to the influence of rescaling, and we develop a test to assess the statistical significance of confounding. A further problem in making comparisons is that, in most cases, the error distribution, and not just its variance, will differ across models. Monte Carlo analyses indicate that other methods that have been proposed for dealing with the rescaling problem can lead to mistaken inferences if the error distributions are very different. In contrast, in all scenarios studied, our approach performs as least as well as, and in some cases better than, others when faced with differences in the error distributions. We present an example of our method using data from the National Education Longitudinal Study.
The measurement of residential segregation patterns and trends has been limited by a reliance on segregation measures that do not appropriately take into account the spatial patterning of population distributions. In this paper we define a general approach to measuring spatial segregation among multiple population groups. This general approach allows researchers to specify any theoretically based definition of spatial proximity desired in computing segregation measures. Based on this general approach, we develop a general spatial exposure/isolation index ( ), and a set of general multigroup spatial evenness/clustering indices: a spatial information theory index ( ), a spatial relative diversity index ( ), and a spatial dissimilarity index ( ). We review these and previously proposed spatial segregation indices against a set of eight desirable properties of spatial segregation indices. We conclude that the spatial exposure/isolation index *—which can be interpreted as a measure of the average composition of individuals' local spatial environments—and the spatial information theory index —which can be interpreted as a measure of the variation in the diversity of the local spatial environments of each individual—are the most conceptually and mathematically satisfactory of the proposed spatial indices.
We argue that social networks can be modeled as the outcome of processes that occur in overlapping local regions of the network, termed local social neighborhoods. Each neighborhood is conceived as a possible site of interaction and corresponds to a subset of possible network ties. In this paper, we discuss hypotheses about the form of these neighborhoods, and we present two new and theoretically plausible ways in which neighborhood-based models for networks can be constructed. In the first, we introduce the notion of a setting structure, a directly hypothesized (or observed) set of exogenous constraints on possible neighborhood forms. In the second, we propose higher-order neighborhoods that are generated, in part, by the outcome of interactive network processes themselves. Applications of both approaches to model construction are presented, and the developments are considered within a general conceptual framework of locale for social networks. We show how assumptions about neighborhoods can be cast within a hierarchy of increasingly complex models; these models represent a progressively greater capacity for network processes to “reach” across a network through long cycles or semipaths. We argue that this class of models holds new promise for the development of empirically plausible models for networks and network-based processes.