2013IS实证研究方法的讨论 DISCOVERING UNOBSERVED HETEROGENEITY


    RESEARCH ESSAY
    DISCOVERING UNOBSERVED HETEROGENEITY IN
    STRUCTURAL EQUATION MODELS TO
    AVERT VALIDITY THREATS1
    JanMichael Becker
    Department of Marketing and Brand Management University of Cologne
    Cologne 50923 GERMANY {jbecker@wisounikoelnde}
    Arun Rai
    Center for Process Innovation and Department of Computer Information Systems Robinson College of Business
    Georgia State University Atlanta GA 30303 USA {arunrai@gsuedu}
    Christian M Ringle
    Institute for Human Resource Management and Organizations Hamburg University of Technology (TUHH)
    Hamburg 21073 GERMANY {ringle@tuhhde} and
    Faculty of Business and Law University of Newcastle Callaghan NSW 2308 AUSTRALIA {christianringle@newcastleeduau}
    Franziska Völckner
    Department of Marketing and Brand Management University of Cologne
    Cologne 50923 GERMANY {voelckner@wisounikoelnde}
    1 A large proportion of information systems research is concerned with developing and testing models pertaining
    to complex cognition behaviors and outcomes of individuals teams organizations and other social systems
    that are involved in the development implementation and utilization of information technology Given the
    complexity of these social and behavioral phenomena heterogeneity is likely to exist in the samples used in IS
    studies While researchers now routinely address observed heterogeneity by introducing moderators a priori
    groupings and contextual factors in their research models they have not examined how unobserved hetero
    geneity may affect their findings We describe why unobserved heterogeneity threatens different types of
    validity and use simulations to demonstrate that unobserved heterogeneity biases parameter estimates thereby
    leading to Type I and Type II errors We also review different methods that can be used to uncover unobserved
    heterogeneity in structural equation models While methods to uncover unobserved heterogeneity in
    covariancebased structural equation models (CBSEM) are relatively advanced the methods for partial least
    squares (PLS) path models are limited and have relied on an extension of mixture regression—finite mixture
    partial least squares (FIMIXPLS) and distance measurebased methods—that have mismatches with some
    characteristics of PLS path modeling We propose a new method—predictionoriented segmentation (PLS
    POS)—to overcome the limitations of FIMIXPLS and other distance measurebased methods and conduct
    extensive simulations to evaluate the ability of PLSPOS and FIMIXPLS to discover unobserved heterogeneity
    in both structural and measurement models Our results show that both PLSPOS and FIMIXPLS perform
    1Ron Thompson was the accepting senior editor for this paper Ron Cenfetelli served as the associate editor
    The appendices for this paper are located in the Online Supplements section of the MIS Quarterly’s website (httpwwwmisqorg)
    MIS Quarterly Vol 37 No 3 pp 665694September 2013 665
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    well in discovering unobserved heterogeneity in structural paths when the measures are reflective and that
    PLSPOS also performs well in discovering unobserved heterogeneity in formative measures We propose an
    unobserved heterogeneity discovery (UHD) process that researchers can apply to (1) avert validity threats by
    uncovering unobserved heterogeneity and (2) elaborate on theory by turning unobserved heterogeneity into
    observed heterogeneity thereby expanding theory through the integration of new moderator or contextual
    variables
    Keywords Unobserved heterogeneity validity structural equation modeling partial least squares formative
    measures predictionoriented segmentation
    Introduction
    Assuming that data in empirical studies are homogeneous and
    represent a single population is often unrealistic in the social
    and behavioral sciences such as information systems man
    agement and marketing (Rust and Verhoef 2005 Wedel and
    Kamakura 2000) There may be significant heterogeneity in
    the data across unobserved groups and it can bias parameter
    estimates lead to Type I and Type II errors and result in
    invalid conclusions (Jedidi et al 1997) Consider the fol
    lowing technology acceptance model (TAM) example A
    researcher is interested in individuals’ intention to use an IT
    system or service (Davis et al 1989 Venkatesh 2000
    Venkatesh and Davis 2000 Venkatesh et al 2003) Informed
    by existing theory the researcher proposes a model in which
    perceived usefulness (PU) and perceived ease of use (PEOU)
    of the IT system explain intention to use the system (IU)
    (Figure 1) The empirical results reveal that PU and PEOU
    are equally important in explaining IU However the theory
    and model overlook the two underlying groups experienced
    IT users (Figure 1a segment 1) and inexperienced IT users
    (Figure 1a segment 2) Experienced users show a strong
    positive relationship between PU and IU and a weak or non
    significant relationship between PEOU and IU In contrast
    inexperienced users show a strong positive relationship
    between PEOU and IU and a weak or nonsignificant rela
    tionship between PU and IU (Figure 1a) In this scenario
    drawing inferences based on results from the overall sample
    would lead to Type I errors as we would be overgeneralizing
    the significant findings from the overall sample to the
    underlying user groups one with a nonsignificant estimate for
    PEOUIU and the other with a nonsignificant estimate for
    PUIU If the model is not refined to accommodate this
    unobserved heterogeneity a system that is unsuitable for
    either user group (ie one with average usefulness and
    average ease of use) may be provided to all users
    In addition a study may not find PEOU to be a significant
    predictor of IU because of unobserved heterogeneity across
    two groups of users (ie experienced versus inexperienced)
    If experienced users (Figure 1b segment 1) perceive an easy
    touse system (ie high PEOU) as being too simple to fulfill
    their needs they may show a strong negative relationship
    between PEOU and IU In contrast if inexperienced users
    (Figure 1b segment 2) show a strong positive relationship
    between PEOU and IU as in the first example a sign reversal
    occurs between the two groups with regard to the effect of
    PEOU on IU thereby leading to an overall nonsignificant
    effect of PEOU on IU and a Type II error
    Recent TAM models acknowledge existing heterogeneity by
    incorporating experience as a moderator of PEOU’s effect on
    IU However before its inclusion in the theory experienced
    versus inexperienced users represented unobserved hetero
    geneity that could lead to biased findings on the effects of PU
    and PEOU on IU This illustration shows how not accounting
    for unobserved heterogeneity can lead to misinterpretations
    and invalid conclusions in IS research—a point we emphasize
    later in the paper based on a review of 12 metaanalysis
    studies on key IS phenomena (see Table A1 in Appendix A)
    Despite the threats to validity from unobserved heterogeneity
    there are important gaps in the IS literature about the specific
    threats to validity and how to safeguard against them
    (1) While IS studies now routinely address observed hetero
    geneity by introducing moderators a priori groupings
    contextual factors and control variables in their research
    models they have not considered unobserved hetero
    geneity in their data In fact none of the papers ap
    pearing in the field’s two most widely recognized jour
    nals (MIS Quarterly and Information Systems Research)
    over the last 20 years that have developed and tested
    structural equation models have examined unobserved
    heterogeneity Our first research objective is to introduce
    the concept of unobserved heterogeneity in the IS litera
    ture and to show how IS researchers can safeguard
    against biases and facilitate theory development
    (2) While research in some fields notes that unobserved
    heterogeneity threatens empirical results and their inter
    pretation a systematic analysis of the threats to specific
    666 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    (a) TAM Example 1 (b) TAM Example 2
    Figure 1 Examples for Unobserved Heterogeneity in TAM
    types of validity is missing in the literature Our second
    research objective is to evaluate the implications of
    unobserved heterogeneity for four types of validity (ie
    instrument internal statistical conclusion and external
    validity Cook and Campbell 1976 1979 Straub 1989)
    thereby broadening our understanding of the specific
    validity threats that arise from unobserved heterogeneity
    (3) In structural equation modeling (SEM) unobserved
    heterogeneity is not only a validity threat for the struc
    tural model but also for the measurement model regard
    less of whether the measures are reflective or formative
    While heterogeneity in reflective measures has been
    discussed in terms of measurement equivalence or invari
    ance (MEI) (eg Steenkamp and Baumgartner 1998
    Vandenberg and Lance 2000) the implications of unob
    served heterogeneity for formative measures have not
    been examined Our third research objective is to evalu
    ate the implications of unobserved heterogeneity for
    formative measures
    (4) In contrast to covariancebased SEM (CBSEM eg
    Jöreskog 1978 1982) research on partial least squares
    (PLS) path modeling (eg Chin 1998 Lohmöller 1989
    Wold 1982) has paid limited attention to unobserved
    heterogeneity Only recently has a method been pro
    posed to detect unobserved heterogeneity in PLS path
    models finite mixture partial least squares (FIMIXPLS
    Hahn et al 2002 Sarstedt and Ringle 2010) However
    FIMIXPLS does not account for heterogeneity in the
    measurement model and assumes multivariate normal
    distributions for latent variables Furthermore there is
    limited evidence of this method’s performance in dis
    covering unobserved heterogeneity Our fourth research
    objective is to propose and evaluate a new method PLS
    predictionoriented segmentation (PLSPOS) which does
    not follow distributional assumptions and uncovers
    unobserved heterogeneity not only in the structural model
    but also in the measurement model
    (5) Researchers facing the problem of unobserved hetero
    geneity in their empirical work lack guidelines on how to
    apply methods systematically to uncover unobserved
    heterogeneity Therefore our fifth research objective is
    to develop an unobserved heterogeneity discovery
    (UHD) process to guide researchers in applying methods
    to ensure the validity of findings and to elaborate theory
    by turning unobserved heterogeneity into observed
    heterogeneity
    By addressing the above research objectives we make six
    contributions First we provide evidence and reasoning for
    why unobserved heterogeneity is an important issue in IS
    research Second we demonstrate that unobserved hetero
    geneity in SEM has implications not only for the structural
    model but also for measurement models Third we identify
    the implications of unobserved heterogeneity for different
    types of validity and surface the importance of uncovering
    unobserved heterogeneity to avoid validity threats Fourth
    we introduce the new PLSPOS method for detecting unob
    served heterogeneity This method is specifically developed
    to fit PLS path modeling as it employs a predictionoriented
    and nonparametric approach and uncovers heterogeneity in
    both the structural model and the (formative) measurement
    MIS Quarterly Vol 37 No 3September 2013 667
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    models and thereby overcomes the limitations of FIMIXPLS
    and other distance measurebased methods Fifth we evaluate
    FIMIXPLS and PLSPOS using an extensive simulation
    study and generate important insights into the performance of
    the two methods in uncovering unobserved heterogeneity in
    PLS path models Sixth we provide a UHD process to guide
    researchers in discovering and addressing unobserved
    heterogeneity in structural equation models
    Concept of Heterogeneity and its
    Treatment in IS Research
    Researchers can obtain different parameter estimates when
    they consider differences among observations relative to when
    they overlook them However heterogeneity among observa
    tions is not necessarily captured by variables that are precon
    ceived by the researcher and specified by existing theory as
    it can exist beyond these previously identified variables
    (Jedidi et al 1997) As a consequence it is necessary to
    differentiate between the following two types of hetero
    geneity (1) observed heterogeneity when subpopulations are
    defined a priori based on known variables and (2) unobserved
    heterogeneity when the subpopulations in the data are
    unknown (Lubke and Muthén 2005)
    Observed Heterogeneity
    Observed heterogeneity occurs when differences in parameter
    estimates between groups are expected a priori for the phen
    omenon—that is when group differences are explained by
    existing theory that incorporates moderators or contextual
    factors Examples of such moderators or contextual factors
    considered in IS research include individual cultural differ
    ences (eg individualism versus collectivism Srite and Kara
    hanna 2006) individual demographic differences (eg gen
    der income levels and education Hsieh et al 2008 Venka
    tesh et al 2003) and organizational demographic differences
    (eg large versus small firms Rai et al 2006) In our TAM
    example from earlier existing theory expects genderbased
    heterogeneity in structural paths (ie men are expected to
    have a stronger relationship between PU and IU and women
    are expected to have a stronger relationship between PEOU
    and IU) (eg Venkatesh and Morris 2000) Moreover
    existing theory expects contextual variables such as volun
    tariness or task type (eg Venkatesh and Davis 2000) or
    psychographic variables such as personal innovativeness and
    computer attitude to cause heterogeneity in the relationships
    among the TAM constructs (eg Venkatesh and Bala 2008)
    Unobserved Heterogeneity
    When theory does not assume heterogeneity even though it
    exists or when theory indicates heterogeneity but the specified
    group variables do not sufficiently capture it in the popula
    tion unobserved heterogeneity occurs In such situations
    researchers need to uncover unobserved heterogeneity by seg
    menting data to form homogenous groups If the differences
    uncovered by segmentation can be explained post hoc using
    contextual or demographic variables (eg culture gender
    experience etc) making the groups accessible theory can be
    expanded accordingly and unobserved heterogeneity is
    turned into observed heterogeneity for future studies If the
    differences cannot be explained by wellknown contextual
    variables the researcher has to consider complementary
    theoretical explanations for the phenomenon
    Treatment of Heterogeneity in IS Research
    Given the complexity of the social and behavioral phenomena
    tackled in IS research heterogeneity is likely to exist in
    samples that are used to develop test and refine models If
    this heterogeneity is not uncovered and controlled the (unob
    served) heterogeneity can bias results and conclusions (eg
    Ansari et al 2000 Johns 2006) Consequently unobserved
    heterogeneity is receiving increasing attention in related disci
    plines (eg marketing where scholars study similar complex
    phenomena pertaining to consumer choices and preferences
    the alignment of firmlevel marketing strategies interorgani
    zational relationships and the business value of tangible and
    intangible resources) to safeguard against biases and probe the
    underlying reasons for unobserved heterogeneity (eg
    Rigdon et al 2010) This enhances the likelihood of
    obtaining valid results as well as of generating greater theo
    retical contributions Methodologists in marketing econo
    metrics and psychology have proposed advances to uncover
    unobserved heterogeneity in various approaches—for
    instance regression analysis (DeSarbo and Cron 1988 Späth
    1979 Wedel and DeSarbo 1994) CBSEM (eg Ansari et al
    2000 Jedidi et al 1997 Muthén 1989) panel data models
    (eg Allenby and Rossi 1998 Popkowski Leszczyc and Bass
    1998) and conjoint analysis (eg DeSarbo et al 1995
    Gilbride et al 2006 Lenk et al 1996)
    While IS studies now routinely address observed hetero
    geneity by introducing moderators a priori groupings con
    textual factors and control variables in their research models
    they have not examined threats to validity due to unobserved
    heterogeneity Our review of 12 metaanalysis studies that
    synthesize the findings of empirical research across various IS
    phenomena (eg technology acceptance IT investment pay
    668 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    off IT innovation adoption IS implementation success and
    group support systems) reveals that all of them identify
    inconsistent conflicting or mixed findings heterogeneity of
    effect sizes (Wang and Keil 2007 p 9) wide variation in
    the predicted effects (King and He 2006 p 740) and corre
    lations that vary across studies more than would be produced
    by sampling error (Wu and Lederer 2009 p A6) (see
    Table A1 in Appendix A) Most of these 12 metaanalysis
    studies note that these inconsistencies may be caused by the
    omission of key contextual variables or moderators How
    ever investigating the known moderators or contextual
    variables controls for observed heterogeneity (Haenlein and
    Kaplan 2011) but as long as these moderators and contextual
    variables are not specified in theory population heterogeneity
    will remain unobserved and threatens model validity (In the
    next section we discuss how unobserved heterogeneity biases
    estimates and causes Type I and II errors) Furthermore
    uncovering unobserved heterogeneity at the study level
    accelerates the theorydevelopment cycle by generating
    insights into relationships among constructs (Edmondson and
    McManus 2007) In a later section we describe a UHD
    process where uncovering unobserved heterogeneity facili
    tates abduction (by raising the possibilities of rival explana
    tions not previously considered Van de Ven 2007) directing
    researchers to identify variables that account for unobserved
    heterogeneity and through this process make segments
    accessible and turn unobserved heterogeneity into observed
    heterogeneity (eg by discovering moderators and grouping
    variables) This introduction of constructs to capture formerly
    unobserved heterogeneity revises models and theoretical
    explanations making it possible for the revised models to be
    tested in future research
    Effects of Heterogeneity on Structural
    Equation Models
    Unobserved Heterogeneity in the
    Structural Model
    In the context of SEM heterogeneity can affect the structural
    model the measurement model (formative and reflective) or
    both (eg Ansari et al 2000 Qureshi and Compeau 2009)
    Unobserved heterogeneity can influence path coefficients in
    the structural model because the parameter estimates are
    determined based on the overall sample which pools obser
    vations across the underlying (unobserved) groups As a
    result researchers may encounter the following biases
    (1) biased parameter estimates of structural paths (2) non
    significant estimates at the group level becoming significant
    at the overall sample level that combines (unobserved)
    groups (3) sign differences in the parameter estimates across
    (unobserved) groups being masked as nonsignificant results
    at the overall sample level that combines (unobserved)
    groups and (4) decreased predictive power of the model (R²
    of the endogenous variables) These biases can lead to Type I
    and Type II errors and invalid inferences
    To substantiate that these biases occur due to unobserved
    heterogeneity we conducted a simulation of a PLS path
    model with the following three situations with two unob
    served groups (1) the parameter estimates across the groups
    have the same sign but differ in absolute values (2) the
    parameter estimates across the groups have opposite signs
    and (3) the parameter estimates are nonsignificant for one
    group but significant for the other Table 1 summarizes the
    findings (see Appendix D for details)
    The results show that unobserved heterogeneity biases the
    parameter estimates decreases the R² and increases the risk
    of Type I and Type II errors Specifically in all three simu
    lated situations biases in the parameter estimates distort effect
    sizes and cause misinterpretation of the parameter values
    which is especially problematic for comparative hypotheses
    (eg path coefficient 1 > path coefficient 2) When the
    groupspecific parameters show inconsistent signs (ie
    situation 2 in which signs are reversed across the groups) and
    when one of the groups involves nonsignificant parameters
    while the other does not (ie situation 3) Type I and Type II
    errors are exacerbated by the following (1) If a researcher
    overlooks unobserved heterogeneity and there is a significant
    nonzero relationship between the constructs as the overall
    sample estimate this researcher is incorrectly overgenera
    lizing the significant relationship that exists in the first
    segment thereby leading to a Type I error with respect to the
    second segment2 (2) If a researcher overlooks unobserved
    heterogeneity and obtains a nonsignificant relationship
    between the constructs as the overall sample estimate this
    researcher may overgeneralize the nonsignificant finding
    which exists only in the second segment thereby leading to
    a Type II error with respect to the first segment In contrast
    when all parameters are significant and show the same sign
    (situation 1) it is unlikely that Type II errors will occur in
    this situation the occurrence of Type II errors depends on the
    effect size and the degree to which the increase in standard
    errors due to unobserved heterogeneity is compensated by the
    increased power of the larger sample size due to combining
    the groups The R² decreases in all situations implying an
    2This does not mean that there will be a Type I error in general (ie for both
    segments) but only with respect to segment 2 where the true effect is zero
    To be specific the overall sample estimate cannot show a significant non
    zero relationship because of unobserved heterogeneity when all segments
    have a true zero relationship
    MIS Quarterly Vol 37 No 3September 2013 669
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table 1 Conclusions from the Simulation Study on Heterogeneity Effects
    True Group Parameters
    (heterogeneity is uncovered)
    Overall Parameter Estimates
    (heterogeneity is not uncovered)
    Explanation for Type I
    and Type II ErrorsSituation
    Group
    1
    Group
    2 Biased
    Type I
    Error
    Type II
    Error
    Lower

    1 Significant in all groups
    with consistent signs
    ++Yes V Depends Yes Increase in standard errors
    vs increased sample size ––Yes V Depends Yes
    2 Significant in all groups
    with inconsistent signs –+Yes V Likely Yes Effects cancel each other
    3 Significant in some
    groups but not in others + – 0 Yes Likely Likely Yes Depends on the effect size
    Notes + significantly positive – significantly negative 0 nonsignificant V not possible
    inferior model fit to the overall sample the decrease in R² is
    greater when groupspecific effect sizes are high however R²
    is almost unaffected when the groupspecific effects are low
    Unobserved Heterogeneity in the
    Measurement Model
    Measurement model specification requires the consideration
    of the nature of the relationship between constructs and
    measures There are two types of measurement models
    reflective and formative measures (Diamantopoulos and
    Winklhofer 2001 Jarvis et al 2003) In reflective measures
    changes in the construct are reflected in changes in all of its
    indicators and the direction of causality is from the construct
    to the indicators Reflective indictors are assessed in terms of
    their loadings which entails the simple correlation between
    the indicator and the construct In formative measures the
    indicators do not reflect the underlying construct but are com
    bined to form it without any assumptions about the intercorre
    lation patterns among them The direction of causality is from
    the indicators to the construct and the weights of formative
    indicators represent the importance of each indicator in
    explaining the variance of the construct (Edwards and
    Lambert 2007 Petter et al 2007 Wetzels et al 2009)
    Unobserved heterogeneity can lead to differences between
    measurement model weights and loadings across groups If
    the construct’s measures are reflective unobserved hetero
    geneity may result in different loadings when respondents
    across groups interpret and respond to measures differently or
    when they provide information with different degrees of
    accuracy (Ansari et al 2000) Thus when reflective measures
    are not equivalent across groups MEI is not established (eg
    Steenkamp and Baumgartner 1998 Vandenberg and Lance
    2000) In this case the construct does not capture the same
    theoretical meaning across groups implying that differences
    in the construct’s relationships with other constructs cannot be
    compared across groups That is the groupspecific param
    eters are only interpretable at the group level and the data
    should not be pooled across groups For example when con
    sidering reflective measures of PU users’ understanding of
    usefulness can differ significantly across groups If this is the
    case one cannot combine the groups into an overall sample
    because the construct measured does not capture the same
    meaning across groups The relationship between PU and
    other constructs would be biased as a result of the absence of
    invariant measurement However the lack of MEI arising
    from heterogeneity provides valuable information that struc
    tural parameters should not be compared between groups and
    that the data across the groups should not be combined As
    such ignoring the heterogeneity and interpreting results based
    on the overall sample would lead to invalid conclusions
    In contrast when a construct’s measures are formative unob
    served heterogeneity can lead to differences in the formative
    indicators’ weights across groups While recent research has
    discussed MEI in formative measures (Diamantopoulos and
    Papadopoulos 2010) it is important to uncover formative
    indicator weight differences due to unobserved heterogeneity
    in order to avoid ambiguous interpretations Formative indi
    cators cause variance in the construct and can be interpreted
    as actionable attributes of a construct The weights of forma
    tive indicators represent the relative importance of the con
    struct’s different facets Therefore the problems associated
    with unobserved heterogeneity in formative measures are
    similar to those that occur in the structural model Conse
    quently ignoring differences in formative indicator weights
    due to unobserved heterogeneity can bias parameter estimates
    and lead to Type I and Type II errors Thus when researchers
    find formative indicator weights to be unstable and nonsigni
    ficant in addition to exploring multicollinearity (Cenfetelli
    670 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    and Bassellier 2009) they should also explore unobserved
    heterogeneity
    As an example assume that service quality (SERVQUAL) is
    measured using the following five formative indicators
    (1) tangibles (2) reliability (3) assurance (4) empathy and
    (5) responsiveness (eg Cenfetelli and Bassellier 2009
    Collier and Bienstock 2009 Parasuraman et al 1988) Some
    customers might favor the communication facets (eg
    empathy and responsiveness) when they evaluate service
    quality while others might favor the trust facets (eg assur
    ance and reliability) in their evaluation These differences in
    customer perceptions result in different measurement weights
    across the groups although the underlying theoretical con
    struct of service quality remains the same For example two
    equally sized groups have measurement weights of wg1 [6
    6 6 0 0] for a certain formative construct in one group and
    wg2 [2 2 2 6 6] in the other group Combining these
    two groups in the overall sample results in equal relative
    importance (weights) for all indicators with measurement
    weights of w [4 4 4 3 3] for the overall sample As a
    consequence the interpretation of the weights estimated using
    the overall sample is misleading and the formative measures
    based on the overall sample represent neither the first group
    nor the second Given this bias in the formative measures for
    service quality the relationship between service quality and
    other constructs (eg customer satisfaction) is also likely to
    be biased
    Implications of Unobserved Heterogeneity
    for Model Validity
    If unobserved heterogeneity characterizes the data and results
    are based on the overall sample the estimated model lacks
    validity because it will not uncover the true effects of the
    underlying groups In a broad sense validity is the extent to
    which a method (ie the design the model or the construct)
    measures what it claims to measure We elaborate on why
    unobserved heterogeneity affects the major types of validity—
    (1) internal (2) instrumental (including content construct
    and criterion validity and reliability) (3) statistical conclu
    sion and (4) external (eg Cook and Campbell 1976 1979
    Heeler and Ray 1972 Straub 1989) See Table 2 for defini
    tions of each type of validity and explanations of how unob
    served heterogeneity threatens it
    Unobserved heterogeneity is a threat to internal validity
    because contextual or group variables that affect results are
    overlooked thereby resulting in an incomplete model The
    observations across the 12 metaanalyses that we discussed
    earlier show that inconsistent findings arise when contextual
    or group variables are omitted Uncovering these variables
    and improving theory through the discovery of unobserved
    heterogeneity safeguards against internal validity threats
    In addition unobserved heterogeneity threatens statistical
    conclusion validity Analyzing the overall sample without
    accounting for heterogeneity increases standard errors and
    reduces (averages) effect sizes thereby biasing estimates and
    leading to Type I and Type II errors (The simulations in the
    previous section show how statistical conclusion validity is
    threatened by unobserved heterogeneity)
    Our earlier discussion of unobserved heterogeneity shows that
    it can bias the measurement model estimates of constructs
    thereby adversely affecting instrument validity There is a
    particular threat to reliability (internal consistency) when
    measures show different correlation patterns or error vari
    ances between groups For example experienced users might
    have a different understanding of a system’s usefulness com
    pared to inexperienced users thereby leading to different
    correlation patterns for the PU construct’s indicators The
    respondents’ experience can also affect PU’s error variance
    between groups as inexperienced users might have higher
    variability in their responses than experienced users who have
    a clearer understanding of the system’s usefulness
    Unobserved heterogeneity can also threaten construct validity
    because differences in indicator loadings and weights across
    groups will not be detected As such an evaluation of con
    struct validity based on the overall sample while overlooking
    unobserved heterogeneity will not reveal the true group
    specific measures of the constructs thereby risking not
    detecting if the construct captures a different phenomenon for
    each group Moreover if the measures derived based on the
    overall sample do not represent the true construct (eg PU)
    the biased construct can lead to invalid inferences on relation
    ships with other constructs thereby threatening criterion
    validity Both threats are regularly addressed when testing for
    MEI in multigroup models (ie observed heterogeneity) (see
    Steenkamp and Baumgartner 1998 Vandenberg and Lance
    2000) but are usually overlooked in the context of unobserved
    heterogeneity
    In contrast unobserved heterogeneity typically does not affect
    content validity because the constructs’ measures are normally
    the same across groups and are grounded in theory However
    an increase in the value of a formative measure’s error term
    due to unobserved heterogeneity can lead to misinterpre
    tations as a high error term is typically associated with the
    construct measure’s incompleteness (Diamantopoulos et al
    2008)
    MIS Quarterly Vol 37 No 3September 2013 671
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table 2 Implications of Unobserved Heterogeneity for Model Validity
    Type of
    Validity What is It
    Threats Due to Unobserved
    Heterogeneity Why Is It a Threat
    Internal
    Validity
    • Is the effect due to
    unhypothesized
    variables
    • Are there rival
    explanations for the
    findings or just one
    single explanation
    • There are other viable
    explanations for the findings
    namely group differences that
    are not accounted for
    • The observed effects are a result of unhypothesized andor
    unmeasured variables (ie the groups and corresponding
    explanatory variables)
    • Example the underlying theory does not include
    differences in the technology acceptance between
    experienced and inexperienced users
    Instrumental Validity
    Content
    Validity
    • Do the indicators
    accurately reflect the
    theoretical domain
    Formative & Reflective
    • In general heterogeneity does
    not affect content validity as
    content validity is grounded in
    theory
    Formative
    • The error term of the formative
    construct likely increases due to
    unobserved heterogeneity which
    can be mistakenly interpreted as
    lack of content validity (Type II
    Error)
    • The empirically relevant (ie significant) set of indicators
    may vary across groups
    • Varying nonsignificant indicators across groups indicate
    problems with MEI but this is a problem of construct
    validity in the sense of (not) capturing the right
    phenomenon
    • Nonsignificant indicators should remain in the model if
    theoretically relevant
    • Following Diamantopoulos et al (2008) the error term in
    formative constructs represents those aspects of the
    construct domain not represented by the indicators
    Understanding the error term in this way and assessing it
    without capturing unobserved heterogeneity may indicate
    insufficient content validity although all important indicators
    are included in the formative construct
    Construct
    Validity
    • Are the chosen
    measures repre
    senting the true con
    struct of the
    phenomenon
    • Are the operationali
    zations of the
    constructs correct
    Formative & Reflective
    • Indicator weightsloadings
    estimated with the assumption
    that no underlying groups exist
    are biased if groups actually
    exist
    • For formative measures differences in the importance of
    indicators across groups lead to different measurement
    weights although the phenomenon is still the same
    • For reflective measures when MEI is established across
    groups (ie there are no differences in the weights
    loadings) there is no threat of unobserved heterogeneity to
    construct validity Otherwise the construct captures a
    different phenomenon for each group Combining the
    measures at the overall sample level is not allowed
    Criterion
    Validity
    • Are inferences from
    the construct to a
    related behavioral
    criterion of interest
    accurate
    Formative & Reflective
    • Differences in construct
    perceptions across groups (ie
    different weights loadings) lead
    to biased construct scores
    which in turn influence (bias)
    the estimated relationship with
    other constructs
    • The measures based on the overall sample do not
    represent the true groupspecific measures of the
    constructs This causes problems when interpreting the
    construct scores or their relationships with other constructs
    in the model
    • For reflective measures when there is no MEI established
    across groups the apparently different phenomena across
    groups have varying and incomparable relationships with
    other constructs
    Reliability
    • Are the measures
    accurate
    • Are the measures
    consistent
    TestRetest Reliability
    (Formative & Reflective)
    • Not affected
    Internal Consistency (Reflective)
    • Reliability (eg Cronbach’s
    alpha) at the overall sample level
    is negatively influenced by the
    lack of MEI across groups
    • Repeating the measurement with the same observations
    under the same conditions should lead to the same results
    on the overall and group levels
    • Different correlation patterns across groups for a reflective
    perceived usefulness construct can lead to an average
    correlation pattern on the overall sample level which does
    not show appropriate internal consistency
    Statistical
    Conclusion
    Validity
    • Have adequate
    sampling procedures
    appropriate statistical
    tests and reliable
    measurements been
    used
    • Heterogeneous samples may
    lead to higher standard errors or
    lower effect sizes thereby
    influencing the power of tests
    • Biased estimates Type I and
    Type II errors
    • Path coefficients for relationships between constructs (eg
    ease of use and intention to use) might have higher
    standard errors on the overall sample than in their
    underlying groups indicating a variety of different
    coefficients across user groups
    • This also applies to formative measurement weights
    External
    Validity
    • Are findings
    generalizable to other
    populations and
    conditions
    • Interpretations of the overall
    sample may be ambiguous and
    misleading
    • Results cannot be generalized
    easily as they are valid for only a
    special condition of the model
    • Analyzing population differences reveals more general
    conclusions about the model than those from the overall
    sample
    • Example Based on the overall sample level usefulness
    has the same importance as ease of use However there
    are no users who value usefulness and ease of use
    equally rather there are two distinct groups of experienced
    and inexperienced users
    672 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Finally if unobserved heterogeneity is not uncovered there
    is a threat to external validity (ie the ability to generalize
    findings beyond the current population and context) because
    the overall sample results are not representative of the under
    lying groups As findings are averaged across groups results
    obtained using the overall sample cannot be generalized to
    different groups The observation of inconsistent conflicting
    or mixed findings in the 12 metaanalyses in Table A1
    (Appendix A) also show that the results of one study often
    cannot be generalized to other studies (indicating low external
    validity) with unobserved heterogeneity being one of the
    plausible reasons
    Because of these threats to the different types of validity it is
    important to uncover heterogeneity in data that may otherwise
    lead to invalid conclusions Next we present an overview of
    methods to uncover unobserved heterogeneity in structural
    equation models that researchers can apply to overcome
    threats to validity due to unobserved heterogeneity
    Uncovering Heterogeneity in Structural
    Equation Models
    In this section we first synthesize and compare different
    methods in SEM (ie CBSEM and PLS path modeling) to
    uncover observed and unobserved heterogeneity Given the
    objectives of our paper we focus primarily on methods in
    SEM to uncover unobserved heterogeneity3 We also intro
    duce a new method to address some of the limitations of
    existing methods to uncover unobserved heterogeneity in PLS
    path models
    Existing Methods to Uncover Observed
    Heterogeneity in SEM
    SEM methods to address observed heterogeneity are now
    commonly applied in the social and behavioral sciences
    including information systems The first category of methods
    identifies homogenous groups of observations (eg indi
    viduals) a priori based on grouping variables (eg psycho
    graphic or sociodemographic) A multigroup analysis
    reveals the heterogeneity between the groups by testing for
    differences across groupspecific parameter estimates Exam
    ples of these methods for PLS path modeling can be found in
    Chin and Dibbern (2010) Sarstedt et al (2011b) and Qureshi
    and Compeau (2009) and for CBSEM in Jöreskog (1971) and
    Sörbom (1974) The second category of methods aims at
    identifying moderating factors that explain heterogeneity in
    specific structural model relationships Examples of these
    methods in PLS path modeling can be found in Chin et al
    (2003) Goodhue et al (2007) and Henseler and Chin (2010)
    and for CBSEM in Jaccard and Wan (1995) Jöreskog and
    Yang (1996) and Klein and Moosbrugger (2000) Uncovering
    observed heterogeneity with both types of methods requires
    a priori knowledge about differences across groups Conse
    quently these two types of methods do not account for unob
    served heterogeneity—that is differences across groups that
    are not informed by existing theory and are unknown a priori
    Existing Methods to Uncover Unobserved
    Heterogeneity in SEM
    The next sections present methods in CBSEM and PLS path
    modeling to uncover unobserved heterogeneity
    CBSEM Methods to Uncover
    Unobserved Heterogeneity
    In CBSEM the following two primary methods have been
    developed to uncover unobserved heterogeneity (1) finite
    mixture models that extend multigroup CBSEM (Arminger
    et al 1999 Dolan and van der Maas 1998 Jedidi et al 1997)
    and (2) hierarchical Bayesian models that extend multilevel
    CBSEM (Ansari et al 2000 Cai and Song 2010 Lee and
    Song 2003) Table 3 presents a summary of these CBSEM
    methods
    Finite mixture models for CBSEM were developed by Jedidi
    et al (1997) Arminger et al (1999) and Dolan and van der
    Maas (1998) These models (1) assume that data originate
    from subpopulations (groups) in the overall population that is
    a mixture of them and (2) generalize multigroup CBSEM
    (Jöreskog 1971 Sörbom 1974) to unobserved latent groups
    assuming the structural parameters (covariance) and factor
    means to be mixtures of components The method used for
    finite mixture models assigns the observations to a pre
    specified number of groups by means of fuzzy (probabilistic)
    clustering thereby permitting the simultaneous estimation of
    groupspecific parameters (Jedidi et al 1997) Consequently
    finite mixture models address unobserved heterogeneity in the
    data by grouping observations and estimating groupspecific
    3There are several methods to uncover both observed and unobserved
    heterogeneity in other methodological contexts—for example regression
    analysis (DeSarbo and Cron 1988 Späth 1979 Wedel and DeSarbo1994)
    panel data models (Allenby and Rossi 1998 Popkowski Leszczyc and Bass
    1998) and conjoint analysis (DeSarbo et al 1995 Gilbride et al 2006 Lenk
    et al 1996) Given the objectives of our paper and for reasons of scope we
    do not review these methods
    MIS Quarterly Vol 37 No 3September 2013 673
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table 3 Overview of CBSEM Methods to Uncover Unobserved Heterogeneity in SEM
    Method Description
    Parameter
    Estimates Limitations
    Illustrative
    Applications
    Finite Mixture
    Models for
    CBSEM
    Jedidi et al
    1997
    Generalizes the multigroup SEM
    for unobserved groupspecific
    differences in the following
    • Structural parameters
    (covariance)
    • Factor means
    For a
    defined
    number of
    groups
    • Number of groups is unknown to the
    researcher
    • Does not account for heterogeneity
    in the covariance of the measures
    • Requires large number of
    observations (large sample sizes)
    Bart et al 2005
    DeSarbo et al 2006
    Reinecke 2006
    Tueller and Lubke 2010
    Hierarchical
    Bayesian
    CBSEM
    Ansari et al
    2000
    Generalizes the multilevel SEM
    for unobserved individualspecific
    differences in the following
    • The covariance structure (ie
    structural parameters
    measurement error variance
    and factor covariance)
    • Factor means
    Specific
    estimates
    for
    individuals
    • Needs continuous data with multiple
    observations per individual
    • Only works for recursive structural
    equation models
    • Not available in standard software
    packages
    Luo et al 2008
    parameters simultaneously thus avoiding wellknown biases
    that occur when groupspecific models are estimated sep
    arately (Fraley and Raftery 2002) Several applications and
    simulation studies (eg Arminger et al 1999 Henson et al
    2007 Jedidi et al 1997 Tueller and Lubke 2010) illustrate
    the usefulness of finite mixture models by showing how struc
    tural relationships among factors differ across unobserved
    groups
    In contrast to finite mixture models hierarchical Bayesian
    models for CBSEM which were developed by Ansari et al
    (2000) do not assume heterogeneity among a defined number
    of groups of individuals but estimate unobserved hetero
    geneity at the individual4 level using a random coefficients
    model Specifically they uncover unobserved heterogeneity
    in the factor means and covariance structure (ie structural
    parameters measurement error variance and factor co
    variance) thereby generalizing multilevel SEM models
    (Muthén 1994 RabeHesketh et al 2004) that only account
    for heterogeneity in the mean structure Hierarchical Bayes
    ian CBSEM provides individualspecific estimates for the
    factor scores structural coefficients and other model param
    eters (Ansari et al 2000) However this method requires
    continuous data with multiple observations per individual to
    estimate individuallevel heterogeneity and the method is
    limited to recursive structural equation models There has
    been some work (eg Cai and Song 2010 Lee and Song
    2003) to extend the method to dichotomous variables and
    missing data and evaluate the performance of these methods
    While both the finite mixture and the hierarchical Bayesian
    CBSEM models have been the subject of extensive method
    ological research finite mixture models have been applied in
    empirical CBSEM research to a greater extent An in
    creasing number of applications especially in the marketing
    econometrics and sociology literatures have utilized finite
    mixture models to uncover unobserved heterogeneity thereby
    improving theoretical and practical implications (eg Bart et
    al 2005 DeSarbo et al 2006 Reinecke 2006 Tueller and
    Lubke 2010)
    PLS Path Modeling Methods to Uncover
    Unobserved Heterogeneity
    Although PLS path modeling research has paid limited
    attention to unobserved heterogeneity in comparison to CB
    SEM research multiple PLS segmentation methods have been
    proposed We draw on Sarstedt’s (2008) review of these
    methods to identify the following key PLS segmentation
    methods
    1 The PATHMOX (path modeling segmentation tree)
    algorithm (Sánchez 2009 Sánchez and Aluja 2006)5
    This algorithm requires the a priori specification of
    explanatory variables that are not used as indicators in
    the PLS path model to discover segments While this
    feature can be advantageous for interpreting discovered
    segments it limits the heterogeneity discovery process to
    the selected explanatory variables (and their specified
    4An individual can be a person group team or company that is the object of
    investigation in a study and has provided several observations (eg over time
    or within a group)
    5PATHMOX is available in the pathmox package of the statistical software
    R (Sánchez and Aluja 2012)
    674 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    order) that are provided as inputs to the PATHMOX
    algorithm (Sarstedt 2008)
    2 Distance measurebased methods These methods deter
    mine the distance of an observation to its current group
    and all other given groups in order to decide on this
    observation’s group membership PLS typological path
    modeling (PLSTPM Squillacciotti 2005 Squillacciotti
    2010) and its enhancement—responsebased detection of
    respondent segments in PLS (REBUSPLS Esposito
    Vinzi et al 2010 Esposito Vinzi et al 2008)—are the
    key methods in this class6 Both PLSTPM and REBUS
    PLS7 can only uncover unobserved heterogeneity in PLS
    path models with reflective measures (ie they cannot be
    applied to path models that include formative measures)
    (Esposito Vinzi et al 2010 Esposito Vinzi et al 2008)
    3 The finite mixture partial least squares method (FIMIX
    PLS) (Hahn et al 2002)8 This method assumes that each
    endogenous latent variable is distributed as a finite
    mixture of conditional multivariate normal densities It
    captures heterogeneity by estimating the probabilities of
    segment memberships for each observation in order to
    optimize the likelihood function Consequently it impli
    citly maximizes the segmentspecific explained variance
    (ie the R² value) which is part of the likelihood func
    tion While FIMIXPLS is generally applicable to PLS
    path models regardless of whether the latent variables are
    measured reflectively or formatively it does not account
    for the heterogeneity in the measurement models More
    over the assumption that the endogenous latent variables
    have multivariate normal distribution is inconsistent with
    the nonparametric PLS path modeling which does not
    impose distributional assumption
    We select FIMIXPLS to benchmark the performance of the
    new PLSPOS method for two reasons First based on an
    assessment of the benefits and limitations of these methods
    Sarstedt (2008 p 152) concludes To sum up FIMIXPLS
    can presently be viewed as the most comprehensive and
    commonly used approach to capture heterogeneity in PLS
    path modeling Second as our research objectives include
    developingevaluating a method (ie PLSPOS) that detects
    unobserved heterogeneity in both the structural model and
    formative measures we conduct simulations with both forma
    tive and reflective models While PLSTPM and REBUS
    PLS are not applicable to PLS path models that include
    formative measures FIMIXPLS is applicable to PLS path
    models regardless of the use of reflectiveformative measure
    ment We next elaborate briefly on FIMIXPLS’ assump
    tions procedure and limitations
    FIMIXPLS follows the assumption that heterogeneity is
    concentrated in the parameters of the estimated relationships
    among latent variables (ie the path coefficients in the struc
    tural model) Based on this concept FIMIXPLS assigns
    observations to a prespecified number of groups by means of
    probabilistic clustering to optimize the likelihood function
    (which implicitly maximizes the segmentspecific explained
    variance as part of the likelihood function) thereby simul
    taneously estimating the model parameters for the groups and
    ascertaining the heterogeneity of the data for the PLS path
    model It adapts a finite mixture regression model that in
    contrast to conventional mixture regression models can be
    comprised of a multitude of interrelated endogenous latent
    variables (Hahn et al 2002)
    Compared to the finite mixture and hierarchical Bayesian CB
    SEM FIMIXPLS does not account for groupspecific mean
    differences of latent variables because it is based on the
    standardized results of an overall sample PLS path model In
    addition FIMIXPLS builds on the latent variable scores of
    the PLS path model estimation using the full set of data and
    thus only focuses on the relationships among latent variables
    Consequently it is generally applicable to PLS path models
    (regardless of the latent variables being measured reflectively
    or formatively) but does not account for the heterogeneity in
    the measurement models (eg the factor covariance or the
    measurement error variance) (Hahn et al 2002 Sarstedt and
    Ringle 2010)
    FIMIXPLS has been applied recently to uncover unobserved
    heterogeneity in PLS path models for success factors in
    industrial goods (Sarstedt et al 2009) intention to adopt new
    movie distribution services on the Internet (Papies and
    Clement) 2008) the American customer satisfaction index
    model (Ringle et al 2010a) and unanticipated reactions to
    organizational strategy among stakeholder segments (Money
    et al 2012) The advantage of applying the parametric finite
    mixture regression concept to PLS path models is that it offers
    segment retention criteria (eg AIC BIC and CAIC Hahn
    et al 2002 Sarstedt et al 2011a) for model selection (ie to
    6Other distancebased methods which are in earlier stages of development
    and currently not available as software packages include fuzzy PLS path
    modeling for latent class detection (FPLSLCD Palumbo et al 2008) and
    partial least squares genetic algorithm segmentation (PLSGAS) (Ringle et
    al 2010b Ringle et al 2013)
    7The REBUSPLS method is included in the XLSTAT software as well as in
    the plspm package (Sánchez and Trinchera 2013) of the statistical software
    R (R Core Team 2013)
    8The FIMIXPLS method is included in the PLS path modeling software
    SmartPLS (Ringle et al 2005)
    MIS Quarterly Vol 37 No 3September 2013 675
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    decide on an appropriate number of segments) However
    FIMIXPLS has some limitations in that it (1) assumes that
    the endogenous latent variables in the structural model have
    a multivariate normal distribution (which is inconsistent with
    PLS’ distributionfree assumption) and (2) uses latent variable
    scores in the structural model based on the measurement
    model for the overall sample and ignores plausible hetero
    geneity in the measurement model’s weights Consequently
    it not only ignores heterogeneity in the measurement model
    but may also fail to detect heterogeneity in the structural
    model that results from unobserved heterogeneity in the
    measurement model
    Partial Least Squares–PredictionOriented
    Segmentation (PLSPOS)
    To overcome the identified methodological limitations of
    FIMIXPLS and of existing distance measurebased PLS
    segmentation methods for uncovering unobserved hetero
    geneity we introduce the PLS predictionoriented segmen
    tation (PLSPOS) method that offers three novel and
    distinctive features (1) it uses a PLSspecific objective
    criterion to form homogeneous groups that maximize the
    explained variance (R²) of all endogenous latent variables in
    the PLS path model and thereby takes the entire path
    model’s structure into account9 (2) it includes a new distance
    measure that is appropriate for formative measures (and
    heterogeneity within them) and (3) it reassigns observations
    only if reassigning observations improves the objective
    criterion The latter feature of PLSPOS ensures continuous
    improvement of the objective criterion throughout the itera
    tions of the algorithm (hillclimbing approach) and provides
    the ability to uncover very small niche segments However
    like the expectation–maximization (EM) algorithm in FIMIX
    PLS PLSPOS can face the problem of ending in local optima
    due to its use of a hillclimbing approach Thus a repeated
    application of PLSPOS with different starting partitions is
    advisable
    PLSPOS follows a clustering approach with a deterministic
    assignment of observations to groups and uses a distance
    measure for the reassignment of observations as such it has
    no distributional assumptions The segmentation objective in
    a PLS path model is to form homogenous groups of obser
    vations with increased predictive power (R² of the endog
    enous latent variables) of the groupspecific path model
    estimates (compared to the overall sample model) In accor
    dance with Anderberg’s (1973 p 195) notion of clustering
    for maximum prediction a fitting objective criterion for PLS
    segmentation is to maximize the sum of the endogenous latent
    variables’ explained variance (R²) across all groups
    A key challenge of this approach is the indeterminacy of the
    data assignment task as it is unknown how the groupspecific
    PLS results will change when an observation is reassigned to
    a different group For this purpose the PLSPOS method
    uses a distance measure to identify appropriate observations
    for reassignment that serve as candidates to improve the PLS
    POS objective criterion Using a distance measure (ie cal
    culating each observation’s distance from its current group
    and from each of the other groups) for segmentation builds on
    an idea of earlier work on distancemeasurebased segmen
    tation in PLS path modeling (ie PLSTPM and its later
    improvement REBUSPLS)
    Appendix B provides the details of PLSPOS’ algorithm
    objective criterion and distance measure It also includes a
    detailed comparison of the technical differences between
    FIMIXPLS PLSTPM REBUSPLS and PLSPOS (Table
    B1) We implement the PLSPOS algorithm as an extension
    of the SmartPLS software (Ringle et al 2005) to evaluate its
    performance in our simulation study The extension will be
    made available with the next release of SmartPLS
    In summary the PLSPOS method complies with the most
    important objectives in PLS path modeling It (1) improves
    the objective criterion by nonparametric means (2) accounts
    for heterogeneity in the structural model as well as in the for
    mative measurement model and (3) is applicable to all path
    models regardless of the type of measurement model the
    distribution of the data or the complexity of the structural
    model Table 4 compares the key properties of PLSPOS and
    FIMIXPLS which we use as the benchmark method in this
    study as depicted in the previous section in terms of five
    desired criteria for a PLS segmentation method
    In the next section we detail the comprehensive simulation
    experiments we conducted to evaluate whether the differences
    in the capabilities of FIMIXPLS and PLSPOS noted in
    Table 4 hold empirically Specifically we focused our simu
    lations on the criteria in columns 2 through 5 because our goal
    9While PLSTPM only focuses on a single target construct REBUSPLS
    accounts for this limitation by replacing PLSTPM’s distance measure with
    the goodnessoffit criterionbased (GoF Tenenhaus et al 2005) closeness
    measure The aim of REBUSPLS is to detect sources of heterogeneity in
    both the structural and the outer model for all exogenous and endogenous
    latent variables (Esposito Vinzi et al 2008 p 444) As in PLSTPM
    REBUSPLS requires reflective measurement models (Esposito Vinzi et al
    2008) In contrast by focusing on the R² of all the endogenous latent
    variables as an explicit objective criterion PLSPOS stresses the prediction
    oriented character of PLS path modeling and allows the general application
    of this method to PLS path models with both reflective and formative
    measurement models
    676 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table 4 Conceptual Capabilities of FIMIXPLS and PLSPOS
    Segmentation
    Methods
    Desired Criteria for a PLS Segmentation Method
    Ability to detect
    heterogeneity in
    reflective
    measures
    Ability to detect
    heterogeneity in
    formative
    measures
    Ability to detect
    heterogeneity in
    the structural
    model
    Maximizes groupspecific R²
    of endogenous latent
    variables (prediction
    orientation)
    Ability to handle
    nonnormal data
    FIMIXPLS
    Hahn et al 2002 ––TT–
    PLSPOS T* TT T T
    *The method can detect heterogeneity in the reflective model if there is heterogeneity in the structural model (ie if heterogeneity in the reflective
    measurement model is the source of heterogeneity in the structural model)
    is to discover heterogeneity in the structural model and in
    formative measures while assuming measurement invariance
    in the reflective measures
    Simulations of PLSPOS and
    FIMIXPLS Performance
    We conducted experiments with simulated data that define the
    true groupspecific PLS parameters a priori We assessed the
    performance of PLSPOS and FIMIXPLS based on the
    differences between the true parameters and those estimated
    by each method Subsequently we compared the perfor
    mance of PLSPOS and FIMIXPLS in recovering the true
    parameter estimates
    Model Specification
    Consistent with most simulation studies on PLS path models
    (eg Chin et al 2003) we specified a direct effects path
    model that includes four exogenous latent variables and one
    endogenous variable We specified two versions of the path
    model model 1 uses reflective measures for the exogenous
    and endogenous latent variables (Figure 2a) while model 2
    uses formative measures for the exogenous latent variables
    and reflective measures for the endogenous latent variables
    (Figure 2b) While we limit the results reported in this paper
    to those obtained from the simulations of a direct effects path
    model we also evaluated more complex path models with
    multiple endogenous variables and mediation paths between
    the latent variables Our results were generally stable for
    these more complex models as well
    We generated the simulated data so each of the two groups
    has one particularly strong relationship in the structural
    model while all other path coefficients are at lower levels of
    magnitude For example for group 1 the structural path p1
    has a high true parameter value while the structural paths p2
    to p4 have lower true parameter values Conversely for group
    2 p4 has a high true parameter value while the path coeffi
    cients p1 to p3 have lower true values The mean differences
    in the coefficients for path p1 to p4 between group 1 and group
    2 reflect the heterogeneity in the model (ie the differences
    between the groups) The same principle applies to the mea
    surement weights in the formative measures We used four
    formative indicators per construct For group 1 the measure
    ment weights w1 and w3 have high true values while weights
    w2 and w4 have low true values Conversely for group 2 w2
    and w4 have high true values and w1 and w3 have low true
    values The mean differences between the weights for group
    1 and group 2 reflect the amount of heterogeneity in the
    measurement model
    Factor Design of the Simulations
    Our selection of experimental factors and their levels was
    informed by criteria that were shown to influence PLS path
    modeling or segmentation results in prior simulation studies
    Specifically we manipulated the following factors
    (1) Explained variance (R²) of the endogenous latent vari
    able per group (100 95 90 85)10 (eg Reinartz et al
    2009)
    (2) Structural model heterogeneity—that is the group
    specific differences in structural model path coefficients
    (25 50 75 100) (eg Andrews and Currim 2003b)
    10This manipulation results in R² values of 425 to 5 in the overall sample
    that combines groups For example when the R² value in both groups is 85
    the overall sample that combines the two groups has a R² value of 425
    because of unobserved heterogeneity
    MIS Quarterly Vol 37 No 3September 2013 677
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    (a) Reflective Model (b) Formative Model
    Figure 2 The Models
    (3) Sample size per group (100 200 400) (eg Chin et al
    2003)
    (4) Data distribution (normal nonnormal11) (eg Reinartz
    et al 2009)
    (5) Relative segment sizes (equal unequal12) (eg Andrews
    and Currim 2003b)
    In addition we manipulated the following factors related to
    the measurement model
    (6) Reliability of reflective measures (perfect versus normal
    loadings of 100 and ~85) (eg Chin et al 2003)
    (7) Measurement model heterogeneity—that is the group
    specific differences in formative measurement weights
    (25 50 75) (We note that to the best of our knowl
    edge this particular factor has not been examined in prior
    simulation research on PLS path models)
    (8) Multicollinearity between formative indicators (none
    level 1 level 2)13 (Mason and Perreault 1991)
    The number of factors and the number of factor levels system
    atically increase the complexity of the PLS segmentation task
    The full factorial design for the study results in 42 × 3 × 23
    384 different combinations for the reflective model (model 1)
    and 42 × 33 × 22 1728 different combinations for the forma
    tive model (model 2) To ensure stability of the results all
    factor combinations include 30 datageneration and segmenta
    tion runs for each segmentation method so in total (384 +
    1728) × 2 × 30 126720 segmentation runs were performed
    Data Generation
    Simulation studies in PLS path modeling require that data
    generated for the indicators (manifest variables) match the
    true values of the model Previous studies on PLS path
    modeling (eg Chin et al 2003 Henseler and Chin 2010
    Reinartz et al 2009) first generated data by extracting latent
    variable scores to match the true relationships in the structural
    model and then generated data for the indicators by adding
    measurement errors to match the indicators’ true parameters11For the nonnormal data we use a logtransformation of the normal data to
    get a skewness of about 2 and a kurtosis of about 5 for the indicators
    12The unequal condition has one segment with 80 and one with 20 of the
    total sample size 13For a detailed explanation of this factor see Appendix C
    678 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    in the measurement model This procedure does not allow for
    generating data for formative indicators as the direction of
    causality in formative measures is from the indicators to the
    construct (in contrast to reflective measures where the indi
    cators cause the construct) Data for the formative indicators
    must first be generated to compute the latent variable scores
    for formative constructs We address this requirement by
    generating random variables for the formative indicators such
    that the generated formative indicators match a prespecified
    correlation matrix (for modeling multicollinearity in the simu
    lation design) the true values of the formative measurement
    weights as well as the true values for the structural model
    parameters
    Performance Assessment
    The objectives of our simulation experiments were to
    (1) assess PLSPOS and FIMIXPLS in terms of their
    respective abilities to recover true groupspecific parameters
    (2) compare PLSPOS and FIMIXPLS based on the assess
    ment of their parameter recovery and (3) identify the relative
    effects of the design factors on the parameter recovery of
    PLSPOS and FIMIXPLS
    We knew the true parameters of each factorial combination
    (ie the R² path coefficients outer weights and loadings) a
    priori based on the parameter settings for the data generation
    The smaller the differences between the true values and the
    segmentation method’s parameter estimates the better the
    parameter recovery As FIMIXPLS cannot provide segmen
    tation results for the measurement model—because param
    eters are fixed to those resulting from the overall sample—we
    assessed each segmentation method by comparing the struc
    tural model’s path coefficients from the two segmentation
    methods with the a priori known values Consistent with
    prior studies (eg Henseler and Chin 2010 Reinartz et al
    2002) we evaluated parameter recovery using the mean
    absolute bias (MAB) which is the average of the simple
    absolute deviations between the true parameter and the
    parameter estimated by the segmentation method MAB
    values close to zero indicate near perfect parameter recovery
    To assess PLSPOS and FIMIXPLS we compared each
    method’s MAB with the MAB when the overall sample was
    analyzed without uncovering unobserved heterogeneity (ie
    without using a segmentation method) Finally to understand
    the relative importance of the design factors we evaluated
    parameter recovery (ie the path coefficient’s MAB) using a
    mixedeffects ANOVA model with the two segmentation
    methods (PLSPOS and FIMIXPLS withinsubjects factor)
    and the eight design factors (betweensubjects factors)
    Results of the Simulation Experiments
    We discuss the findings for both model 1 (reflective mea
    sures) and model 2 (formative measures) below starting with
    the results for model 1
    Results for Model 1 Reflective Measures
    Table 5 presents the results for the ANOVA with MAB as the
    dependent variable Our extensive simulations enabled us to
    detect even very small effects indicating high power For the
    sake of space and simplicity Table 5 shows only the direct
    effects all twoway interactions with the method factor and
    all other interactions having a significant and substantial
    effect (ie explaining more than 2 of the total variance in
    MAB implying a partial η² of more than 02 (Reinartz et al
    2009)) The partial η² represents the contribution of each
    factor or interaction as if it is the only variable so its effect is
    not masked by other variables See Appendix E for the com
    plete results
    The ANOVA results for model 1 show that parameter
    recovery is unaffected by the measurement model’s reliability
    The direct effect and all of the interaction effects of reliability
    are nonsignificant As the reliability has neither a between
    subjects nor a withinsubjects effect we find no evidence that
    the accuracy of either segmentation method is affected by the
    reliability of the measurement model
    The betweensubjects effects identify the factors that influ
    enced MAB for both segmentation methods All of the direct
    effects are significant with two notable findings (1) sample
    size (partial η² 013) and relative segment size (partial
    η² 002) have a partial etasquare below 02 so their influ
    ence on MAB is not substantial and (2) R² has the strongest
    impact on parameter recovery both as a direct effect and as an
    interaction effect with structural model heterogeneity This
    result is not surprising as an increasing error in the model
    distorts group differences As PLSPOS capitalizes on the
    model’s predictive power of the model (ie the explained
    variance) the method is better at uncovering heterogeneity
    when the predictive power is high
    The withinsubjects effects identify the differential influence
    of the design factors on MAB across the segmentation
    methods In general the method has a significant and sub
    stantial impact on the parameter recovery for the reflective
    model Furthermore the method’s two interaction effects
    with structural model heterogeneity and R² are significant and
    substantial All other interaction effects with the method are
    nonsignificant or are not substantial
    MIS Quarterly Vol 37 No 3September 2013 679
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table 5 Model 1 (Reflective Measures) ANOVA Explaining MAB by Method (PLSPOSFIMIXPLS) and
    Design Factors
    Source of Variance in MAB df Fvalue pvalue Partial η²
    Between
    Subjects
    Effects
    Intercept 1 1465862 000 568
    Structural Model Heterogeneity 3 112171 000 232
    R² 3 194885 000 344
    Sample Size 2 7077 000 013
    Reliability 1 188 170 000
    Data Distribution 1 49752 000 043
    Relative Segment Size 1 2262 000 002
    Structural Model Heterogeneity × R² 9 17896 000 126
    Error 11136
    Within
    Subjects
    Effects
    Method 1 95231 000 079
    Method × Structural Model Heterogeneity 3 21747 000 055
    Method × R² 3 13714 000 036
    Method × Sample Size 2 466 009 001
    Method × Reliability 1 01 974 000
    Method × Data Distribution 1 8797 000 008
    Method × Relative Segment Size 1 10401 000 009
    Error (Method) 11136
    Note df degrees of freedom
    Table 6 shows the MAB for each factor level when PLSPOS
    or FIMIXPLS is applied to uncover heterogeneity or the
    overall sample was analyzed without the use of a segmen
    tation method to uncover heterogeneity A detailed examina
    tion of the significant interaction effects of the method with
    the structural model heterogeneity and the R² shows that the
    MAB for PLSPOS increases more than the MAB for FIMIX
    PLS when the structural model heterogeneity or the R² is
    lower (Figures 3a and 3b) However using PLSPOS results
    in a MAB that is still very low compared to the MAB when
    the overall sample was analyzed without the use of a segmen
    tation method
    Overall the results reveal that for model 1 (reflective mea
    sures) both methods perform equally well in almost all
    conditions FIMIXPLS is slightly better than PLSPOS when
    the R² or the structural model heterogeneity is low and the
    bias from using either of the two methods (FIMIXPLS or
    PLSPOS) is much lower than the bias from analyzing the
    overall sample without uncovering heterogeneity
    Results for Model 2 Formative Measures
    Table 7 presents the results for the ANOVA in model 2
    (formative measures) with MAB as the dependent variable
    Again for the sake of space and simplicity Table 7 presents
    the direct effects all twoway interactions with the method
    and all other interactions that have significant and substantial
    effects (partial η² of more than 02) See Appendix F for the
    complete results
    For the betweensubjects effects all of the direct effects on
    MAB are significant but again the effect of relative segment
    size (partial η² 012) on MAB is not substantial Interest
    ingly the relative segment size and sample size have a sub
    stantial interaction in this model (partial η² 054) The
    MAB decreases for increased sample sizes in groups of equal
    size but stays constant for increased sample sizes in unequal
    groups
    The MAB for both segmentation methods is influenced by the
    heterogeneity in the structural model the heterogeneity in the
    measurement model the R² of the model the sample size the
    data distribution and the multicollinearity In contrast to the
    results for model 1 (reflective measures) it is not the R²
    (partial η² 0204) but the structural model heterogeneity that
    has the highest impact (partial η² 313) on parameter
    recovery for model 2 (formative measures) The impact of the
    measurement model heterogeneity (this factor is only relevant
    for formative measures) on MAB is the third most important
    factor and explains about 10 percent of the MAB variance
    (partial η² 104) Moreover the interaction effects between
    the structural model and measurement model heterogeneity as
    680 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    (a) MAB of Both Segmentation Methods for
    Different Structural Model Heterogeneity
    (b) MAB of Both Segmentation Methods
    for Different R² Values
    Figure 3 MAB of Both Segmentation Methods for Model 1 (Reflective Measures)
    Table 6 MAB in Model 1 (Reflective Measures) for Each Method
    Design Factor Level
    POS
    Mean Absolute Bias
    FIMIX
    Mean Absolute Bias
    No Segmentation
    Method
    Mean Absolute Bias
    Structural Model
    Heterogeneity
    25 055 030 125
    50 033 016 250
    75 019 013 375
    100 012 013 500

    85 054 033
    31290 038 023
    95 025 013
    100 002 003
    Sample
    Size
    100 032 021
    312200 031 018
    400 026 015
    Reliability Perfect 030 018 312Normal 029 018
    Data Distribution Normal 024 015 312NonNormal 036 021
    Relative Segment Size Equal 027 019 312Unequal 033 017
    Overall 030 018 312
    MIS Quarterly Vol 37 No 3September 2013 681
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table 7 Model 2 (Formative Measures) ANOVA Explaining MAB by Method (PLSPOSFIMIXPLS) and
    Design Factors
    Source of Variance in MAB df Fvalue pvalue Partial η²
    Between
    Subjects
    Effects
    Intercept 1 14269680 00 740
    Structural Model Heterogeneity 3 760533 00 313
    Measurement Model Heterogeneity 2 291299 00 104
    R² 3 428631 00 204
    Sample Size 2 86477 00 033
    Relative Segment Size 1 62983 00 012
    Data Distribution 1 146575 00 028
    Multicollinearity 2 84818 00 033
    Structural Model Heterogeneity × Measurement
    Model Heterogeneity 6 29809 00 034
    Sample Size × Relative Segment Size 2 142686 00 054
    Measurement Model Heterogeneity ×
    Multicollinearity 4 28784 00 022
    Error 50112
    Within
    Subjects
    Effects
    Method 1 393852 00 073
    Method × Structural Model Het 3 398798 00 193
    Method × Measurement Model Het 2 677105 00 213
    Method × R² 3 82632 00 047
    Method × Sample Size 2 22755 00 009
    Method × Relative Segment Size 1 17166 00 003
    Method × Data Distribution 1 297 08 000
    Method × Multicollinearity 2 173912 00 065
    Method × Structural Model Het × Measurement
    Model Het 6 97649 00 105
    Method × Structural Model Het × Multicollinearity 6 37296 00 043
    Method × Measurement Model Het ×
    Multicollinearity 4 25724 00 020
    Error (Method) 50112
    Note df degrees of freedom
    well as between measurement model heterogeneity and
    multicollinearity are significant and substantial but have very
    little impact compared to the factors discussed earlier
    For the withinsubjects effects the method’s effect on MAB
    is significant and substantial The method also significantly
    and substantially interacts with heterogeneity in both the
    structural model and the measurement model Looking at
    these interaction effects in more detail reveals that PLSPOS
    performs consistently well across all of the factor levels
    while the performance of FIMIXPLS deteriorates with
    decreasing structural model heterogeneity or increasing mea
    surement model heterogeneity Interestingly the threeway
    interaction of method with structural and measurement model
    heterogeneity is also significant and substantial (partial
    η² 105) (Figures 4a and 4b) While the MAB for PLSPOS
    is always below 05 thereby indicating good parameter
    recovery the MAB for FIMIXPLS increases when measure
    ment model heterogeneity becomes higher and structural
    model heterogeneity becomes lower
    Table 8 shows the MAB for each factor level in model 2
    (formative measures) and reveals that the level of structural or
    measurement model heterogeneity only slightly affects
    parameter recovery for PLSPOS In contrast parameter
    recovery for FIMIXPLS decreases with decreasing structural
    model heterogeneity or increasing measurement model
    heterogeneity Thus FIMIXPLS is as good as PLSPOS in
    682 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    (a) PLSPOS (b) FIMIXPLS
    Figure 4 MAB of Both Methods for Different Structural and Measurement Model Heterogeneity
    Table 8 MAB in Model 2 (Formative Measures) for Each Method
    Design Factor Level
    POS
    Mean Absolute Bias
    FIMIX
    Mean Absolute Bias
    No Segmentation
    Method
    Mean Absolute Bias
    Structural Model
    Heterogeneity
    25 038 089 132
    50 039 052 250
    75 032 031 375
    100 025 016 500
    Measurement Model
    Heterogeneity
    25 039 024 312
    50 033 042 312
    75 029 074 318

    85 057 056
    31490 041 050
    95 025 043
    100 011 038
    Sample
    Size
    100 043 050
    314200 030 047
    400 028 043
    Data Distribution Normal 030 043 314NonNormal 037 051
    Relative Segment Size Equal 029 046 314Unequal 038 048
    Multicollinearity
    none 031 062
    314Level 1 034 041
    Level 2 036 037
    Overall 034 047 314
    MIS Quarterly Vol 37 No 3September 2013 683
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table 9 Empirical Evaluation Summary of FIMIXPLS and PLSPOS
    Segmentation
    Method
    Desired Criteria for a PLS Segmentation Method
    Ability to detect
    heterogeneity in
    reflective
    measures
    Ability to detect
    heterogeneity in
    formative
    measures
    Ability to detect
    heterogeneity in
    the structural
    model
    Maximizes groupspecific
    R² of endogenous latent
    variables (prediction
    orientation)
    Ability to
    handle non
    normal data
    FIMIXPLS
    Hahn et al 2002 Not tested – TTT
    PLSPOS Not tested TT T T
    Note T indicates support by the simulation experiments – indicates that the criterion is not associated with the method
    situations with very high structural model heterogeneity
    regardless of the measurement model heterogeneity and also
    in situations where the measurement model heterogeneity is
    low and the structural model heterogeneity is at moderate
    levels Therefore as the results in Figures 4a and 4b reveal
    the parameter recovery ability of a segmentation method
    cannot be assessed independently for these two types of
    heterogeneity
    It is worth noting that the interaction effect between method
    and data distribution is not substantial for either model 1
    (reflective measures) or model 2 (formative measures) In
    addition data distribution only has a small impact on param
    eter recovery in both model 1 and model 2 (direct effects of
    partial η² 043 and partial η² 028) Accordingly we
    conclude that both methods perform equally well with both
    normal and nonnormal distributions This finding is espe
    cially interesting as FIMIXPLS assumes multivariate normal
    distributions of the endogenous latent variables which should
    theoretically result in unfavorable performance with non
    normal data compared to PLSPOS However with several
    indicators for each construct the composite latent variable
    scores might become essentially normal even if the indicators
    are not This might explain this initially surprising result
    Summary of Results
    Overall we can conclude that the use of either PLSPOS or
    FIMIXPLS is better for reducing biases in parameter esti
    mates and avoiding inferential errors than ignoring unob
    served heterogeneity in PLS path models A notable excep
    tion is when there is low structural model heterogeneity and
    high formative measurement model heterogeneity in this
    condition FIMIXPLS produces results that are even more
    biased than those resulting from ignoring heterogeneity and
    estimating the model at the overall sample level PLSPOS
    shows very good performance in uncovering heterogeneity for
    path models involving formative measures and is significantly
    better than FIMIXPLS which shows unfavorable perfor
    mance when there is heterogeneity in formative measures
    However FIMIXPLS becomes more effective when there is
    high multicollinearity in the formative measures while PLS
    POS consistently performs well There are two interrelated
    reasons for this result (1) multicollinearity masks hetero
    geneity in the measurement model making the measures more
    similar (ie homogenous) across groups and (2) FIMIXPLS
    ignores heterogeneity in the measurement model and therefore
    the multicollinearity problems in formative indicators The
    strongly correlated formative measures become closer to a
    homogenous reflective measurement of the construct There
    fore the performance of PLSPOS and FIMIXPLS converges
    in situations with high multicollinearity because FIMIXPLS
    performs marginally better in purely reflective models (model
    1) regardless of the distribution being normal or nonnormal
    However the performance differences between FIMIXPLS
    and PLSPOS are much smaller in the case of a reflective
    model than in the case of a formative model Therefore PLS
    POS is more generally applicable than FIMIXPLS to
    discover heterogeneity in PLS path models
    Thus the simulation experiments provide an empirical assess
    ment of the segmentation criteria associated with PLSPOS
    and FIMIXPLS (Table 9) All criteria associated with each
    of these methods are supported by our findings with the
    exception that FIMIXPLS does not degrade in performance
    with nonnormal data
    A Process for Unobserved
    Heterogeneity Discovery
    Given the availability of methods to uncover unobserved
    heterogeneity as discussed in the two previous sections
    researchers working with SEM face the following two major
    questions when to investigate unobserved heterogeneity and
    684 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    how to apply methods for uncovering unobserved hetero
    geneity and defining segments We address these questions
    by proposing a UHD process (Figure 5) and also by iden
    tifying how this process can be applied given the research
    objective (ie purely testing a model or testing and elabo
    rating a model Colquitt and ZapataPhelan 2007)
    How to Apply the UHD Process
    When selecting an appropriate UHD method researchers have
    to determine whether they are interested in evaluating
    unobserved heterogeneity associated with latent segments or
    individuallevel estimates (eg hierarchical Bayesian ap
    proach fixed effects and random effects) As our focus is on
    the discovery of latent segments we propose a UHD process
    for defining the segments in this context In contrast if the
    objective is to examine unobserved heterogeneity for
    individuallevel estimates the described UHD process does
    not apply because the methods have different assumptions and
    objectives and require different data (ie several observations
    per individual) The UHD process for the discovery of latent
    segments consists of the following three stages
    1 Selecting an appropriate UHD method
    2 Applying the segmentation method to define the
    segments
    a Using heuristics to narrow the range of statistically
    wellfitting segments
    b Separating relevant from irrelevant segments (Are
    the segments substantial)
    c Testing the significance of the differences between
    segments (Are the segments differentiable)
    d Characterizing segments using constructs in the
    modeltheory (Are the segments plausible)
    e Turning unobserved heterogeneity into observed
    heterogeneity (Are the segments accessible)
    3 Validating the segmentation results
    Selecting an Appropriate UHD Method
    (Stage 1 of the UHD Process)
    As discussed earlier the methodological options for analyzing
    unobserved heterogeneity involving CBSEM cover two con
    ceptually different approaches (ie latent segment analysis
    and individuallevel estimate correction) For latent segment
    analysis the appropriate UHD choice is the finite mixture
    model as no modelbased clustering alternative is available
    For analyses involving PLS path modeling there are no
    methods available that address unobserved heterogeneity
    associated with individuallevel estimates Latent segments
    in PLS path modeling can be uncovered using one of the two
    methods we present in this paper (ie FIMIXPLS and PLS
    POS) Our simulation results show that FIMIXPLS is
    restricted to uncovering unobserved heterogeneity in the
    structural model while PLSPOS can uncover unobserved
    heterogeneity in both the measurement and structural models
    Therefore researchers should choose FIMIXPLS if their
    models include only reflective measures and heterogeneity is
    expected to affect only the structural model and not the
    measurement model In contrast PLSPOS should be applied
    for discovering unobserved heterogeneity when PLS path
    models include formative measures and heterogeneity can
    affect both the structural and measurement models
    Applying the UHD Method to Define Segments
    (Stage 2 of the UHD Process)
    After choosing the appropriate method for uncovering unob
    served heterogeneity the researcher has to apply the method
    to evaluate whether significant unobserved heterogeneity is
    present in the model and to define the number of segments to
    retain from the data Determining the correct number of
    segments is important as under or oversegmentation leads to
    biased results and misinterpretations The second stage of the
    UHD process focuses on (1) defining with heuristics a range
    of statistically wellfitting segments and (2) evaluating the
    segments based on theoretical considerations The steps in
    this stage emphasize that researchers (1) evaluate the plausi
    bility of segments by connecting the segmentation solution to
    theory and (2) avoid capitalizing on data idiosyncrasies to
    improve the explained variance or significance of parameters
    Stage 2 Step 1 Narrow the range of statistically wellfitting
    segments To determine the best fitting number of segments
    the researcher has to apply the selected segmentation method
    for a consecutive number of segments (eg 1 to 10) and
    assess the methodspecific heuristics to generate information
    on the number of segments that result in good model fit
    Researchers have to rely on heuristics to determine a well
    fitting number of segments as there is no exact statistical test
    to accomplish this task (McLachlan and Peel 2000) In
    mixture models these heuristics include modelselection
    criteria that are well known from the modelselection litera
    ture (eg AIC BIC and CAIC) and can also be used to
    approximate the best fitting number of segments (Andrews
    and Currim 2003a Sarstedt et al 2011a)
    In contrast modelbased clustering methods such as PLS
    POS are not based on the mixture model concept and do not
    MIS Quarterly Vol 37 No 3September 2013 685
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Figure 5 Unobserved Heterogeneity Discovery (UHD) Process
    provide modelselection criteria These methods require other
    modelspecific heuristics to compare the results across dif
    ferent numbers of groups for example in terms of their
    average explained variance (R²) or the increase in predictive
    relevance (Q²) However researchers should not rely purely
    on heuristics (eg modelselection criteria in finitemixture
    modeling or the explained variance per segment in PLSPOS)
    to retain the best fitting number of segments because past
    studies have shown heuristics to have a low probability of
    finding the true number of segments There is some empirical
    evidence that the best information criteria in mixture models
    only have about a 60 percent chance of identifying the true
    number of segments (Andrews and Currim 2003a 2003b
    Sarstedt et al 2011a) Consequently relying on heuristics can
    lead to strongly datadriven outcomes if the researcher fits the
    number of segments to the data without considering the theo
    retical or practical meaning of the segments Therefore these
    heuristics should only be used to narrow the range of
    segments for further theoretical assessment
    Regardless of whether mixture models or modelbased
    clustering is used if multiple heuristics clearly point to a one
    segment solution the researcher might conclude that the
    threat to validity from unobserved heterogeneity is low and
    the overall sample represents a homogenous population This
    will occur when (1) the average variance explained in PLS
    path models for the multisegment solution is substantially
    lower than the overall sample and (2) the modelselection
    criteria in the mixture models collectively indicate a one
    segment solution as showing the best fit and a large deteriora
    tion in fit for the best multisegment solution
    Stage 2 Step 2 Are the segments substantial The next step
    after defining a range of wellfitting segments is to separate
    relevant from irrelevant segments Often segmentation
    methods produce very small but wellfitting segments that are
    likely to represent data idiosyncrasies (eg outliers and bad
    respondents) However the problem with these very small
    segments is that they may (1) be irrelevant for theory or prac
    686 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    tice (eg outliers) (2) represent statistical artifacts or data
    collection problems (eg bad respondents) (3) yield unre
    liable parameter estimates because of the small sample size
    and (4) not be usable in the next step of the UHD process
    (ie multigroup difference testing) Therefore each segment
    has to be large enough to represent a real segment how
    ever one also needs to be cautious when contrasting niche
    and irrelevant segments Each segment should therefore be
    carefully assessed if it represents a substantial segment A
    guideline for this analysis might be to take the average
    expected segment size to evaluate a segment’s relevance (ie
    five segments would suggest an average expected segment
    size of 20) If the segment size is considerably lower in
    proportion (eg a 2 segment size) it is a candidate for
    exclusion as an irrelevant segment In addition the total
    segment size should meet the minimum standards for reliable
    parameter estimates for the given SEM estimation method
    (ie CBSEM and PLS path modeling) The researcher will
    need to determine if the segment may be a niche segment that
    is substantial and needs to be evaluated further in the next
    steps of the UHD process
    Stage 2 Step 3 Are the segments differentiable To deter
    mine whether heterogeneity significantly affects the results
    the substantial segments from the previous step need to be
    tested to determine the significance of group differences
    assessing if a given segment is differentiable from others
    Therefore researchers should perform multigroup structural
    equation modeling or multigroup PLS analysis and assess
    (1) the measurement invarianceequivalence and (2) the signi
    ficance of differences in path coefficients between segments
    If a segment is not significantly different from other segments
    researchers should consider either combining the segment
    meaningfully with other segments that are not significantly
    different from it or reducing the number of segments in the
    segmentation method A reason for nonsignificant segment
    differences might be that the prespecified number of segments
    for extraction in the segmentation method has caused over
    fitting of the data If no significant differences are detected
    among any of the segments researchers should conclude they
    have a homogenous population and low validity threats due to
    unobserved heterogeneity
    Stage 2 Step 4 Are the segments plausible Given a set of
    differentiable segments the next step is to evaluate whether
    the segments are plausible This plausibility assessment is to
    be conducted by characterizing the segments with the
    constructs in the modeltheory Each segment’s theoretical
    plausibility should be assessed by considering the
    (1) segmentspecific characteristics based on constructs in the
    modeltheory (2) the conceptual differences between the
    segment and other segments and (3) the segment’s theoretical
    or managerial relevance If it is plausible within the specific
    research domain that segments can change the explanatory
    role of the constructs (eg certain types of IS users empha
    size different IS characteristics which changes the role of the
    constructs in predicting usage) researchers should include
    user type segments in their theoretical implications to avoid
    the premature invalidation or overgeneralization of theoretical
    claims based on results from the overall sample If a segment
    is not theoretically plausible it should also be considered a
    limitation of the theory One possible reason for an implau
    sible segment could be that it was mistaken as substantial
    when it actually represented outliers Future research should
    solve the anomaly of differentiable segments that cannot be
    explained by (1) complementary theoretical elaboration andor
    (2) empirical reevaluation However because unobserved
    heterogeneity can threaten the validity of conclusions based
    on the overall sample due to significant segment differences
    differentiable segments that are not plausible should not be
    part of a combined sample used to test the modelhypotheses
    Stage 2 Step 5 Are the segments accessible The last step
    in applying the segmentation methods is to turn unobserved
    heterogeneity into observed heterogeneity by making the
    segments accessible Researchers can further elaborate on the
    theoretical meaning of the plausible segments by identifying
    additional variables (eg demographic psychographic con
    textual etc) beyond the original model that (1) help distin
    guish the segments by explaining the differences between
    retained segments and (2) determine to which segment
    responses belong Statistical techniques to support this step
    include (1) discriminant analysis (2) exhaustive CHAID and
    (3) contingency tables where potential variables are tested for
    their ability to explain segment differences However instead
    of applying an ad hoc approach complementary theoretical
    considerations should guide the process of identifying exter
    nal variables It should not be a process in which the best
    discriminating leftover variable in the dataset (that is not
    part of the model) is used to explain segment differences If
    it is not possible to identify theoretically reasonable variables
    within the given datasetstudy that have sufficient explanatory
    power to differentiate between segments suggestions for
    additional variables based on complementary theoretical
    perspectives should guide future research
    Validating the Segmentation Results
    (Stage 3 of the UHD Process)
    In the final stage of the UHD process researchers should
    validate the segmentation results including the number of
    segments with external data not used in the estimation
    process Researchers may (1) apply holdout sample valida
    tion techniques using data that are already available (Andrews
    et al 2010 Bapna et al 2011) (2) use crossvalidation
    MIS Quarterly Vol 37 No 3September 2013 687
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    random splits to compare the stability of segmentation results
    (Jedidi et al 1997) or (3) collect additional data (eg in a
    followup study) to evaluate the results and find new explana
    tory variables that match segments better to explain hetero
    geneity (ie make them accessible) Furthermore repeating
    the segmentation study on a different population (ie sample)
    and testing the proposed explanatory variables (ie modera
    tors or grouping variables) in followup studies increases the
    generalizability of the results
    When to Apply Methods to Uncover
    Unobserved Heterogeneity
    Given a model that is grounded in substantive theory the
    complexity of the social and behavioral phenomena examined
    in IS research makes it plausible there will be heterogeneity
    in any sample that is used to test and refine the model
    Accordingly we recommend that all empirical IS research
    should consider the discovery of unobserved heterogeneity
    following the UHD process just as we evaluate reliability and
    validity However researchers should (1) only use segmenta
    tion methods when substantive theory supports the model and
    (2) avoid using segmentation methods in models that are not
    well grounded in theory to merely improve the explained
    variance or the significance of parameters As Jedidi et al
    (1997 p 57) observe one practice that should be avoided is
    that of fitting a … model which is not well grounded in sub
    stantive theory and simply adding segments until a reasonable
    fit is found This rule applies to both CBSEM and PLS path
    modeling regardless of the unobserved heterogeneity
    discovery method that is to be used
    For models grounded in substantive theory the objectives for
    discovering unobserved heterogeneity can differ depending on
    the study’s research objectives If the research objective is
    theory testing (ie testers Colquitt and ZapataPhelan 2007)
    uncovering unobserved heterogeneity serves as a validity
    check to safeguard against biases and the false rejection or
    false confirmation of theoretical claims When the theory
    tester uncovers unobserved heterogeneity in the sample (ie
    significant segment differences are detected and the segments
    are determined to be theoretically plausible) heshe has
    evidence of a theoretical breakdown given the segments As
    such the discovery of unobserved heterogeneity safeguards
    against (1) premature invalidation of theoretical claims (ie
    the results based on the overall sample suggest certain rela
    tionships are nonsignificant but the significance of these
    relationships is actually masked by the heterogeneity) and
    (2) premature overgeneralization of theoretical claims (ie
    the modeltheory holds in some segments and not in others
    thus requiring qualifiers for support found for the theory in
    different segments) Hence theory testers apply the UHD
    process to evaluate validity threats due to unobserved hetero
    geneity If significant differences across plausible segments
    are detected researchers should revise the boundary condi
    tions for the theory (ie specify within which plausible
    segments the theory was supported and in which it was not)
    If unobserved heterogeneity is not uncovered in the sample
    (ie no significant differences across segments are detected
    segments are not differentiable) the researcher can continue
    with the standard analysis on the overall sample (in)validate
    theoretical claims and note that the validity of the findings is
    not threatened by unobserved heterogeneity
    If the research objective is theory testing and elaboration (ie
    expanders Colquitt and ZapataPhelan 2007) uncovering
    unobserved heterogeneity not only serves as a validity check
    but can also guide researchers to identify variables explaining
    the uncovered segments and to integrate these variables to
    expand the modeltheory Hence researchers should turn
    unobserved heterogeneity into observed heterogeneity by
    (1) advancing theoretical reasons to explain the differences
    between segments (2) identifying constructs beyond the
    original model that explain these differences thereby making
    the segments accessible and (3) expanding the modeltheory
    by integrating the constructs that make the segments acces
    sible Accordingly the accessibility stage in the UHD pro
    cess will be facilitated when researchers anticipate this task
    during the research design identify complementary theo
    retical perspectives and corresponding constructs and collect
    additional data for these constructs that can be instrumental in
    making the segments accessible Of course these considera
    tions require extra effort and datacollection costs and should
    be accommodated in a study when the researcher expects
    unobserved heterogeneity (eg based on inconsistent results
    in past studies metaanalysis the nature of phenomena etc)
    We note that the discovery of unobserved heterogeneity for
    theoretical tests and elaboration is relevant even when ex
    isting theory offers a priori knowledge about observed hetero
    geneity (eg age gender or income) There can be addi
    tional explainable and generalizable heterogeneity beyond the
    known heterogeneity (eg experienced versus inexperienced
    users) that threatens the theoretical validity of the test and
    when discovered can be used to elaborate theorymodels
    As an illustration assume that the research objective is to test
    the baseline technology acceptance model presented in the
    introduction Based on the analysis of the overall sample the
    researcher risks overgeneralization in that the effects of PU
    and PEOU are always important for IU To avert this risk the
    researcher applies the UHD process and discovers two
    substantial and differentiable segments One segment shows
    a strong positive relationship between PU and IU and a weak
    or nonsignificant relationship between PEOU and IU In
    688 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    contrast the other segment shows a strong positive relation
    ship between PEOU and IU and a weak or nonsignificant
    relationship between PU and IU (Figure 1a) The researcher
    concludes that these two identified segments (ie users
    emphasizing PU or PEOU) are theoretically plausible (ie
    within TAM it is reasonable that there are different users who
    emphasize different system characteristics) and conceptually
    important for the theory In contrast to the results derived
    from the overall sample only one of the posited TAM con
    structs influences IU in each segment As such the researcher
    (1) does not overgeneralize the theory by assuming that it will
    always be applicable (2) acknowledges there are user
    segments that determine which construct is influential for IU
    and (3) specifies the need to make the segments accessible
    thereby expanding the TAM model
    Given the study’s objective (ie theory testing) and the
    limited availability of additional data (eg a lack of demo
    graphic or psychographic variables such as experience)
    researchers might end the UHD process after concluding the
    segments are plausible (ie that it is plausible that the
    segments change the explanatory role of the constructs) with
    out explaining which users belong to which segment (ie
    without making the segments accessible)
    Instead if the research objective is theory testing and elabo
    ration researchers should continue to find complementary
    theoretical explanations to make the segments accessible (ie
    to give additional theoretical meaning to the segments) A
    complementary theory could explain that users’ experience
    influences their appreciation of system characteristics (eg
    PEOU and PU) Experience therefore could be an external
    variableconstruct that if available in the dataset could be
    tested for explaining the segment membership Other plau
    sible theoretical considerations could suggest other variables
    constructs that might explain segment membership and should
    be evaluated (eg age income computer anxiety task type
    subjective norms etc) If researchers are able to identify a
    variableconstruct that explains the segment membership (ie
    makes segments accessible) the unobserved heterogeneity is
    turned into observed heterogeneity thereby expanding the
    theory with new constructs accounting for the group differ
    ences (eg a moderator) If researchers are unable to assess
    the ability of variablesconstructs to explain segment member
    ship because of lack of data in the study they can only theo
    retically identify reasonable variablesconstructs for future
    testing
    Limitations and Future Research
    In this study we (1) discussed why unobserved heterogeneity
    is an important issue in IS research (2) identified threats to
    validity due to unobserved heterogeneity (3) synthesized
    current work on unobserved heterogeneity in CBSEM and
    PLS path modeling (4) introduced a new segmentation
    method (PLSPOS) for PLS path modeling (5) assessed its
    performance and that of FIMIXPLS and (6) provided
    guidelines for researchers on when and how to uncover unob
    served heterogeneity While our study makes contributions
    it has its limitations and opens up avenues for future research
    First the validity and generalizability of simulation studies
    are limited by the choice of design factors and factor levels
    We focused on eight factors based on past studies on PLS
    path modeling or segmentation The analysis of all factor
    level combinations of the two PLS path models entailed
    126720 simulated segmentation runs for assessing the per
    formance of PLSPOS and FIMIXPLS The inclusion of
    additional design factors—namely those that are theoretically
    less important for PLS segmentation—or additional factor
    levels would have increased the complexity of the simulations
    exponentially and is beyond the scope of a single study
    Therefore researchers should also apply PLSPOS and
    FIMIXPLS in a broad range of empirical studies to find
    additional evidence of the methods’ abilities to detect
    unobserved heterogeneity
    Second heterogeneity is a special type of endogeneity prob
    lem (ie omitted group variables) Future studies may want
    to evaluate the impact of other types of endogeneity problems
    (eg reciprocal relationships) on PLS path modeling results
    As PLS path modeling cannot handle nonrecursive models
    these issues might also threaten the consistency of parameters
    In addition researchers may want to assess the effect of
    unobserved heterogeneity in models that do not comply with
    the recursive nature of models imposed by PLS path models
    If heterogeneity affects nonrecursive (reciprocal) relation
    ships it might have a strong impact on the ability of both PLS
    segmentation methods (FIMIXPLS and PLSPOS) to
    uncover unobserved heterogeneity
    Third this research does not focus on the parameter settings
    of the methods or the time needed to arrive at the final seg
    mentation solution Our simulations suggest that PLSPOS is
    more time consuming than FIMIXPLS14 Determining
    efficient parameter settings to reduce the computational effort
    of PLSPOS represents another avenue for future research
    14In absolute terms PLSPOS works within acceptable timeframes Applying
    both methods to the ECSI mobile phone dataset from Tenenhaus et al (2005)
    with two segments the FIMIXPLS algorithm needs approximately 10
    seconds while PLSPOS requires about 3 minutes to arrive at a solution
    (We used a Windows 7 PC with an Intel Core 2 T7300 2GHz and 2GB
    RAM) We believe this should be acceptable to researchers in an advanced
    stage of model investigation
    MIS Quarterly Vol 37 No 3September 2013 689
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Conclusion
    We differentiated between observed and unobserved hetero
    geneity and showed why unobserved heterogeneity biases
    structural equation model estimates leads to Type I and
    Type II errors and is a threat to different types of validity
    (ie internal instrumental statistical conclusion and
    external) We demonstrated that heterogeneity is present in
    empirical IS research across various IS phenomena by
    presenting evidence from 12 metaanalyses showing that
    inconsistent findings are prevalent across IS studies with
    unobserved heterogeneity being a plausible cause for these
    inconsistencies We explained how researchers can avoid
    threats to validity due to unobserved heterogeneity in struc
    tural equation modeling by using different methods that have
    been proposed in the literature to uncover unobserved
    heterogeneity The application of these methods not only
    safeguards against biases and validity threats but also
    facilitates theory development by promoting abduction (Van
    de Ven 2007) Specifically uncovering unobserved hetero
    geneity and explaining segments with new constructs beyond
    those in the model allows researchers to develop additional
    theoretical descriptions that make segments accessible
    Thereby they can expand and further develop existing theory
    We introduced a new segmentation method for PLS path
    modeling—PLSPOS—that overcomes some of the restrictive
    assumptions associated with FIMIXPLS and other distance
    measurebased methods and we evaluated the ability of the
    FIMIXPLS and PLSPOS methods to uncover unobserved
    heterogeneity in PLS path models Our findings show that
    both FIMIXPLS and PLSPOS alleviate threats to validity
    from unobserved heterogeneity by providing considerably less
    biased parameter estimates than those that are based on
    invalid assumptions of homogenous data However FIMIX
    PLS is restricted to uncovering unobserved heterogeneity in
    the structural model while PLSPOS can uncover unobserved
    heterogeneity in both the measurement and structural models
    Our results show that the parameter recovery of PLSPOS and
    FIMIXPLS is comparable for those PLS path models in
    which all measures are reflective (with measurement invari
    ance across groups) and that heterogeneity is limited to the
    structural model PLSPOS performs very well in uncovering
    heterogeneity across all types of PLS path models with
    different locations of heterogeneity in the model (structural
    model measurement model or both) and different data
    conditions (sample size relative segment sizes multi
    collinearity and data distribution)
    Our findings also reveal that unobserved heterogeneity in
    formative measures and in the structural model should be
    evaluated collectively As FIMIXPLS does not uncover
    heterogeneity in measurement models PLSPOS should be
    applied for discovering unobserved heterogeneity if PLS path
    models include formative measures This finding is parti
    cularly important because formative measurement models are
    often used in IS research A comprehensive analysis of the
    application of PLS path models in MIS Quarterly over the last
    20 years indicates that about 42 percent of the models use
    only reflective measures about 32 percent of the models use
    formative measures and about a quarter of the studiesmodels
    do not explicitly state which measurement model was used
    (Ringle et al 2012) In addition the number of studies using
    formative measures in IS research has increased over time
    While there is an ongoing discussion on the interpretation and
    use of formative measures (AguirreUrreta and Marakas 2012
    Diamantopoulos 2011 Edwards 2010 Jarvis et al 2012
    Petter et al 2012) there is general consensus that the theo
    retical meaning of a construct should correspond to its empi
    rical meaning and that some theoretical constructs fit forma
    tive specifications better than reflective specification (Bagozzi
    2011 Diamantopoulos and Winklhofer 2001 Jarvis et al
    2012 Petter et al 2007) As Bagozzi (2011) notes there are
    different ontologies underlying formative and reflective mea
    sures which have different accompanying approaches for
    interpreting and assessing the construct and its relationships
    with other constructs If researchers have chosen a formative
    ontology the discovery of unobserved heterogeneity in
    formative indicator weights can assist them in evaluating
    plausible differences in the construct’s theoretical or empirical
    meaning between groups thereby safeguarding against
    interpretational confounds
    It is important to note that we do not recommend using
    segmentation methods (including FIMIXPLS and PLSPOS)
    for post hoc datadriven improvement of results where
    researchers engage in fishing expeditions with the objective
    of improving the significance of an association or the predic
    tive power of the model as described earlier in the section on
    the UHD process Instead consistent with Jedidi et al (1997)
    and Van de Ven (2007) we take the position that theory
    development in the social and behavioral sciences does not
    need to be confined to deductive reasoning Moreover in
    situations in which the researcher discovers anomalies that
    must be resolved through theoretical elaboration theory
    development is significantly enhanced by abduction Seg
    mentation provides a mechanism to facilitate abduction by
    surfacing anomalies which must then be confronted and
    resolved theoretically Using the presented methods in PLS
    path modeling and CBSEM within the UHD process is a
    possible way to achieve this goal
    Acknowledgments
    We thank the senior editor Ron Thomson the associate editor and
    the reviewers for their constructive comments and valuable
    690 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    suggestions We also appreciate the comments from Ed Rigdon and
    Detmar Straub from Georgia State University on our motivation and
    initial ideas for this study
    References
    AguirreUrreta M I and Marakas G M 2012 Revisiting Bias
    Due to Construct Misspecification Different Results from
    Considering Coefficients in Standardized Form MIS Quarterly
    (361) pp 123138
    Allenby G M and Rossi P E 1998 Marketing Models of
    Consumer Heterogeneity Journal of Econometrics (8912) pp
    5778
    Anderberg M R 1973 Cluster Analysis for Applications New
    York Academic Press
    Andrews R L Brusco M J and Currim I S 2010 Amal
    gamation of Partitions from Multiple Segmentation Bases A
    Comparison of NonModelBased and ModelBased Methods
    European Journal of Operational Research (2012) pp 608618
    Andrews R L and Currim I S 2003a A Comparison of
    Segment Retention Criteria for Finite Mixture Logit Models
    Journal of Marketing Research (4020) pp 235243
    Andrews R L and Currim I S 2003b Retention of Latent
    Segments in RegressionBased Marketing Models International
    Journal of Research in Marketing (204) pp 315321
    Ansari A Jedidi K and Jagpal S 2000 A Hierarchical Bayes
    ian Methodology for Treating Heterogeneity in Structural
    Equation Models Marketing Science (194) pp 328347
    Arminger G Stein P and Wittenberg J 1999 Mixtures of
    Conditional Mean and CovarianceStructure Models Psycho
    metrika (644) pp 475494
    Bagozzi R P 2011 Measurement and Meaning in Information
    Systems and Organizational Research Methodological and
    Philosophical Foundations MIS Quarterly (352) pp 261292
    Bapna R Goes P Kwok Kee W and Zhongju Z 2011 A
    Finite Mixture Logit Model to Segment and Predict Electronic
    Payments System Adoption Information Systems Research
    (221) pp 118133
    Bart Y Shankar V Sultan F and Urban G L 2005 Are the
    Drivers and Role of Online Trust the Same for All Web Sites and
    Consumers A LargeScale Exploratory Empirical Study The
    Journal of Marketing (694) pp 133152
    Cai JH and Song XY 2010 Bayesian Analysis of Mixtures
    in Structural Equation Models with NonIgnorable Missing
    Data British Journal of Mathematical and Statistical
    Psychology (633) pp 491508
    Cenfetelli R T and Bassellier G 2009 Interpretation of Forma
    tive Measurement in Information Systems Research MIS
    Quarterly (334) pp 689707
    Chin W W 1998 The Partial Least Squares Approach to Struc
    tural Equation Modeling in Modern Methods for Business
    Research G A Marcoulides (ed) Mahwah NJ Erlbaum pp
    295358
    Chin W W and Dibbern J 2010 A Permutation Based Pro
    cedure for MultiGroup PLS Analysis Results of Tests of Dif
    ferences on Simulated Data and a Cross Cultural Analysis of the
    Sourcing of Information System Services between Germany and
    the USA in Handbook of Partial Least Squares Concepts
    Methods and Applications V Esposito Vinzi W W Chin J
    Henseler and H Wang (eds) Berlin Springer pp 171193
    Chin W W Marcolin B L and Newsted P R 2003 A Partial
    Least Squares Latent Variable Modeling Approach for Measuring
    Interaction Effects Results from a Monte Carlo Simulation
    Study and an ElectronicMail EmotionAdoption Study Infor
    mation Systems Research (142) pp 189217
    Collier J E and Bienstock C C 2009 Model Misspecification
    Contrasting Formative and Reflective Indicators for a Model of
    EService Quality Journal of Marketing Theory & Practice
    (173) pp 283293
    Colquitt J A and ZapataPhelan C P 2007 Trends in Theory
    Building and Theory Testing A FiveDecade Study of the
    Academy of Management Journal Academy of Management
    Journal (506) pp 12811303
    Cook T D and Campbell D T 1976 The Design and Conduct
    of QuasiExperiments and True Experiments in Field Settings
    in Handbook of Industrial and Organizational Psychology M D
    Dunnette (ed) Chicago Rand McNally pp 223326
    Cook T D and Campbell D T 1979 QuasiExperimentation
    Design and Analysis Issues for Field Settings Chicago Rand
    McNally
    Davis F D Bagozzi R P and Warshaw P R 1989 User
    Acceptance of Computer Technology A Comparison of Two
    Theoretical Models Management Science (358) pp 9821003
    DeSarbo W S and Cron W L 1988 A Maximum Likelihood
    Methodology for Clusterwise Linear Regression Journal of
    Classification (52) pp 249282
    DeSarbo W S Di Benedetto C A Jedidi K and Song M
    2006 Identifying Sources of Heterogeneity for Empirically De
    riving Strategic Types A Constrained FiniteMixture Structural
    Equation Methodology Management Science (526) pp
    909924
    Desarbo W S Ramaswamy V and Cohen S H 1995 Market
    Segmentation with ChoiceBased Conjoint Analysis Marketing
    Letters (62) pp 137147
    Diamantopoulos A 2011 Incorporating Formative Measures into
    CovarianceBased Structural Equation Models MIS Quarterly
    (352) pp 335358
    Diamantopoulos A and Papadopoulos N 2010 Assessing the
    CrossNational Invariance of Formative Measures Guidelines
    for International Business Researchers Journal of International
    Business Studies (412) pp 360370
    Diamantopoulos A Riefler P and Roth K P 2008 Advancing
    Formative Measurement Models Journal of Business Research
    (6112) pp 12031218
    Diamantopoulos A and Winklhofer H M 2001 Index Con
    struction with Formative Indicators An Alternative to Scale
    Development Journal of Marketing Research (382) pp
    269277
    Dolan C and van der Maas H 1998 Fitting Multivariage
    Normal Finite Mixtures Subject to Structural Equation
    Modeling Psychometrika (633) pp 227253
    Edmondson A C and McManus S E 2007 Methodological Fit
    in Management Field Research Academy of Management
    Review (324) pp 11551179
    Edwards J R 2010 The Fallacy of Formative Measurement
    Organizational Research Methods (142) pp 370388
    MIS Quarterly Vol 37 No 3September 2013 691
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Edwards J R and Lambert L S 2007 Methods for Integrating
    Moderation and Mediation A General Analytical Framework
    Using Moderated Path Analysis Psychological Methods (121)
    pp 122
    Esposito Vinzi V Trinchera L and Amato S 2010 PLS Path
    Modeling From Foundations to Recent Developments and Open
    Issues for Model Assessment and Improvement in Handbook of
    Partial Least Squares Concepts Methods and Applications
    V Esposito Vinzi W W Chin J Henseler and H Wang (eds)
    Berlin Springer pp 4782
    Esposito Vinzi V Trinchera L Squillacciotti S and Tenenhaus
    M 2008 REBUSPLS A ResponseBased Procedure for
    Detecting Unit Segments in PLS Path Modelling Applied
    Stochastic Models in Business & Industry (245) pp 439458
    Fraley C and Raftery A 2002 ModelBased Clustering
    Discriminant Analysis and Density Estimation Journal of the
    American Statistical Association (97458) pp 611631
    Gilbride T J Allenby G M and Brazell J D 2006 Models
    for Heterogeneous Variable Selection Journal of Marketing
    Research (433) pp 420430
    Goodhue D Lewis W and Thompson R 2007 Statistical
    Power in Analyzing Interaction Effects Questioning the Advan
    tage of PLS with Product Indicators Information Systems
    Research (182) pp 211227
    Haenlein M and Kaplan A M 2011 The Influence of
    Observed Heterogeneity on Path Coefficient Significance
    Technology Acceptance Within the Marketing Discipline The
    Journal of Marketing Theory and Practice (192) pp 153168
    Hahn C Johnson M D Herrmann A and Huber F 2002
    Capturing Customer Heterogeneity Using a Finite Mixture PLS
    Approach Schmalenbach Business Review (SBR) (543) pp
    243269
    Heeler R M and Ray M L 1972 Measure Validation in
    Marketing Journal of Marketing Research (94) pp 361370
    Henseler J and Chin W W 2010 A Comparison of Ap
    proaches for the Analysis of Interaction Effects Between Latent
    Variables Using Partial Least Squares Path Modeling Structural
    Equation Modeling A Multidisciplinary Journal (171) pp
    82109
    Henson J M Reise S P and Kim K H 2007 Detecting
    Mixtures from Structural Model Differences Using Latent
    Variable Mixture Modeling A Comparison of Relative Model
    Fit Statistics Structural Equation Modeling (142) pp 202226
    Hsieh J J PA Rai A and Keil M 2008 Understanding
    Digital Inequality Comparing Continued Use Behavioral
    Models of the SocioEconomically Advantaged and Disad
    vantaged MIS Quarterly (321) pp 97126
    Jaccard J and Wan C K 1995 Measurement Error in the
    Analysis of Interaction Effects Between Continuous Predictors
    Using Multiple Regression Multiple Indicator and Structural
    Equation Approaches Psychological Bulletin (1172) pp
    348357
    Jarvis C B MacKenzie S B and Podsakoff P M 2003 A
    Critical Review of Construct Indicators and Measurement Model
    Misspecification in Marketing and Consumer Research Journal
    of Consumer Research (302) pp 199218
    Jarvis C B MacKenzie S B and Podsakoff P M 2012 The
    Negative Consequences of Measurement Model Misspecification
    A Response to AguirreUrreta and Marakas MIS Quarterly
    (361) pp 139146
    Jedidi K Jagpal H S and DeSarbo W S 1997 FiniteMixture
    Structural Equation Models for ResponseBased Segmentation
    and Unobserved Heterogeneity Marketing Science (161) pp
    3959
    Johns G 2006 The Essential Impact of Context on Organiza
    tional Behavior The Academy of Management Review (312)
    pp 386408
    Jöreskog K G 1971 Simultaneous Factor Analysis in Several
    Populations Psychometrika (364) pp 409426
    Jöreskog K G 1978 Structural Analysis of Covariance and
    Correlation Matrices Psychometrika (434) pp 443477
    Jöreskog K G 1982 The LISREL Approach to Causal Model
    Building in the Social Sciences in Systems Under Indirect
    Observation Part I H Wold and K G Jöreskog (eds) Amster
    dam NorthHolland pp 81100
    Jöreskog K G and Yang F 1996 Nonlinear Structural Equa
    tion Models The KennyJudd Model with Interaction Effects
    in Advanced Structural Equation Modeling Issues and Tech
    niques G A Marcoulides and R E Schumacker (eds)
    Mahwah NJ Lawrence Earlbaum Associates pp 5787
    King W R and He J 2006 A MetaAnalysis of the Technology
    Acceptance Model Information & Management (436) pp
    740755
    Klein A and Moosbrugger H 2000 Maximum Likelihood Esti
    mation of Latent Interaction Effects with the LMS Method
    Psychometrika (654) pp 457474
    Lee SY and Song XY 2003 Bayesian Analysis of Structural
    Equation Models with Dichotomous Variables Statistics in
    Medicine (2219) pp 30733088
    Lenk P J DeSarbo W S Green P E and Young M R 1996
    Hierarchical Bayes Conjoint Analysis Recovery of Partworth
    Heterogeneity from Reduced Experimental Design Marketing
    Science (152) pp 173191
    Lohmöller JB 1989 Latent Variable Path Modeling with Partial
    Least Squares Heidelberg Physica
    Lubke G H and Muthén B 2005 Investigating Population
    Heterogeneity With Factor Mixture Models Psychological
    Methods (101) pp 2139
    Luo L Kannan P K and Ratchford B T 2008 Incorporating
    Subjective Characteristics in Product Design and Evaluations
    Journal of Marketing Research (452) pp 182194
    Mason C H and Perreault W D 1991 Collinearity Power and
    Interpretation of Multiple Regression Analysis Journal of Mar
    keting Research (283) pp 268280
    McLachlan G J and Peel D 2000 Finite Mixture Models New
    York Wiley
    Money K G Hillenbrand C Henseler J and Da Camara N
    2012 Exploring Unanticipated Consequences of Strategy
    Amongst Stakeholder Segments The Case of a European
    Revenue Service Long Range Planning (4556) pp 395423
    Muthén B O 1989 Latent Variable Modeling in Heterogeneous
    Populations Psychometrika (544) pp 557585
    Muthén B O 1994 Multilevel Covariance Structure Analysis
    Sociological Methods & Research (223) pp 376398
    Palumbo F Romano R and Esposito Vinzi V 2008 Fuzzy
    PLS Path Modeling A New Tool For Handling Sensory Data
    692 MIS Quarterly Vol 37 No 3September 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    in Data Analysis Machine Learning and Applications Pro
    ceedings of the 31st Annual Conference of the Gesellschaft für
    Klassifikation C Preisach H Burkhardt L SchmidtThieme
    and R Decker (eds) Berlin Springer pp 689696
    Papies D and Clement M 2008 Adoption of New Movie
    Distribution Services on the Internet Journal of Media Econo
    mics (213) pp 131157
    Parasuraman A Zeithaml V A and Berry L L 1988
    SERVQUAL A MultipleItem Scale for Measuring Consumer
    Perceptions of Service Quality Journal of Retailing (641) pp
    1240
    Petter S Rai A and Straub D 2012 The Critical Importance
    of Construct Measurement Specification A Response to
    AguirreUrreta and Marakas MIS Quarterly (361) pp
    147156
    Petter S Straub D and Rai A 2007 Specifying Formative
    Constructs in Information Systems Research MIS Quarterly
    (314) pp 623656
    Popkowski Leszczyc P T and Bass F M 1998 Determining
    the Effects of Observed and Unobserved Heterogeneity on
    Consumer Brand Choice Applied Stochastic Models and Data
    Analysis (142) pp 95115
    Qureshi I and Compeau D 2009 Assessing BetweenGroup
    Differences in Information Systems Research A Comparison of
    Covariance and ComponentBased SEM MIS Quarterly (331)
    pp 197214
    RabeHesketh S Skrondal A and Pickles A 2004 Gener
    alized Multilevel Structural Equation Modeling Psychometrika
    (692) pp 167190
    Rai A Patnayakuni R and Seth N 2006 Firm Performance
    Impacts of Digitally Enabled Supply Chain Integration Capa
    bilities MIS Quarterly (302) pp 225246
    R Core Team 2013 R A Language and Environment for
    Statistical Computing R Foundation for Statistical Computing
    Vienna
    Reinartz W J Echambadi R and Chin W W 2002 Gener
    ating NonNormal Data for Simulation of Structural Equation
    Models Using Mattson’s Method Multivariate Behavioral
    Research (372) pp 227244
    Reinartz W J Haenlein M and Henseler J 2009 An
    Empirical Comparison of the Efficacy of CovarianceBased and
    VarianceBased SEM International Journal of Research in
    Marketing (264) pp 332344
    Reinecke J 2006 Special Issue Mixture Structural Equation
    Modeling Methodology European Journal of Research
    Methods for the Behavioral and Social Sciences (23) pp 8385
    Rigdon E E Ringle C M and Sarstedt M 2010 Structural
    Modeling of Heterogeneous Data with Partial Least Squares in
    Review of Marketing Research N K Malhotra (ed) Armonk
    NY M E Sharpe pp 255296
    Ringle C M Sarstedt M and Mooi E A 2010a Response
    Based Segmentation Using Finite Mixture Partial Least Squares
    Theoretical Foundations and an Application to American
    Customer Satisfaction Index Data Annals of Information
    Systems (8) pp 1949
    Ringle C M Sarstedt M and Schlittgen R 2010b Finite
    Mixture and Genetic Algorithm Segmentation in Partial Least
    Squares Path Modeling Identification of Multiple Segments in
    a Complex Path Model in Advances in Data Analysis Data
    Handling and Business Intelligence A Fink B Lausen W
    Seidel and A Ultsch (eds) Berlin Springer pp 167176
    Ringle C M Sarstedt M Schlittgen R and Taylor C R 2013
    PLS Path Modeling and Evolutionary Segmentation Journal
    of Business Research forthcoming
    Ringle C M Sarstedt M and Straub D 2012 A Critical Look
    at the Use of PLSSEM in MIS Quarterly MIS Quarterly (361)
    pp iiiviii
    Ringle C M Wende S and Will A 2005 SmartPLS 20
    wwwsmartplsde
    Rust R T and Verhoef P C 2005 Optimizing the Marketing
    Interventions Mix in IntermediateTerm CRM Marketing
    Science (243) pp 477489
    Sánchez G 2009 PATHMOX Approach Segmentation Trees
    in Partial Least Squares Path Modeling unpublished doctoral
    dissertation Universitat Politècnica de Catalunya
    Sánchez G and Aluja T 2006 PATHMOX A PLSPM
    Segmentation Algorithm in Proceedings of the IASC Sym
    posium on Knowledge Extraction by Modelling International
    Association for Statistical Computing Island of Capri Italy
    Sánchez G and Aluja T 2012 R Package pathmox Segmen
    tation Trees in Partial Least Squares Path Modeling (Version
    011) httpcranrprojectorgwebpackagespathmox
    Sánchez G and Trinchera L 2013 R Package PLSPM (version
    035) httpcranrprojectorgwebpackagesplspm
    Sarstedt M 2008 A Review of Recent Approaches for Capturing
    Heterogeneity in Partial Least Squares Path Modelling Journal
    of Modelling in Management (32) pp 140161
    Sarstedt M Becker JM Ringle C M and Schwaiger M
    2011a Uncovering and Treating Unobserved Heterogeneity
    with FIMIXPLS Which Model Selection Criterion Provides an
    Appropriate Number of Segments Schmalenbach Business
    Review (631) pp 3462
    Sarstedt M Henseler J and Ringle C M 2011b MultiGroup
    Analysis in Partial Least Squares (PLS) Path Modeling Alter
    native Methods and Empirical Results in Advances in Inter
    national Marketing Volume 22 M Sarstedt M Schwaiger and
    C R Taylor (eds) Bingley UK Emerald Group Publishing
    Limited pp 195218
    Sarstedt M and Ringle C M 2010 Treating Unobserved
    Heterogeneity in PLS Path Modelling A Comparison of FIMIX
    PLS with Different Data Analysis Strategies Journal of
    Applied Statistics (378) pp 12991318
    Sarstedt M Schwaiger M and Ringle C M 2009 Do We
    Fully Understand the Critical Success Factors of Customer
    Satisfaction with Industrial Goods Extending Festge and
    Schwaiger’s Model to Account for Unobserved Heterogeneity
    Journal of Business Market Management (33) pp 185206
    Sörbom D 1974 A General Method for Studying Differences in
    Factor Means and Factor Structure between Groups British
    Journal of Mathematical and Statistical Psychology (272) pp
    229239
    Späth H 1979 Algorithm 39 Clusterwise Linear Regression
    Computing (224) pp 367373
    Squillacciotti S 2005 Prediction Oriented Classification in PLS
    Path Modeling in PLS & Marketing Proceedings of the 4th
    International Symposium on PLS and Related Methods T Aluja
    MIS Quarterly Vol 37 No 3September 2013 693
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    J Casanovas V Esposito Vinzi and M Tenenhaus (eds) Paris
    DECISIA pp 499506
    Squillacciotti S 2010 PredictionOriented Classification in PLS
    Path Modeling in Handbook of Partial Least Squares
    Concepts Methods and Applications V Esposito Vinzi W W
    Chin J Henseler and H Wang (eds) Berlin Springer pp
    219233
    Srite M and Karahanna E 2006 The Role of Espoused
    National Cultural Values in Technology Acceptance MIS
    Quarterly (303) pp 679704
    Steenkamp JB E M and Baumgartner H 1998 Assessing
    Measurement Invariance in CrossNational Consumer Research
    Journal of Consumer Research (251) pp 7890
    Straub D W 1989 Validating Instruments in MIS Research
    MIS Quarterly (132) pp 147169
    Tenenhaus M Esposito Vinzi V Chatelin YM and Lauro C
    2005 PLS Path Modeling Computational Statistics & Data
    Analysis (481) pp 159205
    Tueller S and Lubke G 2010 Evaluation of Structural Equa
    tion Mixture Models Parameter Estimates and Correct Class
    Assignment Structural Equation Modeling A Multidisciplinary
    Journal (172) pp 165192
    Van de Ven A H 2007 Engaged Scholarship A Guide for
    Organizational and Social Research New York Oxford
    University Press
    Vandenberg R J and Lance C E 2000 A Review and
    Synthesis of the Measurement Invariance Literature Sugges
    tions Practices and Recommendations for Organizational
    Research Organizational Research Methods (31) pp 470
    Venkatesh V 2000 Determinants of Perceived Ease of Use
    Integrating Control Intrinsic Motivation and Emotion into the
    Technology Acceptance Model Information Systems Research
    (114) pp 342365
    Venkatesh V and Bala H 2008 Technology Acceptance Model
    3 and a Research Agenda on Interventions Decision Sciences
    (392) pp 273315
    Venkatesh V and Davis F D 2000 A Theoretical Extension of
    the Technology Acceptance Model Four Longitudinal Field
    Studies Management Science (462) pp 186204
    Venkatesh V and Morris M G 2000 Why Don’t Men Ever
    Stop to Ask for Directions Gender Social Influence and Their
    Role in Technology Acceptance and Usage Behavior MIS
    Quarterly (241) pp 115139
    Venkatesh V Morris M G Davis G B and Davis F D 2003
    User Acceptance of Information Technology Toward a Unified
    View MIS Quarterly (273) pp 425478
    Wang J and Keil M 2007 A MetaAnalysis Comparing the
    Sunk Cost Effect for IT and NonIT Projects Information
    Resources Management Journal (203) pp 118
    Wedel M and DeSarbo W S 1994 A Review of Latent Class
    Regression Models and their Applications in Advanced
    Methods for Marketing Research R P Bagozzi (ed) Cam
    bridge UK Blackwell Business pp 353388
    Wedel M and Kamakura W 2000 Market Segmentation Con
    ceptual and Methodological Foundations (2nd ed) New York
    Kluwer Academic Publishers
    Wetzels M OdekerkenSchröder G and van Oppen C 2009
    Using PLS Path Modeling for Assessing Hierarchical Construct
    Models Guidelines and Empirical Illustration MIS Quarterly
    (331) pp 177195
    Wold H 1982 Soft Modeling The Basic Design and Some
    Extensions in Systems Under Indirect Observations Part I
    K G Jöreskog and H Wold (eds) Amsterdam NorthHolland
    pp 154
    Wu J and Lederer A 2009 A MetaAnalysis of the Role of
    EnvironmentBased Voluntariness in Information Technology
    Acceptance MIS Quarterly (332) pp 419432
    About the Authors
    JanMichael Becker is a postdoctoral researcher at the University
    of Cologne He received his doctoral degree in Marketing from the
    University of Cologne Germany and his diploma in Information
    Systems from the University of Hamburg Germany He has been
    a visiting scholar at Georgia State University several times His
    research interests focus on structural equation modeling PLS path
    modeling unobserved heterogeneity and mixture models as well as
    brand management and bridging marketing and IS problems
    Arun Rai is Regents’ Professor and the Harkins Chair in the Center
    for Process Innovation and the Department of Computer Informa
    tion Systems at the Robinson College of Business Georgia State
    University His research has examined how firms can leverage
    information technologies in their strategies interfirm relationships
    and processes and how systems can be successfully developed and
    implemented His articles have appeared in Management Science
    MIS Quarterly Information Systems Research Journal of Manage
    ment Information Systems Journal of Operations Management and
    other journals He serves or has served as a senior editor at
    Information Systems Research MIS Quarterly and Journal of
    Strategic Information Systems and as an associate editor at Infor
    mation Systems Research Management Science Journal of MIS
    and MIS Quarterly He was named Fellow of the Association for
    Information Systems in 2010 in recognition for outstanding
    contributions to the Information Systems discipline
    Christian M Ringle is Professor of Management at the Hamburg
    University of Technology (TUHH) Germany and Visiting
    Professor at the University of Newcastle Australia His research
    concerns improvements of quantitative methods for business
    research applied to study management and marketing issues His
    work has been published in outlets that include MIS Quarterly
    International Journal of Research in Marketing Journal of the
    Academy of Marketing Science Journal of Service Research
    Journal of Business Research and Long Range Planning
    Franziska Völckner is Professor of Marketing at the University of
    Cologne Germany Her research interest focuses on building and
    managing marketbased assets This interest bridges the areas of
    branding consumer behavior marketing metrics and marketing
    strategy Her work has been published in several academic journals
    including Journal of Marketing Journal of Marketing Research
    International Journal of Research in Marketing Journal of the
    Academy of Marketing Science Journal of Service Research
    Journal of Business Research and Marketing Letters among others
    694 MIS Quarterly Vol 37 No 3September 2013
    RESEARCH ESSAY
    DISCOVERING UNOBSERVED HETEROGENEITY IN
    STRUCTURAL EQUATION MODELS TO
    AVERT VALIDITY THREATS
    JanMichael Becker
    Department of Marketing and Brand Management University of Cologne
    Cologne 50923 GERMANY {jbecker@wisounikoelnde}
    Arun Rai
    Center for Process Innovation and Department of Computer Information Systems Robinson College of Business
    Georgia State University Atlanta GA 30303 USA {arunrai@gsuedu}
    Christian M Ringle
    Institute for Human Resource Management and Organizations Hamburg University of Technology (TUHH)
    Hamburg 21073 GERMANY {ringle@tuhhde} and
    Faculty of Business and Law University of Newcastle Callaghan NSW 2308 AUSTRALIA {christianringle@newcastleeduau}
    Franziska Völckner
    Department of Marketing and Brand Management University of Cologne
    Cologne 50923 GERMANY {voelckner@wisounikoelnde}
    Appendix A
    MetaAnalyses of Information Systems Studies
    Table A1 MetaAnalyses of IS Studies Inconsistent Results Across a Range of Phenomena
    IS Phenomenon
    Reference
    Journal Scope MetaAnalysis Purpose
    ModeratorsContingency
    Variables Examined
    Nature of Inconsistent Findings
    (emphasis added)
    Decision Support
    System (DSS)
    Implementation
    Success
    Alavi and
    Joachimsth
    aler 1992
    MISQ
    144
    findings
    from 33
    studies
    Investigating the relationship
    between userrelated factors and
    DSS implementation success
    Authors suggest that
    moderators could explain the
    large variance in effect sizes
    across studies
    Reviews of information systems
    implementation research…have
    revealed that collectively implemen
    tation studies have yielded
    conflicting and somewhat
    confusing findings
    Group Support
    Systems (GSS)
    Dennis et al
    2001 MISQ 61 articles
    Developing a new model for
    interpreting GSS effects on firm
    performance
    • Fit between the Task and
    the GSS Structures
    • Appropriation Support
    Received
    Many previous papers have
    lamented the fact that the findings of
    past GSS research have been
    inconsistent This paper develops
    a new model for interpreting GSS
    effects on performance…
    MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A1
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table A1 MetaAnalyses of IS Studies Inconsistent Results Across a Range of Phenomena
    (Continued)
    IS Phenomenon
    Reference
    Journal Scope MetaAnalysis Purpose
    ModeratorsContingency
    Variables Examined
    Nature of Inconsistent Findings
    (emphasis added)
    IT Investment
    Payoff
    Kohli and
    Deveraj
    2003 ISR
    66 studies
    Examining structural variables that
    explain why some IT payoff studies
    observe a positive effect and some
    do not
    • Dependent Classification
    • Sample Size
    • Data Source
    • Type of IT Impact
    • Type of IT Assets
    • Industry
    …some studies have shown mixed
    results in establishing a relationship
    between IT investment and firm
    performance
    IT Innovation
    Adoption
    Lee and Xia
    2006 I&M
    54 correla
    tions from
    21 studies
    Investigating the effects of
    organizational size on IT innovation
    adoption
    • Type of Innovation
    • Type of Organization
    • Stage of Adoption
    • Scope of Size
    • Industry Sector
    …empirical results on the
    relationship between them have
    been disturbingly mixed and
    inconsistent…explain and resolve
    these mixed results by… examining
    the effects of six moderators on the
    relationship
    IT Project
    Escalation
    Wang and
    Keil 2007
    IRMJ
    12 articles
    with
    20 separate
    experiment
    s
    Investigating the effect size of sunk
    cost on project escalation and deter
    mining whether there is a difference
    in effect sizes between IT and non
    IT projects
    • IT vs NonIT Projects
    …because of the strong magnitude
    and heterogeneity of effect sizes
    for the sunk cost effect we need
    more primary studies that
    investigate potential moderators of
    sunk cost
    Turnover of IT
    Professionals
    Joseph et
    al 2007
    MISQ
    33 studies
    Integrating the 43 antecedents of
    turnover intentions of IT
    professionals in a unified framework
    using metaanalytic structural
    equation modeling
    • Age
    • Gender Ratio of Sample
    • Operationalization of
    Turnover Intention
    • Operationalization of
    Antecedents
    …our narrative review finds several
    inconsistent (eg organization
    tenure and role conflict) and
    inconclusive (eg age and gender)
    findings
    IS
    Implementation
    Success
    Sharma and
    Yetton
    2003 MISQ
    22 studies
    Proposing a contingent model in
    which task interdependence
    moderates the effect of
    management support on
    implementation success
    • Task Interdependence
    A metaanalysis of the empirical
    literature provides strong support for
    the model and begins to explain the
    wide variance in empirical
    findings
    The theory developed and findings
    reported above help to explain the
    inconsistent findings in the
    literature
    Sabherwal
    et al 2006
    MgmtScien
    ce
    612
    findings
    from 121
    studies
    Explaining the interrelationships
    among four constructs representing
    the success of a specific information
    system and the relationships of
    these IS success constructs with
    four userrelated constructs and two
    constructs representing the context
    Authors suggest that possible
    moderators include voluntari
    ness of IS adoption and user
    characteristics such as age
    and gender
    Despite considerable empirical
    research results on the
    relationships among constructs
    related to information system (IS)
    success as well as the determinants
    of IS success are often
    inconsistent
    Sharma and
    Yetton
    2007 MISQ
    27 studies
    Proposing a contingent model in
    which the effect of training on IS
    implementation success is a
    function of technical complexity and
    task interdependence
    • Technical Complexity
    • Task Interdependence
    Research has investigated the main
    effect of training on information
    systems implementation success
    However empirical support for
    this model is inconsistent
    A2 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table A1 MetaAnalyses of IS Studies Inconsistent Results Across a Range of Phenomena
    (Continued)
    IS Phenomenon
    Reference
    Journal Scope MetaAnalysis Purpose
    ModeratorsContingency
    Variables Examined
    Nature of Inconsistent Findings
    (emphasis added)
    Technology
    Acceptance
    King and He
    2006 I&M 88 studies
    Summarizing TAM research and
    investigating conditions under which
    TAM may have different effects
    • Type of Users
    • Type of Usage
    all TAM relationships are not
    borne out in all studies there is
    wide variation in the predicted
    effects in various studies…
    Since there are inconsistencies in
    TAM results a metaanalysis is
    more likely to appropriately integrate
    the positive and the negative
    Schepers
    and Wetzels
    2007 I&M
    51 articles
    containing
    63 studies
    Analyzing the role of subjective
    norms and three interstudy
    moderating factors
    • Type of Respondents
    • Type of Technology
    •Culture
    First the subjective norm has had a
    mixed and inconclusive
    role…Some studies found
    considerable impacts of it on the
    dependent variables However
    others did not find significant
    effects
    Wu and
    Lederer
    2009
    MISQ
    71 studies
    Investigating the impact of
    environmentbased voluntariness on
    the relationships among the four
    primary TAM constructs (ie ease
    of use perceived usefulness
    behavioral intention and usage)
    • EnvironmentBased
    Voluntariness
    The Q statistic for each of the five
    correlations exceeded its cutoff and
    thus the analyses confirmed
    heterogeneity for each (p < 001)
    That is of all the correlations vary
    across studies more than would
    be produced by sampling error
    Appendix B
    PredictionOriented Segmentation for PLS Path Modeling (PLSPOS)
    Overview
    As a distancebased segmentation method the PLS predictionoriented segmentation (PLSPOS) method builds on earlier work on distance
    measurebased segmentation—that is the PLS typological path modeling (PLSTPM) approach (Squillacciotti 2005) and its enhancement the
    responsebased detection of respondent segments in PLS (REBUSPLS) (Esposito Vinzi et al 2008) To extend the distancemeasurebased
    PLS segmentation methods (including overcoming the methodological limitation of PLSTPM and REBUSPLS being applicable only to PLS
    path models with reflective measures (Esposito Vinzi et al 2008 Sarstedt 2008)) the PLSPOS algorithm introduces three novel features (1) it
    uses an explicit PLSspecific objective criterion to form homogeneous groups (2) it includes a new distance measure that is appropriate for
    PLS path model with both reflective and formative measures and is able to uncover unobserved heterogeneity in formative measures and (3) it
    ensures continuous improvement of the objective criterion throughout the iterations of the algorithm (hillclimbing approach) Table B1 shows
    the key technical differences of the new PLSPOS method in comparison with the main distancebased methods (ie PLSTPM and REBUS
    PLS) and the popular finitemixture method for PLS (ie FIMIXPLS)
    The following sections explain in greater detail PLSPOS’ distinctive features To begin with we focus on the description of PLSPOS’
    objective criterion An explanation of the distance measure employed and its extension to use it for formative measurement models follows
    Finally we provide details on the algorithm with its specific steps and procedures and how it ensures the continuous improvement of the
    objective criterion
    Objective Criterion of PLSPOS
    The main segmentation objective in PLS is to form homogenous groups of observations that show increased endogenous variables’ explained
    variance (R²) and thus provide an improved prediction (compared to the overall sample) which is in accordance with Anderberg’s (1973 p
    MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A3
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table B1 Comparison of the Technical Differences of FIMIXPLS PLSTPM REBUSPLS and PLSPOS
    FiniteMixture
    Segmentation
    Approach DistanceBased Clustering Approaches
    Algorithm Feature FIMIXPLS
    (Hahn et al 2002)
    PLSTPM
    (Squillacciotti 2005
    Squillacciotti 2010)
    REBUSPLS
    (Esposito Vinzi et al 2010
    Esposito Vinzi et al 2008)
    PLSPOS
    Distributional
    Assumptions Yes No No No
    Preclustering
    No preclustering
    random split of
    observations
    Hierarchical
    classification based
    on redundancy
    residuals of the
    overall model
    Hierarchical classification
    based on communality and
    structural residuals of the
    overall model
    No preclustering random
    split of observations and
    assignment to closest
    segment according to the
    distance measure
    Distance measure Has no distance
    measure†
    Based on redundancy
    residuals of a single
    reflective endogenous
    latent variable
    Based on communality
    residuals of all latent vari
    ables and structural
    residuals of all endog
    enous latent variables
    Based on structural resi
    duals of all endogenous
    latent variables with an
    extension that also accounts
    for heterogeneity in
    formative measures
    Accounts for sources of
    heterogeneity in reflec
    tive measures
    No No Yes No
    Accounts for sources of
    heterogeneity in forma
    tive measures
    No No‡ No ‡ Yes
    Accounts for sources of
    heterogeneity in the
    structural model
    Yes Yes Yes Yes
    Assignment of
    observations to
    segments in each
    iteration
    Proportional assignment
    of all observations to all
    segments based on the
    conditional multivariate
    normal densities to
    optimize the likelihood
    function
    Assigns all
    observations to the
    closest segment
    Assigns all observations to
    the closest segment
    Assigns only one
    observation to the closest
    segment and assures
    improvement of an objective
    criterion (R² of all
    endogenous latent
    variables) before accepting
    the change
    Stop criterion
    Extremely small
    improvement in log
    likelihood below critical
    value (or maximum
    number of iterations)
    Stability of the
    classes’ composition
    (no reassignment of
    observations) or
    maximum number of
    iterations
    Stability of the classes’
    composition (number of re
    assignments below a
    critical percentage value of
    observations) or maximum
    number of iterations
    Infinitesimal improvement in
    objective criterion (or
    maximum number of
    iterations)
    †FIMIXPLS assumes that each endogenous latent variable is distributed as a finite mixture of conditional multivariate normal densities It uses
    these densities to estimate probabilities of segment memberships for each observation (proportional assignment) to optimize the likelihood function
    (which implicitly maximizes the segmentspecific explained variance as part of the likelihood function)
    ‡As in PLSTPM … [REBUSPLS] distance’ has so far only been implemented on models with reflective blocks Although this is not to be
    considered a strict limitation for many applications it must be pointed out that REBUSPLS requires all blocks to be reflective (Esposito Vinzi et
    al 2008 p 444) This requirement for models with only reflective measures also holds for the REBUSPLS implementation in the PLSPM package
    (Sánchez and Trinchera 2013) for the statistical software R (R Core Team 2013)
    A4 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    195) notion of clustering for maximum prediction Consequently possible PLSspecific and thus predictionoriented objective criteria
    include the following (1) the sum of the manifest variables’ redundancy residuals in the reflective measures (2) the sum of endogenous latent
    variables’ R² values in the structural model and (3) the goodnessoffit criterion (GoF Tenenhaus et al 2005)1 for assessing both the structural
    model and the reflective measures
    Including the residual terms of the manifest variables would only be appropriate to assess the explained variance and thus the predictive
    performance in reflective measures Because PLS path modeling allows for the use of reflective and formative measures objective criteria
    that draw on the manifest variables’ residual terms do not support the general applicability of PLSPOS in both measurement models (ie
    reflective and formative) Consequently the redundancy and community residual in the reflective measures which are also included in the
    PLSGoF measure are not a useful criterion for the purpose of the PLS segmentation method
    An appropriate PLSspecific objective criterion maximizes the sum of the endogenous latent variables’ R² values In accordance with the PLS
    algorithm’s objective (Lohmöller 1989 Wold 1982) PLSPOS focuses on maximizing the predictivity of each group by minimizing the sum
    of the endogenous latent variables’ squared residuals in the PLS path model Thus the sum of each group’s sum of R² values represents the
    objective criterion which is explicitly defined and calculated in the PLSPOS algorithm Every reassignment of observations in PLSPOS
    ensures improvement of the objective criterion (hill climbing approach see description of the algorithm below) This objective criterion is
    suitable for any PLS path model regardless of whether such models include reflective or formative measures
    Distance Measure
    To reassign observations PLSPOS builds on the idea of Squillacciotti (2005) and Esposito Vinzi et al (2008) to use a distance measure We
    propose a new distance measure that is applicable to both reflective and formative measures and accounts for heterogeneity in the structural
    and the formative measurement model This observationtogroup distance measure identifies appropriate observations to form homogenous
    groups and thereby depicts suitable candidates to improve the objective criterion Within a group each observation’s capability to predict the
    endogenous latent variables in the PLS path model determines its distance to that group the shorter the distance of observation i to group g
    the higher the predictivity of observation i in group g
    It is important to understand the conceptual difference between observation i’s membership in its current group k (k g k g ε G) and its dis
    tance to an alternative group g (k … g k g ε G) For every endogenous latent variable b (b ε B) the latent variable scores of its direct prede
    cessors and the corresponding structural model path coefficients allow for the groupspecific prediction of the endogenous latentYaik
    exogenous
    b
    pagb
    variable scores via linear combinations To calculate we use the latent variable scores of()Ybig ()YYpbig a ik
    exogenous
    aga
    A
    bbb
    b× 1 Ybig
    an observation’s current group k and draw on the alternative group g’s PLS path coefficients The difference between the predicted value pagb
    Ybig
    and the current group’s latent variable scores from the PLS path model estimation is the residual of observation i in group g for theYbik
    endogenous latent variable b (Equation 1)
    (1)()eYY Y pYbig big bik a ok
    exogenous
    ag bik
    endogenous
    a
    A
    bb
    b
    b
    2 2
    1
    2
    − ×−

     



    
    The result of is an observation’s predictivity in its current group when k g (k g ε G) Furthermore using the path coefficients ebig
    2 pagb
    of alternative groupspecific PLS estimations for k … g (k g ε G) provides a heuristic outcome for observation i’s predictivity in each of the
    G1 other possible group assignments This establishes the new predictionoriented PLSPOS distance measure as presented by Equation (2)
    (2)D e
    ekig
    big
    bigi
    I
    b
    B
    k

    
    2
    2
    11
    The residuals of each observation i are divided by the sum of the residuals of all observations in i’s current group k (Ik sample size in group
    k) This ratio’s square root is the distance of an observation i to group g for an endogenous latent variable b (b ε B) The sum over all
    1Against its naming PLSGoF does not represent a measure of fit for PLS path modeling see Henseler and Sarstedt (2012) for a discussion
    MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A5
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    endogenous variables B in the PLS path model provides the total distance measure Dkig The smaller the sum of the endogenous latent variables’
    squared residual values the higher the predictivity of observation i in group g of the underlying PLS path model
    The distinction between formative and reflective measures requires that one pays particular attention in PLS path modeling (eg
    Diamantopoulos et al 2001 Gudergan et al 2008 Jarvis et al 2003) Formative measures require (1) taking into account the indicators’
    heterogeneity for each measurement model within each group andor (2) uncovering the significant differences in weights between the groups
    Therefore calculating the groupspecific residual term in models with formative measures requires an extension of the groupspecific residual
    in the distance measure The latent variable scores are replaced by linear combinations of the manifest variable scores andebig
    2 Yajikb
    xajikb
    the corresponding measurement model’s formative weights Equation (3) shows the calculation of the residual term for formativeπajgb
    measures in the PLS path model
    (3)ex pYbig a jik a jg a g bik
    endogenous
    j
    J
    a
    A
    bbb
    b
    b
    2
    1
    2
    ××−

     

    
    π
    The formative latent variable scores become a groupwise reestimated prediction of the associated manifest variables j when the squared residual
    is determined
    Algorithm
    The segmentation process starts by randomly partitioning the overall sample into the prespecified number of G equal groups (Figure B1 Step
    1) Calculating all groupspecific PLS path model estimates reveals each observation’s distance to its own and all other G1 groups A
    partitioning approach that assigns each observation to the group to which it has the shortest distance improves the initial segmentation
    Subsequently the PLSPOS algorithm computes the groupspecific PLS path modeling results (Figure B1 Step 2) updates the objective
    function (Figure B1 Step 3) and computes the observations’ distances to all groups (Figure B1 Step 41) PLSPOS uses the distance measure
    to reassign observations based on the maximum value of the difference between an observation’s distance to its current group (ie the group
    to which the observation has been assigned) and its distance to an alternative group (Equation 4)
    difference Δkig distance to current group k (Dkik) – distance to alternative group g (Dkig)(4)
    Positive differences indicate that an observation has a shorter distance to the alternative group and thus potentially fits better in that group
    in terms of predictivity This computation is conducted for all observations (Figure B1 Step 41) Each observation’s maximum positive
    difference becomes part of the list of candidates (Figure B1 Step 42) Negative values are not considered because reassigning these
    observations possibly decreases the objective criterion Subsequently the candidates are sorted in descending order in terms of their positive
    distance differences (Figure B1 Step 43)
    After the STOP statement PLSPOS provides the groupspecific PLS path model estimates for the final segmentation solution (Figure B1
    Step 7) The maximum number of iterations should be sufficiently high (eg twice the number of observations in the overall sample) to obtain
    a solution that is close to the global optimum The maximum search depth equals the number of observations in the sorted list of candidate
    observations for reassignment and thus may not exceed the number of observations in the overall sample In early explorative research stages
    one may use a reduced search depth for performance reasons However to determine the final segmentation result the search depth should
    equal the maximum number of observations to ensure that the segmentation solution that minimizes the PLSPOS objective criterion (ie the
    endogenous latent variables’ R² values in the PLS path model) has been identified
    Finally three important issues are worth noting First PLSPOS only reassigns observations that improve the objective criterion As such
    the algorithm ensures the continuous improvement of the objective criterion and potentially provides a solution that is at least close to the global
    optimum Second in each iteration step the algorithm changes the assignment of only one observation and calculates the groupspecific PLS
    estimates of all observations and their new distance measures Thus in contrast to the alternative distancebased PLS segmentation approaches
    suggested in the literature to date (eg Esposito Vinzi et al 2008 Squillacciotti 2005) PLSPOS avoids moving a sizeable set (more or less)
    of similar candidates from one group to another without improving the objective criterion Third owing to the implementation of a hill
    climbing approach PLSPOS could face the problem of ending in local optima Wedel and Kamakura (2000) recommend running hillclimbing
    algorithms several times to attain alternative starting partitions and finally to select the best segmentation solution The same procedure should
    be applied in the application of PLSPOS
    A6 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Step 1 Create an initial segmentation to start the algorithm
    Step 11 Randomly split the overall sample into K equally sized groups
    Step 12 Compute the groupspecific PLS estimates for the path model
    Step 13 Establish each observation’s distance to each group
    Step 14 Assign each observation to the closest group
    DO LOOP
    Step 2 Compute the groupspecific PLS estimates for the path model
    Step 3 Determine the result of the objective criterion
    Step 4 Create a list of candidate observations for reassignment
    Step 41 Establish the K1 differences between each observation’s distance to its current group and an alternative
    group
    Step 42 IF an observation has one or more positive differences of distances then
    Add the maximum difference and the observation’s corresponding alternative group assignment to a list of
    candidates
    ELSE Do nothing
    Step 43 IF the list is empty then
    GO TO STOP
    ELSE Sort the list of candidate observations in descending order in terms of their positive distance differences
    Step 5 Improve the segmentation result
    Step 51 Select the first observation in the list of candidate observations for reassignment
    DO LOOP
    Step 52 Reassign the observation
    Step 52 Compute the groupspecific PLS estimates for the path model
    Step 53 Determine the result of the objective criterion
    Step 54 IF the observation’s reassignment improves the objective criterion then
    Save the current assignment and GO TO Step 6
    ELSE Undo changes and continue with Step 55
    Step 55 IF the list contains a subsequent observation following the currently selected observation on the list of
    candidates AND the maximum search depth has not been reached then
    Select the next observation
    ELSE GO TO Step 6
    UNTIL the objective criterion is improved
    Step 6 IF the maximum number of iterations OR the maximum search depth has been reached then
    GO TO STOP
    ELSE GO TO Step 2
    UNTIL STOP
    Step 7 Compute the groupspecific PLS path model estimates and provide the final segmentation results
    Figure B1 The PLSPOS Algorithm
    Appendix C
    Design of the Multicollinearity Factor for the Simulation Study
    The design of the simulation study for the formative measurement model includes three levels of multicollinearity between the formative
    indicators in the model To simulate different levels of multicollinearity we revert to Mason and Perreault’s (1991) seminal study on
    multicollinearity (see also Grewal et al 2004) We vary two levels of correlation patterns among the predictor variables reflecting conditions
    typically encountered by researchers and practitioners In addition a situation in which the indicators are uncorrelated (orthogonal) serves as
    a baseline for comparison (ie a perfect formative measure) because this model is unaffected by multicollinearity
    MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A7
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table C1 shows the two multicollinearity levels based on Mason and Perreault including the trace of (X’X)1 det(X’X) and condition number
    as well as each variable’s variance inflation factor (VIF) associated with a given level of multicollinearity
    Table C1 Levels of Multicollinearity
    Level 1 Level 2
    X1 X2 X3 X4 X1 X2 X3 X4
    X1 100 100
    X2 65 100 80 100
    X3 40 40 100 60 60 100
    X4 00 00 00 100 00 00 00 100
    VIF 180 180 124 100296296167100
    Trace (X’X)1 585 859
    Det(X’X) 47 22
    Condition no 238 342
    Note VIF variance inflation factor
    Appendix D
    Simulation on the Effects of Unobserved Heterogeneity
    The objective of this simulation study is to evaluate the implications of unobserved heterogeneity for structural model parameter estimates in
    PLS path models The results show that unobserved heterogeneity has a strong adverse effect on PLS estimation outcomes (1) parameter
    estimates are biased (2) nonsignificant path coefficients at the group level become significant at the overall sample level that combines groups
    (3) sign differences in the parameter estimates between groups are manifested as nonsignificant results at the overall sample level and
    (4) explained variance of the model (R² of the endogenous variables) decreases These erroneous estimates can lead to both Type I and Type II
    errors and to invalid inferences
    The simulation study uses a path model with two exogenous variables having a direct effect on one endogenous variable (all variables measured
    with five reflective indicators) We generate data for the true path coefficients of two groups by considering three situations of unobserved
    heterogeneity
    • Situation 1 where the path coefficients between group 1 and group 2 differ but show the same sign We consider scenarios where all
    parameter estimates are positive (situation 1a) and negative (situation 1b) and where the magnitude in parameter differences between groups
    is low (1) and high (5)
    • Situation 2 where unobserved heterogeneity causes sign reversal in parameter estimates across the two groups (ie group 1 has a positive
    path coefficient while group 2 has a negative one)
    • Situation 3 where one group has a nonsignificant parameter estimate and the other group has a significant parameter estimate We distinguish
    between two different levels of parameter differences represented by the effect size of the significant parameter namely 2 and 7
    We generated 100 sets of data for each condition and estimated the groupspecific path coefficients the overall sample path coefficients and
    the tvalues of these coefficients by employing the bootstrapping procedure on 1000 subsamples (Henseler et al 2009)
    Table D1 presents the results The left side shows the groupspecific mean estimates of the path coefficients and their average tvalues2 The
    columns on the right side show the mean path coefficients of the overall sample and the interpretation of the results in terms of bias Type I
    and II errors and variance explained (R²)
    2For a significance level of α 005 the tvalue has to exceed the threshold of 198 in these conditions
    A8 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    The results show that in all situations biases in the parameter estimates distort effect sizes and cause misinterpretation of the path coefficients
    which is especially problematic for comparative hypotheses (eg path coefficient 1 > path coefficient 2) Type I and Type II errors are
    exacerbated in situations where the groupspecific parameters show inconsistent signs (ie situation 2 where signs are reversed across groups)
    and when at least one of the groups involves nonsignificant parameters while the other group does not (ie situation 3) In contrast when all
    parameters are significant and show the same sign (situation 1) our results suggest that it is not very likely that Type II errors occur In this
    situation the existence of Type II errors depends on the effect size and the degree to which the increased power of the combined sample size
    compensates for the increase in standard errors due to unobserved heterogeneity For all parameter constellations in our simulation study the
    increased sample size compensates for the increase in standard errors
    The R² decreases in almost all situations implying an inferior model fit at the overall sample level We find particularly strong decreases in
    R² in situations in which the groupspecific effect sizes are high in contrast R² is almost unaffected in situations showing low groupspecific
    effect sizes
    Table D1 Results of the Simulation Study
    GroupSpecific
    Parameter Estimates Pooled Parameter Estimate
    Group 1
    (n 200)
    Group 2
    (n 200)
    Parameter
    (n 400) Biased
    Type I
    Error
    Type II
    Error Lower R²
    1a
    7 (t 1857)
    2 (t 394)
    R² 53
    2 (t 384)
    7 (t 1964)
    R² 53
    45 (t 1136)
    45 (t 1154)
    R² 41
    Yes – No Yes
    3 (t 495)
    2 (t 331)
    R² 13
    2 (t 336)
    3 (t 479)
    R² 13
    25 (t 570)
    25 (t 573)
    R² 12
    Yes – No (Yes)
    1b
    7 (t 1895)
    2 (t 370)
    R² 53
    2 (t 401)
    7 (t 1927)
    R² 53
    45 (t 1119)
    45 (t 1144)
    R² 24
    Yes – No Yes
    3 (t 503)
    2 (t 314)
    R² 13
    2 (t 325)
    3 (t 509)
    R² 13
    25 (t 561)
    25 (t 580)
    R² 12
    Yes – No (Yes)
    2
    7 (t 1943)
    2 (t 399)
    R² 53
    7 (t 1909)
    2 (t 378)
    R² 53
    00 (t 01)
    00 (t 00)
    R² 00
    Yes – 100
    100 Yes
    3
    7 (t 1994)
    0 (t 01)
    R² 49
    0 (t 01)
    7 (t 1989)
    R² 49
    35 (t 761)
    35 (t 738)
    R² 24
    Yes 100
    100 No Yes
    2 (t 338)
    0 (t 00)
    R² 04
    0 (t 01)
    2 (t 317)
    R² 04
    10 (t 188)
    10 (t 190)
    R² 02
    Yes 20
    40
    80
    60 (Yes)
    4
    0 (t 00)
    0 (t 01)
    R² 00
    0 (t 01)
    0 (t 00)
    R² 00
    00 (t 00)
    00 (t 00)
    R² 00
    –No––
    MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A9
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Appendix E
    ANOVA Results—Model 1 (Reflective Measures)
    Tables E1 to E4 present the ANOVA results for model 1 (reflective measures) explaining MAB by method (PLSPOSFIMIXPLS) and the
    six design factors All significant and substantial effects (ie all effects that explain more than 2 percent of the total variance in MAB implying
    a partial η² of more than 02) are highlighted in grey
    We find that the R² structural model heterogeneity data distribution and the interaction of structural model heterogeneity and R² have a
    substantial and significant effect on the MAB of both methods Furthermore there is a significant and substantial difference in the parameter
    recovery (MAB) of the two methods (PLSPOS and FIMIXPLS) and for the interaction effects between the method and structural model
    heterogeneity and between the method and R²
    Table E1 BetweenSubjects Effects (Part I)
    Source of Variance in MAB df F Sig Partial η²
    Intercept 1 1465862 000 568
    SMH 3 112171 000 232
    R² 3 194885 000 344
    Sample Size 2 7077 000 013
    Reliability 1 188 170 000
    Data Distribution 1 49752 000 043
    RSS 1 2262 000 002
    SMH × R² 9 17896 000 126
    SMH × Sample Size 6 964 000 005
    SMH × Reliability 3 133 262 000
    SMH × Data Distribution 3 2115 000 006
    SMH × RSS 3 2517 000 007
    R² × Sample Size 6 1144 000 006
    R² × Reliability 3 75 524 000
    R² × Data Distribution 3 1472 000 004
    R² × RSS 3 2976 000 008
    Sample Size × Reliability 2 48 620 000
    Sample Size × Data Distribution 2 1417 000 003
    Sample Size × RSS 2 6392 000 011
    Reliability × Data Distribution 1 404 044 000
    Reliability × RSS 1 11 735 000
    Data Distribution × RSS 1 26772 000 023
    SMH × R² × Sample Size 18 175 026 003
    SMH × R² × Reliability 9 127 249 001
    SMH × R² × Data Distribution 9 600 000 005
    SMH × R² × RSS 9 232 013 002
    SMH × Sample Size × Reliability 6 139 216 001
    Note df degrees of freedom MAB mean absolute bias RSS relative segment size SMH structural model heterogeneity
    all significant and substantial effects (ie all effects that explain more than 2 of the total variance in MAB implying a partial η²
    of more than 02) are highlighted in grey
    A10 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table E2 BetweenSubjects Effects (Part II)
    Source of Variance in MAB df F Sig Partial η²
    SMH × Sample Size × Data Distribution 6 522 000 003
    SMH × Sample Size × RSS 6 923 000 005
    SMH × Reliability × Data Distribution 3 219 087 001
    SMH × Reliability × RSS 3 350 015 001
    SMH × Data Distribution × RSS 3 230 075 001
    R² × Sample Size × Reliability 6 188 080 001
    R² × Sample Size × Data Distribution 6 183 089 001
    R² × Sample Size × RSS 6 1300 000 007
    R² × Reliability × Data Distribution 3 185 135 000
    R² × Reliability × RSS 3 42 740 000
    R² × Data Distribution × RSS 3 783 000 002
    Sample Size × Reliability × Data Distribution 2 165 191 000
    Sample Size × Reliability × RSS 2 219 112 000
    Sample Size × Data Distribution × RSS 2 1714 000 003
    Reliability × Data Distribution × RSS 1 108 299 000
    SMH × R² × Sample Size × Reliability 18 53 948 001
    SMH × R² × Sample Size × Data Distribution 18 168 036 003
    SMH × R² × Sample Size × RSS 18 211 004 003
    SMH × R² × Reliability × Data Distribution 9 68 725 001
    SMH × R² × Reliability × RSS 9 80 614 001
    SMH × R² × Data Distribution × RSS 9 152 135 001
    SMH × Sample Size × Reliability × Data Distribution 6 60 730 000
    SMH × Sample Size × Reliability × RSS 6 79 577 000
    SMH × Sample Size × Data Distribution × RSS 6 241 025 001
    SMH × Reliability × Data Distribution × RSS 3 206 104 001
    R² × Sample Size × Reliability × Data Distribution 6 152 168 001
    R² × Sample Size × Reliability × RSS 6 104 399 001
    R² × Sample Size × Data Distribution × RSS 6 475 000 003
    R² × Reliability × Data Distribution × RSS 3 26 851 000
    Sample Size × Reliability × Data Distribution × RSS 2 53 588 000
    SMH × R² × Sample Size × Reliability × Data Distribution 18 70 817 001
    SMH × R² × Sample Size × Reliability × RSS 18 70 811 001
    SMH × R² × Sample Size × Data Distribution × RSS 18 99 473 002
    SMH × R² × Reliability × Data Distribution × RSS 9 50 874 000
    SMH × Sample Size × Reliability × Data Distribution × RSS 6 171 115 001
    R² × Sample Size × Reliability × Data Distribution × RSS 6 141 206 001
    SMH × R² × Sample Size × Reliability × Data Distribution × RSS 18 96 502 002
    Error 11136
    Note df degrees of freedom MAB mean absolute bias RSS relative segment size SMH structural model heterogeneity
    MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A11
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table E3 WithinSubjects Effects (Part I)
    Source of Variance in MAB df F Sig Partial η²
    Method 1 95231 000 079
    Method × SMH 3 21747 000 055
    Method × R² 3 13714 000 036
    Method × Sample Size 2 466 009 001
    Method × Reliability 1 00 974 000
    Method × Data Distribution 1 8797 000 008
    Method × RSS 1 10401 000 009
    Method × SMH × R² 9 1284 000 010
    Method × SMH × Sample Size 6 279 010 002
    Method × SMH × Reliability 3 26 854 000
    Method × SMH × Data Distribution 3 3726 000 010
    Method × SMH × RSS 3 88 450 000
    Method × R² × Sample Size 6 184 087 001
    Method × R² × Reliability 3 02 995 000
    Method × R² × Data Distribution 3 1948 000 005
    Method × R² × RSS 3 398 008 001
    Method × Sample Size × Reliability 2 27 765 000
    Method × Sample Size × Data Distribution 2 1760 000 003
    Method × Sample Size × RSS 2 1660 000 003
    Method × Reliability × Data Distribution 1 02 876 000
    Method × Reliability × RSS 1 149 700 000
    Method × Data Distribution × RSS 1 1437 000 001
    Method × SMH × R² × Sample Size 18 89 589 001
    Method × SMH × R² × Reliability 9 133 215 001
    Method × SMH × R² × Data Distribution 9 207 029 002
    Method × SMH × R² × RSS 9 456 000 004
    Method × SMH × Sample Size × Reliability 6 73 626 000
    Method × SMH × Sample Size × Data Distribution 6 394 001 002
    Method × SMH × Sample Size × RSS 6 172 112 001
    Method × SMH × Reliability × Data Distribution 3 74 527 000
    Method × SMH × Reliability × RSS 3 102 381 000
    Method × SMH × Data Distribution × RSS 3 1888 000 005
    Method × R² × Sample Size × Reliability 6 28 945 000
    Method × R² × Sample Size × Data Distribution 6 209 051 001
    Method × R² × Sample Size × RSS 6 357 002 002
    Method × R² × Reliability × Data Distribution 3 29 835 000
    Method × R² × Reliability × RSS 3 128 278 000
    Method × R² × Data Distribution × RSS 3 897 000 002
    Method × Sample Size × Reliability × Data Distribution 2 69 501 000
    Method × Sample Size × Reliability × RSS 2 13 876 000
    Method × Sample Size × Data Distribution × RSS 2 898 000 002
    Method × Reliability × Data Distribution × RSS 1 00 993 000
    Note df degrees of freedom MAB mean absolute bias RSS relative segment size SMH structural model heterogeneity all significant
    and substantial effects (ie all effects that explain more than 2 of the total variance in MAB implying a partial η² of more than 02) are highlighted
    in grey
    A12 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table E4 WithinSubjects Effect (Part II)
    Source of Variance in MAB df F Sig Partial η²
    Method × SMH × R² × Sample Size × Reliability 18 56 930 001
    Method × SMH × R² × Sample Size × Data Distribution 18 195 009 003
    Method × SMH × R² × Sample Size × RSS 18 147 092 002
    Method × SMH × R² × Reliability × Data Distribution 9 95 484 001
    Method × SMH × R² × Reliability × RSS 9 107 380 001
    Method × SMH × R² × Data Distribution × RSS 9 196 040 002
    Method × SMH × Sample Size × Reliability × Data Distribution 6 54 775 000
    Method × SMH × Sample Size × Reliability × RSS 6 123 286 001
    Method × SMH × Sample Size × Data Distribution × RSS 6 262 015 001
    Method × SMH × Reliability × Data Distribution × RSS 3 30 828 000
    Method × R² × Sample Size × Reliability × Data Distribution 6 120 305 001
    Method × R² × Sample Size × Reliability × RSS 6 56 766 000
    Method × R² × Sample Size × Data Distribution × RSS 6 259 016 001
    Method × R² × Reliability × Data Distribution × RSS 3 34 798 000
    Method × Sample Size × Reliability × Data Distribution × RSS 2 34 711 000
    Method × SMH × R² × Sample Size × Reliability × Data Distribution 18 49 965 001
    Method × SMH × R² × Sample Size × Reliability × RSS 18 44 980 001
    Method × SMH × R² × Sample Size × Data Distribution × RSS 18 176 024 003
    Method × SMH × R² × Reliability × Data Distribution × RSS 9 47 897 000
    Method × SMH × Sample Size × Reliability × Data Distribution × RSS 6 162 138 001
    Method × R² × Sample Size × Reliability × Data Distribution × RSS 6 32 928 000
    Method × SMH × R² × Sample Size × Reliability × Data Distribution × RSS 18 83 667 001
    Error(Method) 11136
    Note df degrees of freedom MAB mean absolute bias RSS relative segment size SMH structural model heterogeneity
    MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A13
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Appendix F
    ANOVA Results—Model 2 (Formative Measures)
    Tables F1 to F7 present the ANOVA results for model 2 (formative measures) explaining MAB by method (PLSPOSFIMIXPLS) and the
    seven design factors All significant and substantial effects (ie all effects that explain more than 2 percent of the total variance in MAB
    implying a partial η² of more than 02) are highlighted in grey
    We find that the R² structural and measurement model heterogeneity sample size multicollinearity and data distribution the interaction of
    structural and measurement model heterogeneity and the interaction of sample size and relative segment size have a substantial and significant
    effect on the MAB of both methods Furthermore there is a significant and substantial difference in the parameter recovery (MAB) of the two
    methods (PLSPOS and FIMIXPLS) and for the twoway interaction effects between method and R² multicollinearity and structural and
    measurement model heterogeneity Method even has a significant and substantial interaction effect with both structural and measurement model
    heterogeneity (threeway interaction)
    Table F1 BetweenSubjects Effects (Part I)
    Source of Variance in MAB df F Sig Partial η²
    Intercept 1 14269680 00 740
    SMH 3 760533 00 313
    MMH 2 291299 00 104
    R² 3 428631 00 204
    Sample Size 2 86477 00 033
    RSS 1 62983 00 012
    Data Distribution 1 146575 00 028
    Multicollinearity 2 84818 00 033
    SMH × MMH 6 29809 00 034
    SMH × R² 9 4428 00 008
    MMH × R² 6 582 00 006
    Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
    SMH structural model heterogeneity all significant and substantial effects (ie all effects that explain more than 2 of the total variance in MAB
    implying a partial η² of more than 02) are highlighted in grey
    A14 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table F2 BetweenSubjects Effects (Part II)
    Source of Variance in MAB df F Sig Partial η²
    SMH × Sample Size 6 3110 00 004
    MMH × Sample Size 4 1506 00 001
    R² × Sample Size 6 4643 00 006
    SMH × RSS 3 7868 00 005
    MMH × RSS 2 69 50 000
    R² × RSS 3 8786 00 005
    Sample Size × RSS 2 142686 00 054
    SMH × Data Distribution 3 1204 00 001
    MMH × Data Distribution 2 761 00 000
    R² × Data Distribution 3 321 02 000
    Sample Size × Data Distribution 2 2839 00 001
    RSS × Data Distribution 1 226 13 000
    SMH × Multicollinearity 6 10917 00 013
    MMH × Multicollinearity 4 28784 00 022
    R² × Multicollinearity 6 539 00 001
    Sample Size × Multicollinearity 4 2836 00 002
    RSS × Multicollinearity 2 1571 00 001
    Data Distribution × Multicollinearity 2 1650 00 001
    SMH × MMH × R² 18 2586 00 009
    SMH × MMH × Sample Size 12 518 00 001
    SMH × R² × Sample Size 18 78 73 000
    MMH × R² × Sample Size 12 48 93 000
    SMH × MMH × RSS 6 548 00 001
    SMH × R² × RSS 9 60 80 000
    MMH × R² × RSS 6 266 01 000
    SMH × Sample Size × RSS 6 4287 00 005
    MMH × Sample Size × RSS 4 623 00 000
    R² × Sample Size × RSS 6 5973 00 007
    SMH × MMH × Data Distribution 6 335 00 000
    SMH × R² × Data Distribution 9 1258 00 002
    MMH × R² × Data Distribution 6 179 10 000
    SMH × Sample Size × Data Distribution 6 902 00 001
    MMH × Sample Size × Data Distribution 4 233 05 000
    R² × Sample Size × Data Distribution 6 276 01 000
    SMH × RSS × Data Distribution 3 1381 00 001
    MMH × RSS × Data Distribution 2 150 22 000
    R² × RSS × Data Distribution 3 264 05 000
    Sample Size × RSS × Data Distribution 2 2148 00 001
    SMH × MMH × Multicollinearity 12 1831 00 004
    SMH × R² × Multicollinearity 18 730 00 003
    MMH × R² × Multicollinearity 12 116 31 000
    SMH × Sample Size × Multicollinearity 12 1115 00 003
    MMH × Sample Size × Multicollinearity 8 317 00 001
    R² × Sample Size × Multicollinearity 12 88 57 000
    SMH × RSS × Multicollinearity 6 1244 00 001
    MMH × RSS × Multicollinearity 4 808 00 001
    Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
    SMH structural model heterogeneity all significant and substantial effects (ie all effects that explain more than 2 of the total variance in MAB
    implying a partial η² of more than 02) are highlighted in grey
    MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A15
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table F3 BetweenSubjects Effects (Part III)
    Source of Variance in MAB df F Sig Partial η²
    R² × RSS × Multicollinearity 6 129 26 000
    Sample Size × RSS × Multicollinearity 4 1822 00 001
    SMH × Data Distribution × Multicollinearity 6 94 46 000
    MMH × Data Distribution × Multicollinearity 4 381 00 000
    R² × Data Distribution × Multicollinearity 6 88 51 000
    Sample Size × Data Distribution × Multicollinearity 4 1109 00 001
    RSS × Data Distribution × Multicollinearity 2 1297 00 001
    SMH × MMH × R² × Sample Size 36 75 86 001
    SMH × MMH × R² × RSS 18 86 63 000
    SMH × MMH × Sample Size × RSS 12 531 00 001
    SMH × R² × Sample Size × RSS 18 192 01 001
    MMH × R² × Sample Size × RSS 12 36 98 000
    SMH × MMH × R² × Data Distribution 18 165 04 001
    SMH × MMH × Sample Size × Data Distribution 12 387 00 001
    SMH × R² × Sample Size × Data Distribution 18 136 14 000
    MMH × R² × Sample Size × Data Distribution 12 68 78 000
    SMH × MMH × RSS × Data Distribution 6 180 09 000
    SMH × R² × RSS × Data Distribution 9 157 12 000
    MMH × R² × RSS × Data Distribution 6 54 78 000
    SMH × Sample Size × RSS × Data Distribution 6 898 00 001
    MMH × Sample Size × RSS × Data Distribution 4 319 01 000
    R² × Sample Size × RSS × Data Distribution 6 104 40 000
    SMH × MMH × R² × Multicollinearity 36 216 00 002
    SMH × MMH × Sample Size × Multicollinearity 24 79 75 000
    SMH × R² × Sample Size × Multicollinearity 36 162 01 001
    MMH × R² × Sample Size × Multicollinearity 24 104 41 000
    SMH × MMH × RSS × Multicollinearity 12 241 00 001
    SMH × R² × RSS × Multicollinearity 18 119 26 000
    MMH × R² × RSS × Multicollinearity 12 138 17 000
    SMH × Sample Size × RSS × Multicollinearity 12 908 00 002
    MMH × Sample Size × RSS × Multicollinearity 8 195 05 000
    R² × Sample Size × RSS × Multicollinearity 12 138 17 000
    SMH × MMH × Data Distribution × Multicollinearity 12 634 00 002
    Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
    SMH structural model heterogeneity
    A16 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table F4 BetweenSubjects Effects (Part IV)
    Source of Variance in MAB df F Sig Partial η²
    SMH × R² × Data Distribution × Multicollinearity 18 172 03 001
    MMH × R² × Data Distribution × Multicollinearity 12 112 34 000
    SMH × Sample Size × Data Distribution × Multicollinearity 12 1019 00 002
    MMH × Sample Size × Data Distribution × Multicollinearity 8 87 54 000
    R² × Sample Size × Data Distribution × Multicollinearity 12 223 01 001
    SMH × RSS × Data Distribution × Multicollinearity 6 902 00 001
    MMH × RSS × Data Distribution × Multicollinearity 4 49 74 000
    R² × RSS × Data Distribution × Multicollinearity 6 110 36 000
    Sample Size × RSS × Data Distribution × Multicollinearity 4 2461 00 002
    SMH × MMH × R² × Sample Size × RSS 36 75 86 001
    SMH × MMH × R² × Sample Size × Data Distribution 36 74 88 001
    SMH × MMH × R² × RSS × Data Distribution 18 120 25 000
    SMH × MMH × Sample Size × RSS × Data Distribution 12 162 08 000
    SMH × R² × Sample Size × RSS × Data Distribution 18 69 83 000
    MMH × R² × Sample Size × RSS × Data Distribution 12 120 27 000
    SMH × MMH × R² × Sample Size × Multicollinearity 72 113 21 002
    SMH × MMH × R² × RSS × Multicollinearity 36 166 01 001
    SMH × MMH × Sample Size × RSS × Multicollinearity 24 166 02 001
    SMH × R² × Sample Size × RSS × Multicollinearity 36 52 99 000
    MMH × R² × Sample Size × RSS × Multicollinearity 24 75 81 000
    SMH × MMH × R² × Data Distribution × Multicollinearity 36 95 55 001
    SMH × MMH × Sample Size × Data Distribution × Multicollinearity 24 152 05 001
    SMH × R² × Sample Size × Data Distribution × Multicollinearity 36 133 09 001
    MMH × R² × Sample Size × Data Distribution × Multicollinearity 24 90 60 000
    SMH × MMH × RSS × Data Distribution × Multicollinearity 12 152 11 000
    SMH × R² × RSS × Data Distribution × Multicollinearity 18 190 01 001
    MMH × R² × RSS × Data Distribution × Multicollinearity 12 145 14 000
    SMH × Sample Size × RSS × Data Distribution × Multicollinearity 12 865 00 002
    MMH × Sample Size × RSS × Data Distribution × Multicollinearity 8 113 34 000
    R² × Sample Size × RSS × Data Distribution × Multicollinearity 12 85 60 000
    SMH × MMH × R² × Sample Size × RSS × Data Distribution 36 98 51 001
    SMH × MMH × R² × Sample Size × RSS × Multicollinearity 72 84 84 001
    SMH × MMH × R² × Sample Size × Data Distribution × Multicollinearity 72 107 33 002
    SMH × MMH × R² × RSS × Data Distribution × Multicollinearity 36 124 15 001
    SMH × MMH × Sample Size × RSS × Data Distribution ×
    Multicollinearity
    24 112 32 001
    SMH × R² × Sample Size × RSS × Data Distribution × Multicollinearity 36 109 32 001
    MMH × R² × Sample Size × RSS × Data Distribution × Multicollinearity 24 87 65 000
    SMH × MMH × R² × Sample Size × RSS × Data Distribution ×
    Multicollinearity
    72 105 36 002
    Error 50112
    Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
    SMH structural model heterogeneity
    MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A17
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table F5 WithinSubjects Effects (Part I)
    Source of Variance in MAB df F Sig Partial η²
    Method 1 393852 00 073
    Method × SMH 3 398798 00 193
    Method × MMH 2 677105 00 213
    Method × R² 3 82632 00 047
    Method × Sample Size 2 22755 00 009
    Method × RSS 1 17166 00 003
    Method × Data Distribution 1 297 08 000
    Method × Multicollinearity 2 173912 00 065
    Method × SMH × MMH 6 97649 00 105
    Method × SMH × R² 9 8350 00 015
    Method × MMH × R² 6 613 00 001
    Method × SMH × Sample Size 6 2280 00 003
    Method × MMH × Sample Size 4 313 01 000
    Method × R² × Sample Size 6 395 00 000
    Method × SMH × RSS 3 6096 00 004
    Method × MMH × RSS 2 1278 00 001
    Method × R² × RSS 3 1569 00 001
    Method × Sample Size × RSS 2 16340 00 006
    Method × SMH × Data Distribution 3 5431 00 003
    Method × MMH × Data Distribution 2 339 03 000
    Method × R² × Data Distribution 3 519 00 000
    Method × Sample Size × Data Distribution 2 1245 00 000
    Method × RSS × Data Distribution 1 5616 00 001
    Method × SMH × Multicollinearity 6 37296 00 043
    Method × MMH × Multicollinearity 4 25724 00 020
    Method × R² × Multicollinearity 6 969 00 001
    Method × Sample Size × Multicollinearity 4 2284 00 002
    Method × RSS × Multicollinearity 2 585 00 000
    Method × Data Distribution × Multicollinearity 2 1181 00 000
    Method × SMH × MMH × R² 18 1149 00 004
    Method × SMH × MMH × Sample Size 12 244 00 001
    Method × SMH × R² × Sample Size 18 368 00 001
    Method × MMH × R² × Sample Size 12 139 16 000
    Method × SMH × MMH × RSS 6 1480 00 002
    Method × SMH × R² × RSS 9 1250 00 002
    Method × MMH × R² × RSS 6 261 02 000
    Method × SMH × Sample Size × RSS 6 4794 00 006
    Method × MMH × Sample Size × RSS 4 1337 00 001
    Method × R² × Sample Size × RSS 6 1962 00 002
    Method × SMH × MMH × Data Distribution 6 174 11 000
    Method × SMH × R² × Data Distribution 9 501 00 001
    Method × MMH × R² × Data Distribution 6 304 01 000
    Method × SMH × Sample Size × Data Distribution 6 768 00 001
    Method × MMH × Sample Size × Data Distribution 4 30 88 000
    Method × R² × Sample Size × Data Distribution 6 334 00 000
    Method × SMH × RSS × Data Distribution 3 368 01 000
    Method × MMH × RSS × Data Distribution 2 76 47 000
    Method × R² × RSS × Data Distribution 3 43 73 000
    Method × Sample Size × RSS × Data Distribution 2 1904 00 001
    Method × SMH × MMH × Multicollinearity 12 2862 00 007
    Method × SMH × R² × Multicollinearity 18 504 00 002
    Method × MMH × R² × Multicollinearity 12 46 94 000
    Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
    SMH structural model heterogeneity all significant and substantial effects (ie all effects that explain more than 2 of the total variance in MAB
    implying a partial η² of more than 02) are highlighted in grey
    A18 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table F6 WithinSubjects Effects (Part II)
    Source of Variance in MAB df F Sig Partial η²
    Method × SMH × Sample Size × Multicollinearity 12 1191 00 003
    Method × MMH × Sample Size × Multicollinearity 8 140 19 000
    Method × R² × Sample Size × Multicollinearity 12 91 53 000
    Method × SMH × RSS × Multicollinearity 6 1691 00 002
    Method × MMH × RSS × Multicollinearity 4 391 00 000
    Method × R² × RSS × Multicollinearity 6 119 31 000
    Method × Sample Size × RSS × Multicollinearity 4 2068 00 002
    Method × SMH × Data Distribution × Multicollinearity 6 657 00 001
    Method × MMH × Data Distribution × Multicollinearity 4 363 01 000
    Method × R² × Data Distribution × Multicollinearity 6 99 43 000
    Method × Sample Size × Data Distribution × Multicollinearity 4 2439 00 002
    Method × RSS × Data Distribution × Multicollinearity 2 2884 00 001
    Method × SMH × MMH × R² × Sample Size 36 135 08 001
    Method × SMH × MMH × R² × RSS 18 148 08 001
    Method × SMH × MMH × Sample Size × RSS 12 199 02 000
    Method × SMH × R² × Sample Size × RSS 18 248 00 001
    Method × MMH × R² × Sample Size × RSS 12 234 01 001
    Method × SMH × MMH × R² × Data Distribution 18 86 63 000
    Method × SMH × MMH × Sample Size × Data Distribution 12 268 00 001
    Method × SMH × R² × Sample Size × Data Distribution 18 128 19 000
    Method × MMH × R² × Sample Size × Data Distribution 12 37 97 000
    Method × SMH × MMH × RSS × Data Distribution 6 118 32 000
    Method × SMH × R² × RSS × Data Distribution 9 345 00 001
    Method × MMH × R² × RSS × Data Distribution 6 51 80 000
    Method × SMH × Sample Size × RSS × Data Distribution 6 837 00 001
    Method × MMH × Sample Size × RSS × Data Distribution 4 121 31 000
    Method × R² × Sample Size × RSS × Data Distribution 6 113 34 000
    Method × SMH × MMH × R² × Multicollinearity 36 129 11 001
    Method × SMH × MMH × Sample Size × Multicollinearity 24 128 16 001
    Method × SMH × R² × Sample Size × Multicollinearity 36 136 08 001
    Method × MMH × R² × Sample Size × Multicollinearity 24 105 40 001
    Method × SMH × MMH × RSS × Multicollinearity 12 327 00 001
    Method × SMH × R² × RSS × Multicollinearity 18 102 43 000
    Method × MMH × R² × RSS × Multicollinearity 12 140 16 000
    Method × SMH × Sample Size × RSS × Multicollinearity 12 814 00 002
    Method × MMH × Sample Size × RSS × Multicollinearity 8 247 01 000
    Method × R² × Sample Size × RSS × Multicollinearity 12 136 18 000
    Method × SMH × MMH × Data Distribution × Multicollinearity 12 263 00 001
    Method × SMH × R² × Data Distribution × Multicollinearity 18 165 04 001
    Method × MMH × R² × Data Distribution × Multicollinearity 12 82 63 000
    Method × SMH × Sample Size × Data Distribution × Multicollinearity 12 724 00 002
    Method × MMH × Sample Size × Data Distribution × Multicollinearity 8 101 42 000
    Method × R² × Sample Size × Data Distribution × Multicollinearity 12 142 15 000
    Method × SMH × RSS × Data Distribution × Multicollinearity 6 694 00 001
    Method × MMH × RSS × Data Distribution × Multicollinearity 4 140 23 000
    Method × R² × RSS × Data Distribution × Multicollinearity 6 159 15 000
    Method × Sample Size × RSS × Data Distribution × Multicollinearity 4 1565 00 001
    Method × SMH × MMH × R² × Sample Size × RSS 36 188 00 001
    Method × SMH × MMH × R² × Sample Size × Data Distribution 36 80 80 001
    Method × SMH × MMH × R² × RSS × Data Distribution 18 100 45 000
    Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
    SMH structural model heterogeneity
    MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A19
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Table F7 WithinSubjects Effects (Part III)
    Source of Variance in MAB df F Sig Partial
    η²
    Method × SMH × MMH × Sample Size × RSS × Data Distribution 12 214 01 001
    Method × SMH × R² × Sample Size × RSS × Data Distribution 18 153 07 001
    Method × MMH × R² × Sample Size × RSS × Data Distribution 12 77 68 000
    Method × SMH × MMH × R² × Sample Size × Multicollinearity 72 91 70 001
    Method × SMH × MMH × R² × RSS × Multicollinearity 36 128 12 001
    Method × SMH × MMH × Sample Size × RSS × Multicollinearity 24 195 00 001
    Method × SMH × R² × Sample Size × RSS × Multicollinearity 36 137 07 001
    Method × MMH × R² × Sample Size × RSS × Multicollinearity 24 90 60 000
    Method × SMH × MMH × R² × Data Distribution × Multicollinearity 36 98 50 001
    Method × SMH × MMH × Sample Size × Data Distribution × Multicollinearity 24 246 00 001
    Method × SMH × R² × Sample Size × Data Distribution × Multicollinearity 36 149 03 001
    Method × MMH × R² × Sample Size × Data Distribution × Multicollinearity 24 70 85 000
    Method × SMH × MMH × RSS × Data Distribution × Multicollinearity 12 175 05 000
    Method × SMH × R² × RSS × Data Distribution × Multicollinearity 18 171 03 001
    Method × MMH × R² × RSS × Data Distribution × Multicollinearity 12 137 17 000
    Method × SMH × Sample Size × RSS × Data Distribution × Multicollinearity 12 867 00 002
    Method × MMH × Sample Size × RSS × Data Distribution × Multicollinearity 8 129 24 000
    Method × R² × Sample Size × RSS × Data Distribution × Multicollinearity 12 78 68 000
    Method × SMH × MMH × R² × Sample Size × RSS × Data Distribution 36 85 73 001
    Method × SMH × MMH × R² × Sample Size × RSS × Multicollinearity 72 105 36 002
    Method × SMH × MMH × R² × Sample Size × Data Distribution × Multicollinearity 72 120 11 002
    Method × SMH × MMH × R² × RSS × Data Distribution × Multicollinearity 36 153 02 001
    Method × SMH × MMH × Sample Size × RSS × Data Distribution × Multicollinearity 24 253 00 001
    Method × SMH × R² × Sample Size × RSS × Data Distribution × Multicollinearity 36 133 09 001
    Method × MMH × R² × Sample Size × RSS × Data Distribution × Multicollinearity 24 125 18 001
    Method × SMH × MMH × R² × Sample Size × RSS × Data Distribution ×
    Multicollinearity
    72 96 58 001
    Error(Method) 50112
    Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
    SMH structural model heterogeneity
    References
    Alavi M and Joachimsthaler E A 1992 Revisiting DSS Implementation Research A Meta Analysis of the Literature and Suggestions
    for Researchers MIS Quarterly (161) pp 95116
    Anderberg M R 1973 Cluster Analysis for Applications New York Academic Press
    Dennis A R Wixom B H and Vandenberg R J 2001 Understanding Fit and Appropriation Effects in Group Support Systems via
    MetaAnalysis MIS Quarterly (252) pp 167193
    Diamantopoulos A and Winklhofer H M 2001 Index Construction with Formative Indicators An Alternative to Scale Development
    Journal of Marketing Research (382) pp 269277
    Esposito Vinzi V Trinchera L and Amato S 2010 PLS Path Modeling From Foundations to Recent Developments and Open Issues
    for Model Assessment and Improvement in Handbook of Partial Least Squares Concepts Methods and Applications V Esposito Vinzi
    W W Chin J Henseler and H Wang (eds) Berlin Springer pp 4782
    Esposito Vinzi V Trinchera L Squillacciotti S and Tenenhaus M 2008 REBUSPLS A ResponseBased Procedure for Detecting
    Unit Segments in PLS Path Modelling Applied Stochastic Models in Business & Industry (245) pp 439458
    A20 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
    Becker et alDiscovering Unobserved Heterogeneity in SEM
    Grewal R Cote J A and Baumgartner H 2004 Multicollinearity and Measurement Error in Structural Equation Models Implications
    for Theory Testing Marketing Science (234) pp 519529
    Gudergan S P Ringle C M Wende S and Will A 2008 Confirmatory Tetrad Analysis in PLS Path Modeling Journal of Business
    Research (6112) pp 12381249
    Hahn C Johnson M D Herrmann A and Huber F 2002 Capturing Customer Heterogeneity Using a Finite Mixture PLS Approach
    Schmalenbach Business Review (SBR) (543) pp 243269
    Henseler J Ringle C M and Sinkovics R R 2009 The Use of Partial Least Squares Path Modeling in International Marketing in
    Advances in International Marketing R R Sinkovics and P N Ghauri (eds) Bingley United Kingdom Emerald Group Publishing Limited
    pp 277320
    Henseler J and Sarstedt M 2012 GoodnessofFit Indices for Partial Least Squares Path Modeling Computational Statistics
    (httplinkspringercomarticle1010072Fs0018001203171)
    Jarvis C B MacKenzie S B and Podsakoff P M 2003 A Critical Review of Construct Indicators and Measurement Model
    Misspecification in Marketing and Consumer Research Journal of Consumer Research (302) pp 199218
    Joseph D KokYee N Koh C and Soon A 2007 Turnover of Information Technology Professionals A Narrative Review
    MetaAnalytic Structural Equation Modeling and Model Development MIS Quarterly (313) pp 547577
    King W R and He J 2006 A MetaAnalysis of the Technology Acceptance Model Information & Management (436) pp 740755
    Kohli R and Devaraj S 2003 Measuring Information Technology Payoff A MetaAnalysis of Structural Variables in FirmLevel
    Empirical Research Information Systems Research (142) pp 127145
    Lee G and Xia W 2006 Organizational Size and IT Innovation Adoption A MetaAnalysis Information & Management (438) pp
    975985
    Lohmöller JB 1989 Latent Variable Path Modeling with Partial Least Squares Heidelberg Physica
    Mason C H and Perreault W D 1991 Collinearity Power and Interpretation of Multiple Regression Analysis Journal of Marketing
    Research (283) pp 268280
    R Core Team 2013 R A Language and Environment for Statistical Computing R Foundation for Statistical Computing Vienna
    Sabherwal R Jeyaraj A and Chowa C 2006 Information System Success Individual and Organizational Determinants Management
    Science (5212) pp 18491864
    Sánchez G and Trinchera L 2013 R Package PLSPM (version 035) httpcranrprojectorgwebpackagesplspm
    Sarstedt M 2008 A Review of Recent Approaches for Capturing Heterogeneity in Partial Least Squares Path Modelling Journal of
    Modelling in Management (32) pp 140161
    Schepers J and Wetzels M 2007 A MetaAnalysis of the Technology Acceptance Model Investigating Subjective Norm and Moderation
    Effects Information & Management (441) pp 90103
    Sharma R and Yetton P 2003 The Contingent Effects of Management Support and Task Interdependence on Successful Information
    Systems Implementation MIS Quarterly (274) pp 533555
    Sharma R and Yetton P 2007 The Contingent Effects of Training Technical Complexity and Task Interdependence on Successful
    Information Systems Implementation MIS Quarterly (312) pp 219238
    Squillacciotti S 2005 Prediction Oriented Classification in PLS Path Modeling in PLS & Marketing Proceedings of the 4th International
    Symposium on PLS and Related Methods T Aluja J Casanovas V Esposito Vinzi and M Tenenhaus (eds) Paris DECISIA pp 499506
    Squillacciotti S 2010 PredictionOriented Classification in PLS Path Modeling in Handbook of Partial Least Squares Concepts Methods
    and Applications V Esposito Vinzi W W Chin J Henseler and H Wang (eds) Berlin Springer pp 219233
    Tenenhaus M Esposito Vinzi V Chatelin YM and Lauro C 2005 PLS Path Modeling Computational Statistics & Data Analysis
    (481) pp 159205
    Wang J and Keil M 2007 A MetaAnalysis Comparing the Sunk Cost Effect for IT and NonIT Projects Information Resources
    Management Journal (203) pp 118
    Wedel M and Kamakura W 2000 Market Segmentation Conceptual and Methodological Foundations (2nd ed) New York Kluwer
    Academic Publishers
    Wold H 1982 Soft Modeling The Basic Design and Some Extensions in Systems Under Indirect Observations Part I K G Jöreskog
    and H Wold (eds) Amsterdam NorthHolland pp 154
    Wu J and Lederer A 2009 A MetaAnalysis of the Role of EnvironmentBased Voluntariness in Information Technology Acceptance
    MIS Quarterly (332) pp 419432
    MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A21
    Copyright of MIS Quarterly is the property of MIS Quarterly & The Society for Information
    Management and its content may not be copied or emailed to multiple sites or posted to a
    listserv without the copyright holder's express written permission However users may print
    download or email articles for individual use

    《香当网》用户分享的内容,不代表《香当网》观点或立场,请自行判断内容的真实性和可靠性!
    该内容是文档的文本内容,更好的格式请下载文档

    下载pdf到电脑,查找使用更方便

    pdf的实际排版效果,会与网站的显示效果略有不同!!

    需要 3 香币 [ 分享pdf获得香币 ]

    下载pdf

    相关文档

    法学研究方法作业

    法学研究方法作业The Forgotten Dinner Guest:The “Beyond a Reasonable Doubt“Standard in a Motion for a Jud...

    9年前   
    728    0

    法学方法论研究

    任何学科都会涉及到方法论、以及方法,如果进行过深入研究的学者会对方法论、以及方法有一个比较深入的了 解,能够清楚的认识到这两者是不同的。但是还有部分学者将两者混为一谈。法学对我国发展的作用是不言...

    5年前   
    1841    0

    IT项目管理方法研究

    IT项目管理方法研究  摘 要:在知识经济时代,发展的决定因素和国际竞争的成败就是创新的能力。管理创新和技术创新是知识经济的灵魂,管理创新尤为重要。只有通过管理创新,技术创新才有保证,只有树立...

    9年前   
    767    0

    大红鹰公司:员工激励机制实证研究

    大红鹰公司:员工激励机制实证研究建立一套科学的既有激励,又有约束的企业运行机制,是促进企业发展的重要手段。成功的企业领导者要较好地掌握激励的观点与技巧。 1.问题背景 中国加入WTO后企业面临...

    10年前   
    558    0

    权力运用、企业文化与创新的实证研究

    权力运用、企业文化与创新的实证研究权力的概念 在任何企业中,我们都不可避免地会遇到权力这一现象。        权力是组织中的一种无形的力量。        虽然看不见它的存在,但其影响却能感...

    12年前   
    595    0

    我国上市公司配股融资行为的实证研究

    我国上市公司配股融资行为的实证研究    (中国人民大学会计系 100872)   (摘要]配股在我国资本市场中具有举足轻重的地位,研究上市公司配股融资行为...

    11年前   
    608    0

    「开题报告」风险态度与企业绩效实证研究

    开题报告风险态度与企业绩效实证研究一、立论依据1.研究意义、预期目标中国是当今世界发展最快的国家之一,中国企业在快速成长中。近年来如97亚洲金融危机、美国次贷危机等各种区域性和世界性金融危机频...

    3年前   
    803    0

    公司绩效与高阶管理者离职之实证研究

    公司绩效与高阶管理者离职之实证研究 李佳玲 中正大学会计系助理教授 中文摘要 文献指出公司绩效较差的高阶主管理离职率较高,但当高阶主管是否可透过持股来降低其离职机率,是值得关切的主题。...

    9年前   
    10965    0

    关于上海股市系统风险的实证研究(选读)

    《证券投资学》参考资料 透过风险看发展 ——关于上海股市系统风险的实证研究 自1990年12月19日上海证券交易所宣告成立以来,以沪深两地股市为代表的中...

    10年前   
    21527    0

    股权结构、资本结构与公司价值的实证研究

    我国上市公司存在着特殊的股权结构。股权结构与公司价值关系如何?债务融资是否在公司治理中发挥其应有的作用?目前国内文献对这些问题的研究大都以股权结构为解释变量,没有考虑债务融资、行业因素、公司的...

    12年前   
    543    0

    对财务柔性与企业绩效的实证研究定稿

    制造业是国民经济健康持续发展的重要基石。在改革开放40周年的时代背景下,深化改革的任务能否顺利完成,很大程度取决于中小板制造企业是否健康持续发展。外界环境充满着不确定性,很可能使企业陷入各种危机。

    3年前   
    556    0

    开题报告中研究思路与研究方法的写法

    开题报告中研究思路与研究方法的写法 研究方法(1)模糊层次分析法本论文考虑到绿色造船评价指标既有定量指标又有定性指标,可以借助模糊评价方法的处理方式,将一些模糊的概念转化成定量的数据。此外,为...

    3年前   
    1766    0

    咨询研究的主要方法

    咨询研究的主要方法 --------------------------------------------------------------------------------...

    14年前   
    26794    0

    知识进步及其测量方法研究

    知识进步及其测量方法研究 摘要 本文研究了知识经济及知识进步问题,并就知识进步给出具体算法,进而将知识进步率分解成两个可测性强的指标,据此提出知识经济下经济增长的几点的建议。 ...

    14年前   
    1881    0

    研究方法与科技论文写作试题

    研究方法与科技论文写作试题化学与材料科学学院2010年“研究方法与科技论文写作”试题1. 以你做过的实验为例,从实验原理和实验操作的对应性说明如何实现主观与客观统一,并说明这种统一性的意义。主...

    10年前   
    796    0

    教育研究方法试题集及答案

    一、单项选择题(从下列四个备选答案中选出一个正确答案,并将其代号写在题干的空白处)1、在科学史上,首次研究了科学认识的“归纳一演绎”程序及所遵循的方法,并在形式逻辑之上建立了科学方法论的哲学家、...

    1年前   
    763    0

    社会调查研究方法

    调查研究是当前普遍使用的一种社会研究方法,主要包括抽样、问卷设计、访谈(邮寄问卷)和数据处理等几个步骤。

    6年前   
    4220    0

    服务业调查方法及研究

    我们姜山镇多年来随着改革开放的不断深入,经济发展水平一年一个新台阶,尤其是三产服务业发展越来越快,在全区经济发展中所占的地位和作用越来越明显,为配合上级统计部门开展的服务业抽样调查工作,我们姜...

    15年前   
    17634    0

    知识进步及其测量方法研究

    知识进步及其测量方法研究吴景泰* 吴景泰,(1964年-),男,东北大学工商管理学院博士生,沈阳航空工业学院副教授。 (沈阳航空工业学院,辽宁 沈阳 110034)摘要 本文研究了知...

    10年前   
    593    0

    《论文中常用的研究方法》

    调查法是科学研究中最常用的方法之一。它是有目的、有计划、有系统地搜集有关研究对象现实状况或历史状况的材料的方法。调查方法是科学研究中常用的基本研究方法,它综合运用历史法、观察法等方法以及谈话、问...

    3年前   
    560    0