SunText Reviews

Introduction

Scale construction is a technique used in social science research. It is a systematic approach to understanding a particular component or problem, prevailing in the community and quantifying it scientifically. Measurement is an integral part of science and helps to identify and quantify a particular problem, object, or process [1]. According to De Vellis [2], measurement scales are found to be important in attributing scores, numbers, factors, in some numerical dimensions, which cannot be measured directly. In the process of developing the scale, principal factor analysis plays a very vital role. Large data are statistically very difficult to manage and interpret. In literature, there are different methods to reduce the dimensions of the large data set and make it more interpretable without losing the efficacy of the data [3]. Principal factor analysis also called principal component analysis is one of the oldest methods for reducing data, finding the correlation between variables, and helping to get the most comprehensive and combined variables also called items without compromising on variability [4]. It is generally used to lower the items of the scale and to check the psychometric property of the newly developed tool, without compromising on the statistical importance of the scale. Mooi and Sarstedt [5] proposed a few points to conduct principal factor analysis.

Pre-requisite for conducting PFA

In the process of conducting PFA, there are certain pre-requisite that need to be taken care of by the researcher. These pre-requisites are as follows

Suitability of measurement scale: To carry PFA, the data must be measured on an interval or ratio scale level. In an interval scale, the items and responses are measured based on numerical values, and each of the numerical values is at the same interval, with others [6].

Large sample size: To carry PFA, the sample size should be adequately large, with an item-to response ratio of 1:4 to 1:10 [7] which means the number of valid observations should be at least 4 to 10 times, the number of items used in the analysis.

Independent Observation: It meant that the responses collected from the sample, should be independent, which means, totally uncorrelated. If dependent observation is used, it will give birth to "artificial correlation”, which can alter the result.

Sufficiently correlation on the variables: As discussed earlier, PFA is based on the correlation between items. So, to perform PFA the item must be correlated sufficiently. To check the adequacy of correlation, researchers generally used The Kaiser-Meyer-Olkin (KMO) criteria proposed by Kaiser [8].

The Kaiser-Meyer-Olkin (KMO) by Kaiser (1974): To measure the adequacy of a sample, Kaiser (1974) proposed criteria known as "Kaiser Criterion". According to this criterion, the KMO values need to be 0.6 to and close to 1.0.

The Bartlett's Test of Sphericity [9]: Bartlett's (1951) test of sphericity tests whether "a matrix (of correlations) is significantly different from an identity matrix." The test indicates that the correlation matrix must fall at a significance level with the variable in the data set.

Extraction of the Factor

The researcher generally uses PFA, when data reduction is the primary objective. In other words, it can be said that PFA, is widely used because it extracts "a minimum number of factors that account of a maximum proportion of the variables total variance". In the process of factor extraction, principal factor analysis, it self develops a new set of factors, which are the 'Linear Composite' of the original factor in the data set. These linear composites are also called Eigen factors. This process of generating factors continues until a significant share of factors is explained.

Eigenvalue- An eigenvalue is essentially a ratio of the shared variance to the unique variance (shared variance: unique variance) extraction of principal factor analysis. An eigenvalue if greater than 1.0 is the standard accepted arbitrary criterion for deciding the factor [10].

The mathematical equation for eigenvalue:

AX = ?X

Where,

A is an arbitrary matrix,

? are eigenvalues, and

X is an eigenvector corresponding to each eigenvalue

Communality- Communality may be defined as the "proportion of common variance found in a particular variable" and is denoted by h2. [11] The communality indicates how much variance of each variable factor extraction can produce. Generally, the extracted factor should account for at least 50% of the variance of a variable. Thus, the commonality should be above 0.50. A variable with a variance that is completely unexplained by any other variables will have a commonality of zero. [12] Since the objective is to reduce the number of factor variables through factor extraction, the researcher should extract only a few factors that account for a high degree of the overall variance. Communality in PFA [13],

Where,

cj = commonality of the jth variable (h2)

sij = loading (or correlation) between the ith component and the jth variable.

Determining the number of factors

After the factors are extracted, the major task is to identify and determine the extracted factors and their adequacy. In this process, the researcher generally makes use of two methods – (i) the Kaiser Criteria and (ii) The Scree Plot. The purpose behind using these two methods simultaneously is because if a different method suggests the same number of factors, it leads to great confidence in the result.

The Kaiser criterion [8]: The most common way to determine the number of factors is to select all the factors with an eigenvalue greater than 1. The reason for choosing an eigenvalue greater than 1 is that it accounts for more variances than a single variance. Extracting all the factors with an eigenvalue greater than 1 is frequently called the Kaiser criteria or latent root criterion.

The Scree Plot [14]: This is again a method to determine the factors. In the scree plot, several factors to be extracted with eigenvalue (y-axis) is a plot against the factor with which it is associated (x-factor). The result is the output of the scree plot where there is a typical distinct break in it, showing the correct number of factors. This distinct break is known as "elbow". It is recommended to retain all the factors which are above the elbow break because it contributes most to the explanation of the variance in the database. In the figure, shown below 2 to 3 factors could be retained.

Figure 1: Scree Plot (Eigenvalues of Full Correlation Matrix).

Figure 1: Scree Plot (Eigenvalues of Full Correlation Matrix).

KMO Value	Adequacy of the Correlation
Below 0.50	Unacceptable
0.50-0.59	Miserable
0.60-0.69	Mediocre
0.70-0.79	Middling
0.80-0.89	Meritorious
0.90 and higher	Marvelous

Interpretation of the factor solution

After the factors are identified and determined, Interpretation of the factor solution takes place. It follows two methods:

Rotation of the Factor- To interpret the solution, the researcher has to determine which variables relate to each of the factors extracted. The researcher does this by examining the factor loadings, which represent the correlations between the factors and the variables (range -1 to +1). A high factor loading indicates that a certain factor represents a variable well. Subsequently, the researcher looks for high absolute values, because the correlation between a variable and a factor can also be negative. Studies conducted in the past have suggested that in the rotation of the factor ‘Promax rotation’ is widely used [15,16].

Promax Rotation- It is an oblique rotation, which allows factors to be correlated. This rotation can be calculated more quickly than a direct oblimin rotation, so it is useful for large datasets [17]. In oblique rotations the new axes are free to acquire any angle in the factor space, here, the degree of correlation is generally seen as small because two highly correlated factors are understood as one factor. Oblique rotations, therefore, relax the orthogonality constraint to gain simplicity in the interpretation and hence are widely followed.

Final Item Reduction

A pattern coefficient ("loading") of 0.4 and higher (that is, a factor explaining at least 16% of an item's variance) were retained. For a factor to be considered, a minimum three-item should have a loading of more than 0.40. It is to be noted that the interpretation of the factors is entirely based on the pattern matrix coefficients.

Factor Loading: Factor loading is a sort of “indices” or “scale” that shows the “relative importance” or “magnitude” of some collection of items (characteristics, features) that collectively form a whole [18].

Pattern Matrix Coefficients: It is defined as the “unique loads or investments of the given factor into variables” [19]. It gives an overview of the number of factors developed with factor loading (more than 0.4), and the overview of the final items retained. For example: If 10 factors were developed and the factor loading of 20 (for say) items was below 0.40, hence they will be discarded. So, it is necessary to develop at least twice as many items in the question pool.

Internal Consistency Assessment of the new tool (PIC)

Internal consistency is a statistical measurement for the reliability of a particular scale. It is defined as the extent to which items within a scale or construct, measure various aspects of the same characteristics of the scale [20]. A scale is considered to be having good internal consistency reliability if the items of the scales measure the same construct. It can be calculated by two methods (i) Cronbach’s Alpha Co-efficiency (ii) Composite Reliability.

Cronbach’s Alpha Co-efficiency: Cronbach’s alpha, ? (or coefficient alpha), developed by Cronbach [21] measures the reliability or internal consistency of a particular construct. It assesses reliability in the Likert-type scale. It helps in identifying, how closely a set of items is grouped.

The formula for Cronbach’s is,

Where

N = the number of items.
c? = average covariance between item pairs.
v? = average variance.

Composite Reliability (CR): Composite reliability is an alternative way to measure the reliability of a particular construct. It is obtained by "combining all of the true score variances and co-variances in the composite of indicator variables related to constructs, and by dividing this sum by the total variance in the composite". It measures the overall reliability of a latent construct based on factor loading output. The accepted threshold for composite reliability is considered to be 0.70 [22].

Mathematical expression for calculating Composite Reliability (CR),

?i = completely standardized loading for the ith indicator,

V (?i) = variance of the error term for the ith indicator,

p = number of indicators

Construct Validity of the new tool (PIC)

Construct validity is one of the types to measure the validity of a constructor scale, to see how well the scale is constructed, and how well it is measuring the component or variable, it is supposed to measure [23]. The most common way to see the construct validity of a particular scale is by comparing the scale, with other pre-existing tools of the same construct. If the outcome is significant, then it can be said that construct validity is established. There are two types of construct validity (i) Convergent Validity (ii) Divergent Validity.

Convergent Validity: In convergent validity, it is seen that, to what level the newly construct converges with the pre-existing tools. The scores of the new construct tool are correlated with the scores of the pre-existing tool, and a level of significance is seen (Strauss & Smith, 2009). For convergent validity, the expected average variance (AV) should be greater than 0.5, though, Fornell & Larcker [24] have suggested that if average variance (AV) falls below the cut off of 0.5 but Composite Reliability (CR) falls above 0.6, therefore the convergent validity of a specific construct stands adequate [25].

Average Variance Extraction (AVE): It is a measure to assess convergent validity. AVE is the average amount of variance in indicator variables that a construct is measuring [26].

AVE =

K=is the number of items,

The factor loading of an item, and

The variance of the error of item.

Discriminant Validity: It is the other type of construct validity that is the opposite of convergent validity. In discriminant validity, the newly formed measure should not be highly correlated with the other pre-existing measures. Discriminant validity coefficients should be noticeably smaller in magnitude than convergent validity coefficients [27].

Conclusion

The construction of a psychosocial scale is undeniably effortful. Prompt theoretical understanding of psychosocial construct along with needful statistical knowledge can be a boon. The intention of writing this review paper was to offer a comprehensive overview of PFA so that, its intimidating nature can be debunked. This simplified understanding of the steps of PFA may encourage young researchers to construct psychosocial scales in the Indian context.

Financial Support and sponsorship

Nil

Conflicts of Interest

There are no conflicts of interest.

Principal Factor Analysis in Scale Construction Download PDF

Journal Name : SunText Review of Medical & Clinical Research

Abstract

Introduction

Conclusion