QWES Data Appendix

Share this page:

Data Appendix

A NOTE ABOUT THE DATA

Seasonality and trend adjustments
When appropriate, data in the QWES are seasonally adjusted. In some cases, instead of actual data, we present trend values, which are better for illuminating the underlying movement of the data series. (See the data appendix for a discussion of how these series are constructed.)

Statistical significance
The data in the QWES come from the Current Population Survey (CPS), a monthly survey of U.S. households used by the Bureau of Labor Statistics (BLS) to derive labor market statistics. As with any data series drawn from a sample, these data include some sampling error, and therefore it is important to address the issue of statistical significance. This issue is particularly relevant in the analysis of quarterly changes because, in some cases, changes will be statistically insignificant, i.e., indistinguishable from random or chance movements.

In most of the QWES series small changes are insignificant. “Small” in this case should be taken to mean changes in unemployment or underemployment rates less than half a percentage point for aggregates (such as the “all” group) and 1.5 percentage points for subgroups (such as racial minorities). For wages, changes less than 1% are rarely significant; for racial, gender, and other subgroups, changes less than 2% are generally not significant.

Note also that even significant quarterly changes are not necessarily indicative of a longer-term trend. Wage growth, for example, may be significantly above trend in a given quarter and then revert back to trend in following quarters. In this regard, many of the findings in the QWES are informative about the most recent trends (i.e., within the past year) in the labor market. Only Table 5, which has data back to 1989 (the last business cycle peak) sheds light on longer-term trends.

The data in the QWES are derived from the Current Population Survey (CPS), a monthly survey of U.S. households used by the Bureau of Labor Statistics (BLS) to derive labor market statistics. We combine three months of this survey for each quarter and derive the un- and underemployment rates from these data. Each month, one-quarter of the sample are asked questions about their earnings and hours of work. We use these variables to derive hourly wage rates. EPI’s methodology for these calculations is described in Appendix B of The State of Working America 1998-99, which is available on our web site at www.epinet.org (click on DataZone). Overall sample sizes in 1999:1 were 19,300 for the full survey of respondents age 18-25, with a high school education or less, and 37,600 for the quarter sample of respondents age 18-64, wage and salary workers.

Underemployment: Underemployment expands the definition of unemployment to include discouraged workers (those who unsuccessfully sought work within the past 12 months and have given up the job search), persons working part time who would prefer full-time work, and those whose labor market participation is blocked by barriers such as lack of transportation or child care.

Race: Unlike some BLS tabulations, in which Hispanics can be classified as members of any racial group, our racial categories are mutually exclusive. Thus, whites and African Americans are non-Hispanic in these data.

Occupation: White-collar workers include executives, managers, professional, technical, sales, and clerical workers (about 60% of the workforce in 1999:1). Blue-collar workers include production workers, machine operators, transportation workers, and material and equipment handlers (about 26% of the workforce in 1999:1). Other includes all other occupations, notably services.

Entry level (1-10 years of potential experience): A variable for actual labor force experience is not available in the CPS, so we use the “potential experience” construct, i.e., age-years of education-6, common to labor market analyses of these data. Since education coding is categorical for the data in the QWES (i.e., highest degree attained), we use an imputation procedure described in the State of Working America appendix noted above.

Seasonal adjustment: When using data on other than an annual basis, it is important to extract seasonal effects so that neighboring quarters can be reliably compared. For example, demand for sales workers would be expected to increase each December, and this demand is likely be reflected in fourth quarter unemployment rates. In order to determine reliably the underlying trend in the labor market variables, this seasonal effect must be extracted.

However, there are economic time series, some of which are in the QWES, which do not have a seasonal component (more technically, they do not have one that can be reliably extracted). In these cases, the analyst either can present the unadjusted data (i.e., unadjusted for seasonality; note that all wage data are adjusted for inflation) or, if possible, can extract the underlying trend from the data. For many of the wage variables we have followed this latter course. In this case, the data move in a clear direction over time, but the “noise” in the series makes it difficult to observe the underlying trend. (“Noise” in this context refers to random movements unrelated to the seasonal or trend/cycle components of the time series).

Adjustments for seasonality and trend extraction are made using the U.S. Bureau of the Census Arima X-12 software, an extension of the X-11 procedure used by the BLS. We generally follow the same procedure as the BLS, as described in any issue of the monthly BLS publication Employment and Earnings. One difference is that we take advantage of X-12’s automatic outliner identification process, which is not available in X-11. Also, as noted above, for some of our series we report the extracted trend from the data; BLS presents only unadjusted or seasonally adjusted data.

Table A1 provides a variable guide for the QWES, showing which variables are seasonally adjusted (s), trend adjusted (t), and not adjusted (n).

Table A1: Variable guide

Standard errors: Whenever dealing with statistics, it is important to account for sampling error-the random fluctuations in sample data that lead to some uncertainty about the point estimates presented in the tables. Statisticians have derived methods to quantify this uncertainty, and these techniques enable us to determine the probability that a given change is statistically significant, i.e., whether the observed change is distinguishable from random or chance movements.

BLS has developed the methodology for deriving standard errors for various types of persons (e.g., teenagers, adults, minorities) using CPS data, but these are not always the groups we include in the QWES. Therefore, we use a methodology known as “bootstrapping” to derive standard errors for the series in the QWES. This method, described in Efron and Tibshirani (1993),2 re-samples (with replacement) the CPS data used for the estimates in the QWES and creates a new vector of sampled estimates, each of which differs somewhat because of the re-sampling procedure. The standard deviation of this vector is the “bootstrapped” standard error that we use to determine significance for the series in this report.

In order to test the reliability of this method, we derived some series that were identical to those published by the BLS. We then calculated and compared the standard errors based on the BLS methods and the bootstrap method. These yielded very similar results, leading us to have confidence that the bootstrap approach is a good proxy for the BLS method. For example, using the parameters in the January Employment and Earnings (Tables 1-C and 1-H), BLS gives the standard error for the change from 1998:4 to 1998:1 in teenage unemployment rates a
s 0.79 percentage points. Our bootstrap estimate for this value is 0.60. For males 20 years and older, the comparable standard errors are 0.13 from the BLS method and 0.10 from the bootstrap. Thus, our bootstrap approach appears to generate standard errors that have a downward bias. While we plan to continue to improve our approach to this important question of statistical significance, for this issue we made sure to consider changes to be significant only when they surpassed traditional acceptance levels.

Endnotes
1. Since the standard error for this group is 0.6, the 1.1 percentage-point decline is marginally significant (see appendix for a discussion of this issue).

2. See Bradley Efron and Robert J. Tibshirani, An Introduction to the Bootstrap, New York: Chapman and Hall, 1993.

Check out the archive for past QWES.

Click here to return to the latest QWES.