In stata, this command is subpop, while in sas the command is domain. The cluster statement specifies cluster identification variables in a clustered sample design. Each analysis application emphasizes procedure selection, syntax generation, and interpretation of results. The subpopx statement is used to subset the data set to adults with arthritis, because these are the only.
The subpopulation is defined as white mracrpi21 and at least 25 years old. An introduction to the sas survey analysis procs lex jansen. How can i analyze a subpopulation of my survey data in stata. However, the subpopulation analysis for jackknife repeated replication is not available in sas 9. Calculating subset weighted analysis using proc surveyfreq and. Remarks and examples subpopulation estimation involves computing point and variance estimates for part of the. How to prepare and analyze pair data in the national.
Variances will be computed using the taylor linearization method, assuming a with replacement wr design sample weight. The analysis of subpopulations is one place where survey data and. These sample survey procedures use sas syntax that will be familiar to those. In sas, the domain statement is used to identify records to include in analyses. Without the rlabel statement, sas variable labels would be produced if already defined. Appendices a and b are based on more advanced material from references 1 and 2 in appendix e. It is recommended that you use sas to do as many of the problems as possible. Oct 10, 2010 for the subpopulation with the highest ki67 li, the hazard ratio of a dfs event for patients in the letrozole group was less than half that for patients in the tamoxifen group. Longitudinal models are twolevel models in conventional multilevel programs, whereas they are singlelevel models in mplus. Ucla further adds that using if in the subpop option does not remove cases from the analysis. Most of code shown in this seminar will work in earlier versions of sas and sas stat. The analysis of domains, sometimes called subpopulations, is discussed.
Subpopulation treatment effect pattern plot stepp for. Subpopulation command national center for education. How can i analyze a subpopulation of my survey data in. The paper analyzes a simulated data set and considers questions related to estimating the effect. This paper presents how to conduct a general subpopulation analysis for jackknife replicate weight designs using the health information national. In january 2009, cdc produced a report software for. The surveymeans procedure can also apply poststrati. See chapter 85, the surveymeans procedure, for more information. You cannot specify both of these options in the same proc surveymeans statement. Poststratification is often used to improve the efficiency of the analysis and adjust for. Subdomain analyses with proc glimmix apr 2, 2019 1. Introduction to sas for data analysis uncg quantitative methodology series 8 composing a program sas requires that a complete module of code be executed in order to create and manipulate data files and perform data analysis.
Analysis of survey data using the sas survey procedures. A statistical population can be a group of existing objects e. Threelevel analysis where time is the first level, individual is the second level, and cluster is the. In general, first a data file must be created using a data step. A closer examination of subpopulation analysis of complex. Unequal probability of selection is an inevitable feature of complex sampling surveys.
Sudaan sudaan version 10 and higher is designed to analyze data from complex surveys and experimental studies. Before the introduction of the survey procedures, sas could not be used for variance. In this article, we aim to focus in more detail on the practical issues underlying subpopulation analysis of complexsample survey data. The survival time of each member of a population is assumed to follow its own hazard. This can be the result of stratified sampling, cluster sampling, subpopulation oversampling, designed unequal probability sampling, and so on. Sas versions 8 and higher include special sample survey procedures that are appropriate for analyzing complex survey data. The following sas statements perform the analysis of the survey data. Software for analysis of yrbs data centers for disease. The formulas for the estimates of a total and its variance for a simple sampling design with no clusters or stratification are shown in the entry for subpopulation estimation in the svy survey data reference manual and are transcribed below. The formation of these subpopulations of interest might be unrelated to the sample design. We chose to take 200 patients in each subpopulation and 199 overlapping patients in tfd analysis. Estimating lorenz and concentration curves in stata. Multilevel modeling with complex survey data 263 analysis.
To avoid confusion between the two uses, the term analysis class will replace the term analysis domain in. Description subpopulation estimation focuses on part of the population. This paper presents practical guidance on three common survey data analysis techniques. Continuous factor analysis lisrel discrete fa irt item response discrete latent profile growth mixture latent class analysis, regression general software. Domain analysis takes this variability into account by using the entire sample to estimate the variance of domain estimates. Comparing results from cox proportional hazards models using. The rate and total options apply only to taylor series or. To find out what version of sas and sas stat you are running, open sas and look at the information in the log file. The mixtran, distrib, and indivint sas macros are provided as separate programs. Applications focus on key sas survey procedures including proc surveymeans, proc surveyfreq, proc surveyreg, proc surveylogistic, and proc surveyphreg. Analysis of complex survey data with sas lex jansen. Non0 values are included in the analysis, except for missing values, which are excluded from the analysis.
The correct bibliographic citation for the complete manual is as follows. Jackknife and balanced repeated replication methods for variance estimation, subpopulation analysis with use of the domain option and the subsetting approach and the use of the. Interim analysis group sequential design randomization interim 1 finalinterim 2 chemotherapy new drug figure from. Subpopulation estimation from the technical point of view.
Subpopulation definition of subpopulation by the free. Nces 2014364 will help you use subsetting commands in sas and stata to. The remainder of the paper consists of examples that demonstrate these procedures. Part iii contains appendices dealing with more advancedfeatures of sas, such as matrix algebra. This entry discusses subpopulation estimation and explains why you should use the subpop option instead of if and in for your survey data analysis. Sasstat software fact sheet organizations in every field depend on data and analysis to provide new insights, gain competitive advantage and make informed decisions. However, the subpopulation analysis for jackknife repeated replication is not available. This document is an individual chapter from sasstat 9. For many organizations, the complexity and volume of their data has outgrown the capabilities of other statistical software. Half the class was asked to run in place between the two readings and the other. Number of observations for subpopulation estimation. For more information about domain analysis, see lohr, cochran, fuller et al. How to prepare and analyze pair data in the national survey. How to prepare and analyze pair data in the national survey on drug use and health substance abuse and mental health services administration center for behavioral health statistics and quality.
These programs can be saved in a sas macro library, or called using the %include statement in a sas program. Keeping n1 close to n2 produces limited patient turnover from window to window, thus smoothing enough the resulting curve. The weight statement names the sampling weight variable. Sudaan version 10 and higher offers analysis capabilities that include crosstabulation, frequency, ratio, and multiple regression modeling techniques. Commonly used techniques such as subpopulation analyses and hypothesis tests are included. Thus, when data on an analysis variable are missing for all respondents in one. The phreg procedure performs regression analysis of survival data based on the cox proportional hazards model. When the subpopulation options is used, only the cases defined by the subpopulation are used in the calculation of the estimate, but all cases are used in the calculation of the standard errors.
Basic statistical and modeling procedures using sas. As stated in the stata survey manual, when the subpop option is used, the subpopulation is actually defined by the 0s false, which indicate those cases to be excluded from the subpopulation. The correct bibliographic citation for this manual is as follows. The first, pulse, has information collected in a classroom setting, where students were asked to take their pulse two times.
The domain statement is available only for the taylor series method. Cdc produced a report software for analyis of yrbs data, describing the use of sas, sudaan, stata, spss, and epi info for analyzing data. The sas manual suggests that using a where statement or creating a dataset with. Sas manual for introduction to thepracticeofstatistics third.
Estimating lorenz and concentration curves in stata ben jann this paper is forthcoming in the stata journal. The purpose of this workshop is to explore some issues in the analysis of survey data using sas 9. If only one subpopulation analysis is desired, a sas technique that assigns a minuscule weight to all nondomain cases in a dataset step prior to the analysis limits the analysis to a selected domain and eliminates unnecessary processing. This scenario provides relatively precise estimates of treatment effect and a large number of windows.
Using stata for subpopulation analysis of complex sample. Sascallable sudaan code for data manipulation and crosstab. Analysis for domains that uses the entire sample is called domain analysis also called subgroup analysis, subpopulation analysis, or subdomain analysis. A part or subdivision of a population, especially one originating from some other population. Hypothesis testing for contrasts of subpopulation estimates. Stepp analysis results were suggestive of heterogeneous relative treatment effects, although this was not statistically significant p. For the subpopulation with the highest ki67 li, the hazard ratio of a dfs event for patients in the letrozole group was less than half that for patients in the tamoxifen group.
The ratio statement requests ratio analysis for means or proportions of analysis variables. With complex survey data, the analyst explicitly acknowledges that the data were. This example was run in sascallable sudaan, and the sas program and. We use the subpopx statement to subset the analysis to the subpopulation of interest. In some survey data analysis programs, such as sudaan, this information will be. We are assuming sascallable sudaan program is installed in your computer. This example was run in sas callable sudaan, and the sas program and. Sas does not have a statement that allows a subpopulation e. Proc surveymeans also performs poststratification, which adjusts the sampling weights so that their distribution matches known auxiliary information. Introduction to survey sampling and analysis procedures book excerpt. Several statistical packages, including stata, sas, sudaan.
The 2x2 table to be used for estimation of odds ratios and relative risks is defined by the last two variables. Coxs semiparametric model is widely used in the analysis of survival data to explain the effect of explanatory variables on survival times. In statistics, a population is a set of similar items or events which is of interest for some question or experiment. The domain statement lists the variables that define domains for subpopulation analysis. Integrating the pdf over a range of survival times gives the probability of observing a survival time within that interval. Basic statistical and modeling procedures using sas onesample tests the statistical procedures illustrated in this handout use two datasets. Analysis of survey data although some survey data analysis procedures are available in sas 9. Sas manual for introduction to thepracticeofstatistics. The sascallable sudaan code for this example is displayed in exhibit 1. Comparing results from cox proportional hazards models. July 16 thurs lecture 89 west regression analysis of complex sample survey data. See chapter 96, the surveymeans procedure, for more information. The sascallable code for this example is presented in exhibit 1. Request pdf using stata for subpopulation analysis of complex sample survey data in this presentation, i provide an overview of important considerations that analysts of large publicuse.
459 1421 437 649 1278 996 643 1193 542 1156 877 505 856 421 711 509 207 5 778 1170 478 1476 1019 25 1403 1113 1312 1096 1385 1227 47 134 1464 48