Frequencies for a Single Categorical Variable If you plan on applying what you learn directly to your homework, create a similar do file but have it load the data set used for your assignment. Then create a do file called descriptives.do in that folder as described in Doing Your Work Using Do Files and start with the following code: If you plan to carry out the examples in this article, make sure you've downloaded the GSS sample to your U:\SFS folder as described in Managing Stata Files. Some commonly used options can change what the tables produced by tab look like, as described in the sections below:
Getting them to do all these things is simply a matter of applying Stata syntax, so so if you've read How Stata Commands Work this section will have no surprises for you. Commands UsedĪll of these tasks can be carried out using just two Stata commands: tabulate (or tab) and summarize (or sum). These are examples of multivariate statistics.
Summary Statistics for One Quantitative Variable over Two or More Categorical Variablesįor a quantitative variable and two or more categorical variables, the the mean value of the quantitative variable for those observations in each combination of the categorical variables can give you a sense of how the variables are related just like they did with a quantitative variable and one categorical variable. Frequencies for Three or More Categorical Variablesįor three or more categorical variables, frequencies will tell you how many observations fall in each combination of the variables and give you a sense of their relationships just like they did with two categorical variables. These are also examples of bivariate statistics. Of then the question of interest is whether the distribution of the quantitative variable is different for different categories. Summary Statistics for One Quantitative Variable over One Categorical Variableįor a quantitative variable and a categorical variable, the mean value of the quantitative variable for those observations that fall in each category of the categorical variable can give you a sense of how the two variables are related. Tables of frequencies for two variables are often called two-way tables, contingency tables, or crosstabs. These are examples of bivariate statistics, or statistics that describe the joint distribution of the two variables. Frequencies for Two Categorical Variablesįor two categorical variables, frequencies tell you how many observations fall in each combination of the two categorical variables (like black women or hispanic men) and can give you a sense of the relationship between the two variables. Means are often called averages, and variance is just the standard deviation squared. Quantitative variables are often called continuous variables. However, the median and percentiles often give you a better sense of how the variable is distributed, especially for variables that are not symmetric (like income, which often has a few very high values).
Summary Statistics for a Single Quantitative Variableįor a variable that describes quantities (like income) the mean tells you what the expected value of the variable is, and the standard deviation tells you how much it varies. Frequency tables for a single variable are sometimes called one-way tables. Indicator variables (also called binary or dummy variables) are just categorical variables with two categories. These are examples of univariate statistics, or statistics that describe a single variable.Ĭategorical variables are also sometimes called factor variables. Topics Covered in this Section Frequencies for a Single Categorical Variableįor a variable that describes categories (like sex or race) rather than quantities (like income) frequencies tell you how many observations are in each category. If you are new to Stata we strongly recommend reading all the articles in the Stata Basics section.ĭescriptive statistics give you a basic understanding one or more variables and how they relate to each other. This article is part of the Stata for Students series.