How to Read in Sas Format Catalogs

A jargon-complimentary, piece of cake-to-learn SAS base of operations class that is tailor-made for students with no prior cognition of SAS.

The Complete SAS Format Guide

Would y'all like to better understand how to use SAS formats to change the appearance of your data sets and output? Are you interested to learn how to use PROC FORMAT to create your ain custom formats? Do you desire to employ custom formats to chop-chop and efficiently categorize your data and heighten your assay? Are you lot looking for ways to permanently store and manage custom SAS formats?

 This article volition address all of the in a higher place and provide an extensive end-to-cease guide on creating, using and managing SAS formats.

 In detail, this article will cover:

  1. Using Congenital-in SAS Formats
    (a) Character Formats
    (b) Numeric Formats
    (c) Engagement Formats
  2. PROC FORMAT
    (a) Creating a Uncomplicated Numeric Format
    (b) Creating a Graphic symbol Format
    (c) Creating a Numeric Format with Ranges
    (d) Saving and Retrieving a Permanent Format Catalog
    (e) Viewing Formats in a Catalog
    (f) Importing and Exporting Format Catalogs

Software

Before nosotros go on, make certain you lot have access to SAS Studio. It'southward gratuitous!

Data Sets

The following datasets from the SASHELP library volition be used in this commodity:

  1. CARS – Information about 2004 cars
  2. ORSALES –  Fictitious  Sports and Outdoors Store Sales information
  3. PRICEDATA - Simulated monthly sales data

Using Built-in SAS Formats

SAS provides a vast array of born formats to modify the advent of character, numeric and date values. With any SAS format, it is of import to keep in heed that the format is non modifying the actual values in the dataset only simply how information technology is displayed.

Both congenital-in formats and custom formats follow a specific naming convention. For both built-in and custom formats, graphic symbol formats always start with a dollar sign ($) while numeric formats practise non. With both character and numeric born formats, the format proper noun ends in either a "w" (width) or a "w.d" for the width and number of digits that volition be shown to the right of a decimal point.

Here are a few sample built-in SAS formats and their naming conventions:

  1. $UPCASEw. – Example: $UPCASE9. is a character format called "UPCASE" with width ix
  2. DATEw. – Example: DATE9. is a numeric format chosen DATE with width 9
  3. DOLLARw.d – Example: DOLLAR10.1 is a numeric format chosen DOLLAR with width 10 and 1 decimal bespeak to the right of the decimal place

Next, let's walk through a few examples of using these formats to understand how SAS formats work.

Character Formats

A character format is a format that can be used with a character variable in a SAS data fix. Every bit mentioned, 1 instance of a built-in character format is the $UPCASEw. format. The $UPCASE can format can be used to convert all the letters in a variable to upper case.

For example, to catechumen the names of the car makes in the SASHELP.CARS dataset to upper case, we can use the FORMAT argument with the $UPCASE format as follows:

data cars_upcase;
 set sashelp.cars;

format make $upcase.;
run;

As yous can come across from the output dataset shown partially beneath, all the car makes are now in upper case and the properties of MAKE prove that the $UPCASE format has been applied. Note that $UPCASE was automatically adapted to $UPCASE13 in this case since the length of the original Make variable was 13:

Think that applying a format does non actually change the values of the variable, so it is easy to reverse or united nations-apply any formats. To remove the $UPCASE format and revert back to an unformatted MAKE variable, simply use the FORMAT statement as before but remove the $UPCASE portion:

data cars_noformat;
 set sashelp.cars;

format brand;
run;

Now in the output dataset, you tin can see the upper case values are gone and the original appearance of the values has returned:

Numeric Formats

As you might expect, numeric formats are formats which can be used with numeric variables. As mentioned before, a common numeric format is the DOLLARw.d format. The DOLLAR format tin exist used with numeric variables which comprise dollar amounts to apply the dollar sign and adjust the number of decimal places shown.

The SASHELP.ORSALES dataset contains the numeric variable Profit as shown partially beneath:

To add the dollar sign to the profit values and but display a single digit after the decimal indicate, we can apply the DOLLAR format every bit follows:

data orsales_dollar;
 fix sashelp.orsales;

 format profit dollar8.1;
run;

As you tin see in the output dataset shown partially beneath, the PROFIT variable has now been formatted to include the dollar sign ($) and but display a unmarried digit later the decimal place:

Date Formats


While date formats are notwithstanding a form of numeric formats, they but piece of work with variables that SAS recognizes equally dates to begin with. One example of a engagement format is MMDDYYw. Depending on the width used, the MMDDYYw format tin can convert dates to look similar mm/dd/yy with a width of viii or look similar mm/dd/yyyy with a width of 10 applied.

The SASHELP.PRICEDATA contains the variable Appointment, which is formatted as MONYY5 past default:

To modify the DATE variable to appear like mm/dd/yyyy, we tin apply the mmddyy10. Format using the following syntax:

data pricedata_date;
 set sashelp.pricedata;

format date mmddyy10.;
run;

Equally y'all can see in the PRICEDATA_DATE dataset shown partially beneath, we now have the appointment variable formatted equally mm/dd/yyyy:

Do you have a hard time learning SAS?

Accept our Practical SAS Grooming Course for Accented Beginners and larn how to write your showtime SAS program!

Proc Format

The FORMAT process allows you to create your ain custom formats. Using PROC FORMAT you can create both character and numeric formats and likewise create more complex grouping formats with ranges. Custom formats can be created for an individual SAS session or they can be stored permanently for time to come utilize.

While  much of what you lot tin accomplish with PROC FORMAT could ultimately be handled with DATA Step programming, PROC FORMAT is a much more efficient solution, particularly with larger datasets as information technology requires far fewer computational resources.

Creating a Uncomplicated Numeric Format

The SASHELP.CARS dataset contains the numeric variable CYLINDERS which denotes the number of cylinders found in each model of vehicle. In this case, we would like to study on the frequency of the number of cylinders but Instead of displaying the numeric values such equally iv,6 or eight, we would like to display the values in words such every bit "four", "6" or "eight".

Before nosotros walk through how to create a numeric format with PROC FORMAT, let'due south kickoff by illustrating how this tin be achieved with traditional SAS Information Step programming.

In the syntax below, nosotros ascertain a series of IF statements to create a new variable, cylinders_text, which contains the desired description for the number of cylinders in words. Nosotros tin then verify the results by running PROC FREQ on both the original CYLINDERS variable also as the newly created CYLINDERS_TEXT variable:

data cars_coded;
 length cylinders_txt $vi;
 ready sashelp.cars;

  if cylinders = 3 and then cylinders_text = '3';
 if cylinders = 4 and so cylinders_text = 'iv';
 if cylinders = five then cylinders_text = 'five';
 if cylinders = six then cylinders_text = 'six';
 if cylinders = 8 then cylinders_text = 'eight';
 if cylinders = 10 then cylinders_text = 'ten';
 if cylinders = 12 and so cylinders_text = 'twelve';
run;

 proc freq information = cars_coded order = freq;
 tables cylinders cylinders_text;
run;

Every bit y'all can come across in the two tables output by the PROC FREQ telephone call shown beneath, we have successfully created the CYLINDERS_TEXT variable.

Next, allow's wait at how we can achieve a similar effect using PROC FORMAT.

 The PROC FORMAT phone call starts with a PROC FORMAT statement. By default, PROC FORMAT volition store the custom formats in the WORK library and they will only be available during this SAS session. By using the LIBRARY option, yous can specify the desired location for the PROC FORMAT catalog, nevertheless in this instance we will save the catalog to WORK for simplicity.

 Next, the VALUE statement is used to name the format and likewise define the characteristics of the format. In this case, our format is named CYLINDER_FMT and text values for 3,iv,5,6,eight,ten and 12 are defined by placing the desired words in quotations after the equal sign as shown in the syntax below:

proc format library = work;
 value cylinder_fmt
  3 = 'three'
  four = 'four'
  5 = 'five'
  vi = 'six'
  8 = 'eight'
  ten = 'x'
  12 = 'twelve'
  ;
run;

After running the code above, the format is created and will exist available in the WORK directory for the remainder of your SAS session. We can now utilize this format with PROC FREQ to achieve the desired results:

proc freq data =s ashelp.cars;
 tables cylinders;
 format cylinders cylinder_fmt.;
run;

As you can see in output table, the results are consistent with the Data Footstep case:

While the same results can often be achieved using DATA Footstep programming or PROC FORMAT, PROC FORMAT can exist better both in terms of efficiency and storage utilization:

  • Efficiency
    1. The DATA Step method requires the use of IF statements to iterate through every observation in the data set whereas PROC FORMAT just alters the metadata of the dataset.
    2. The Data Step method also requires you to read in and out a dataset to make the modification, whereas PROC FORMAT does not require writing out whatever new datasets.
    3. Both using IF statements to iterate through all observations and reading/writing new datasets are potentially time consuming and resource intensive tasks, particularly with large datasets.
  • Storage
    1. The Information Footstep methods requires you to create a new variable, whereas PROC FORMAT does not crave you to create any new variables.
    2. Creating additional variables could significantly increase the size of your information set, particularly if the formatted values are wide or if you accept a big number of observations

Creating a Character Format


The PROC FORMAT syntax to create a custom format for a character variable is very similar to the syntax used for creating a custom numeric variable.

As before, we first commencement with a PROC FORMAT statement and specify that we would like to save the format in WORK with the library option (recall this is actually the PROC FORMAT default).

Next, we begin defining the format with a VALUE statement followed by the desired format name. In that location are 2 differences here when compared with creating a numeric format. First, the format name must kickoff with a dollar sign ($) for a graphic symbol format and second the values to be formatted must also be in quotation marks since they are character values.

In the post-obit syntax, a character format $CAR_TYPE is created which tin be applied to the SASHELP.CARS dataset variable TYPE. The $CAR_TYPE format expands the values of TYPE and then that they are easier to understand:

proc format library = work;
 value $car_type
  "Hybrid" = "Hybrid Drivetrain"
  "SUV" = "Sports Utility Vehicle"
  "Sedan" = "4-door Sedan"
  "Truck" = "Pickup Truck"
  "Carriage" = "Station Railroad vehicle"
 ;
run;

After running the higher up code, the $CAR_TYPE format is now available for use in your SAS session. To test it, nosotros tin run a PROC FREQ on the TYPE variable in the SASHELP.CARS information set as shown hither:

proc freq information = sashelp.cars;
 tables type;
 format blazon $car_type.;
run;

Equally yous can see in the results shown below, the TYPE values have at present been formatted and are easier to sympathize in the output frequency table:

Go a Certified SAS Specialist

Get access to 2 SAS base certification prep courses and 150+ practice exercises

Creating a Numeric format with Ranges

PROC FORMAT is as well a useful tool for grouping your data to help with certain analyses, categorization, and data interpretation.

In this example, you would like to better understand the distribution of invoice prices for all the vehicle models in the SASHELP.CARS dataset. For this analysis, yous'd similar to know how many vehicles fall into the falling toll categories:

  1. $20,000 or less
  2. $20,001 to $thirty,000
  3. $30,001 to $50,000
  4. $l,001 or more than

Using PROC FORMAT, we tin can create a custom format called INVOICE_GROUPS to use to the INVOICE variable in the SASHELP.CARS dataset. Every bit before, we will use the default options of PROC FORMAT and create our new format in the WORK library.

Using the VALUE statement, we will define our new format as INVOICE_GROUPS. When defining the ranges, there are a few important points to consider:

Minimum and maximum values for each range are separated by a nuance (-)
"low" and "high" tin be used equally minimum and maximum values when defining ranges to capture the true minimum and maximum values constitute within a dataset
Ranges within a format cannot overlap

Keeping the above nether consideration, here is the PROC FORMAT syntax to create the INVOICE_GROUPS format:

proc format library = work;
 value invoice_groups
  depression-20000 = '$20,000 or less'
  20001-30000 = '$20,001-$30,001'
  30001-50000 = '$30,001-$fifty,000'
  50001-loftier = '$50,000 or more'
  ;
run;

To illustrate that the newly created format is working correctly, permit's beginning utilise this to a data prepare and examine the results. Using a PUT argument, nosotros tin easily create a new variable, INVOICE_FORMAT and compare the formatted values of INVOICE to the original values of INVOICE. A the end of the Information Pace, a go on statement is used to keep simply the variables MAKE, MODEL, INVOICE, INVOICE_FORMAT in the output dataset:

data cars_invoice_groups;
 ready sashelp.cars;

  invoice_format = put(invoice,invoice_groups.);

  go along make model invoice invoice_format;
run;

In the output data set shown partially beneath, you tin can come across that the groupings for the invoice prices are working correctly:

At present that we take verified the newly created format is working equally expected, the format can be used to create a report with the grouped invoice prices to gain a ameliorate understanding of how many cars fall into each toll category.

 Using PROC FREQ with a TABLES argument and a FORMAT statement equally shown below, we can hands generate this report:

proc freq data=sashelp.cars;
 tables invoice;
 format invoice invoice_groups.;
run;

As y'all can encounter in the Results shown beneath, we now have the frequency of cars which fall under each of the newly created invoice groups:

Saving and Retrieving a Permanent Format Catalog

So far, all the formats we have created were saved to the Work directory, which is the default location that PROC FORMAT saves user created formats. If you would like to store your formats for future use without having to re-run the PROC FORMAT code each time, PROC FORMAT as well has the ability to salve format catalogs into a permanent SAS library.

To demonstrate how to store a permanent format itemize, permit'southward create a new uncomplicated format for classifying Miles Per Gallon in the City (MPG_CITY) in the SASHELP.CARS dataset.

Before you can salve to permanent SAS library, you must beginning use the LIBNAME statement to define a new permanent library on your arrangement. Annotation that if y'all are using SAS Studio you may be able to use the verbal same LIBNAME statement shown below, but depending on the SAS version y'all are using and your arrangement configuration, the path "/folders/myfolders" may need to be replaced with a dissimilar path that is bachelor to you on your system.

The syntax for creating the actual format is the same as earlier, merely this time nosotros will use the library choice to point to another location on your arrangement:

libname mylib "/folders/myfolders";

 proc format library = mylib;
 value mpg_groups
 0-15 = "Poor MPG"
 16-30 = "Average MPG"
 31-loftier = "Good MPG"
 ;
run;

If the format was successfully created and saved to the library MYLIB, you should at present see a formats.sas7bcat file under "My Folders" (or in the location yous specified if you didn't use SAS Studio or the "/folders/myfolders" path):

At present that the format has been created and stored in MYLIB, by default SAS doesn't know where to observe this format and information technology won't be available for yous to apply in this SAS session. To tell SAS where to find the format catalog, yous'll need to add the FMTSEARCH arrangement selection. Later on calculation this pick and specifying the MYLIB library, you'll be able to employ the newly created MPG_GROUPS format:

options fmtsearch = (mylib);

 proc freq data = sashelp.cars;
 tables mpg_city;
 format mpg_city mpg_groups.;
run;

After running the code in a higher place with the FMTSEARCH option and PROC FREQ, you should at present come across the new MPG groups reported in the frequency table as shown below:

Viewing Formats in a Catalog


In one case you have saved formats in a permanent catalog, you can review them and see the values you have defined. Using PROC FORMAT with the FMTLIB option, yous tin hands impress out a list of all the formats found in that library:

proc format library = mylib fmtlib ;
run;

After running the code above, you can come across the details of the MPG_GROUPS format in the Results shown beneath:

Importing and Exporting Format Catalogs


As you lot may have realized, manually typing out many custom format values can be quite fourth dimension consuming. Fortunately PROC FORMAT also has a utility to create formats based on an existing dataset. This allows you to import tables which may already contain your format definitions and utilise them to create new custom formats.

Before y'all can import a SAS data set into PROC FORMAT, it must contain at least the following 4 variables:

  1. FMTNAME – the name of the format y'all'd similar to create
  2. Showtime – the minimum value of the number/graphic symbol you'd like to format (if you have a character format or if your format volition not include a range then this is simply the value you'd like to format)
  3. Label – The formatted value you'd like to apply to your data points
  4. Blazon – The type of format you'd like to create (C=Character or North=Numeric are the most common values used hither)
  5. (Optionally) END – the maximum value of the number range yous'd like to utilise a format to
    There are other variables that can be found in this dataset as well for more advanced custom formats, but these are the mandatory variables you must take.

Using the post-obit Data Step lawmaking, we will create a data set which volition exist compatible with PROC FORMAT. The format we are creating will be called ENGINE_GROUPS and can be used to group the ENGINESIZE variable from SASHELP.CARS into 3 categories: Small (1-2), Medium (two.1-3.5) and Big (3.half dozen-viii.3).

information myformats;
length fmtname $xv;
input fmtname $ start end label $ type $;
datalines;
engine_groups 1 2 Pocket-sized N
engine_groups 2.i 3.v Medium N
engine_groups three.6 8.iii Large N
;
run;

After running the code above, you lot should at present take the MYFORMATS data in WORK as shown below:

Now that nosotros take a dataset with the correct variables and values to define a new format, ENGINE_GROUPS, nosotros can use the CNTLIN option with PROC FORMAT to read in this dataset:

proc format library = work cntlin = myformats;
run;

After successfully importing the MYFORMATS dataset with the lawmaking in a higher place, the ENGINE_GROUPS format is now available for use. For case, we can apply the PROC FREQ syntax lawmaking below to generate a frequency table on the engine sizes using our new ENGINE_GROUPS format:

proc freq data = sashelp.cars;
 tables enginesize;
 format enginesize engine_groups.;
run;

After creating the MYFORMATS data set and running the PROC FORMAT and PROC FREQ code above, you lot should see the following frequency tabular array:

Since all formats are stored in a formats.sas7bcat file, at that place may be situations where y'all also  want to export your formats into a regular SAS data set (.sas7bdat) file. This process is essentially the reverse of the previous example, yet this fourth dimension we use the CNTLOUT selection with PROC FORMAT to consign a SAS data set instead of importing a SAS data ready.

 If you went through this entire article you should now have 3 formats in your WORK library: ENGINE_GROUPS, INVOICE_GROUPS and $CAR_TYPE. To save all these formats into a single dataset, ALL_WORK_FORMATS, we tin add the CNTRLOUT option to the PROC FORMAT statement every bit shown below:

proc format library = work cntlout = work.all_work_formats;
run;

Afterwards running the code in a higher place, you should now meet an output dataset, WORK.ALL_WORK_FORMATS, which is shown partially below:

Master SAS in xxx Days

Inline Feedbacks

View all comments

iconmail

Become latest manufactures from SASCrunch

SAS Base Certification Exam Prep Course

Two Certificate Prep Courses and 300+ Practise Exercises

chanrestlys.blogspot.com

Source: https://sascrunch.com/sas-formats/

0 Response to "How to Read in Sas Format Catalogs"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel