Читать книгу JMP Essentials - Curt Hinrichs - Страница 9

Оглавление

Chapter 2: Data

2.1 Getting Data into JMP

2.2 The JMP Data Table

2.3 Data and Modeling Types

2.4 Cleaning and Formatting Data

2.5 Selecting, Highlighting, and Filtering Data: Row States

2.6 Adding Visual Dimension to Your Data

2.7 Shape Files and Background Maps

2.8 The Tables Menu

2.9 Summary

The first step in creating a graph or analysis is to get your data into JMP. With JMP, you can easily import data from many different sources such as Microsoft Excel or ODBC-compliant databases, or you can enter your data directly into a JMP data table. Because most readers already have data in one form or another, this section focuses on getting that data into JMP from another file format. Sometimes data is not in the best condition when you import it. Later in this chapter, we discuss what you can do to format data or deal with missing data. JMP also supports unstructured text and shape files (that can be used to create maps). We will describe the special requirements of using these file types.

As mentioned in the previous chapter, we use Windows as our default operating system to illustrate JMP and its native menus as shown in the below image. JMP instructions for Windows and Macintosh are basically the same, though some operating system differences are noted when they occur.


Example 2.1 Big Class Families

We will be using the Big Class Families.jmp data file to illustrate the steps in this section. This data set consists of 40 middle-school students and their image, name, height, weight, gender, age, and other miscellaneous information. You can access this data set in the Sample Data folder that is installed with JMP: File  Open  C:  Program Files  SAS  JMP  15  Samples  Data  Big Class Families.jmp. Alternatively, you can select Help  Sample Data Library  Big Class Families.jmp.

2.1 Getting Data into JMP

Getting your data into JMP is a familiar process. Like many other desktop applications, you can simply select File  Open to import your data into JMP. JMP can handle many different data formats. Table 2.1 shows the default formats JMP recognizes. Other previously installed applications could contain proprietary formats that might also appear as import options. You can import files with these formats as well.

In this section, we show you how to open JMP data tables and how to import Microsoft Excel spreadsheets and text files in JMP. Each of these file formats follows the same basic procedure, but each has special options that enable you to import exactly what you want. JMP interfaces with databases using Open DataBase Connectivity standard (ODBC). Through the Query Builder dialog box, you can easily set up queries of your data and automatically create SQL code that can be saved and repurposed. And you don’t need to learn SQL to do this. We illustrate only the essential connectivity here; more information about querying your data is available in the JMP documentation (Help  Search  SQL). At the end of this section, we show you how to create a new data table in JMP.

Table 2.1 Default File Formats Supported by JMP

File TypeFile Extension
JMP Files.jmp, .jsl, .jrn, .jrp, .jmpprj, .jmpmenu, .jmpaddin, .jmpapp, .jmpquery
Excel Files.xls, .xlsx, .xlsm
Text Files.txt, .csv, .dat, .tsv
SAS Data Sets.sas7bdat, .sas7bxat, .xpt, .stx
SAS Program files.sas
R Code.r, .R
MATLAB Code.m, .M
SPSS Data Files.sav
Minitab Worksheet Files.mtp
Shapefiles.shp
HTML.htm, .html
XML Data files.xml
JSON.json
SQLite 3.0 or Higher.sqlite, .db, .sqlite3, .db3
Teradata Database.trd
xBase Data Files.dbf
Triple-S Files.sss, .xml
Flow Cytometry v2 or 3.fcs
Hierarchical Data Format v5.h5
Note
JMP can be used to initialize or pull data from other third-party applications: Python, ODBC, R, MATLAB, and so on when such integrations are implemented in JMP. The third-party application must match in architecture. The term architecture is sometimes referred to as “bitness” (32-bit or 64-bit). One can verify the architecture of JMP via Help  About JMP (Windows).

Opening a JMP File

Let’s start by opening a JMP data table. At the top left of the JMP window is the File menu:

1. Select File  Open (see Figure 2.1). A dialog box opens. (On a Mac, select File  Open and locate your file in the appropriate folder.)

Figure 2.1 Opening a File


We will use the Big Class Families data table described earlier. Click the Big Class Families.jmp file and select Open. (See Figure 2.2.)

Figure 2.2 Open File Dialog Box


Note
To locate this file for the first time, select File  Open  C:  Program Files  SAS  JMP  15  Samples  Data  Big Class Families.jmp. Alternatively, you can also select Help  Sample Data  Open the Sample Data Directory  Big Class Families.jmp.

These steps open the JMP data table Big Class Families. (See Figure 2.3.) With these simple steps, you are now ready to analyze or visualize this data.

Figure 2.3 The JMP Data Table


This spreadsheet-like table is referred to as the JMP data table, which is JMP’s common data format regardless of where the data comes from. Section 2.2 discusses the components of the data table.

Note
In the Big Class Families example, each row of the table contains information about a single student in the class and is, therefore, one observation. Each column of the table is one piece of information (or variable) collected on each student. This structured format of the data table is required for most graphs and analyses in JMP. The importing examples that follow assume that your data already exists in this format. Section 2.4 introduces some tools to use if your data does not conform to this structure.

Importing Data into JMP

Importing data into JMP from another file format is similar to opening a JMP file. Within the File  Open pop-up window, the Files of Type drop-down menu indicates All JMP Files as the default.

If you are importing another file type, simply click on the down arrow and select the correct type. You can also select All Files from the drop-down menu. (See Figure 2.4.) Select the file that you want, and then click Open. On the Mac, select File  Open and available files will be highlighted. On the Mac, files that JMP cannot open will be dimmed.

Figure 2.4 Selecting All File Types


Note
If you know the format of your data, first select the correct format from the Files of Type drop-down menu. You will see the available files of that type within the folder. Once you have located the right file, select the file and click Open.

Importing an Excel File

Importing an Excel file is easy if your variables are in columns and your cases or observations are in rows. Ideally, any variable names should appear in the row directly above the first row of data, as shown in Figure 2.5. The import process automatically opens and converts the data into a JMP data table and uses your variable names as column headings:

Figure 2.5 Importing an Excel File


1. Select File  Open.

The Big Class.xls file, which is illustrated here, can be found by selecting C:  Program Files  SAS  JMP  15  Samples  Import Data  Big Class.xls.

2. From the Files of Type drop-down menu, select Excel Files.

3. Select the file that you want, then select Open to launch the Excel Import Wizard dialog box with a view of your data. If it looks correctly structured, select Import.

Shortcut
If you have an Excel worksheet or workbook on your desktop, you can simply drag the file over the JMP shortcut icon on your desktop to launch JMP and the Excel import wizard.

The Excel Import Wizard

While the previous example was simple and straightforward, a common characteristic of Excel worksheets is that data does not always conform to the essential column/row structure that is required by JMP. For example, you might have multiple nested headers where one row might represent year and the next row contains months within that year. The Excel Import Wizard has made importing this worksheet and maintaining the month-within-year structure much easier.

This wizard also provides options to specify which rows should be headers for columns, to specify hidden or merged columns, and to replicate these settings or merge data from multiple worksheets within a workbook.

To illustrate this feature, let’s try another example with one of these characteristics. We will use the Team Results.xlsx worksheet from the Import Data folder. This worksheet has headers/column names that appear to be in the second row. (See Figure 2.6.) The Excel Import Wizard will help JMP decide how to import this data.

Figure 2.6 The Excel Import Wizard


1. Select File  Open, then select an Excel workbook. In this example, we are using Team Results.xlsx from the C:  Program Files  SAS  JMP  15  Samples  Import Data folder.

2. Select Team Results.xls  Open to launch the Excel Import Wizard with an initial display of your data in the window (Figure 2.6).

3. As you can see, the column headers begin in row 3 (when you include the note that appears as a header in the preview) and the first set of observations in row 4. To get this into the right format, adjust the Individual Worksheet Settings as we have done in Figure 2.7.

Figure 2.7 Adjusting the Worksheet Settings


4. Once you have adjusted these settings, you should see your data take its proper shape in the Data Preview panel. Once you are satisfied with the adjustments you have made, select Import.

Importing a Text File

If you are importing a text file, another handy wizard is included in the Data with Preview file option. Like the Excel Import Wizard, this wizard enables you to view your data and specify how you want it to appear before importing it into a JMP data table. It also provides options to convert your text file if it is delimited by commas, tabs, or spaces:

1. Select File  Open.

The Big Class_L.txt file, illustrated here, can be found by selecting C:  Program Files  SAS  JMP  15  Samples  Import Data  Big Class_L.txt.

2. Select Text Files (*.txt, *.csv, *.dat, *.tsv) in the Files of Type drop-down menu. Select the file.

3. Select Data (Using Preview) (Figure 2.8).

Figure 2.8 Data Using Preview


4. Select Open.

5. Choose the settings that you may need to structure your data for import. (See Figure 2.9.) Click Next and then Import.

Figure 2.9 Text Data Preview


Importing a Database File

Options to import data extracted from a database are available through ODBC within JMP. To access this data, first connect to the database (the data source should already be defined) and then specify the table of interest. You can also create custom queries of your data using Query Builder (explained in the next section) or the Advanced button. If you need more help defining your data source, select Help  JMP Documentation Library Using JMP  Ch. 3 Import Data from a Database.

1. Select File  Database  Open Table (Figure 2.10).

Figure 2.10 Database Open Menu


2. The Database Open Table window appears. (See Figure 2.11.) It prompts you to connect to your database and either open a data table or specify a query. Clicking the Connect button launches the Select Data Source window to locate and connect to your database.

Figure 2.11 Database Open Dialog Box


3. Locate the table of interest, highlight it, and click Open Table to import the data.

Note
Note: You have the option of directly importing database files by selecting File  Open as previously discussed (assuming that these programs are installed). Using this more direct option enables you to import only a single table.

Creating a Database Query Using Query Builder

While importing a database file is a quick way to bring in a single table, there are times when you want to import from more than one file at a time. A database query enables you to combine data tables, aggregate data, sort data, and filter data. Once the query is created, it can be saved and reused for future queries.

In other software environments, you would need to know SQL (Structured Query Language) to create such queries. However, the Query Builder in JMP enables you to create a query in an easy, point-and-click environment that creates the SQL code for you. As with importing a single database file, you must first connect to the database. The data source should already be defined. Once the connection is established, you can choose the data tables of interest and specify the parts of the data tables that you want to extract into your resulting JMP table.

1. Select File  Database  Query Builder.

2. Choose New Connection and connect to your database (this example is for illustration purposes only since you will be unable to connect to this database).

For this example, you want to create a data table with information about historical SAT scores for states in the South and Southwestern regions. You would like to include the following information in your data table:

● From the data table SATByYear: the average SAT Math and SAT Verbal scores for each state.

● From the data table SATStateInfo: the Student/Faculty ratio for each state.

● From the data table StateToRegion: the Region each state is assigned to.

1. Select the SATByYear data table and assign it as the Primary table for the query.

2. Select SATStateInfo and StateToRegion data tables and assign them as the Secondary tables for the query. (See Figure 2.12.) Note that the column State appears in each data table and is designated as a “key” variable. Key variables are automatically used as the matching column to join data table in the query.

Figure 2.12 Database Query


3. Select Build Query.

4. In the Tables panel, select the SATByYear data table.

5. Choose the State, SAT Verbal, and SAT Math from the Available Columns panel and click the Add button. (See Figure 2.13.) (Because these are small data tables, you might want to turn on the Update preview automatically option. This option can slow performance down if the data tables are very large.)

Figure 2.13 Database Query Building


6. Since you are interested in the average SAT scores for each state, change the Aggregation for the SAT Math and SAT Verbal columns to Average. Note that Group By was automatically turned on for State. Examine the sample shown in the lower panel. You have one row for each state. (See Figure 2.14.)

Figure 2.14 Aggregating Data


7. Select the SATStateInfo data table in the Tables panel. Add the Student/Faculty Ratio column to the included columns.

8. Select the StateToRegion data table in the Tables panel. Add the Region column to the included columns. (See Figure 2.15.)

Figure 2.15 Defining Included Columns


9. To include only the states in the South and Southwestern regions, select Region in the Included Columns list and drag it to the Filter pane on the right side of the window.

10. Control-click to select both South and Southwest in the Filters list.

11. Click Save to save the query as a .jmpquery file for use later.

12. Click the Run Query button to create the new data table. (See Figure 2.16.)

Figure 2.16 Resulting Data Table


Creating a JMP Data Table from Scratch

1. Select File  New  Data Table to create a new data table (Figure 2.17):

Figure 2.17 Creating a New Data Table


2. Double-click on the first column’s heading and type the column name (the variable name).

3. Press Enter and type the data into the first cell directly below the heading. Press Enter again, type the data, and repeat as needed. Rows within JMP are consecutively numbered as observations or cases. (See Figure 2.18.)

Figure 2.18 Add a New Column


4. To create another column, double-click on the next column’s heading and enter the data as you did before.

If it is more practical for you to enter a series of data for each row as you build your data table, set up all of your column headings first and then use the Tab key to move from the left columns to the right. When each column has been filled, the Tab key moves down to the beginning of the next row.

Note
JMP will recognize the type of data you are entering and assign a data type to the column, either numeric or character. It also assigns an icon next to the columns (or variables) in the box on the left. These icons are the modeling type and are discussed in Section 2.3.

2.2 The JMP Data Table

The JMP Data Table looks very much like any spreadsheet. (See Figure 2.19.) In JMP, column headings indicate variables (what you have measured or counted), and rows indicate individual cases or observations. JMP requires your data to be structured in this way. If it is not, JMP can help you reformat your data. (See Section 2.4.)

Figure 2.19 The JMP Data Table


Data Table refers to the spreadsheet-like grid where your data resides.

The data grid can contain any number of columns (your variables) or rows (observations or cases). In this sense, we refer to data within the JMP data table as structured data.

In addition to the data grid, notice the three panels to the left of the data table. These panels contain information about your data (metadata). They provide vital information about your data as well as options to streamline and save your analyses.

The first and upper-most panel contains the name of the data table. (See Figure 2.20.) This panel stores references, notes, and/or scripts. Scripts enable you to save, automate, and customize analyses. If you perform a regular analysis or scheduled task, you will want to learn more about JMP scripts. (See the JMP Scripting Guide at Help  JMP Documentation Library  JMP Scripting Guide.)

Figure 2.20 The Table Panel


The Columns panel (see Figure 2.21) is where your column names (or variables) appear. Each column has an icon in front of it.

Figure 2.21 The Columns Panel


These icons correspond to the modeling type of the data in each column. As discussed in the next section, this is vitally important. JMP produces only the graphs or statistics that are appropriate for a column’s modeling type. In most cases, you can change the modeling type by simply clicking on the icon and selecting another appropriate type. The first number in parentheses represents the total number of column names or variables in the data table. The second number represents any of those columns that are currently selected or highlighted.

The bottom panel is the Rows panel. (See Figure 2.22.) The Rows panel indicates how many rows (observations) are in your data table. This panel also indicates the number of selected, hidden, or excluded rows, if any.

Figure 2.22 The Rows Panel


When rows are hidden, the observations are not included in graphs. When rows are excluded, they are not included in analyses. This row state is effective when you want to see or analyze a subset of your data. You can also both hide and exclude specific rows, which effectively removes the row(s) from your analyses and graphs, but not from your data table. Section 2.5 provides more information about row states including hiding and excluding rows.

Note
Multiple data tables can be open at any time, but only one active data table can be analyzed at a time. If you have multiple data tables open within JMP and you want to switch to another open data table, go to the Home Window (see Figure 2.23) and select the desired data table under Windows List.

Figure 2.23 The Home Window


A special type of data table is shape files, which are used to create thematic maps. These data tables consist of two tables including a “Name” and corresponding “Boundary” table. These are stored in the Maps folder: C: Program Files  SAS  JMP  15  Maps. Section 2.7 covers some of the basics about shape files.

Note
There is no practical limit on the size of the data table that you can analyze. However, because JMP runs in your computer’s local memory, the amount of RAM that you have determines the upper size limit of your data table. Your computer should be equipped with at least twice as much memory as the size of the data table. Thus, if you have just 8 GB of RAM, you can analyze a 4GB data table or about a 10-variable data set with 4 million rows! More details about JMP system requirements can be found at www.jmp.com.

2.3 Data and Modeling Types

One of JMP’s great features is the ability to produce graphs and statistics that make sense for the data that you are analyzing. This feature assumes that your data is correctly classified in the data table. So, what do we mean by data type and modeling type? Let’s define a few terms.

Data

refers to any values placed within a cell of a JMP data grid. Examples include numeric and/or text descriptions: 3.6, $2500, Female, Somewhat Likely, or 7/7/19.

Data type

refers to the nature of the data. The data type is usually either numeric (numbers) or character (often words and letters but sometimes also numbers). Other special purpose data types include expression (used for images and matrices) and row state.

Modeling type

refers to how the data within a column should be used in an analysis or a graph. JMP uses three distinct and primary modeling types: continuous, nominal, and ordinal. (JMP also includes three additional special purpose modeling types: Multiple response, Unstructured text, and None. Since these are less common, we will only summarize them in the note at the end of this section.)

● Continuous data (also referred to as quantitative, ratio, or interval scale data) takes a numeric form and is often thought of as some type of measurement. For example, home selling prices, income earned, costs per square foot, and dates are all examples of continuous data. As a rule of thumb, continuous data can be used in calculations. For example, calculating the average cost per square foot would be meaningful.

● Nominal data is categorical data (also referred to as qualitative, discrete, count, or attribute data) and can take on either a character or numeric form. Nominal data fits into categories or groups such as car type, gender, department, and sales territory and also includes indicator variables such as yes/no or 0/1. In nominal data, it is helpful to count the frequency of the occurrence of values, but otherwise, nominal data is not used in calculations. For example, calculating the average car type would not be meaningful.

● Ordinal data is categorical data that has an inherent order or hierarchy. For instance, Likert scales (such as levels of satisfaction) in a survey and grade levels in school (freshman, sophomore, junior, senior) are examples of ordinal data. That is, they represent categories that have some sequence or order that should be retained in any analysis. Ordinal data is less common than continuous and nominal data, but there are a few analyses designed specifically for it. In most JMP analyses, nominal and ordinal data are treated the same way.

Table 2.2 Data Types Appropriate for Modeling Types

Modeling TypeData Type
NumericCharacter
ContinuousYesNo
NominalYesYes
OrdinalYesYes
Note
Numeric data is right-justified in the data table, whereas character data is left-justified. This can be useful to check whether data contains errors.

In our example, Big Class contains five variables (or columns) representing each of these modeling types. (See Figure 2.24.) Let’s briefly explain why they are classified by their data and modeling types:

● Name is nominal because it is a character data type and the student’s name is arbitrary.

● Age is ordinal because the values are rounded down and we want to retain the six ordered age groups (12 to 17) in our analysis.

● Sex is nominal because its data type is character (M or F) and it has no order.

● Height and weight are continuous because they are both numeric and represent a measurement.

Note
Age could also be considered continuous because the values are numeric, but this would treat age differently and yield different results.

Figure 2.24 Understanding Modeling Types in the Data Table


Note
Let’s briefly review some of the more specialized data and modeling types and what they are used for.• Row State is a data type that enables you to store and manage information about a row of data. (See Section 2.5.)• Expression is a data type that enables you to store images or matrices in a column.• Multiple Response is a modeling type that is commonly used in surveys where one may be asked for more than one answer.• Unstructured Text is a modeling type that is used for documents such as customer reviews, wine tasting notes, or an entire book. This modeling type is used in text mining with JMP’s Text Explorer.• None is a modeling type that tells JMP not to use that column in an analysis. This might be used for a column that represents an identifier for each row of data, for example, a student ID number, or patient number.For more information, select Help  JMP Documentation Library  Using JMP  Ch. 5, About Modeling Types.

Changing the Modeling Type

When you import data, the JMP default selects and assigns one of two modeling types based on whether the data is numeric or character. Numeric data becomes continuous and character data becomes nominal. Sometimes you might want to change the default modeling type of your data to generate results that are more meaningful.

For example, if we imported the Big Class data from Excel, age as numeric data would be imported as a continuous column. We might want to change that to ordinal. Changing the modeling type is simple in JMP. Click the column’s corresponding icon in the Columns panel in the data table and select the correct type. (See Figure 2.25.)

Figure 2.25 Changing the Modeling Type


If the Continuous option is grayed out, your data type is classified as character. To change the data type, double-click on the column heading and change the data type to numeric. (See Figure 2.26.) In this window, you can also change the modeling type along with a host of other formatting options, which are described in the next section.

Figure 2.26 Changing the Data Type


For more information, select Help  Books  Using JMP  Chapter 5, Set Column Properties  About Data and Modeling Types.

2.4 Cleaning and Formatting Data

Sometimes data is not in the best shape or in the right form when it is imported. Fortunately, JMP has extensive column formatting abilities. This section focuses on the most common features, including:

● Cleaning up your data format, such as decimal places, dates, times, and currency. We will use the Column Info window to accomplish these tasks.

● Introducing the Formula Editor, which enables you to create new columns from old ones, add IF statements, and transform data using basic or more advanced functions. We will introduce a basic example in this section. For more information, select Help  JMP Documentation Library.

● Learning to use the RECODE command, which is a handy way to merge similar categorical responses into a single category. For example, if you have Woman, Female, and Girl as responses, you can merge these into a single response: Female.

Example 2.2 Movie Rentals

We will use the Movies.jmp data table to illustrate the concepts in this section. This data table consists of the 277 top-grossing movies released between 1937 and 2003. The columns are:

● Movie: name of movie

● Type: genre/category of movie (for example, comedy, family)

● Rating: US movie rating system (for example, general audience [G], adult [R])

● Year: year of movie release (for example, 1937)

● Domestic $: US domestic revenue in $ earned by the movie in that year

● Worldwide $: Worldwide revenue in $ earned by the movie in that year

● Director: director of movie

You can access this data table in the Sample Data folder that is installed with JMP by selecting File  Open  C:  Program Files  SAS  JMP  15  Samples  Data  Movies.

Getting your data into a standard format is done through the Column Info window, which is accessed from the Cols menu. Options to format your data are driven by the data and modeling types specified for that column of data. You can change these types, if necessary, to meet the requirements of your analysis. Recall that changing these types affects the graphs or statistics that you can generate from that column. (See the previous section.) Let’s begin by opening the Movies.jmp data table:

1. Open the Movies.jmp data table.

2. Select the Domestic $ column, and then select Cols  Column Info.

3. Because Domestic$ is a numeric value, you see the Format drop-down menu (see Figure 2.27), which leads to several options. It is also our starting point for the next items we will discuss. Note that if you select a Character column, the Format menu does not appear in the Column Info window.

Figure 2.27 Column Info Format Menu – Continuous Variables


Note
You can also either double-click on the column name as mentioned in the previous section, or right-click on the column header and select Column Info from the menu.

Formatting Decimal Places

To change the number of decimal places displayed in a column of data, do the following:

1. Click on the column of interest. In our example, it is Domestic$.

2. Select Cols  Column Info. JMP will make a best guess on the format of the data; in our example, Currency was correctly specified. (See Figure 2.27.) You can easily change this format by selecting another format from the menu.

3. To the right of the Format menu are two boxes, Width and Dec. Width refers to the number of characters that can be in the column, and Dec refers to the number of decimals right of the point. In our example, type “0” in the Dec box, then select Apply and OK. (See Figure 2.28.)

Figure 2.28 Formatting Decimal Places


Note
Formatting decimal places affects what is viewed in the data table and in output but does not affect the precision of the data when used in calculations.

Formatting Dates, Times, and Durations

Dates are numeric values in JMP, which allows them to be transformed into other date formats and calculated for duration or elapsed time. If you are importing data that contains dates, ensure that the data type is numeric.

The Column Info (Cols  Column Info) window provides several date format options. (See Figure 2.29) When a date is selected from the Format menu, a secondary drop-down menu for the display format appears, along with a similar drop-down menu for the input format of your imported data. The format of your imported data needs to match one of JMP’s input format options, which can then be transformed into any format among the display format options.

Figure 2.29 Column Info Window for a Date Column


Let’s walk through a new example, TechStock, to illustrate this concept.

Example 2.3 TechStock

We will use the TechStock data table to illustrate dates in this section. This data set contains the stock price of the NASDAQ 100 (QQQ) at the high, low, and close for each trading day during the period 11/27/2000 to 2/26/2001. You can access this data set in the Sample Data folder that is installed with JMP: File  Open  C:  Program Files  SAS  JMP  15  Samples Data  Techstock.jmp.

1. Open the TechStock.jmp data table.

2. Click the Date column name.

3. Select Cols  Column Info. Open the Format drop-down menu and select Date, which displays how the dates will appear in the data table.

4. It is currently displayed as d/m/y (see Figure 2.26), as indicated by the check mark. Change the format to Monddyyyy. Click Apply or OK. The date is now displayed as abbreviated month, day, and year in the Date column. (See lower part of Figure 2.30.)

Figure 2.30 Changing the Display Format of Dates


You can also format time and duration from this window provided you have these types of columns.

Column Properties Menu

Column Properties, another useful tool in the Column Info window (see Figure 2.31), enables you to add formulas, check ranges of values for auditing, and assign customized ordering to the data, among other tasks.

Figure 2.31 Column Properties


Some of the more commonly used column properties are:

● Missing Value Codes: used to identify values placed in columns that represent unknown (or missing) data. By default, JMP uses a blank cell for missing character data and a ∙ for missing cells in numeric columns. In cases where the original data used a different missing identifier (such as 999 or unknown), these values can be entered in the missing value codes property so that JMP recognizes the values as missing.

● Value Labels: used when values are entered into the data table rather than the actual label. For example, an analyst might code Low, Medium, High as 1, 2, 3 for ease of data entry. However, in analysis results, the labels are preferred. Using this property will enable you to enter the data as a code, but provide the labels in the appropriate display.

● Value Order: used to specify the order in which column values should be sorted. For example, using the order Poor, Good, Better, Best rather than the alphabetical ordering of these data values. An example is provided later in this section.

● Axis: allows the specification of axis properties (such as formatting, minimum value, maximum value, increment, tick marks, and grid marks) to be used for all graphs of the column.

These functions are described in detail in the Using JMP book (Help  JMP Documentation Library  Using JMP  Ch. 5  Set Column Properties).

Formula Editor

JMP’s formula editor is handy and flexible. Use it when you need to create a new column that contains values that are calculated or derived from existing columns in your data table. You can also transform your data, add conditional statements, and much more. Due to the advanced nature of these features, we will cover only the most basic features here. For more information, see Using JMP at Help  JMP Documentation Library  Using JMP  Ch. 5  Assign Column Properties.

One of the common operations performed with the Formula Editor is creating a new column of data that contains a calculation from existing columns. To illustrate this feature, let’s return to our Movies.jmp data table. For example, suppose we want to obtain the international revenues from these movies by subtracting the domestic revenues (Domestic $) from the worldwide revenues (Worldwide $).

1. First, we need to create a new column. Double-click in the column head to the right of our last populated column (Director). (See Figure 2.32.) Type “International $” in the heading, and press Enter. Click in the column head to highlight.

Figure 2.32 Creating a New Column


2. With the new column selected, select Cols  Column Info. The Formula window appears in the Columns Properties menu. (See Figure 2.33.) Select Formula and Edit Formula. You see a list of columns or variables on the left side of the window.

Figure 2.33 Opening the Formula Editor


3. Click the Worldwide $ column, then the “-” symbol in the upper palette, and then the Domestic $ column. You see your formula take shape in the preview window. (See Figure 2.34).

Figure 2.34 Creating a Formula


4. When you click OK in the formula editor window, the calculated values appear in a new column in your data table.

5. Click OK in the Column Info window and you will see your new column, International $, within the Columns panel with a “+” next to it indicating that the column was created with a formula.

Value Order

The value order column property enables you to specify an order to the values of a categorical column. JMP’s built-in defaults order common ordinal columns such as months of the year or days of the week, but there are other instances when you would like to arrange responses (values) in some logical order for graphs and analyses. For example, some surveys have a range of responses from “Not Satisfied” to “Very Satisfied,” with a few intermediate responses in between. For specific examples of the built-in ordering, see the discussion of Column Properties in Using JMP.

In the Movies.jmp data table, we want to reorder the rating of the movies to display in this order: R, PG-13, PG, and G.

1. Click the Rating column. Select Cols  Column Info, and then select Column Properties.

2. Select Value Ordering. (See Figure 2.35.) A new window appears with the available responses from the Rating column. Select a response and move it up or down or reverse the order, whichever is appropriate. Select Reverse for this example. (See Figure 2.36.)

Figure 2.35 Value Order-Column Properties Menu


Figure 2.36 Specifying the Order of Categorical Variables


3. When you are satisfied with the order, click Apply and OK.

Now let’s see the results of this exercise by using the distribution platform, which is discussed in Chapters 3 and 5. Here’s a preview:

JMP Essentials

Подняться наверх