Читать книгу SAS Viya - Kevin D. Smith - Страница 24

Using DataFrames

Оглавление

The DataFrames that are returned by CAS actions are extensions of the DataFrames that are defined by the Pandas package. Largely, both work the same way. The only difference is that the DataFrames returned by CAS contain extra metadata that is found in typical SAS data sets. This metadata includes things such as SAS data format names, the SAS data type, and column and table labels.

One of the builtins actions that returns a DataFrame is help. This action returns a DataFrame that is filled with the names and descriptions of all the actions that are installed on the server. Each action set gets its own key in the result. Let’s look at some output from help.

The following code runs the help action, lists the keys in the CASResults object that is returned, verifies that it is a SASDataFrame object using Python’s type function, and displays the contents of the DataFrame (some output is reformatted slightly for readability):

In [33]: out = conn.help()

In [34]: list(out.keys())

Out[34]:

['accessControl',

'builtins',

'loadStreams',

'search',

'session',

'sessionProp',

'table',

'tutorial']

In [35]: type(out['builtins'])

Out[35]: swat.dataframe.SASDataFrame

In [36]: out['builtins']

Out[36]:

name description

0 addNode Adds a machine to the server

1 removeNode Remove one or more machines from the...

2 help Shows the parameters for an action o...

3 listNodes Shows the host names used by the server

4 loadActionSet Loads an action set for use in this ...

5 installActionSet Loads an action set in new sessions ...

6 log Shows and modifies logging levels

7 queryActionSet Shows whether an action set is loaded

8 queryName Checks whether a name is an action o...

9 reflect Shows detailed parameter information...

10 serverStatus Shows the status of the server

11 about Shows the status of the server

12 shutdown Shuts down the server

13 userInfo Shows the user information for your ...

14 actionSetInfo Shows the build information from loa...

15 history Shows the actions that were run in t...

16 casCommon Provides parameters that are common ...

17 ping Sends a single request to the server...

18 echo Prints the supplied parameters to th...

19 modifyQueue Modifies the action response queue s...

20 getLicenseInfo Shows the license information for a ...

21 refreshLicense Refresh SAS license information from...

22 httpAddress Shows the HTTP address for the serve...

We can store this DataFrame in another variable to make it a bit easier to work with. Much like Pandas DataFrames, CASResults objects enable you to access keys as attributes (as long as the name of the key doesn’t collide with an existing attribute or method). This means that we can access the builtins key of the out variable in either of the following ways:

In [37]: blt = out['builtins']

In [38]: blt = out.builtins

Which syntax you use depends on personal preference. The dot syntax is a bit cleaner, but the bracketed syntax works regardless of the key value (including white space, or name collisions with existing attributes). Typically, you might use the attribute-style syntax in interactive programming, but the bracketed syntax is better for production code.

Now that we have a handle on the DataFrame, we can do typical DataFrame operations on it such as sorting and filtering. For example, to sort the builtins actions by the name column, you might do the following.

In [39]: blt.sort_values('name')

Out[39]:

name description

11 about Shows the status of the server

14 actionSetInfo Shows the build information from loa...

0 addNode Adds a machine to the server

16 casCommon Provides parameters that are common ...

18 echo Prints the supplied parameters to th...

20 getLicenseInfo Shows the license information for a ...

2 help Shows the parameters for an action o...

15 history Shows the actions that were run in t...

22 httpAddress Shows the HTTP address for the serve...

5 installActionSet Loads an action set in new sessions ...

3 listNodes Shows the host names used by the server

4 loadActionSet Loads an action set for use in this ...

6 log Shows and modifies logging levels

19 modifyQueue Modifies the action response queue s...

17 ping Sends a single request to the server...

7 queryActionSet Shows whether an action set is loaded

8 queryName Checks whether a name is an action o...

9 reflect Shows detailed parameter information...

21 refreshLicense Refresh SAS license information from...

1 removeNode Remove one or more machines from the...

10 serverStatus Shows the status of the server

12 shutdown Shuts down the server

13 userInfo Shows the user information for your ...

If we wanted to combine all of the output DataFrames into one DataFrame, we can use the concat function in the Pandas package. We use the values method of the CASResults object to get all of the values in the dictionary (which are DataFrames, in this case). Then we concatenate them using concat with the ignore_index=True option so that it creates a new unique index for each row.

In [40]: import pandas as pd

In [41]: pd.concat(out.values(), ignore_index=True)

Out[41]:

name description

0 assumeRole Assumes a role

1 dropRole Relinquishes a role

2 showRolesIn Shows the currently active role

3 showRolesAllowed Shows the roles that a user is a mem...

4 isInRole Shows whether a role is assu...

137 queryCaslib Checks whether a caslib exists

138 partition Partitions a table

139 recordCount Shows the number of rows in a Cloud ...

140 loadDataSource Loads one or more data source interf...

141 update Updates rows in a table

[142 rows x 2 columns]

In addition to result values, the CASResults object also contains information about the return status of the action. We look at that in the next section.

SAS Viya

Подняться наверх