Читать книгу SAS Viya - Kevin D. Smith - Страница 12

Executing Actions on CAS Tables

Оглавление

The simple action set that comes with CAS contains some basic analytic actions. You can use either the help action or the IPython ? operator to view the available actions.

In [17]: conn.simple?

Type: Simple

String form: <swat.cas.actions.Simple object at 0x4582b10>

File: swat/cas/actions.py

Definition: conn.simple(self, *args, **kwargs)

Docstring :

Analytics

Actions

-------

simple.correlation : Generates a matrix of Pearson product-moment

correlation coefficients

simple.crosstab : Performs one-way or two-way tabulations

simple.distinct : Computes the distinct number of values of the

variables in the variable list

simple.freq : Generates a frequency distribution for one or

more variables

simple.groupby : Builds BY groups in terms of the variable value

combinations given the variables in the variable

list

simple.mdsummary : Calculates multidimensional summaries of numeric

variables

simple.numrows : Shows the number of rows in a Cloud Analytic

Services table

simple.paracoord : Generates a parallel coordinates plot of the

variables in the variable list

simple.regression : Performs a linear regression up to 3rd-order

polynomials

simple.summary : Generates descriptive statistics of numeric

variables such as the sample mean, sample

variance, sample size, sum of squares, and so on

simple.topk : Returns the top-K and bottom-K distinct values of

each variable included in the variable list based

on a user-specified ranking order

Let’s run the summary action on our CAS table.

In [18]: summ = iris.summary()

In [19]: summ

Out[19]:

[Summary]

Descriptive Statistics for IRIS

Column Min Max N NMiss Mean Sum Std \

0 SepalLength 4.3 7.9 150.0 0.0 5.843333 876.5 0.828066

1 SepalWidth 2.0 4.4 150.0 0.0 3.054000 458.1 0.433594

2 PetalLength 1.0 6.9 150.0 0.0 3.758667 563.8 1.764420

3 PetalWidth 0.1 2.5 150.0 0.0 1.198667 179.8 0.763161

StdErr Var USS CSS CV TValue \

0 0.067611 0.685694 5223.85 102.168333 14.171126 86.425375

1 0.035403 0.188004 1427.05 28.012600 14.197587 86.264297

2 0.144064 3.113179 2583.00 463.863733 46.942721 26.090198

3 0.062312 0.582414 302.30 86.779733 63.667470 19.236588

ProbT

0 3.331256e-129

1 4.374977e-129

2 1.994305e-57

3 3.209704e-42

+ Elapsed: 0.0256s, user: 0.019s, sys: 0.009s, mem: 1.74mb

The summary action displays summary statistics in a form that is familiar to SAS users. If you want them in a form similar to what Pandas users are used to, you can use the describe method (just like on DataFrames).

In [20]: iris.describe()

Out[20]:

SepalLength SepalWidth PetalLength PetalWidth

count 150.000000 150.000000 150.000000 150.000000

mean 5.843333 3.054000 3.758667 1.198667

std 0.828066 0.433594 1.764420 0.763161

min 4.300000 2.000000 1.000000 0.100000

25% 5.100000 2.800000 1.600000 0.300000

50% 5.800000 3.000000 4.350000 1.300000

75% 6.400000 3.300000 5.100000 1.800000

max 7.900000 4.400000 6.900000 2.500000

Note that when you call the describe method on a CASTable object, it calls various CAS actions in the background to do the calculations. This includes the summary, percentile, and topk actions. The output of those actions is combined into a DataFrame in the same form that the real Pandas DataFrame describe method returns. This enables you to use CASTable objects and DataFrame objects interchangeably in your workflow for this method and many other methods.

SAS Viya

Подняться наверх