Читать книгу SAS Viya - Kevin D. Smith - Страница 11

Loading Data

Оглавление

The easiest way to load data into a CAS server is by using the upload method on the CAS connection object. This method uses a file path or URL that points to a file in various possible formats including CSV, Excel, and SAS data sets. You can also pass a Pandas DataFrame object to the upload method in order to upload the data from that DataFrame to a CAS table. We use the classic Iris data set in the following data loading example.

In [12]: out = conn.upload('https://raw.githubusercontent.com/' + ....: 'pydata/pandas/master/pandas/tests/' + ....: 'data/iris.csv')

In [13]: out

Out[13]:

[caslib]

'CASUSER(username)'

[tableName]

'IRIS'

[casTable]

CASTable('IRIS', caslib='CASUSER(username)')

+ Elapsed: 0.0629s, user: 0.037s, sys: 0.021s, mem: 48.4mb

The output from the upload method is, again, a CASResults object. The output contains the name of the created table, the CASLib that the table was created in, and a CASTable object that can be used to interact with the table on the server. CASTable objects have all of the same CAS action set and action methods of the connection that created it. They also include many of the methods that are defined by Pandas DataFrames so that you can operate on them as if they were local DataFrames. However, until you explicitly fetch the data or call a method that returns data from the table (such as head or tail), all operations are simply combined on the client side (essentially creating a client-side view) until data is actually retrieved from the server.

We can use actions such as tableinfo and columninfo to access general information about the table itself and its columns.

# Store CASTable object in its own variable.

In [14]: iris = out.casTable

# Call the tableinfo action on the CASTable object.

In [15]: iris.tableinfo()

Out[15]:

[TableInfo]

Name Rows Columns Encoding CreateTimeFormatted \

0 IRIS 150 5 utf-8 01Nov2016:16:38:59

ModTimeFormatted JavaCharSet CreateTime ModTime \

0 01Nov2016:16:38:59 UTF8 1.793638e+09 1.793638e+09

Global Repeated View SourceName SourceCaslib Compressed \

0 0 0 0 0

Creator Modifier

0 username

+ Elapsed: 0.000856s, mem: 0.104mb

# Call the columninfo action on the CASTable.

In [16]: iris.columninfo()

Out[16]:

[ColumnInfo]

Column ID Type RawLength FormattedLength NFL NFD

0 SepalLength 1 double 8 12 0 0

1 SepalWidth 2 double 8 12 0 0

2 PetalLength 3 double 8 12 0 0

3 PetalWidth 4 double 8 12 0 0

4 Name 5 varchar 15 15 0 0

+ Elapsed: 0.000727s, mem: 0.175mb

Now that we have some data, let’s run some more interesting CAS actions on it.

SAS Viya

Подняться наверх