8. API: Data Frames
Python-oracledb can fetch directly to data frames that expose an Apache Arrow PyCapsule Interface. These can be used by many numerical and data analysis libraries.
See Fetching Data Frames for more information, including the type mapping from Oracle Database types to Arrow data types.
Note
The data frame support in python-oracledb 3.1 is a pre-release and may change in a future version.
8.1. OracleDataFrame Objects
OracleDataFrame objects are returned from the methods
Connection.fetch_df_all() and Connection.fetch_df_batches().
Each column in OracleDataFrame exposes an Apache Arrow PyCapsule interface, giving access to the underlying Arrow array.
This object is an extension to the DB API definition.Added in version 3.0.0.
8.1.1. OracleDataFrame Methods
The object implements the Python DataFrame Interchange Protocol DataFrame API Interface
- OracleDataFrame.column_arrays()
Returns a list of OracleArrowArray objects, each containing a select list column.
This is an extension to the DataFrame Interchange Protocol.
- OracleDataFrame.column_names()
Returns a list of the column names in the data frame.
- OracleDataFrame.get_chunks(n_chunks)
Returns itself, since python-oracledb only uses one chunk.
- OracleDataFrame.get_column(i)
Returns an OracleColumn object for the column at the given index
i.
- OracleDataFrame.get_column_by_name(name)
Returns an OracleColumn object for the column with the given name
name.
- OracleDataFrame.get_columns()
Returns a list of OracleColumn objects, one object for each column in the data frame.
- OracleDataFrame.num_chunks()
Return the number of chunks the data frame consists of.
This always returns 1.
- OracleDataFrame.num_columns()
Returns the number of columns in the data frame.
- OracleDataFrame.num_rows()
Returns the number of rows in the data frame.
8.1.2. OracleDataFrame Attributes
- OracleDataFrame.metadata
This read-only attribute returns the metadata for the data frame as a dictionary with keys
num_columns,num_rows, andnum_chunks, showing the number of columns, rows, and chunks, respectively. The number of chunks is always 1 in python-oracledb.
8.2. OracleArrowArray Objects
OracleArrowArray objects are returned by
OracleDataFrame.column_arrays().
These are used for conversion to PyArrow Tables, see Fetching Data Frames.
Added in version 3.0.0.
8.3. OracleColumn Objects
OracleColumn objects are returned by OracleDataFrame.get_column(),
OracleDataFrame.get_column_by_name(), and
OracleDataFrame.get_columns().
Added in version 3.0.0.
8.3.1. OracleColumn Methods
- OracleColumn.get_buffers()
Returns a dictionary containing the underlying buffers.
The returned dictionary contains the
data,validity, andoffsetkeys.The
dataattribute is a two-element tuple whose first element is a buffer containing the data and whose second element is the data buffer’s associated dtype.The
validityattribute is a a two-element tuple whose first element is a buffer containing mask values indicating missing data and whose second element is the mask value buffer’s associated dtype. The value of this attribute is None if the null representation is not a bit or byte mask.The
offsetattribute is a two-element tuple whose first element is a buffer containing the offset values for variable-size binary data (for example, variable-length strings) and whose second element is the offsets buffer’s associated dtype. The value of this attribute is None if the data buffer does not have an associated offsets buffer.
- OracleColumn.get_chunks(n_chunks)
Returns itself, since python-oracledb only uses one chunk.
- OracleColumn.num_chunks()
Returns the number of chunks the column consists of.
This always returns 1.
- OracleColumn.size()
Returns the number of rows in the column.
8.3.2. OracleColumn Attributes
- OracleColumn.describe_null
This read-only property returns the description of the null representation that the column uses.
- OracleColumn.dtype
This read-only attribute returns the Dtype description as a tuple containing the values for the attributes
kind,bit-width,format string, andendianess.The
kindattribute specifies the type of the data.The
bit-widthattribute specifies the number of bits as an integer.The
format stringattribute specifies the data type description format string in Apache Arrow C Data Interface format.The
endianessattribute specifies the byte order of the data type. Currently, only native endianess is supported.
- OracleColumn.metadata
This read-only attribute returns the metadata for the column as a dictionary with string keys.
- OracleColumn.null_count
This read-only attribute returns the number of null row values, if known.
- OracleColumn.offset
This read-only attribute specifies the offset of the first row.
8.4. OracleColumnBuffer Objects
A buffer object backed by an ArrowArray consisting of a single chunk.
This is an internal class used for conversion to third party data frames.
Added in version 3.0.0.
8.4.1. OracleColumnBuffer Attributes
- OracleColumnBuffer.bufsize
This read-only property returns the buffer size in bytes.
- OracleColumnBuffer.ptr
This read-only attribute specifies the pointer to the start of the buffer as an integer.