Pandas Series: to_hdf() function

Last update on September 15 2022 12:54:37 (UTC/GMT +8 hours)

Series-to_hdf() function

The to_hdf() function is used to write the contained data to an HDF5 file using HDFStore.

Hierarchical Data Format (HDF) is self-describing, allowing an application to interpret the structure and contents of a file with no outside information. One HDF file can hold a mix of related objects which can be accessed as a group or as individual objects.

In order to add another DataFrame or Series to an existing HDF file please use append mode and a different a key.

Syntax:

Series.to_hdf(self, path_or_buf, key, **kwargs)

Parameters:

Name	Description	Type/Default Value	Required / Optional
path_or_buf	File path or HDFStore object.	str or pandas.HDFStore	Required
key	Identifier for the group in the store.	str	Required
mode	Mode to open file: 'w': write, a new file is created (an existing file with the same name would be deleted). 'a': append, an existing file is opened for reading and writing, and if the file does not exist it is created. 'r+': similar to 'a', but the file must already exist.	{'a', 'w', 'r+'}, default 'a'	Required
format	Possible values: 'fixed': Fixed format. Fast writing/reading. Not-appendable, nor searchable. ‘table’: Table format. Write as a PyTables Table structure which may perform worse but allow more flexible operations like searching / selecting subsets of the data.	{‘fixed’, ‘table’}, default ‘fixed’	Required
append	For Table formats, append the input data to the existing.	bool, default False	Required
data_columns	List of columns to create as indexed data columns for on-disk queries, or True to use all columns. By default only the axes of the object are indexed. See Query via data columns. Applicable only to format='table'.	list of columns or True	Optional
complevel	Specifies a compression level for data. A value of 0 disables compression.	{0-9}	Optional/td>
complib	Specifies the compression library to be used. As of v0.20.2 these additional compressors for Blosc are supported (default if no compressor specified: ‘blosc:blosclz’): {'blosc:blosclz', 'blosc:lz4', 'blosc:lz4hc', ‘blosc:snappy', 'blosc:zlib’, ‘blosc:zstd’}. Specifying a compression library which is not available issues a ValueError.	{'zlib', 'lzo', 'bzip2', 'blosc'}, default ‘zlib’	Required
fletcher32	If applying compression use the fletcher32 checksum.	bool, default False	Required
dropna	If true, ALL nan rows will not be written to store.	bool, default False	Required
errors	Specifies how encoding and decoding errors are to be handled. See the errors argument for open() for a full list of options.	str, default ‘strict’	Required

Example - We can add another object to the same file:

Reading from HDF file:

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': [2, 3, 4], 'Y': [5, 6, 7]},
                  index=['p', 'q', 'r'])
df.to_hdf('data.h5', key='df', mode='w')
s = pd.Series([2, 3, 4, 5])
s.to_hdf('data.h5', key='s')
pd.read_hdf('data.h5', 'df')

Output:

  X	Y
p	2	5
q	3	6
r	4	7

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': [2, 3, 4], 'Y': [5, 6, 7]},
                  index=['p', 'q', 'r'])
df.to_hdf('data.h5', key='df', mode='w')
s = pd.Series([2, 3, 4, 5])
s.to_hdf('data.h5', key='s')
pd.read_hdf('data.h5', 's')

Output:

0    2
1    3
2    4
3    5
dtype: int64

Example - Deleting file with data:

Python-Pandas Code:

import os
os.remove('data.h5')

Previous: Series - to_xarray() function
Next: Series-to_sql() function