View the top and bottom rows of a data frame:

import numpy as np
import pandas as pd

dates = pd.date_range('20190101', periods=8)

df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=list('PQRS'))

df.head()

Pandas Head

df.tail(3)

Pandas Tail

Display the index, columns:

df.index

DatetimeIndex(['2019-01-01', '2019-01-02', '2019-01-03', '2019-01-04',
               '2019-01-05', '2019-01-06', '2019-01-07', '2019-01-08'],
              dtype='datetime64[ns]', freq='D')

df.columns

Index(['P', 'Q', 'R', 'S'], dtype='object')

DataFrame.to_numpy() gives a NumPy representation of the underlying data.

DataFrame.to_numpy() is fast and doesn’t require copying data.

df.to_numpy()

array([[ 0.01037757, -1.08489787, -0.44240246, -0.27828417],
       [ 1.86326425,  0.62636265,  0.63987705, -0.74176401],
       [-0.72747971, -0.47361484,  0.29916325, -2.37401899],
       [-0.11454301,  1.87307102, -0.7213719 , -0.75998405],
       [ 0.09742166, -0.42815908,  0.05608862, -0.03599818],
       [-0.03078364,  0.0072619 , -0.00713497,  1.15947185],
       [-0.28107373, -0.59509149,  1.02512422, -2.60274602],
       [ 0.17823484,  0.03812998,  0.30527348, -1.55129971]])

For df2, the DataFrame with multiple dtypes, DataFrame.to_numpy() is relatively expensive.

df2 = pd.DataFrame({'A': 1.,
                        'B': pd.Timestamp('20190102'),
                        'C': pd.Series(1, index=list(range(4)), dtype='float32'),
                        'D': np.array([3] * 4, dtype='int32'),
                        'E': pd.Categorical(["test", "train", "test", "train"]),
                        'F': 'foo'})
df2

df2.to_numpy()

array([[1.0, Timestamp('2019-01-02 00:00:00'), 1.0, 3, 'test', 'foo'],
       [1.0, Timestamp('2019-01-02 00:00:00'), 1.0, 3, 'train', 'foo'],
       [1.0, Timestamp('2019-01-02 00:00:00'), 1.0, 3, 'test', 'foo'],
       [1.0, Timestamp('2019-01-02 00:00:00'), 1.0, 3, 'train', 'foo']],
      dtype=object)

Note: DataFrame.to_numpy() does not include the index or column labels in the output.

describe() function shows a quick statistic summary of your data:

df.describe()

Transposing data:

df.T

Sorting data by an axis:

df.sort_index(axis=1, ascending=False)

Sorting by values:

df.sort_values(by='Q')

	P	Q	R	S
2019-01-06	-0.030784	0.007262	-0.007135	1.159472
2019-01-07	-0.281074	-0.595091	1.025124	-2.602746
2019-01-08	0.178235	0.038130	0.305273	-1.551300

	P	Q	R	S
count	8.000000	8.000000	8.000000	8.000000
mean	0.124427	-0.004617	0.144327	-0.898078
std	0.757020	0.913457	0.560059	1.248733
min	-0.727480	-1.084898	-0.721372	-2.602746
25%	-0.156176	-0.503984	-0.115952	-1.756980
50%	-0.010203	-0.210449	0.177626	-0.750874
75%	0.117625	0.185188	0.388924	-0.217713
max	1.863264	1.873071	1.025124	1.159472

	P	Q	R	S
2019-01-01	0.010378	-1.084898	-0.442402	-0.278284
2019-01-02	1.863264	0.626363	0.639877	-0.741764
2019-01-03	-0.727480	-0.473615	0.299163	-2.374019
2019-01-04	-0.114543	1.873071	-0.721372	-0.759984
2019-01-05	0.097422	-0.428159	0.056089	-0.035998

	A	B	C	D	E	F
0	1.0	2019-01-02	1.0	3	test	foo
1	1.0	2019-01-02	1.0	3	train	foo
2	1.0	2019-01-02	1.0	3	test	foo
3	1.0	2019-01-02	1.0	3	train	foo