リストからDataFrameを生成[Pandas]

本記事では,値をデータフレームを入力し,そのあとカラム名とインデックス名を決める流れで生成を行う方法を紹介.

それぞれの列をリストとし,その2次元リストからDataFrameを作成する場合

pd.DataFrameの際,転置するのがキモ

import pandas as pd
l2d = [["January", "September", "September", "November", "February"],
    ["UK", "Spain", "Jamaica", "Italy", "France"],
    [27, 28, 28, 22, 19],
    ["f", "f", "f", "f", "f"]]
df = pd.DataFrame(l2d).T
df.columns = ["Birth Month", "Origin", "Age", "Gender"]
df.index = ["Carly", "Rachel", "Nicky", "Wendy", "Judith"]

Output

Birth Month Origin Age Gender
Carly January UK 27 f
Rachel September Spain 28 f
Nicky September Jamaica 28 f
Wendy November Italy 22 f
Judith February France 19 f

それぞれの列をリストとし,その2次元リストからDataFrameを作成する場合の別パターン

1つ目とほぼ同じだが参考のために載せておく.

import pandas as pd
if __name__ == "__main__":
c1 = ["January", "September", "September", "November", "February"]
c2 = ["UK", "Spain", "Jamaica", "Italy", "France"]
c3 = [27, 28, 28, 22, 19]
c4 = ["f", "f", "f", "f", "f"]
df = pd.DataFrame().T
df.columns = ["Birth Month", "Origin", "Age", "Gender"]
df.index = ["Carly", "Rachel", "Nicky", "Wendy", "Judith"]
print(df)

Output

Birth Month Origin Age Gender
Carly January UK 27 f
Rachel September Spain 28 f
Nicky September Jamaica 28 f
Wendy November Italy 22 f
Judith February France 19 f

それぞれの行をリストとし,その2次元リストからDataFrameを作成する場合

行からの場合,pd.DataFrameを転置しなくてもよい.

これが王道パターンな気がするが,筆者はさまざまな形式で散らばったデータを集約する際にこのような操作を行うことが多いので,データが行として蓄積されているケースが少ない.このため,この操作はあまり使う機会が無い.

import pandas as pd
l2d = [['January', 'UK', 27, 'f'],
['September', 'Spain', 28, 'f'],
['September', 'Jamaica', 28, 'f'],
['November', 'Italy', 22, 'f'],
['February', 'France', 19, 'f']]
df = pd.DataFrame(l2d)
df.columns = ["Birth Month", "Origin", "Age", "Gender"]
df.index = ["Carly", "Rachel", "Nicky", "Wendy", "Judith"]

Output

Birth Month Origin Age Gender
Carly January UK 27 f
Rachel September Spain 28 f
Nicky September Jamaica 28 f
Wendy November Italy 22 f
Judith February France 19 f