Python pandas 列の範囲を指定して欠損値判定
# Make Dataframe In[1]: df = pd.DataFrame({"A" : ['foo', np.nan, 'bar', np.nan,'foo', 'bar', 'foo', 'foo'], "B" : ['one', 'one', np.nan, np.nan,'two', 'two', 'one', 'three'], "C" : ['hoge', 'fuga', np.nan, 'fuga', np.nan, np.nan, 'hoge', 'fuga'], "D" : np.random.randn(8)}) Out[2]: A B C D 0 foo one hoge -0.650722 1 NaN one fuga 1.343146 2 bar NaN NaN -0.560993 3 NaN NaN fuga 0.136937 4 foo two NaN 0.461315 5 bar two NaN -0.172828 6 foo one hoge -1.439034 7 foo three fuga -1.908443 In[3]: df.ix[:,:2].isnull().any(axis=1) Out[4]: 0 False 1 True 2 True 3 True 4 False 5 False 6 False 7 False dtype: bool
この方法の欠点:スライスしたDataFrameがSeriesの場合はエラーが出る。
In[4]: df.ix[:,1].isnull().any(axis=1) Traceback (most recent call last): File "<ipython-input-40-66179f97810c>", line 1, in <module> df.ix[:,1].isnull().any(axis=1) File "/usr/lib/python3/dist-packages/pandas/core/generic.py", line 4913, in logical_func name=name) File "/usr/lib/python3/dist-packages/pandas/core/series.py", line 2189, in _reduce self._get_axis_number(axis) File "/usr/lib/python3/dist-packages/pandas/core/generic.py", line 315, in _get_axis_number .format(axis, type(self))) ValueError: No axis named 1 for object type <class 'pandas.core.series.Series'>
ディスカッション
コメント一覧
まだ、コメントがありません