Python pandas 列の範囲を指定して欠損値判定

7月 27, 2020

# Make Dataframe
In[1]:
df = pd.DataFrame({"A" : ['foo', np.nan, 'bar', np.nan,'foo', 'bar', 'foo', 'foo'],
"B" : ['one', 'one', np.nan, np.nan,'two', 'two', 'one', 'three'],
"C" : ['hoge', 'fuga', np.nan, 'fuga', np.nan, np.nan, 'hoge', 'fuga'],
"D" : np.random.randn(8)})
Out[2]:
A      B     C         D
0  foo    one  hoge -0.650722
1  NaN    one  fuga  1.343146
2  bar    NaN   NaN -0.560993
3  NaN    NaN  fuga  0.136937
4  foo    two   NaN  0.461315
5  bar    two   NaN -0.172828
6  foo    one  hoge -1.439034
7  foo  three  fuga -1.908443
In[3]:
df.ix[:,:2].isnull().any(axis=1)
Out[4]:
0
False
1
True
2
True
3
True
4
False
5
False
6
False
7
False
dtype: bool

この方法の欠点:スライスしたDataFrameがSeriesの場合はエラーが出る。

In[4]:
df.ix[:,1].isnull().any(axis=1)
Traceback (most recent call last):
File "<ipython-input-40-66179f97810c>", line 1, in <module>
df.ix[:,1].isnull().any(axis=1)
File "/usr/lib/python3/dist-packages/pandas/core/generic.py", line 4913, in logical_func
name=name)
File "/usr/lib/python3/dist-packages/pandas/core/series.py", line 2189, in _reduce
self._get_axis_number(axis)
File "/usr/lib/python3/dist-packages/pandas/core/generic.py", line 315, in _get_axis_number
.format(axis, type(self)))
ValueError: No axis named 1
for
object
type <class
'pandas.core.series.Series'>

Pandas,Python

Posted by vastee