How to Create Dummy Variables in Python
In this article, we will discuss creating dummy variables in Python with Pandas.
A dummy variable is a variable that indicates whether a separate categorical variable takes on a specific value.
We can create dummy variables in python using get_dummies()
method.
Syntax: pandas.get_dummies(data, prefix=None, prefix_sep=’_’,)
- data – input data i.e. it includes pandas data frame, list, set, numpy arrays etc.
- prefix – Initial value
- prefix_sep – Data values separation
#required modules import pandas as pd import numpy as np # create dataset df = pd.DataFrame({'Height': ['Small', 'Medium', 'Tall', 'Very Tall'], }) # display dataset print(df) # create dummy variables print(pd.get_dummies(df))
Output:
Height 0 Small 1 Medium 2 Tall 3 Very Tall Height_Medium Height_Small Height_Tall Height_Very Tall 0 0 1 0 0 1 1 0 0 0 2 0 0 1 0 3 0 0 0 1
Now let’s see List arrays to get dummies
#required modules import pandas as pd import numpy as np # create dataset l = pd.Series(list('abc')) # display dataset print(l) # create dummy variables print(pd.get_dummies(l))
Output:
0 a 1 b 2 c dtype: object a b c 0 1 0 0 1 0 1 0 2 0 0 1
Subscribe
Login
Please login to comment
0 Discussion