Reshape with melt
reshape your data with melt
The pandas.DataFrame.melt() function is useful to massage a DataFrame into a format where one or more columns are identifier variables (id_vars), while all other columns, considered measured variables (value_vars). The function “unpivotes” to the row axis, leaving just two non-identifier columns, ‘variable’ and ‘value’.
To demonstrate this we will work with the EEG brain dataset. The values are in the X<number> columns, the variable of interest is the 'y'
X1
X2
X3
X4
X5
X6
X7
X8
X9
X10
...
X170
X171
X172
X173
X174
X175
X176
X177
X178
y
0
135
190
229
223
192
125
55
-9
-33
-38
...
-17
-15
-31
-77
-103
-127
-116
-83
-51
4
1
386
382
356
331
320
315
307
272
244
232
...
164
150
146
152
157
156
154
143
129
1
2
-32
-39
-47
-37
-32
-36
-57
-73
-85
-94
...
57
64
48
19
-12
-30
-35
-35
-36
5
3
-105
-101
-96
-92
-89
-95
-102
-100
-87
-79
...
-82
-81
-80
-77
-85
-77
-72
-69
-65
5
4
-9
-65
-98
-102
-78
-48
-16
0
-21
-59
...
4
2
-12
-32
-41
-65
-83
-89
-73
5
5 rows × 179 columns
We can melt this by the function melt. It will keep the 'y' value and put all the other columns in the variable column
y
variable
value
0
4
X1
135
1
1
X1
386
2
5
X1
-32
3
5
X1
-105
4
5
X1
-9
This might be handy if I want for instance to groupby y-value to discover the differences in counts, mean or standard deviation. I also can make a graphical overview
Last updated
Was this helpful?