Overview

Brought to you by YData

Dataset statistics

Number of variables15
Number of observations36408
Missing cells94148
Missing cells (%)17.2%
Duplicate rows3451
Duplicate rows (%)9.5%
Total size in memory5.5 MiB
Average record size in memory157.0 B

Variable types

Categorical15

Alerts

Dataset has 3451 (9.5%) duplicate rowsDuplicates
have_other_ventelation_holes is highly overall correlated with main_material_used_for_the_floor_of_the_room and 2 other fieldsHigh correlation
main_material_used_for_the_floor_of_the_room is highly overall correlated with have_other_ventelation_holesHigh correlation
main_material_used_for_the_roof_of_the_room is highly overall correlated with have_other_ventelation_holesHigh correlation
type_of_ceiling_of_the_room is highly overall correlated with have_other_ventelation_holesHigh correlation
storey_which_the_room_located is highly imbalanced (75.8%) Imbalance
main_material_used_for_the_floor_of_the_room is highly imbalanced (51.5%) Imbalance
no_of_doors_opened_to_external_environment is highly imbalanced (69.6%) Imbalance
main_material_used_for_window_panes is highly imbalanced (61.2%) Imbalance
no_of_bulbs_in_the_room is highly imbalanced (56.0%) Imbalance
no_of_bulbs_used_during_last_week is highly imbalanced (67.0%) Imbalance
no_of_fans_in_the_room is highly imbalanced (66.7%) Imbalance
no_of_ACs_in_the_room is highly imbalanced (90.6%) Imbalance
storey_which_the_room_located has 3017 (8.3%) missing values Missing
main_material_used_for_the_roof_of_the_room has 3009 (8.3%) missing values Missing
type_of_ceiling_of_the_room has 3017 (8.3%) missing values Missing
main_material_used_for_the_floor_of_the_room has 3017 (8.3%) missing values Missing
no_of_doors_opened_to_external_environment has 3017 (8.3%) missing values Missing
no_of_windows has 3017 (8.3%) missing values Missing
main_material_used_for_window_panes has 22084 (60.7%) missing values Missing
have_curtains_or_blinds_for_windows has 15296 (42.0%) missing values Missing
have_other_ventelation_holes has 3017 (8.3%) missing values Missing
no_of_bulbs_in_the_room has 3017 (8.3%) missing values Missing
no_of_bulbs_used_during_last_week has 26603 (73.1%) missing values Missing
no_of_fans_in_the_room has 3017 (8.3%) missing values Missing
no_of_ACs_in_the_room has 3017 (8.3%) missing values Missing

Reproduction

Analysis started2024-11-18 08:39:54.036415
Analysis finished2024-11-18 08:40:00.045072
Duration6.01 seconds
Software versionydata-profiling vv4.11.0
Download configurationconfig.json

Variables

room_ID
Categorical

Distinct32
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
I_1
4063 
I_2
4060 
I_3
4044 
I_4
3967 
I_5
3711 
Other values (27)
16563 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters327672
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowI_1
2nd rowI_2
3rd rowI_3
4th rowI_4
5th rowI_5

Common Values

ValueCountFrequency (%)
I_1 4063
11.2%
I_2 4060
11.2%
I_3 4044
11.1%
I_4 3967
10.9%
I_5 3711
10.2%
I_6 3243
8.9%
I_7 2724
7.5%
I_8 2488
6.8%
I_9 2382
6.5%
I_10 2343
6.4%
Other values (22) 3383
9.3%

Length

2024-11-18T14:10:00.104964image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
i_1 4063
11.2%
i_2 4060
11.2%
i_3 4044
11.1%
i_4 3967
10.9%
i_5 3711
10.2%
i_6 3243
8.9%
i_7 2724
7.5%
i_8 2488
6.8%
i_9 2382
6.5%
i_10 2343
6.4%
Other values (22) 3383
9.3%

Most occurring characters

ValueCountFrequency (%)
211201
64.5%
_ 36914
 
11.3%
I 36408
 
11.1%
1 10389
 
3.2%
2 4735
 
1.4%
3 4492
 
1.4%
4 4307
 
1.3%
5 3970
 
1.2%
6 3432
 
1.0%
7 2868
 
0.9%
Other values (6) 8956
 
2.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 327672
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
211201
64.5%
_ 36914
 
11.3%
I 36408
 
11.1%
1 10389
 
3.2%
2 4735
 
1.4%
3 4492
 
1.4%
4 4307
 
1.3%
5 3970
 
1.2%
6 3432
 
1.0%
7 2868
 
0.9%
Other values (6) 8956
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 327672
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
211201
64.5%
_ 36914
 
11.3%
I 36408
 
11.1%
1 10389
 
3.2%
2 4735
 
1.4%
3 4492
 
1.4%
4 4307
 
1.3%
5 3970
 
1.2%
6 3432
 
1.0%
7 2868
 
0.9%
Other values (6) 8956
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 327672
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
211201
64.5%
_ 36914
 
11.3%
I 36408
 
11.1%
1 10389
 
3.2%
2 4735
 
1.4%
3 4492
 
1.4%
4 4307
 
1.3%
5 3970
 
1.2%
6 3432
 
1.0%
7 2868
 
0.9%
Other values (6) 8956
 
2.7%
Distinct18
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size1.6 MiB
2
10380 
4
5767 
3
5159 
1
4431 
6
3776 
Other values (13)
6892 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters327645
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 1
2nd row 2
3rd row 2
4th row 2
5th row 2

Common Values

ValueCountFrequency (%)
2 10380
28.5%
4 5767
15.8%
3 5159
14.2%
1 4431
12.2%
6 3776
 
10.4%
10 3405
 
9.4%
9 1825
 
5.0%
7 736
 
2.0%
5 609
 
1.7%
18 81
 
0.2%
Other values (8) 236
 
0.6%

Length

2024-11-18T14:10:00.199152image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2 10380
28.5%
4 5767
15.8%
3 5159
14.2%
1 4431
12.2%
6 3776
 
10.4%
10 3405
 
9.4%
9 1825
 
5.0%
7 736
 
2.0%
5 609
 
1.7%
18 81
 
0.2%
Other values (8) 236
 
0.6%

Most occurring characters

ValueCountFrequency (%)
287562
87.8%
2 10432
 
3.2%
1 8137
 
2.5%
4 5769
 
1.8%
3 5174
 
1.6%
6 3800
 
1.2%
0 3405
 
1.0%
9 1825
 
0.6%
7 745
 
0.2%
5 671
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 327645
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
287562
87.8%
2 10432
 
3.2%
1 8137
 
2.5%
4 5769
 
1.8%
3 5174
 
1.6%
6 3800
 
1.2%
0 3405
 
1.0%
9 1825
 
0.6%
7 745
 
0.2%
5 671
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 327645
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
287562
87.8%
2 10432
 
3.2%
1 8137
 
2.5%
4 5769
 
1.8%
3 5174
 
1.6%
6 3800
 
1.2%
0 3405
 
1.0%
9 1825
 
0.6%
7 745
 
0.2%
5 671
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 327645
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
287562
87.8%
2 10432
 
3.2%
1 8137
 
2.5%
4 5769
 
1.8%
3 5174
 
1.6%
6 3800
 
1.2%
0 3405
 
1.0%
9 1825
 
0.6%
7 745
 
0.2%
5 671
 
0.2%

storey_which_the_room_located
Categorical

Imbalance  Missing 

Distinct14
Distinct (%)< 0.1%
Missing3017
Missing (%)8.3%
Memory size1.6 MiB
0
26381 
1
5985 
2
 
702
3
 
152
8
 
33
Other values (9)
 
138

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters300519
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row 0
2nd row 0
3rd row 0
4th row 1
5th row 1

Common Values

ValueCountFrequency (%)
0 26381
72.5%
1 5985
 
16.4%
2 702
 
1.9%
3 152
 
0.4%
8 33
 
0.1%
4 30
 
0.1%
7 29
 
0.1%
5 24
 
0.1%
6 20
 
0.1%
10 12
 
< 0.1%
Other values (4) 23
 
0.1%
(Missing) 3017
 
8.3%

Length

2024-11-18T14:10:00.533376image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 26381
79.0%
1 5985
 
17.9%
2 702
 
2.1%
3 152
 
0.5%
8 33
 
0.1%
4 30
 
0.1%
7 29
 
0.1%
5 24
 
0.1%
6 20
 
0.1%
10 12
 
< 0.1%
Other values (4) 23
 
0.1%

Most occurring characters

ValueCountFrequency (%)
267104
88.9%
0 26393
 
8.8%
1 6017
 
2.0%
2 705
 
0.2%
3 152
 
0.1%
8 34
 
< 0.1%
4 30
 
< 0.1%
7 29
 
< 0.1%
5 24
 
< 0.1%
6 20
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
267104
88.9%
0 26393
 
8.8%
1 6017
 
2.0%
2 705
 
0.2%
3 152
 
0.1%
8 34
 
< 0.1%
4 30
 
< 0.1%
7 29
 
< 0.1%
5 24
 
< 0.1%
6 20
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
267104
88.9%
0 26393
 
8.8%
1 6017
 
2.0%
2 705
 
0.2%
3 152
 
0.1%
8 34
 
< 0.1%
4 30
 
< 0.1%
7 29
 
< 0.1%
5 24
 
< 0.1%
6 20
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
267104
88.9%
0 26393
 
8.8%
1 6017
 
2.0%
2 705
 
0.2%
3 152
 
0.1%
8 34
 
< 0.1%
4 30
 
< 0.1%
7 29
 
< 0.1%
5 24
 
< 0.1%
6 20
 
< 0.1%

main_material_used_for_the_roof_of_the_room
Categorical

High correlation  Missing 

Distinct10
Distinct (%)< 0.1%
Missing3009
Missing (%)8.3%
Memory size1.6 MiB
2
14980 
3
11566 
1
4329 
10
1739 
5
 
287
Other values (5)
 
498

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters367389
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row 2
2nd row 2
3rd row 2
4th row 2
5th row 2

Common Values

ValueCountFrequency (%)
2 14980
41.1%
3 11566
31.8%
1 4329
 
11.9%
10 1739
 
4.8%
5 287
 
0.8%
4 226
 
0.6%
9 211
 
0.6%
8 59
 
0.2%
7 1
 
< 0.1%
6 1
 
< 0.1%
(Missing) 3009
 
8.3%

Length

2024-11-18T14:10:00.624906image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-18T14:10:00.720894image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
2 14980
44.9%
3 11566
34.6%
1 4329
 
13.0%
10 1739
 
5.2%
5 287
 
0.9%
4 226
 
0.7%
9 211
 
0.6%
8 59
 
0.2%
7 1
 
< 0.1%
6 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
332251
90.4%
2 14980
 
4.1%
3 11566
 
3.1%
1 6068
 
1.7%
0 1739
 
0.5%
5 287
 
0.1%
4 226
 
0.1%
9 211
 
0.1%
8 59
 
< 0.1%
7 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 367389
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
332251
90.4%
2 14980
 
4.1%
3 11566
 
3.1%
1 6068
 
1.7%
0 1739
 
0.5%
5 287
 
0.1%
4 226
 
0.1%
9 211
 
0.1%
8 59
 
< 0.1%
7 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 367389
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
332251
90.4%
2 14980
 
4.1%
3 11566
 
3.1%
1 6068
 
1.7%
0 1739
 
0.5%
5 287
 
0.1%
4 226
 
0.1%
9 211
 
0.1%
8 59
 
< 0.1%
7 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 367389
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
332251
90.4%
2 14980
 
4.1%
3 11566
 
3.1%
1 6068
 
1.7%
0 1739
 
0.5%
5 287
 
0.1%
4 226
 
0.1%
9 211
 
0.1%
8 59
 
< 0.1%
7 1
 
< 0.1%

type_of_ceiling_of_the_room
Categorical

High correlation  Missing 

Distinct9
Distinct (%)< 0.1%
Missing3017
Missing (%)8.3%
Memory size1.6 MiB
6
10521 
1
9847 
2
5960 
7
1847 
9
1843 
Other values (4)
3373 

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters367301
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 3
2nd row 3
3rd row 3
4th row 3
5th row 3

Common Values

ValueCountFrequency (%)
6 10521
28.9%
1 9847
27.0%
2 5960
16.4%
7 1847
 
5.1%
9 1843
 
5.1%
4 1320
 
3.6%
3 1281
 
3.5%
8 682
 
1.9%
5 90
 
0.2%
(Missing) 3017
 
8.3%

Length

2024-11-18T14:10:00.833503image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-18T14:10:00.926822image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
6 10521
31.5%
1 9847
29.5%
2 5960
17.8%
7 1847
 
5.5%
9 1843
 
5.5%
4 1320
 
4.0%
3 1281
 
3.8%
8 682
 
2.0%
5 90
 
0.3%

Most occurring characters

ValueCountFrequency (%)
333910
90.9%
6 10521
 
2.9%
1 9847
 
2.7%
2 5960
 
1.6%
7 1847
 
0.5%
9 1843
 
0.5%
4 1320
 
0.4%
3 1281
 
0.3%
8 682
 
0.2%
5 90
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 367301
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
333910
90.9%
6 10521
 
2.9%
1 9847
 
2.7%
2 5960
 
1.6%
7 1847
 
0.5%
9 1843
 
0.5%
4 1320
 
0.4%
3 1281
 
0.3%
8 682
 
0.2%
5 90
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 367301
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
333910
90.9%
6 10521
 
2.9%
1 9847
 
2.7%
2 5960
 
1.6%
7 1847
 
0.5%
9 1843
 
0.5%
4 1320
 
0.4%
3 1281
 
0.3%
8 682
 
0.2%
5 90
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 367301
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
333910
90.9%
6 10521
 
2.9%
1 9847
 
2.7%
2 5960
 
1.6%
7 1847
 
0.5%
9 1843
 
0.5%
4 1320
 
0.4%
3 1281
 
0.3%
8 682
 
0.2%
5 90
 
< 0.1%

main_material_used_for_the_floor_of_the_room
Categorical

High correlation  Imbalance  Missing 

Distinct11
Distinct (%)< 0.1%
Missing3017
Missing (%)8.3%
Memory size1.6 MiB
3
16272 
1
12961 
11
1806 
9
 
1263
2
 
398
Other values (6)
 
691

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters300519
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 3
2nd row 3
3rd row 3
4th row 1
5th row 1

Common Values

ValueCountFrequency (%)
3 16272
44.7%
1 12961
35.6%
11 1806
 
5.0%
9 1263
 
3.5%
2 398
 
1.1%
10 306
 
0.8%
7 137
 
0.4%
4 73
 
0.2%
8 72
 
0.2%
6 69
 
0.2%
(Missing) 3017
 
8.3%

Length

2024-11-18T14:10:01.037891image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3 16272
48.7%
1 12961
38.8%
11 1806
 
5.4%
9 1263
 
3.8%
2 398
 
1.2%
10 306
 
0.9%
7 137
 
0.4%
4 73
 
0.2%
8 72
 
0.2%
6 69
 
0.2%

Most occurring characters

ValueCountFrequency (%)
265016
88.2%
1 16879
 
5.6%
3 16272
 
5.4%
9 1263
 
0.4%
2 398
 
0.1%
0 306
 
0.1%
7 137
 
< 0.1%
4 73
 
< 0.1%
8 72
 
< 0.1%
6 69
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
265016
88.2%
1 16879
 
5.6%
3 16272
 
5.4%
9 1263
 
0.4%
2 398
 
0.1%
0 306
 
0.1%
7 137
 
< 0.1%
4 73
 
< 0.1%
8 72
 
< 0.1%
6 69
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
265016
88.2%
1 16879
 
5.6%
3 16272
 
5.4%
9 1263
 
0.4%
2 398
 
0.1%
0 306
 
0.1%
7 137
 
< 0.1%
4 73
 
< 0.1%
8 72
 
< 0.1%
6 69
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
265016
88.2%
1 16879
 
5.6%
3 16272
 
5.4%
9 1263
 
0.4%
2 398
 
0.1%
0 306
 
0.1%
7 137
 
< 0.1%
4 73
 
< 0.1%
8 72
 
< 0.1%
6 69
 
< 0.1%

no_of_doors_opened_to_external_environment
Categorical

Imbalance  Missing 

Distinct18
Distinct (%)0.1%
Missing3017
Missing (%)8.3%
Memory size1.6 MiB
0
19548 
1
12298 
2
 
972
3
 
280
4
 
174
Other values (13)
 
119

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters300519
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row 1
2nd row 1
3rd row 1
4th row 1
5th row 1

Common Values

ValueCountFrequency (%)
0 19548
53.7%
1 12298
33.8%
2 972
 
2.7%
3 280
 
0.8%
4 174
 
0.5%
5 33
 
0.1%
6 33
 
0.1%
8 20
 
0.1%
7 16
 
< 0.1%
12 4
 
< 0.1%
Other values (8) 13
 
< 0.1%
(Missing) 3017
 
8.3%

Length

2024-11-18T14:10:01.132149image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 19548
58.5%
1 12298
36.8%
2 972
 
2.9%
3 280
 
0.8%
4 174
 
0.5%
5 33
 
0.1%
6 33
 
0.1%
8 20
 
0.1%
7 16
 
< 0.1%
12 4
 
< 0.1%
Other values (8) 13
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
267114
88.9%
0 19551
 
6.5%
1 12314
 
4.1%
2 977
 
0.3%
3 281
 
0.1%
4 175
 
0.1%
5 34
 
< 0.1%
6 33
 
< 0.1%
8 20
 
< 0.1%
7 16
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
267114
88.9%
0 19551
 
6.5%
1 12314
 
4.1%
2 977
 
0.3%
3 281
 
0.1%
4 175
 
0.1%
5 34
 
< 0.1%
6 33
 
< 0.1%
8 20
 
< 0.1%
7 16
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
267114
88.9%
0 19551
 
6.5%
1 12314
 
4.1%
2 977
 
0.3%
3 281
 
0.1%
4 175
 
0.1%
5 34
 
< 0.1%
6 33
 
< 0.1%
8 20
 
< 0.1%
7 16
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
267114
88.9%
0 19551
 
6.5%
1 12314
 
4.1%
2 977
 
0.3%
3 281
 
0.1%
4 175
 
0.1%
5 34
 
< 0.1%
6 33
 
< 0.1%
8 20
 
< 0.1%
7 16
 
< 0.1%

no_of_windows
Categorical

Missing 

Distinct24
Distinct (%)0.1%
Missing3017
Missing (%)8.3%
Memory size1.6 MiB
0
12279 
1
6797 
2
6039 
3
4770 
4
1498 
Other values (19)
2008 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters300519
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row 2
2nd row 1
3rd row 1
4th row 1
5th row 1

Common Values

ValueCountFrequency (%)
0 12279
33.7%
1 6797
18.7%
2 6039
16.6%
3 4770
 
13.1%
4 1498
 
4.1%
6 643
 
1.8%
5 466
 
1.3%
8 282
 
0.8%
7 279
 
0.8%
9 137
 
0.4%
Other values (14) 201
 
0.6%
(Missing) 3017
 
8.3%

Length

2024-11-18T14:10:01.226499image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 12279
36.8%
1 6797
20.4%
2 6039
18.1%
3 4770
 
14.3%
4 1498
 
4.5%
6 643
 
1.9%
5 466
 
1.4%
8 282
 
0.8%
7 279
 
0.8%
9 137
 
0.4%
Other values (14) 201
 
0.6%

Most occurring characters

ValueCountFrequency (%)
266927
88.8%
0 12345
 
4.1%
1 7028
 
2.3%
2 6104
 
2.0%
3 4779
 
1.6%
4 1507
 
0.5%
6 651
 
0.2%
5 473
 
0.2%
8 285
 
0.1%
7 283
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
266927
88.8%
0 12345
 
4.1%
1 7028
 
2.3%
2 6104
 
2.0%
3 4779
 
1.6%
4 1507
 
0.5%
6 651
 
0.2%
5 473
 
0.2%
8 285
 
0.1%
7 283
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
266927
88.8%
0 12345
 
4.1%
1 7028
 
2.3%
2 6104
 
2.0%
3 4779
 
1.6%
4 1507
 
0.5%
6 651
 
0.2%
5 473
 
0.2%
8 285
 
0.1%
7 283
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
266927
88.8%
0 12345
 
4.1%
1 7028
 
2.3%
2 6104
 
2.0%
3 4779
 
1.6%
4 1507
 
0.5%
6 651
 
0.2%
5 473
 
0.2%
8 285
 
0.1%
7 283
 
0.1%

main_material_used_for_window_panes
Categorical

Imbalance  Missing 

Distinct6
Distinct (%)< 0.1%
Missing22084
Missing (%)60.7%
Memory size1.6 MiB
3
10942 
2
2839 
5
 
203
1
 
185
4
 
142

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters128916
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 3
2nd row 3
3rd row 3
4th row 3
5th row 3

Common Values

ValueCountFrequency (%)
3 10942
30.1%
2 2839
 
7.8%
5 203
 
0.6%
1 185
 
0.5%
4 142
 
0.4%
6 13
 
< 0.1%
(Missing) 22084
60.7%

Length

2024-11-18T14:10:01.319887image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-18T14:10:01.412027image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
3 10942
76.4%
2 2839
 
19.8%
5 203
 
1.4%
1 185
 
1.3%
4 142
 
1.0%
6 13
 
0.1%

Most occurring characters

ValueCountFrequency (%)
114592
88.9%
3 10942
 
8.5%
2 2839
 
2.2%
5 203
 
0.2%
1 185
 
0.1%
4 142
 
0.1%
6 13
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 128916
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
114592
88.9%
3 10942
 
8.5%
2 2839
 
2.2%
5 203
 
0.2%
1 185
 
0.1%
4 142
 
0.1%
6 13
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 128916
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
114592
88.9%
3 10942
 
8.5%
2 2839
 
2.2%
5 203
 
0.2%
1 185
 
0.1%
4 142
 
0.1%
6 13
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 128916
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
114592
88.9%
3 10942
 
8.5%
2 2839
 
2.2%
5 203
 
0.2%
1 185
 
0.1%
4 142
 
0.1%
6 13
 
< 0.1%
Distinct3
Distinct (%)< 0.1%
Missing15296
Missing (%)42.0%
Memory size1.6 MiB
1
13235 
0
7839 
3
 
38

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters190008
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 1
2nd row 1
3rd row 1
4th row 1
5th row 1

Common Values

ValueCountFrequency (%)
1 13235
36.4%
0 7839
21.5%
3 38
 
0.1%
(Missing) 15296
42.0%

Length

2024-11-18T14:10:01.508621image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-18T14:10:01.592604image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 13235
62.7%
0 7839
37.1%
3 38
 
0.2%

Most occurring characters

ValueCountFrequency (%)
168896
88.9%
1 13235
 
7.0%
0 7839
 
4.1%
3 38
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 190008
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
168896
88.9%
1 13235
 
7.0%
0 7839
 
4.1%
3 38
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 190008
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
168896
88.9%
1 13235
 
7.0%
0 7839
 
4.1%
3 38
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 190008
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
168896
88.9%
1 13235
 
7.0%
0 7839
 
4.1%
3 38
 
< 0.1%

have_other_ventelation_holes
Categorical

High correlation  Missing 

Distinct3
Distinct (%)< 0.1%
Missing3017
Missing (%)8.3%
Memory size1.6 MiB
1
17714 
0
13370 
3
2307 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters300519
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Common Values

ValueCountFrequency (%)
1 17714
48.7%
0 13370
36.7%
3 2307
 
6.3%
(Missing) 3017
 
8.3%

Length

2024-11-18T14:10:01.684446image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-18T14:10:01.770692image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 17714
53.1%
0 13370
40.0%
3 2307
 
6.9%

Most occurring characters

ValueCountFrequency (%)
267128
88.9%
1 17714
 
5.9%
0 13370
 
4.4%
3 2307
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
267128
88.9%
1 17714
 
5.9%
0 13370
 
4.4%
3 2307
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
267128
88.9%
1 17714
 
5.9%
0 13370
 
4.4%
3 2307
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
267128
88.9%
1 17714
 
5.9%
0 13370
 
4.4%
3 2307
 
0.8%

no_of_bulbs_in_the_room
Categorical

Imbalance  Missing 

Distinct21
Distinct (%)0.1%
Missing3017
Missing (%)8.3%
Memory size1.6 MiB
1
20781 
2
5581 
0
2257 
3
 
1558
4
 
1157
Other values (16)
 
2057

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters300519
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 4
2nd row 1
3rd row 1
4th row 1
5th row 1

Common Values

ValueCountFrequency (%)
1 20781
57.1%
2 5581
 
15.3%
0 2257
 
6.2%
3 1558
 
4.3%
4 1157
 
3.2%
5 557
 
1.5%
6 452
 
1.2%
7 220
 
0.6%
8 197
 
0.5%
10 166
 
0.5%
Other values (11) 465
 
1.3%
(Missing) 3017
 
8.3%

Length

2024-11-18T14:10:01.860245image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 20781
62.2%
2 5581
 
16.7%
0 2257
 
6.8%
3 1558
 
4.7%
4 1157
 
3.5%
5 557
 
1.7%
6 452
 
1.4%
7 220
 
0.7%
8 197
 
0.6%
10 166
 
0.5%
Other values (11) 465
 
1.4%

Most occurring characters

ValueCountFrequency (%)
266595
88.7%
1 21301
 
7.1%
2 5699
 
1.9%
0 2477
 
0.8%
3 1581
 
0.5%
4 1189
 
0.4%
5 659
 
0.2%
6 468
 
0.2%
7 223
 
0.1%
8 220
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
266595
88.7%
1 21301
 
7.1%
2 5699
 
1.9%
0 2477
 
0.8%
3 1581
 
0.5%
4 1189
 
0.4%
5 659
 
0.2%
6 468
 
0.2%
7 223
 
0.1%
8 220
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
266595
88.7%
1 21301
 
7.1%
2 5699
 
1.9%
0 2477
 
0.8%
3 1581
 
0.5%
4 1189
 
0.4%
5 659
 
0.2%
6 468
 
0.2%
7 223
 
0.1%
8 220
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
266595
88.7%
1 21301
 
7.1%
2 5699
 
1.9%
0 2477
 
0.8%
3 1581
 
0.5%
4 1189
 
0.4%
5 659
 
0.2%
6 468
 
0.2%
7 223
 
0.1%
8 220
 
0.1%

no_of_bulbs_used_during_last_week
Categorical

Imbalance  Missing 

Distinct12
Distinct (%)0.1%
Missing26603
Missing (%)73.1%
Memory size1.6 MiB
1
7458 
0
1291 
2
 
726
3
 
200
4
 
83
Other values (7)
 
47

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters107855
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row 1
2nd row 1
3rd row 1
4th row 1
5th row 1

Common Values

ValueCountFrequency (%)
1 7458
 
20.5%
0 1291
 
3.5%
2 726
 
2.0%
3 200
 
0.5%
4 83
 
0.2%
6 19
 
0.1%
5 16
 
< 0.1%
7 6
 
< 0.1%
8 3
 
< 0.1%
11 1
 
< 0.1%
Other values (2) 2
 
< 0.1%
(Missing) 26603
73.1%

Length

2024-11-18T14:10:01.952701image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 7458
76.1%
0 1291
 
13.2%
2 726
 
7.4%
3 200
 
2.0%
4 83
 
0.8%
6 19
 
0.2%
5 16
 
0.2%
7 6
 
0.1%
8 3
 
< 0.1%
11 1
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
98048
90.9%
1 7461
 
6.9%
0 1292
 
1.2%
2 726
 
0.7%
3 200
 
0.2%
4 83
 
0.1%
6 19
 
< 0.1%
5 16
 
< 0.1%
7 6
 
< 0.1%
8 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 107855
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
98048
90.9%
1 7461
 
6.9%
0 1292
 
1.2%
2 726
 
0.7%
3 200
 
0.2%
4 83
 
0.1%
6 19
 
< 0.1%
5 16
 
< 0.1%
7 6
 
< 0.1%
8 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 107855
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
98048
90.9%
1 7461
 
6.9%
0 1292
 
1.2%
2 726
 
0.7%
3 200
 
0.2%
4 83
 
0.1%
6 19
 
< 0.1%
5 16
 
< 0.1%
7 6
 
< 0.1%
8 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 107855
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
98048
90.9%
1 7461
 
6.9%
0 1292
 
1.2%
2 726
 
0.7%
3 200
 
0.2%
4 83
 
0.1%
6 19
 
< 0.1%
5 16
 
< 0.1%
7 6
 
< 0.1%
8 3
 
< 0.1%

no_of_fans_in_the_room
Categorical

Imbalance  Missing 

Distinct11
Distinct (%)< 0.1%
Missing3017
Missing (%)8.3%
Memory size1.6 MiB
0
21657 
1
10512 
2
 
873
3
 
200
4
 
101
Other values (6)
 
48

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters300519
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row 1
2nd row 1
3rd row 1
4th row 1
5th row 1

Common Values

ValueCountFrequency (%)
0 21657
59.5%
1 10512
28.9%
2 873
 
2.4%
3 200
 
0.5%
4 101
 
0.3%
5 41
 
0.1%
9 3
 
< 0.1%
8 1
 
< 0.1%
6 1
 
< 0.1%
10 1
 
< 0.1%
(Missing) 3017
 
8.3%

Length

2024-11-18T14:10:02.044073image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 21657
64.9%
1 10512
31.5%
2 873
 
2.6%
3 200
 
0.6%
4 101
 
0.3%
5 41
 
0.1%
9 3
 
< 0.1%
8 1
 
< 0.1%
6 1
 
< 0.1%
10 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
267127
88.9%
0 21658
 
7.2%
1 10513
 
3.5%
2 873
 
0.3%
3 200
 
0.1%
4 101
 
< 0.1%
5 41
 
< 0.1%
9 3
 
< 0.1%
8 1
 
< 0.1%
6 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
267127
88.9%
0 21658
 
7.2%
1 10513
 
3.5%
2 873
 
0.3%
3 200
 
0.1%
4 101
 
< 0.1%
5 41
 
< 0.1%
9 3
 
< 0.1%
8 1
 
< 0.1%
6 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
267127
88.9%
0 21658
 
7.2%
1 10513
 
3.5%
2 873
 
0.3%
3 200
 
0.1%
4 101
 
< 0.1%
5 41
 
< 0.1%
9 3
 
< 0.1%
8 1
 
< 0.1%
6 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 300519
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
267127
88.9%
0 21658
 
7.2%
1 10513
 
3.5%
2 873
 
0.3%
3 200
 
0.1%
4 101
 
< 0.1%
5 41
 
< 0.1%
9 3
 
< 0.1%
8 1
 
< 0.1%
6 1
 
< 0.1%

no_of_ACs_in_the_room
Categorical

Imbalance  Missing 

Distinct5
Distinct (%)< 0.1%
Missing3017
Missing (%)8.3%
Memory size1.6 MiB
0
32305 
1
 
1029
2
 
30
3
 
26
4
 
1

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters267128
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Common Values

ValueCountFrequency (%)
0 32305
88.7%
1 1029
 
2.8%
2 30
 
0.1%
3 26
 
0.1%
4 1
 
< 0.1%
(Missing) 3017
 
8.3%

Length

2024-11-18T14:10:02.138303image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-18T14:10:02.223788image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0 32305
96.7%
1 1029
 
3.1%
2 30
 
0.1%
3 26
 
0.1%
4 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
233737
87.5%
0 32305
 
12.1%
1 1029
 
0.4%
2 30
 
< 0.1%
3 26
 
< 0.1%
4 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 267128
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
233737
87.5%
0 32305
 
12.1%
1 1029
 
0.4%
2 30
 
< 0.1%
3 26
 
< 0.1%
4 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 267128
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
233737
87.5%
0 32305
 
12.1%
1 1029
 
0.4%
2 30
 
< 0.1%
3 26
 
< 0.1%
4 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 267128
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
233737
87.5%
0 32305
 
12.1%
1 1029
 
0.4%
2 30
 
< 0.1%
3 26
 
< 0.1%
4 1
 
< 0.1%

Correlations

2024-11-18T14:10:02.298625image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
have_curtains_or_blinds_for_windowshave_other_ventelation_holesmain_material_used_for_the_floor_of_the_roommain_material_used_for_the_roof_of_the_roommain_material_used_for_window_panesmain_purpose_of_the_roomno_of_ACs_in_the_roomno_of_bulbs_in_the_roomno_of_bulbs_used_during_last_weekno_of_doors_opened_to_external_environmentno_of_fans_in_the_roomno_of_windowsroom_IDstorey_which_the_room_locatedtype_of_ceiling_of_the_room
have_curtains_or_blinds_for_windows1.0000.3630.1790.2100.2950.3270.1010.1510.1000.0870.3100.2520.2590.0110.151
have_other_ventelation_holes0.3631.0000.5580.5430.2960.4620.0390.2390.1150.1620.2030.3520.2770.0740.549
main_material_used_for_the_floor_of_the_room0.1790.5581.0000.3570.1780.2230.0820.1490.0600.0670.0730.1170.1260.0650.372
main_material_used_for_the_roof_of_the_room0.2100.5430.3571.0000.1790.2350.0270.1320.0510.0630.0700.1180.1180.0650.455
main_material_used_for_window_panes0.2950.2960.1780.1791.0000.1360.0150.0580.0000.0180.0510.0620.0300.0220.127
main_purpose_of_the_room0.3270.4620.2230.2350.1361.0000.0930.1270.1230.0890.2020.1870.4430.0560.258
no_of_ACs_in_the_room0.1010.0390.0820.0270.0150.0931.0000.4740.1940.2420.2260.1070.0700.0670.069
no_of_bulbs_in_the_room0.1510.2390.1490.1320.0580.1270.4741.0000.3300.2400.2600.1350.1020.0410.163
no_of_bulbs_used_during_last_week0.1000.1150.0600.0510.0000.1230.1940.3301.0000.1240.2180.1190.1210.0350.082
no_of_doors_opened_to_external_environment0.0870.1620.0670.0630.0180.0890.2420.2400.1241.0000.2300.1080.0770.0000.083
no_of_fans_in_the_room0.3100.2030.0730.0700.0510.2020.2260.2600.2180.2301.0000.1890.1800.0270.093
no_of_windows0.2520.3520.1170.1180.0620.1870.1070.1350.1190.1080.1891.0000.1340.0260.133
room_ID0.2590.2770.1260.1180.0300.4430.0700.1020.1210.0770.1800.1341.0000.0480.134
storey_which_the_room_located0.0110.0740.0650.0650.0220.0560.0670.0410.0350.0000.0270.0260.0481.0000.080
type_of_ceiling_of_the_room0.1510.5490.3720.4550.1270.2580.0690.1630.0820.0830.0930.1330.1340.0801.000

Missing values

2024-11-18T14:09:59.173198image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-18T14:09:59.423836image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-11-18T14:09:59.760929image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

room_IDmain_purpose_of_the_roomstorey_which_the_room_locatedmain_material_used_for_the_roof_of_the_roomtype_of_ceiling_of_the_roommain_material_used_for_the_floor_of_the_roomno_of_doors_opened_to_external_environmentno_of_windowsmain_material_used_for_window_paneshave_curtains_or_blinds_for_windowshave_other_ventelation_holesno_of_bulbs_in_the_roomno_of_bulbs_used_during_last_weekno_of_fans_in_the_roomno_of_ACs_in_the_room
household_ID
ID0001I_110233123104None10
ID0001I_220233113101None10
ID0001I_320233113101None10
ID0001I_421231113101None10
ID0001I_521231113101None10
ID0001I_631233113111None00
ID0001I_74023310NoneNone11None00
ID0001I_86026900NoneNone00None00
ID0001I_9701091100NoneNone30None00
ID0001I_10901091100NoneNone30None00
room_IDmain_purpose_of_the_roomstorey_which_the_room_locatedmain_material_used_for_the_roof_of_the_roomtype_of_ceiling_of_the_roommain_material_used_for_the_floor_of_the_roomno_of_doors_opened_to_external_environmentno_of_windowsmain_material_used_for_window_paneshave_curtains_or_blinds_for_windowshave_other_ventelation_holesno_of_bulbs_in_the_roomno_of_bulbs_used_during_last_weekno_of_fans_in_the_roomno_of_ACs_in_the_room
household_ID
ID4063I_11036916None102110
ID4063I_21036913None101110
ID4063I_32036913None101110
ID4063I_42121912None000None00
ID4063I_52121912None000None00
ID4063I_62121924None001000
ID4063I_73036913None101100
ID4063I_89036913None001100
ID4063I_910NoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNone
ID4063I_1010NoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNone

Duplicate rows

Most frequently occurring

room_IDmain_purpose_of_the_roomstorey_which_the_room_locatedmain_material_used_for_the_roof_of_the_roomtype_of_ceiling_of_the_roommain_material_used_for_the_floor_of_the_roomno_of_doors_opened_to_external_environmentno_of_windowsmain_material_used_for_window_paneshave_curtains_or_blinds_for_windowshave_other_ventelation_holesno_of_bulbs_in_the_roomno_of_bulbs_used_during_last_weekno_of_fans_in_the_roomno_of_ACs_in_the_room# duplicates
403I_1010NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1321
3414I_910NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN985
3290I_810NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN537
3100I_710NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN162
3270I_8901091100NaNNaN30NaN0073
391I_10901091100NaNNaN30NaN0060
3085I_7901091100NaNNaN30NaN0051
2846I_6901091100NaNNaN30NaN0050
3397I_9901091100NaNNaN30NaN0048
1893I_43021110NaNNaN01NaN0042