Overview
Brought to you by YData
Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 36,408 |
| Missing cells | 94,148 |
| Missing cells (%) | 16.2% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 4.4 MiB |
| Average record size in memory | 128.0 B |
Variable types
| Text | 1 |
|---|---|
| Categorical | 9 |
| Numeric | 6 |
have_extra_ventilation_openings is highly overall correlated with main_material_used_for_the_floor_of_the_room and 2 other fields | High correlation |
main_material_used_for_the_floor_of_the_room is highly overall correlated with have_extra_ventilation_openings | High correlation |
main_material_used_for_the_roof_of_the_room is highly overall correlated with have_extra_ventilation_openings | High correlation |
no_of_fans_in_the_room is highly overall correlated with no_of_windows | High correlation |
no_of_windows is highly overall correlated with no_of_fans_in_the_room | High correlation |
type_of_ceiling_of_the_room is highly overall correlated with have_extra_ventilation_openings | High correlation |
main_material_used_for_the_floor_of_the_room is highly imbalanced (51.5%) | Imbalance |
main_material_used_for_window_panes is highly imbalanced (61.2%) | Imbalance |
no_of_ACs_in_the_room is highly imbalanced (90.6%) | Imbalance |
storey_which_the_room_located has 3017 (8.3%) missing values | Missing |
main_material_used_for_the_roof_of_the_room has 3009 (8.3%) missing values | Missing |
type_of_ceiling_of_the_room has 3017 (8.3%) missing values | Missing |
main_material_used_for_the_floor_of_the_room has 3017 (8.3%) missing values | Missing |
no_of_doors_opened_to_external_environment has 3017 (8.3%) missing values | Missing |
no_of_windows has 3017 (8.3%) missing values | Missing |
main_material_used_for_window_panes has 22084 (60.7%) missing values | Missing |
have_curtains_or_blinds_for_windows has 15296 (42.0%) missing values | Missing |
have_extra_ventilation_openings has 3017 (8.3%) missing values | Missing |
no_of_bulbs_in_the_room has 3017 (8.3%) missing values | Missing |
no_of_bulbs_used_during_last_week has 26603 (73.1%) missing values | Missing |
no_of_fans_in_the_room has 3017 (8.3%) missing values | Missing |
no_of_ACs_in_the_room has 3017 (8.3%) missing values | Missing |
no_of_doors_opened_to_external_environment is highly skewed (γ1 = 30.98014855) | Skewed |
storey_which_the_room_located has 26381 (72.5%) zeros | Zeros |
no_of_doors_opened_to_external_environment has 19548 (53.7%) zeros | Zeros |
no_of_windows has 12279 (33.7%) zeros | Zeros |
no_of_bulbs_in_the_room has 2257 (6.2%) zeros | Zeros |
no_of_bulbs_used_during_last_week has 1291 (3.5%) zeros | Zeros |
no_of_fans_in_the_room has 21657 (59.5%) zeros | Zeros |
Reproduction
| Analysis started | 2024-12-06 05:54:36.382353 |
|---|---|
| Analysis finished | 2024-12-06 05:54:41.691799 |
| Duration | 5.31 seconds |
| Software version | ydata-profiling vv4.11.0 |
| Download configuration | config.json |
Variables
household_ID
Text
| Distinct | 4063 |
|---|---|
| Distinct (%) | 11.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 284.6 KiB |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | ID0001 |
|---|---|
| 2nd row | ID0001 |
| 3rd row | ID0001 |
| 4th row | ID0001 |
| 5th row | ID0001 |
| Value | Count | Frequency (%) |
| id0698 | 32 | 0.1% |
| id1165 | 29 | 0.1% |
| id1142 | 24 | 0.1% |
| id2632 | 24 | 0.1% |
| id0255 | 23 | 0.1% |
| id1621 | 23 | 0.1% |
| id1132 | 23 | 0.1% |
| id1676 | 22 | 0.1% |
| id3864 | 22 | 0.1% |
| id1792 | 22 | 0.1% |
| Other values (4053) | 36164 |
Most occurring characters
| Value | Count | Frequency (%) |
| I | 36408 | |
| D | 36408 | |
| 0 | 20584 | |
| 3 | 19966 | |
| 1 | 19852 | |
| 2 | 19503 | |
| 4 | 11487 | 5.3% |
| 6 | 11138 | 5.1% |
| 7 | 10945 | 5.0% |
| 8 | 10826 | 5.0% |
| Other values (2) | 21331 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 145632 | |
| Uppercase Letter | 72816 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 20584 | |
| 3 | 19966 | |
| 1 | 19852 | |
| 2 | 19503 | |
| 4 | 11487 | |
| 6 | 11138 | |
| 7 | 10945 | |
| 8 | 10826 | |
| 5 | 10712 | |
| 9 | 10619 |
Uppercase Letter
| Value | Count | Frequency (%) |
| I | 36408 | |
| D | 36408 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 145632 | |
| Latin | 72816 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 20584 | |
| 3 | 19966 | |
| 1 | 19852 | |
| 2 | 19503 | |
| 4 | 11487 | |
| 6 | 11138 | |
| 7 | 10945 | |
| 8 | 10826 | |
| 5 | 10712 | |
| 9 | 10619 |
Latin
| Value | Count | Frequency (%) |
| I | 36408 | |
| D | 36408 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 218448 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| I | 36408 | |
| D | 36408 | |
| 0 | 20584 | |
| 3 | 19966 | |
| 1 | 19852 | |
| 2 | 19503 | |
| 4 | 11487 | 5.3% |
| 6 | 11138 | 5.1% |
| 7 | 10945 | 5.0% |
| 8 | 10826 | 5.0% |
| Other values (2) | 21331 |
room_ID
Categorical
| Distinct | 32 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 284.6 KiB |
| I1 | |
|---|---|
| I2 | |
| I3 | |
| I4 | |
| I5 | |
| Other values (27) |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 7.986102 |
| Min length | 7 |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | I1 |
|---|---|
| 2nd row | I2 |
| 3rd row | I3 |
| 4th row | I4 |
| 5th row | I5 |
Common Values
| Value | Count | Frequency (%) |
| I1 | 4063 | |
| I2 | 4060 | |
| I3 | 4044 | |
| I4 | 3967 | |
| I5 | 3711 | |
| I6 | 3243 | |
| I7 | 2724 | |
| I8 | 2488 | |
| I9 | 2382 | |
| I10 | 2343 | |
| Other values (22) | 3383 |
Length
| Value | Count | Frequency (%) |
| i1 | 4063 | |
| i2 | 4060 | |
| i3 | 4044 | |
| i4 | 3967 | |
| i5 | 3711 | |
| i6 | 3243 | |
| i7 | 2724 | |
| i8 | 2488 | |
| i9 | 2382 | |
| i10 | 2343 | |
| Other values (22) | 3383 |
Most occurring characters
| Value | Count | Frequency (%) |
| 211201 | ||
| I | 36408 | 12.5% |
| 1 | 10389 | 3.6% |
| 2 | 4735 | 1.6% |
| 3 | 4492 | 1.5% |
| 4 | 4307 | 1.5% |
| 5 | 3970 | 1.4% |
| 6 | 3432 | 1.2% |
| 7 | 2868 | 1.0% |
| 8 | 2593 | 0.9% |
| Other values (5) | 6363 | 2.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Space Separator | 211201 | |
| Decimal Number | 41631 | 14.3% |
| Uppercase Letter | 36914 | 12.7% |
| Lowercase Letter | 1012 | 0.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 10389 | |
| 2 | 4735 | |
| 3 | 4492 | |
| 4 | 4307 | |
| 5 | 3970 | 9.5% |
| 6 | 3432 | 8.2% |
| 7 | 2868 | 6.9% |
| 8 | 2593 | 6.2% |
| 9 | 2455 | 5.9% |
| 0 | 2390 | 5.7% |
Uppercase Letter
| Value | Count | Frequency (%) |
| I | 36408 | |
| O | 506 | 1.4% |
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 506 | |
| h | 506 |
Space Separator
| Value | Count | Frequency (%) |
| 211201 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 252832 | |
| Latin | 37926 | 13.0% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 211201 | ||
| 1 | 10389 | 4.1% |
| 2 | 4735 | 1.9% |
| 3 | 4492 | 1.8% |
| 4 | 4307 | 1.7% |
| 5 | 3970 | 1.6% |
| 6 | 3432 | 1.4% |
| 7 | 2868 | 1.1% |
| 8 | 2593 | 1.0% |
| 9 | 2455 | 1.0% |
Latin
| Value | Count | Frequency (%) |
| I | 36408 | |
| O | 506 | 1.3% |
| t | 506 | 1.3% |
| h | 506 | 1.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 290758 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 211201 | ||
| I | 36408 | 12.5% |
| 1 | 10389 | 3.6% |
| 2 | 4735 | 1.6% |
| 3 | 4492 | 1.5% |
| 4 | 4307 | 1.5% |
| 5 | 3970 | 1.4% |
| 6 | 3432 | 1.2% |
| 7 | 2868 | 1.0% |
| 8 | 2593 | 0.9% |
| Other values (5) | 6363 | 2.2% |
main_purpose_of_the_room
Categorical
| Distinct | 18 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3 |
| Missing (%) | < 0.1% |
| Memory size | 284.6 KiB |
| Bedrooms | |
|---|---|
| Bathroom and / or toilets | |
| Kitchen and/ or pantry | |
| Living room | |
| Gaming room | |
| Other values (13) |
Length
| Max length | 29 |
|---|---|
| Median length | 25 |
| Mean length | 13.412663 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Living room |
|---|---|
| 2nd row | Bedrooms |
| 3rd row | Bedrooms |
| 4th row | Bedrooms |
| 5th row | Bedrooms |
Common Values
| Value | Count | Frequency (%) |
| Bedrooms | 10380 | |
| Bathroom and / or toilets | 5767 | |
| Kitchen and/ or pantry | 5159 | |
| Living room | 4431 | |
| Gaming room | 3776 | 10.4% |
| Veranda | 3405 | 9.4% |
| Passage | 1825 | 5.0% |
| Servant's Room | 736 | 2.0% |
| Storage room | 609 | 1.7% |
| Other | 81 | 0.2% |
| Other values (8) | 236 | 0.6% |
Length
| Value | Count | Frequency (%) |
| and | 10926 | |
| or | 10926 | |
| bedrooms | 10380 | |
| room | 9616 | |
| 5767 | ||
| bathroom | 5767 | |
| toilets | 5767 | |
| kitchen | 5159 | |
| pantry | 5159 | |
| living | 4431 | |
| Other values (18) | 10773 |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 68996 | |
| 48266 | ||
| r | 46164 | 9.5% |
| a | 37657 | 7.7% |
| n | 33682 | 6.9% |
| m | 29563 | 6.1% |
| t | 29224 | 6.0% |
| e | 28198 | 5.8% |
| d | 24806 | 5.1% |
| i | 23739 | 4.9% |
| Other values (21) | 117993 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 391155 | |
| Space Separator | 48266 | 9.9% |
| Uppercase Letter | 37152 | 7.6% |
| Other Punctuation | 11715 | 2.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 68996 | |
| r | 46164 | |
| a | 37657 | |
| n | 33682 | |
| m | 29563 | |
| t | 29224 | |
| e | 28198 | |
| d | 24806 | 6.3% |
| i | 23739 | 6.1% |
| s | 20745 | 5.3% |
| Other values (9) | 48381 |
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 16175 | |
| K | 5159 | 13.9% |
| L | 4431 | 11.9% |
| G | 3853 | 10.4% |
| V | 3449 | 9.3% |
| P | 1825 | 4.9% |
| S | 1408 | 3.8% |
| R | 738 | 2.0% |
| O | 114 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 10935 | |
| ' | 780 | 6.7% |
Space Separator
| Value | Count | Frequency (%) |
| 48266 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 428307 | |
| Common | 59981 | 12.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 68996 | |
| r | 46164 | |
| a | 37657 | |
| n | 33682 | 7.9% |
| m | 29563 | 6.9% |
| t | 29224 | 6.8% |
| e | 28198 | 6.6% |
| d | 24806 | 5.8% |
| i | 23739 | 5.5% |
| s | 20745 | 4.8% |
| Other values (18) | 85533 |
Common
| Value | Count | Frequency (%) |
| 48266 | ||
| / | 10935 | 18.2% |
| ' | 780 | 1.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 488288 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 68996 | |
| 48266 | ||
| r | 46164 | 9.5% |
| a | 37657 | 7.7% |
| n | 33682 | 6.9% |
| m | 29563 | 6.1% |
| t | 29224 | 6.0% |
| e | 28198 | 5.8% |
| d | 24806 | 5.1% |
| i | 23739 | 4.9% |
| Other values (21) | 117993 |
storey_which_the_room_located
Real number (ℝ)
Missing  Zeros 
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3017 |
| Missing (%) | 8.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.270522 |
| Minimum | 0 |
|---|---|
| Maximum | 18 |
| Zeros | 26381 |
| Zeros (%) | 72.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 284.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 18 |
| Range | 18 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.71070568 |
|---|---|
| Coefficient of variation (CV) | 2.6271641 |
| Kurtosis | 75.218635 |
| Mean | 0.270522 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.6014703 |
| Sum | 9033 |
| Variance | 0.50510257 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 26381 | |
| 1 | 5985 | 16.4% |
| 2 | 702 | 1.9% |
| 3 | 152 | 0.4% |
| 8 | 33 | 0.1% |
| 4 | 30 | 0.1% |
| 7 | 29 | 0.1% |
| 5 | 24 | 0.1% |
| 6 | 20 | 0.1% |
| 10 | 12 | < 0.1% |
| Other values (4) | 23 | 0.1% |
| (Missing) | 3017 | 8.3% |
| Value | Count | Frequency (%) |
| 0 | 26381 | |
| 1 | 5985 | 16.4% |
| 2 | 702 | 1.9% |
| 3 | 152 | 0.4% |
| 4 | 30 | 0.1% |
| 5 | 24 | 0.1% |
| 6 | 20 | 0.1% |
| 7 | 29 | 0.1% |
| 8 | 33 | 0.1% |
| 9 | 11 | < 0.1% |
| Value | Count | Frequency (%) |
| 18 | 1 | < 0.1% |
| 12 | 3 | < 0.1% |
| 11 | 8 | < 0.1% |
| 10 | 12 | < 0.1% |
| 9 | 11 | < 0.1% |
| 8 | 33 | |
| 7 | 29 | |
| 6 | 20 | |
| 5 | 24 | |
| 4 | 30 |
main_material_used_for_the_roof_of_the_room
Categorical
High correlation  Missing 
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3009 |
| Missing (%) | 8.3% |
| Memory size | 284.6 KiB |
| Asbestos | |
|---|---|
| Concrete | |
| Tile | |
| Garden - Not relevant | |
| Takaran | 287 |
| Other values (5) | 498 |
Length
| Max length | 21 |
|---|---|
| Median length | 8 |
| Mean length | 8.1620108 |
| Min length | 4 |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Asbestos |
|---|---|
| 2nd row | Asbestos |
| 3rd row | Asbestos |
| 4th row | Asbestos |
| 5th row | Asbestos |
Common Values
| Value | Count | Frequency (%) |
| Asbestos | 14980 | |
| Concrete | 11566 | |
| Tile | 4329 | 11.9% |
| Garden - Not relevant | 1739 | 4.8% |
| Takaran | 287 | 0.8% |
| Metal Sheet | 226 | 0.6% |
| Other | 211 | 0.6% |
| Plastic sheets | 59 | 0.2% |
| Tent | 1 | < 0.1% |
| Cadjun/Palmyra/Straw | 1 | < 0.1% |
| (Missing) | 3009 | 8.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| asbestos | 14980 | |
| concrete | 11566 | |
| tile | 4329 | 11.1% |
| garden | 1739 | 4.5% |
| 1739 | 4.5% | |
| not | 1739 | 4.5% |
| relevant | 1739 | 4.5% |
| takaran | 287 | 0.7% |
| metal | 226 | 0.6% |
| sheet | 226 | 0.6% |
| Other values (5) | 331 | 0.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 48666 | |
| s | 45117 | |
| t | 30807 | |
| o | 28285 | |
| r | 15544 | 5.7% |
| n | 15333 | 5.6% |
| b | 14980 | 5.5% |
| A | 14980 | 5.5% |
| c | 11625 | 4.3% |
| C | 11567 | 4.2% |
| Other values (22) | 35699 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 229994 | |
| Uppercase Letter | 35366 | 13.0% |
| Space Separator | 5502 | 2.0% |
| Dash Punctuation | 1739 | 0.6% |
| Other Punctuation | 2 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 48666 | |
| s | 45117 | |
| t | 30807 | |
| o | 28285 | |
| r | 15544 | 6.8% |
| n | 15333 | 6.7% |
| b | 14980 | 6.5% |
| c | 11625 | 5.1% |
| l | 6354 | 2.8% |
| a | 4628 | 2.0% |
| Other values (10) | 8655 | 3.8% |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 14980 | |
| C | 11567 | |
| T | 4617 | 13.1% |
| G | 1739 | 4.9% |
| N | 1739 | 4.9% |
| S | 227 | 0.6% |
| M | 226 | 0.6% |
| O | 211 | 0.6% |
| P | 60 | 0.2% |
Space Separator
| Value | Count | Frequency (%) |
| 5502 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1739 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 265360 | |
| Common | 7243 | 2.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 48666 | |
| s | 45117 | |
| t | 30807 | |
| o | 28285 | |
| r | 15544 | 5.9% |
| n | 15333 | 5.8% |
| b | 14980 | 5.6% |
| A | 14980 | 5.6% |
| c | 11625 | 4.4% |
| C | 11567 | 4.4% |
| Other values (19) | 28456 |
Common
| Value | Count | Frequency (%) |
| 5502 | ||
| - | 1739 | 24.0% |
| / | 2 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 272603 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 48666 | |
| s | 45117 | |
| t | 30807 | |
| o | 28285 | |
| r | 15544 | 5.7% |
| n | 15333 | 5.6% |
| b | 14980 | 5.5% |
| A | 14980 | 5.5% |
| c | 11625 | 4.3% |
| C | 11567 | 4.2% |
| Other values (22) | 35699 |
type_of_ceiling_of_the_room
Categorical
High correlation  Missing 
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3017 |
| Missing (%) | 8.3% |
| Memory size | 284.6 KiB |
| No ceiling, the concrete slab | |
|---|---|
| No ceiling, just the roof above | |
| A conventional ceiling | |
| Wooden ceiling | |
| Garden - Not relevant | |
| Other values (4) |
Length
| Max length | 31 |
|---|---|
| Median length | 30 |
| Mean length | 25.608487 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | A beamed ceiling |
|---|---|
| 2nd row | A beamed ceiling |
| 3rd row | A beamed ceiling |
| 4th row | A beamed ceiling |
| 5th row | A beamed ceiling |
Common Values
| Value | Count | Frequency (%) |
| No ceiling, the concrete slab | 10521 | |
| No ceiling, just the roof above | 9847 | |
| A conventional ceiling | 5960 | |
| Wooden ceiling | 1847 | 5.1% |
| Garden - Not relevant | 1843 | 5.1% |
| A hanging ceiling | 1320 | 3.6% |
| A beamed ceiling | 1281 | 3.5% |
| Other | 682 | 1.9% |
| A polythene cover as a ceiling | 90 | 0.2% |
| (Missing) | 3017 | 8.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| ceiling | 30866 | |
| no | 20368 | |
| the | 20368 | |
| concrete | 10521 | 7.0% |
| slab | 10521 | 7.0% |
| just | 9847 | 6.6% |
| roof | 9847 | 6.6% |
| above | 9847 | 6.6% |
| a | 8741 | 5.8% |
| conventional | 5960 | 4.0% |
| Other values (11) | 12772 |
Most occurring characters
| Value | Count | Frequency (%) |
| 116267 | ||
| e | 98973 | |
| o | 78067 | 9.1% |
| i | 69012 | 8.1% |
| n | 67530 | 7.9% |
| c | 57958 | 6.8% |
| t | 51154 | 6.0% |
| l | 49280 | 5.8% |
| g | 33506 | 3.9% |
| a | 32795 | 3.8% |
| Other values (19) | 200551 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 681381 | |
| Space Separator | 116267 | 13.6% |
| Uppercase Letter | 35234 | 4.1% |
| Other Punctuation | 20368 | 2.4% |
| Dash Punctuation | 1843 | 0.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 98973 | |
| o | 78067 | |
| i | 69012 | |
| n | 67530 | |
| c | 57958 | |
| t | 51154 | |
| l | 49280 | |
| g | 33506 | 4.9% |
| a | 32795 | 4.8% |
| r | 24826 | 3.6% |
| Other values (11) | 118280 |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 22211 | |
| A | 8651 | 24.6% |
| W | 1847 | 5.2% |
| G | 1843 | 5.2% |
| O | 682 | 1.9% |
Space Separator
| Value | Count | Frequency (%) |
| 116267 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 20368 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1843 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 716615 | |
| Common | 138478 | 16.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 98973 | |
| o | 78067 | |
| i | 69012 | |
| n | 67530 | |
| c | 57958 | 8.1% |
| t | 51154 | 7.1% |
| l | 49280 | 6.9% |
| g | 33506 | 4.7% |
| a | 32795 | 4.6% |
| r | 24826 | 3.5% |
| Other values (16) | 153514 |
Common
| Value | Count | Frequency (%) |
| 116267 | ||
| , | 20368 | 14.7% |
| - | 1843 | 1.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 855093 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 116267 | ||
| e | 98973 | |
| o | 78067 | 9.1% |
| i | 69012 | 8.1% |
| n | 67530 | 7.9% |
| c | 57958 | 6.8% |
| t | 51154 | 6.0% |
| l | 49280 | 5.8% |
| g | 33506 | 3.9% |
| a | 32795 | 3.8% |
| Other values (19) | 200551 |
main_material_used_for_the_floor_of_the_room
Categorical
High correlation  Imbalance  Missing 
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3017 |
| Missing (%) | 8.3% |
| Memory size | 284.6 KiB |
| Tile | |
|---|---|
| Cement | |
| Garden - Not relevant | |
| Concrete | 1263 |
| Teraso | 398 |
| Other values (6) | 691 |
Length
| Max length | 21 |
|---|---|
| Median length | 15 |
| Mean length | 5.8957803 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Tile |
|---|---|
| 2nd row | Tile |
| 3rd row | Tile |
| 4th row | Cement |
| 5th row | Cement |
Common Values
| Value | Count | Frequency (%) |
| Tile | 16272 | |
| Cement | 12961 | |
| Garden - Not relevant | 1806 | 5.0% |
| Concrete | 1263 | 3.5% |
| Teraso | 398 | 1.1% |
| Other | 306 | 0.8% |
| Wood | 137 | 0.4% |
| Granite | 73 | 0.2% |
| Sand | 72 | 0.2% |
| Mud | 69 | 0.2% |
| (Missing) | 3017 | 8.3% |
Length
| Value | Count | Frequency (%) |
| tile | 16272 | |
| cement | 12961 | |
| garden | 1806 | 4.6% |
| 1806 | 4.6% | |
| not | 1806 | 4.6% |
| relevant | 1806 | 4.6% |
| concrete | 1263 | 3.3% |
| teraso | 398 | 1.0% |
| other | 306 | 0.8% |
| wood | 171 | 0.4% |
| Other values (4) | 248 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 50949 | |
| t | 18215 | 9.3% |
| l | 18078 | 9.2% |
| n | 18015 | 9.2% |
| T | 16670 | 8.5% |
| i | 16413 | 8.3% |
| C | 14224 | 7.2% |
| m | 12961 | 6.6% |
| r | 5652 | 2.9% |
| 5452 | 2.8% | |
| Other values (18) | 20237 | 10.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 154343 | |
| Uppercase Letter | 35197 | 17.9% |
| Space Separator | 5452 | 2.8% |
| Dash Punctuation | 1806 | 0.9% |
| Open Punctuation | 34 | < 0.1% |
| Close Punctuation | 34 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 50949 | |
| t | 18215 | 11.8% |
| l | 18078 | 11.7% |
| n | 18015 | 11.7% |
| i | 16413 | 10.6% |
| m | 12961 | 8.4% |
| r | 5652 | 3.7% |
| a | 4155 | 2.7% |
| o | 3809 | 2.5% |
| d | 2152 | 1.4% |
| Other values (6) | 3944 | 2.6% |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 16670 | |
| C | 14224 | |
| G | 1879 | 5.3% |
| N | 1806 | 5.1% |
| O | 306 | 0.9% |
| W | 171 | 0.5% |
| S | 72 | 0.2% |
| M | 69 | 0.2% |
Space Separator
| Value | Count | Frequency (%) |
| 5452 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1806 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 34 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 34 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 189540 | |
| Common | 7326 | 3.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 50949 | |
| t | 18215 | 9.6% |
| l | 18078 | 9.5% |
| n | 18015 | 9.5% |
| T | 16670 | 8.8% |
| i | 16413 | 8.7% |
| C | 14224 | 7.5% |
| m | 12961 | 6.8% |
| r | 5652 | 3.0% |
| a | 4155 | 2.2% |
| Other values (14) | 14208 | 7.5% |
Common
| Value | Count | Frequency (%) |
| 5452 | ||
| - | 1806 | 24.7% |
| ( | 34 | 0.5% |
| ) | 34 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 196866 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 50949 | |
| t | 18215 | 9.3% |
| l | 18078 | 9.2% |
| n | 18015 | 9.2% |
| T | 16670 | 8.5% |
| i | 16413 | 8.3% |
| C | 14224 | 7.2% |
| m | 12961 | 6.6% |
| r | 5652 | 2.9% |
| 5452 | 2.8% | |
| Other values (18) | 20237 | 10.3% |
no_of_doors_opened_to_external_environment
Real number (ℝ)
Missing  Skewed  Zeros 
| Distinct | 18 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 3017 |
| Missing (%) | 8.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.50076368 |
| Minimum | 0 |
|---|---|
| Maximum | 90 |
| Zeros | 19548 |
| Zeros (%) | 53.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 284.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 90 |
| Range | 90 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.93502175 |
|---|---|
| Coefficient of variation (CV) | 1.8671916 |
| Kurtosis | 2638.9937 |
| Mean | 0.50076368 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 30.980149 |
| Sum | 16721 |
| Variance | 0.87426566 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 19548 | |
| 1 | 12298 | |
| 2 | 972 | 2.7% |
| 3 | 280 | 0.8% |
| 4 | 174 | 0.5% |
| 5 | 33 | 0.1% |
| 6 | 33 | 0.1% |
| 8 | 20 | 0.1% |
| 7 | 16 | < 0.1% |
| 12 | 4 | < 0.1% |
| Other values (8) | 13 | < 0.1% |
| (Missing) | 3017 | 8.3% |
| Value | Count | Frequency (%) |
| 0 | 19548 | |
| 1 | 12298 | |
| 2 | 972 | 2.7% |
| 3 | 280 | 0.8% |
| 4 | 174 | 0.5% |
| 5 | 33 | 0.1% |
| 6 | 33 | 0.1% |
| 7 | 16 | < 0.1% |
| 8 | 20 | 0.1% |
| 9 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 90 | 1 | < 0.1% |
| 41 | 1 | < 0.1% |
| 21 | 1 | < 0.1% |
| 15 | 1 | < 0.1% |
| 13 | 1 | < 0.1% |
| 12 | 4 | < 0.1% |
| 11 | 3 | < 0.1% |
| 10 | 2 | < 0.1% |
| 9 | 3 | < 0.1% |
| 8 | 20 |
no_of_windows
Real number (ℝ)
High correlation  Missing  Zeros 
| Distinct | 24 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 3017 |
| Missing (%) | 8.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.5957294 |
| Minimum | 0 |
|---|---|
| Maximum | 52 |
| Zeros | 12279 |
| Zeros (%) | 33.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 284.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 5 |
| Maximum | 52 |
| Range | 52 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.9270753 |
|---|---|
| Coefficient of variation (CV) | 1.2076455 |
| Kurtosis | 23.017829 |
| Mean | 1.5957294 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 2.7094613 |
| Sum | 53283 |
| Variance | 3.7136194 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 12279 | |
| 1 | 6797 | |
| 2 | 6039 | |
| 3 | 4770 | 13.1% |
| 4 | 1498 | 4.1% |
| 6 | 643 | 1.8% |
| 5 | 466 | 1.3% |
| 8 | 282 | 0.8% |
| 7 | 279 | 0.8% |
| 9 | 137 | 0.4% |
| Other values (14) | 201 | 0.6% |
| (Missing) | 3017 | 8.3% |
| Value | Count | Frequency (%) |
| 0 | 12279 | |
| 1 | 6797 | |
| 2 | 6039 | |
| 3 | 4770 | 13.1% |
| 4 | 1498 | 4.1% |
| 5 | 466 | 1.3% |
| 6 | 643 | 1.8% |
| 7 | 279 | 0.8% |
| 8 | 282 | 0.8% |
| 9 | 137 | 0.4% |
| Value | Count | Frequency (%) |
| 52 | 1 | < 0.1% |
| 28 | 1 | < 0.1% |
| 22 | 2 | < 0.1% |
| 21 | 1 | < 0.1% |
| 20 | 4 | |
| 18 | 2 | < 0.1% |
| 17 | 4 | |
| 16 | 8 | |
| 15 | 6 | |
| 14 | 9 |
main_material_used_for_window_panes
Categorical
Imbalance  Missing 
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 22084 |
| Missing (%) | 60.7% |
| Memory size | 284.6 KiB |
| Glass | |
|---|---|
| Wood | |
| Other | 203 |
| None, it's open | 185 |
| Net | 142 |
Length
| Max length | 21 |
|---|---|
| Median length | 5 |
| Mean length | 4.9256493 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Glass |
|---|---|
| 2nd row | Glass |
| 3rd row | Glass |
| 4th row | Glass |
| 5th row | Glass |
Common Values
| Value | Count | Frequency (%) |
| Glass | 10942 | |
| Wood | 2839 | 7.8% |
| Other | 203 | 0.6% |
| None, it's open | 185 | 0.5% |
| Net | 142 | 0.4% |
| Garden - Not relevant | 13 | < 0.1% |
| (Missing) | 22084 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| glass | 10942 | |
| wood | 2839 | 19.3% |
| other | 203 | 1.4% |
| none | 185 | 1.3% |
| it's | 185 | 1.3% |
| open | 185 | 1.3% |
| net | 142 | 1.0% |
| garden | 13 | 0.1% |
| 13 | 0.1% | |
| not | 13 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 22069 | |
| a | 10968 | |
| G | 10955 | |
| l | 10955 | |
| o | 6061 | 8.6% |
| d | 2852 | 4.0% |
| W | 2839 | 4.0% |
| e | 754 | 1.1% |
| t | 556 | 0.8% |
| 409 | 0.6% | |
| Other values (11) | 2137 | 3.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 55426 | |
| Uppercase Letter | 14337 | 20.3% |
| Space Separator | 409 | 0.6% |
| Other Punctuation | 370 | 0.5% |
| Dash Punctuation | 13 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| s | 22069 | |
| a | 10968 | |
| l | 10955 | |
| o | 6061 | 10.9% |
| d | 2852 | 5.1% |
| e | 754 | 1.4% |
| t | 556 | 1.0% |
| n | 396 | 0.7% |
| r | 229 | 0.4% |
| h | 203 | 0.4% |
| Other values (3) | 383 | 0.7% |
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 10955 | |
| W | 2839 | 19.8% |
| N | 340 | 2.4% |
| O | 203 | 1.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 185 | |
| ' | 185 |
Space Separator
| Value | Count | Frequency (%) |
| 409 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 13 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 69763 | |
| Common | 792 | 1.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| s | 22069 | |
| a | 10968 | |
| G | 10955 | |
| l | 10955 | |
| o | 6061 | 8.7% |
| d | 2852 | 4.1% |
| W | 2839 | 4.1% |
| e | 754 | 1.1% |
| t | 556 | 0.8% |
| n | 396 | 0.6% |
| Other values (7) | 1358 | 1.9% |
Common
| Value | Count | Frequency (%) |
| 409 | ||
| , | 185 | |
| ' | 185 | |
| - | 13 | 1.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 70555 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| s | 22069 | |
| a | 10968 | |
| G | 10955 | |
| l | 10955 | |
| o | 6061 | 8.6% |
| d | 2852 | 4.0% |
| W | 2839 | 4.0% |
| e | 754 | 1.1% |
| t | 556 | 0.8% |
| 409 | 0.6% | |
| Other values (11) | 2137 | 3.0% |
have_curtains_or_blinds_for_windows
Categorical
Missing 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 15296 |
| Missing (%) | 42.0% |
| Memory size | 284.6 KiB |
| Yes | |
|---|---|
| No | |
| Garden - Not relevant | 38 |
Length
| Max length | 21 |
|---|---|
| Median length | 3 |
| Mean length | 2.6610932 |
| Min length | 2 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Yes |
|---|---|
| 2nd row | Yes |
| 3rd row | Yes |
| 4th row | Yes |
| 5th row | Yes |
Common Values
| Value | Count | Frequency (%) |
| Yes | 13235 | |
| No | 7839 | |
| Garden - Not relevant | 38 | 0.1% |
| (Missing) | 15296 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| yes | 13235 | |
| no | 7839 | |
| garden | 38 | 0.2% |
| 38 | 0.2% | |
| not | 38 | 0.2% |
| relevant | 38 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 13349 | |
| Y | 13235 | |
| s | 13235 | |
| N | 7877 | |
| o | 7877 | |
| 114 | 0.2% | |
| a | 76 | 0.1% |
| n | 76 | 0.1% |
| r | 76 | 0.1% |
| t | 76 | 0.1% |
| Other values (5) | 190 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 34879 | |
| Uppercase Letter | 21150 | |
| Space Separator | 114 | 0.2% |
| Dash Punctuation | 38 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 13349 | |
| s | 13235 | |
| o | 7877 | |
| a | 76 | 0.2% |
| n | 76 | 0.2% |
| r | 76 | 0.2% |
| t | 76 | 0.2% |
| d | 38 | 0.1% |
| l | 38 | 0.1% |
| v | 38 | 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| Y | 13235 | |
| N | 7877 | |
| G | 38 | 0.2% |
Space Separator
| Value | Count | Frequency (%) |
| 114 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 38 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 56029 | |
| Common | 152 | 0.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 13349 | |
| Y | 13235 | |
| s | 13235 | |
| N | 7877 | |
| o | 7877 | |
| a | 76 | 0.1% |
| n | 76 | 0.1% |
| r | 76 | 0.1% |
| t | 76 | 0.1% |
| G | 38 | 0.1% |
| Other values (3) | 114 | 0.2% |
Common
| Value | Count | Frequency (%) |
| 114 | ||
| - | 38 | 25.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 56181 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 13349 | |
| Y | 13235 | |
| s | 13235 | |
| N | 7877 | |
| o | 7877 | |
| 114 | 0.2% | |
| a | 76 | 0.1% |
| n | 76 | 0.1% |
| r | 76 | 0.1% |
| t | 76 | 0.1% |
| Other values (5) | 190 | 0.3% |
have_extra_ventilation_openings
Categorical
High correlation  Missing 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3017 |
| Missing (%) | 8.3% |
| Memory size | 284.6 KiB |
| Yes | |
|---|---|
| No | |
| Garden - Not relevant |
Length
| Max length | 21 |
|---|---|
| Median length | 3 |
| Mean length | 3.8432212 |
| Min length | 2 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | No |
|---|---|
| 2nd row | No |
| 3rd row | No |
| 4th row | No |
| 5th row | No |
Common Values
| Value | Count | Frequency (%) |
| Yes | 17714 | |
| No | 13370 | |
| Garden - Not relevant | 2307 | 6.3% |
| (Missing) | 3017 | 8.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| yes | 17714 | |
| no | 13370 | |
| garden | 2307 | 5.7% |
| 2307 | 5.7% | |
| not | 2307 | 5.7% |
| relevant | 2307 | 5.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 24635 | |
| Y | 17714 | |
| s | 17714 | |
| N | 15677 | |
| o | 15677 | |
| 6921 | 5.4% | |
| a | 4614 | 3.6% |
| n | 4614 | 3.6% |
| r | 4614 | 3.6% |
| t | 4614 | 3.6% |
| Other values (5) | 11535 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 83403 | |
| Uppercase Letter | 35698 | |
| Space Separator | 6921 | 5.4% |
| Dash Punctuation | 2307 | 1.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 24635 | |
| s | 17714 | |
| o | 15677 | |
| a | 4614 | 5.5% |
| n | 4614 | 5.5% |
| r | 4614 | 5.5% |
| t | 4614 | 5.5% |
| d | 2307 | 2.8% |
| l | 2307 | 2.8% |
| v | 2307 | 2.8% |
Uppercase Letter
| Value | Count | Frequency (%) |
| Y | 17714 | |
| N | 15677 | |
| G | 2307 | 6.5% |
Space Separator
| Value | Count | Frequency (%) |
| 6921 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2307 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 119101 | |
| Common | 9228 | 7.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 24635 | |
| Y | 17714 | |
| s | 17714 | |
| N | 15677 | |
| o | 15677 | |
| a | 4614 | 3.9% |
| n | 4614 | 3.9% |
| r | 4614 | 3.9% |
| t | 4614 | 3.9% |
| G | 2307 | 1.9% |
| Other values (3) | 6921 | 5.8% |
Common
| Value | Count | Frequency (%) |
| 6921 | ||
| - | 2307 | 25.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 128329 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 24635 | |
| Y | 17714 | |
| s | 17714 | |
| N | 15677 | |
| o | 15677 | |
| 6921 | 5.4% | |
| a | 4614 | 3.6% |
| n | 4614 | 3.6% |
| r | 4614 | 3.6% |
| t | 4614 | 3.6% |
| Other values (5) | 11535 |
no_of_bulbs_in_the_room
Real number (ℝ)
Missing  Zeros 
| Distinct | 21 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 3017 |
| Missing (%) | 8.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.7330418 |
| Minimum | 0 |
|---|---|
| Maximum | 20 |
| Zeros | 2257 |
| Zeros (%) | 6.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 284.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 5 |
| Maximum | 20 |
| Range | 20 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 2.0331509 |
|---|---|
| Coefficient of variation (CV) | 1.173169 |
| Kurtosis | 25.597735 |
| Mean | 1.7330418 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.3728391 |
| Sum | 57868 |
| Variance | 4.1337028 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 20781 | |
| 2 | 5581 | 15.3% |
| 0 | 2257 | 6.2% |
| 3 | 1558 | 4.3% |
| 4 | 1157 | 3.2% |
| 5 | 557 | 1.5% |
| 6 | 452 | 1.2% |
| 7 | 220 | 0.6% |
| 8 | 197 | 0.5% |
| 10 | 166 | 0.5% |
| Other values (11) | 465 | 1.3% |
| (Missing) | 3017 | 8.3% |
| Value | Count | Frequency (%) |
| 0 | 2257 | 6.2% |
| 1 | 20781 | |
| 2 | 5581 | 15.3% |
| 3 | 1558 | 4.3% |
| 4 | 1157 | 3.2% |
| 5 | 557 | 1.5% |
| 6 | 452 | 1.2% |
| 7 | 220 | 0.6% |
| 8 | 197 | 0.5% |
| 9 | 98 | 0.3% |
| Value | Count | Frequency (%) |
| 20 | 54 | |
| 19 | 9 | < 0.1% |
| 18 | 23 | 0.1% |
| 17 | 3 | < 0.1% |
| 16 | 16 | < 0.1% |
| 15 | 102 | |
| 14 | 32 | 0.1% |
| 13 | 23 | 0.1% |
| 12 | 64 | |
| 11 | 41 |
no_of_bulbs_used_during_last_week
Real number (ℝ)
Missing  Zeros 
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 26603 |
| Missing (%) | 73.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.0333503 |
| Minimum | 0 |
|---|---|
| Maximum | 11 |
| Zeros | 1291 |
| Zeros (%) | 3.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 284.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 11 |
| Range | 11 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.70639199 |
|---|---|
| Coefficient of variation (CV) | 0.6835939 |
| Kurtosis | 22.445811 |
| Mean | 1.0333503 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.1117452 |
| Sum | 10132 |
| Variance | 0.49898964 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 7458 | 20.5% |
| 0 | 1291 | 3.5% |
| 2 | 726 | 2.0% |
| 3 | 200 | 0.5% |
| 4 | 83 | 0.2% |
| 6 | 19 | 0.1% |
| 5 | 16 | < 0.1% |
| 7 | 6 | < 0.1% |
| 8 | 3 | < 0.1% |
| 11 | 1 | < 0.1% |
| Other values (2) | 2 | < 0.1% |
| (Missing) | 26603 |
| Value | Count | Frequency (%) |
| 0 | 1291 | 3.5% |
| 1 | 7458 | |
| 2 | 726 | 2.0% |
| 3 | 200 | 0.5% |
| 4 | 83 | 0.2% |
| 5 | 16 | < 0.1% |
| 6 | 19 | 0.1% |
| 7 | 6 | < 0.1% |
| 8 | 3 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 11 | 1 | < 0.1% |
| 10 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| 8 | 3 | < 0.1% |
| 7 | 6 | < 0.1% |
| 6 | 19 | 0.1% |
| 5 | 16 | < 0.1% |
| 4 | 83 | 0.2% |
| 3 | 200 | 0.5% |
| 2 | 726 |
no_of_fans_in_the_room
Real number (ℝ)
High correlation  Missing  Zeros 
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3017 |
| Missing (%) | 8.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.40504926 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 21657 |
| Zeros (%) | 59.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 284.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.63488269 |
|---|---|
| Coefficient of variation (CV) | 1.5674209 |
| Kurtosis | 12.267449 |
| Mean | 0.40504926 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.3606338 |
| Sum | 13525 |
| Variance | 0.40307603 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 21657 | |
| 1 | 10512 | |
| 2 | 873 | 2.4% |
| 3 | 200 | 0.5% |
| 4 | 101 | 0.3% |
| 5 | 41 | 0.1% |
| 9 | 3 | < 0.1% |
| 8 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
| 10 | 1 | < 0.1% |
| (Missing) | 3017 | 8.3% |
| Value | Count | Frequency (%) |
| 0 | 21657 | |
| 1 | 10512 | |
| 2 | 873 | 2.4% |
| 3 | 200 | 0.5% |
| 4 | 101 | 0.3% |
| 5 | 41 | 0.1% |
| 6 | 1 | < 0.1% |
| 7 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 10 | 1 | < 0.1% |
| 9 | 3 | < 0.1% |
| 8 | 1 | < 0.1% |
| 7 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
| 5 | 41 | 0.1% |
| 4 | 101 | 0.3% |
| 3 | 200 | 0.5% |
| 2 | 873 | 2.4% |
| 1 | 10512 |
no_of_ACs_in_the_room
Categorical
Imbalance  Missing 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3017 |
| Missing (%) | 8.3% |
| Memory size | 284.6 KiB |
| 0.0 | |
|---|---|
| 1.0 | 1029 |
| 2.0 | 30 |
| 3.0 | 26 |
| 4.0 | 1 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 32305 | |
| 1.0 | 1029 | 2.8% |
| 2.0 | 30 | 0.1% |
| 3.0 | 26 | 0.1% |
| 4.0 | 1 | < 0.1% |
| (Missing) | 3017 | 8.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 32305 | |
| 1.0 | 1029 | 3.1% |
| 2.0 | 30 | 0.1% |
| 3.0 | 26 | 0.1% |
| 4.0 | 1 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 65696 | |
| . | 33391 | |
| 1 | 1029 | 1.0% |
| 2 | 30 | < 0.1% |
| 3 | 26 | < 0.1% |
| 4 | 1 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 66782 | |
| Other Punctuation | 33391 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 65696 | |
| 1 | 1029 | 1.5% |
| 2 | 30 | < 0.1% |
| 3 | 26 | < 0.1% |
| 4 | 1 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 33391 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100173 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 65696 | |
| . | 33391 | |
| 1 | 1029 | 1.0% |
| 2 | 30 | < 0.1% |
| 3 | 26 | < 0.1% |
| 4 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100173 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 65696 | |
| . | 33391 | |
| 1 | 1029 | 1.0% |
| 2 | 30 | < 0.1% |
| 3 | 26 | < 0.1% |
| 4 | 1 | < 0.1% |
Interactions
Correlations
| have_curtains_or_blinds_for_windows | have_extra_ventilation_openings | main_material_used_for_the_floor_of_the_room | main_material_used_for_the_roof_of_the_room | main_material_used_for_window_panes | main_purpose_of_the_room | no_of_ACs_in_the_room | no_of_bulbs_in_the_room | no_of_bulbs_used_during_last_week | no_of_doors_opened_to_external_environment | no_of_fans_in_the_room | no_of_windows | room_ID | storey_which_the_room_located | type_of_ceiling_of_the_room | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| have_curtains_or_blinds_for_windows | 1.000 | 0.363 | 0.179 | 0.210 | 0.295 | 0.327 | 0.101 | 0.117 | 0.098 | 0.045 | 0.077 | 0.105 | 0.259 | 0.017 | 0.151 |
| have_extra_ventilation_openings | 0.363 | 1.000 | 0.558 | 0.543 | 0.296 | 0.462 | 0.039 | 0.074 | 0.070 | 0.000 | 0.063 | 0.082 | 0.277 | 0.029 | 0.549 |
| main_material_used_for_the_floor_of_the_room | 0.179 | 0.558 | 1.000 | 0.357 | 0.178 | 0.223 | 0.082 | 0.073 | 0.051 | 0.000 | 0.043 | 0.044 | 0.126 | 0.025 | 0.372 |
| main_material_used_for_the_roof_of_the_room | 0.210 | 0.543 | 0.357 | 1.000 | 0.179 | 0.235 | 0.027 | 0.031 | 0.034 | 0.000 | 0.033 | 0.027 | 0.118 | 0.037 | 0.455 |
| main_material_used_for_window_panes | 0.295 | 0.296 | 0.178 | 0.179 | 1.000 | 0.136 | 0.015 | 0.030 | 0.000 | 0.000 | 0.029 | 0.020 | 0.030 | 0.011 | 0.127 |
| main_purpose_of_the_room | 0.327 | 0.462 | 0.223 | 0.235 | 0.136 | 1.000 | 0.093 | 0.124 | 0.089 | 0.000 | 0.080 | 0.162 | 0.443 | 0.038 | 0.258 |
| no_of_ACs_in_the_room | 0.101 | 0.039 | 0.082 | 0.027 | 0.015 | 0.093 | 1.000 | 0.373 | 0.195 | 0.011 | 0.205 | 0.033 | 0.070 | 0.028 | 0.069 |
| no_of_bulbs_in_the_room | 0.117 | 0.074 | 0.073 | 0.031 | 0.030 | 0.124 | 0.373 | 1.000 | 0.342 | 0.186 | 0.305 | 0.275 | 0.121 | 0.065 | 0.079 |
| no_of_bulbs_used_during_last_week | 0.098 | 0.070 | 0.051 | 0.034 | 0.000 | 0.089 | 0.195 | 0.342 | 1.000 | 0.187 | 0.215 | 0.181 | 0.083 | -0.117 | 0.069 |
| no_of_doors_opened_to_external_environment | 0.045 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.011 | 0.186 | 0.187 | 1.000 | 0.165 | 0.269 | 0.000 | -0.004 | 0.003 |
| no_of_fans_in_the_room | 0.077 | 0.063 | 0.043 | 0.033 | 0.029 | 0.080 | 0.205 | 0.305 | 0.215 | 0.165 | 1.000 | 0.520 | 0.080 | 0.044 | 0.051 |
| no_of_windows | 0.105 | 0.082 | 0.044 | 0.027 | 0.020 | 0.162 | 0.033 | 0.275 | 0.181 | 0.269 | 0.520 | 1.000 | 0.159 | 0.022 | 0.046 |
| room_ID | 0.259 | 0.277 | 0.126 | 0.118 | 0.030 | 0.443 | 0.070 | 0.121 | 0.083 | 0.000 | 0.080 | 0.159 | 1.000 | 0.017 | 0.134 |
| storey_which_the_room_located | 0.017 | 0.029 | 0.025 | 0.037 | 0.011 | 0.038 | 0.028 | 0.065 | -0.117 | -0.004 | 0.044 | 0.022 | 0.017 | 1.000 | 0.036 |
| type_of_ceiling_of_the_room | 0.151 | 0.549 | 0.372 | 0.455 | 0.127 | 0.258 | 0.069 | 0.079 | 0.069 | 0.003 | 0.051 | 0.046 | 0.134 | 0.036 | 1.000 |
Missing values
Sample
| household_ID | room_ID | main_purpose_of_the_room | storey_which_the_room_located | main_material_used_for_the_roof_of_the_room | type_of_ceiling_of_the_room | main_material_used_for_the_floor_of_the_room | no_of_doors_opened_to_external_environment | no_of_windows | main_material_used_for_window_panes | have_curtains_or_blinds_for_windows | have_extra_ventilation_openings | no_of_bulbs_in_the_room | no_of_bulbs_used_during_last_week | no_of_fans_in_the_room | no_of_ACs_in_the_room | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ID0001 | I1 | Living room | 0.0 | Asbestos | A beamed ceiling | Tile | 1.0 | 2.0 | Glass | Yes | No | 4.0 | NaN | 1.0 | 0.0 |
| 1 | ID0001 | I2 | Bedrooms | 0.0 | Asbestos | A beamed ceiling | Tile | 1.0 | 1.0 | Glass | Yes | No | 1.0 | NaN | 1.0 | 0.0 |
| 2 | ID0001 | I3 | Bedrooms | 0.0 | Asbestos | A beamed ceiling | Tile | 1.0 | 1.0 | Glass | Yes | No | 1.0 | NaN | 1.0 | 0.0 |
| 3 | ID0001 | I4 | Bedrooms | 1.0 | Asbestos | A beamed ceiling | Cement | 1.0 | 1.0 | Glass | Yes | No | 1.0 | NaN | 1.0 | 0.0 |
| 4 | ID0001 | I5 | Bedrooms | 1.0 | Asbestos | A beamed ceiling | Cement | 1.0 | 1.0 | Glass | Yes | No | 1.0 | NaN | 1.0 | 0.0 |
| 5 | ID0001 | I6 | Kitchen and/ or pantry | 1.0 | Asbestos | A beamed ceiling | Tile | 1.0 | 1.0 | Glass | Yes | Yes | 1.0 | NaN | 0.0 | 0.0 |
| 6 | ID0001 | I7 | Bathroom and / or toilets | 0.0 | Asbestos | A beamed ceiling | Tile | 1.0 | 0.0 | NaN | NaN | Yes | 1.0 | NaN | 0.0 | 0.0 |
| 7 | ID0001 | I8 | Gaming room | 0.0 | Asbestos | No ceiling, the concrete slab | Concrete | 0.0 | 0.0 | NaN | NaN | No | 0.0 | NaN | 0.0 | 0.0 |
| 8 | ID0001 | I9 | Servant's Room | 0.0 | Garden - Not relevant | Garden - Not relevant | Garden - Not relevant | 0.0 | 0.0 | NaN | NaN | Garden - Not relevant | 0.0 | NaN | 0.0 | 0.0 |
| 9 | ID0001 | I10 | Passage | 0.0 | Garden - Not relevant | Garden - Not relevant | Garden - Not relevant | 0.0 | 0.0 | NaN | NaN | Garden - Not relevant | 0.0 | NaN | 0.0 | 0.0 |
| household_ID | room_ID | main_purpose_of_the_room | storey_which_the_room_located | main_material_used_for_the_roof_of_the_room | type_of_ceiling_of_the_room | main_material_used_for_the_floor_of_the_room | no_of_doors_opened_to_external_environment | no_of_windows | main_material_used_for_window_panes | have_curtains_or_blinds_for_windows | have_extra_ventilation_openings | no_of_bulbs_in_the_room | no_of_bulbs_used_during_last_week | no_of_fans_in_the_room | no_of_ACs_in_the_room | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 36398 | ID4063 | I1 | Living room | 0.0 | Concrete | No ceiling, the concrete slab | Concrete | 1.0 | 6.0 | NaN | Yes | No | 2.0 | 1.0 | 1.0 | 0.0 |
| 36399 | ID4063 | I2 | Living room | 0.0 | Concrete | No ceiling, the concrete slab | Concrete | 1.0 | 3.0 | NaN | Yes | No | 1.0 | 1.0 | 1.0 | 0.0 |
| 36400 | ID4063 | I3 | Bedrooms | 0.0 | Concrete | No ceiling, the concrete slab | Concrete | 1.0 | 3.0 | NaN | Yes | No | 1.0 | 1.0 | 1.0 | 0.0 |
| 36401 | ID4063 | I4 | Bedrooms | 1.0 | Asbestos | No ceiling, just the roof above | Concrete | 1.0 | 2.0 | NaN | No | No | 0.0 | NaN | 0.0 | 0.0 |
| 36402 | ID4063 | I5 | Bedrooms | 1.0 | Asbestos | No ceiling, just the roof above | Concrete | 1.0 | 2.0 | NaN | No | No | 0.0 | NaN | 0.0 | 0.0 |
| 36403 | ID4063 | I6 | Bedrooms | 1.0 | Asbestos | No ceiling, just the roof above | Concrete | 2.0 | 4.0 | NaN | No | No | 1.0 | 0.0 | 0.0 | 0.0 |
| 36404 | ID4063 | I7 | Kitchen and/ or pantry | 0.0 | Concrete | No ceiling, the concrete slab | Concrete | 1.0 | 3.0 | NaN | Yes | No | 1.0 | 1.0 | 0.0 | 0.0 |
| 36405 | ID4063 | I8 | Passage | 0.0 | Concrete | No ceiling, the concrete slab | Concrete | 1.0 | 3.0 | NaN | No | No | 1.0 | 1.0 | 0.0 | 0.0 |
| 36406 | ID4063 | I9 | Veranda | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 36407 | ID4063 | I10 | Veranda | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |