Overview

Brought to you by YData

Dataset statistics

Number of variables36
Number of observations4,063
Missing cells52,098
Missing cells (%)35.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 MiB
Average record size in memory288.0 B

Variable types

Text1
Boolean29
Numeric1
Categorical5

Alerts

biogas_used_for_cooking has constant value "False" Constant
aware_of_no_of_units_generated_by_solar_system is highly overall correlated with coconut_shells_or_charcoal_used_for_cooking and 3 other fieldsHigh correlation
boil_water_before_drinking is highly overall correlated with source_of_energy_for_boiling_drinking_waterHigh correlation
coconut_shells_or_charcoal_used_for_cooking is highly overall correlated with aware_of_no_of_units_generated_by_solar_system and 14 other fieldsHigh correlation
does_water_heating_equipment_serve_other_housing_units is highly overall correlated with coconut_shells_or_charcoal_used_for_cooking and 1 other fieldsHigh correlation
firewood_used_for_cooking is highly overall correlated with source_of_energy_for_boiling_drinking_waterHigh correlation
gas_used_for_cooking is highly overall correlated with source_of_energy_for_boiling_drinking_waterHigh correlation
generate_electicity_using_mini_hydropower is highly overall correlated with no_of_units_generated_by_solar_systemHigh correlation
generate_electicity_using_solar_energy is highly overall correlated with aware_of_no_of_units_generated_by_solar_system and 12 other fieldsHigh correlation
generate_electicity_using_wind_power is highly overall correlated with no_of_units_generated_by_solar_systemHigh correlation
household_members_used_hot_water_last_week is highly overall correlated with coconut_shells_or_charcoal_used_for_cookingHigh correlation
no_of_units_generated_by_solar_system is highly overall correlated with aware_of_no_of_units_generated_by_solar_system and 6 other fieldsHigh correlation
sawdust_or_paddy_husk_used_for_cooking is highly overall correlated with aware_of_no_of_units_generated_by_solar_system and 13 other fieldsHigh correlation
solar_energy_used_for_agricultural_systems is highly overall correlated with coconut_shells_or_charcoal_used_for_cooking and 4 other fieldsHigh correlation
solar_energy_used_for_all_above is highly overall correlated with coconut_shells_or_charcoal_used_for_cooking and 2 other fieldsHigh correlation
solar_energy_used_for_car_charging is highly overall correlated with coconut_shells_or_charcoal_used_for_cooking and 2 other fieldsHigh correlation
solar_energy_used_for_cooking is highly overall correlated with coconut_shells_or_charcoal_used_for_cooking and 2 other fieldsHigh correlation
solar_energy_used_for_other_purposes is highly overall correlated with coconut_shells_or_charcoal_used_for_cooking and 2 other fieldsHigh correlation
solar_energy_used_for_outdoor_lighting is highly overall correlated with coconut_shells_or_charcoal_used_for_cooking and 3 other fieldsHigh correlation
solar_energy_used_for_water_heating is highly overall correlated with coconut_shells_or_charcoal_used_for_cooking and 3 other fieldsHigh correlation
solar_system_invertor_or_noninvertor is highly overall correlated with coconut_shells_or_charcoal_used_for_cooking and 2 other fieldsHigh correlation
solar_system_ongrid_or_offgird is highly overall correlated with coconut_shells_or_charcoal_used_for_cooking and 4 other fieldsHigh correlation
source_of_energy_for_boiling_drinking_water is highly overall correlated with boil_water_before_drinking and 5 other fieldsHigh correlation
water_heating_method_for_bathing is highly overall correlated with generate_electicity_using_solar_energyHigh correlation
when_was_solar_system_installed is highly overall correlated with coconut_shells_or_charcoal_used_for_cooking and 2 other fieldsHigh correlation
have_backup_generator is highly imbalanced (81.7%) Imbalance
generate_electicity_using_solar_energy is highly imbalanced (53.2%) Imbalance
generate_electicity_using_bio_energy is highly imbalanced (95.1%) Imbalance
generate_electicity_using_mini_hydropower is highly imbalanced (97.9%) Imbalance
generate_electicity_using_wind_power is highly imbalanced (97.5%) Imbalance
generate_electicity_using_other_methods is highly imbalanced (96.5%) Imbalance
solar_energy_used_for_water_heating is highly imbalanced (60.1%) Imbalance
solar_energy_used_for_cooking is highly imbalanced (82.0%) Imbalance
solar_energy_used_for_outdoor_lighting is highly imbalanced (59.3%) Imbalance
solar_energy_used_for_car_charging is highly imbalanced (93.7%) Imbalance
solar_energy_used_for_agricultural_systems is highly imbalanced (97.5%) Imbalance
solar_energy_used_for_all_above is highly imbalanced (86.0%) Imbalance
solar_energy_used_for_other_purposes is highly imbalanced (78.3%) Imbalance
have_system_to_store_backup_energy is highly imbalanced (61.6%) Imbalance
method_of_receiving_water is highly imbalanced (72.2%) Imbalance
does_water_heating_equipment_serve_other_housing_units is highly imbalanced (57.0%) Imbalance
electricity_generated_using_solar_energy_used_for_cooking is highly imbalanced (90.9%) Imbalance
kerosene_used_for_cooking is highly imbalanced (89.1%) Imbalance
sawdust_or_paddy_husk_used_for_cooking is highly imbalanced (97.9%) Imbalance
coconut_shells_or_charcoal_used_for_cooking is highly imbalanced (99.7%) Imbalance
other_methods_used_for_cooking is highly imbalanced (90.0%) Imbalance
solar_system_ongrid_or_offgird has 3658 (90.0%) missing values Missing
solar_system_invertor_or_noninvertor has 3658 (90.0%) missing values Missing
solar_energy_used_for_water_heating has 3658 (90.0%) missing values Missing
solar_energy_used_for_cooking has 3658 (90.0%) missing values Missing
solar_energy_used_for_outdoor_lighting has 3658 (90.0%) missing values Missing
solar_energy_used_for_car_charging has 3658 (90.0%) missing values Missing
solar_energy_used_for_agricultural_systems has 3658 (90.0%) missing values Missing
solar_energy_used_for_all_above has 3658 (90.0%) missing values Missing
solar_energy_used_for_other_purposes has 3658 (90.0%) missing values Missing
aware_of_no_of_units_generated_by_solar_system has 3658 (90.0%) missing values Missing
no_of_units_generated_by_solar_system has 3868 (95.2%) missing values Missing
when_was_solar_system_installed has 3658 (90.0%) missing values Missing
does_water_heating_equipment_serve_other_housing_units has 3302 (81.3%) missing values Missing
household_members_used_hot_water_last_week has 2457 (60.5%) missing values Missing
source_of_energy_for_boiling_drinking_water has 2233 (55.0%) missing values Missing
household_ID has unique values Unique

Reproduction

Analysis started2024-12-06 05:55:00.385407
Analysis finished2024-12-06 05:55:04.072929
Duration3.69 seconds
Software versionydata-profiling vv4.11.0
Download configurationconfig.json

Variables

household_ID
Text

Unique 

Distinct4063
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size31.9 KiB
2024-12-06T11:25:04.268354image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters24,378
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4,063 ?
Unique (%)100.0%

Sample

1st rowID0001
2nd rowID0002
3rd rowID0003
4th rowID0004
5th rowID0005
ValueCountFrequency (%)
id0039 1
 
< 0.1%
id4063 1
 
< 0.1%
id0001 1
 
< 0.1%
id0002 1
 
< 0.1%
id0003 1
 
< 0.1%
id0004 1
 
< 0.1%
id0005 1
 
< 0.1%
id0006 1
 
< 0.1%
id0007 1
 
< 0.1%
id0008 1
 
< 0.1%
Other values (4053) 4053
99.8%
2024-12-06T11:25:04.598618image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 4063
16.7%
D 4063
16.7%
0 2277
9.3%
3 2217
9.1%
2 2217
9.1%
1 2217
9.1%
4 1280
 
5.3%
5 1216
 
5.0%
6 1210
 
5.0%
7 1206
 
4.9%
Other values (2) 2412
9.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16252
66.7%
Uppercase Letter 8126
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2277
14.0%
3 2217
13.6%
2 2217
13.6%
1 2217
13.6%
4 1280
7.9%
5 1216
7.5%
6 1210
7.4%
7 1206
7.4%
8 1206
7.4%
9 1206
7.4%
Uppercase Letter
ValueCountFrequency (%)
I 4063
50.0%
D 4063
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16252
66.7%
Latin 8126
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2277
14.0%
3 2217
13.6%
2 2217
13.6%
1 2217
13.6%
4 1280
7.9%
5 1216
7.5%
6 1210
7.4%
7 1206
7.4%
8 1206
7.4%
9 1206
7.4%
Latin
ValueCountFrequency (%)
I 4063
50.0%
D 4063
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24378
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 4063
16.7%
D 4063
16.7%
0 2277
9.3%
3 2217
9.1%
2 2217
9.1%
1 2217
9.1%
4 1280
 
5.3%
5 1216
 
5.0%
6 1210
 
5.0%
7 1206
 
4.9%
Other values (2) 2412
9.9%

have_backup_generator
Boolean

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
False
3950 
True
 
113
ValueCountFrequency (%)
False 3950
97.2%
True 113
 
2.8%
2024-12-06T11:25:04.700055image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

generate_electicity_using_solar_energy
Boolean

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
False
3658 
True
405 
ValueCountFrequency (%)
False 3658
90.0%
True 405
 
10.0%
2024-12-06T11:25:04.775580image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
False
4041 
True
 
22
ValueCountFrequency (%)
False 4041
99.5%
True 22
 
0.5%
2024-12-06T11:25:04.851224image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

generate_electicity_using_mini_hydropower
Boolean

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
False
4055 
True
 
8
ValueCountFrequency (%)
False 4055
99.8%
True 8
 
0.2%
2024-12-06T11:25:04.925182image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

generate_electicity_using_wind_power
Boolean

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
False
4053 
True
 
10
ValueCountFrequency (%)
False 4053
99.8%
True 10
 
0.2%
2024-12-06T11:25:05.000532image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
False
4048 
True
 
15
ValueCountFrequency (%)
False 4048
99.6%
True 15
 
0.4%
2024-12-06T11:25:05.074789image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

solar_system_ongrid_or_offgird
Boolean

High correlation  Missing 

Distinct2
Distinct (%)0.5%
Missing3658
Missing (%)90.0%
Memory size8.1 KiB
True
 
330
False
 
75
(Missing)
3658 
ValueCountFrequency (%)
True 330
 
8.1%
False 75
 
1.8%
(Missing) 3658
90.0%
2024-12-06T11:25:05.151524image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

solar_system_invertor_or_noninvertor
Boolean

High correlation  Missing 

Distinct2
Distinct (%)0.5%
Missing3658
Missing (%)90.0%
Memory size8.1 KiB
True
 
308
False
 
97
(Missing)
3658 
ValueCountFrequency (%)
True 308
 
7.6%
False 97
 
2.4%
(Missing) 3658
90.0%
2024-12-06T11:25:05.228872image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

solar_energy_used_for_water_heating
Boolean

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.5%
Missing3658
Missing (%)90.0%
Memory size8.1 KiB
False
373 
True
 
32
(Missing)
3658 
ValueCountFrequency (%)
False 373
 
9.2%
True 32
 
0.8%
(Missing) 3658
90.0%
2024-12-06T11:25:05.307176image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

solar_energy_used_for_cooking
Boolean

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.5%
Missing3658
Missing (%)90.0%
Memory size8.1 KiB
False
394 
True
 
11
(Missing)
3658 
ValueCountFrequency (%)
False 394
 
9.7%
True 11
 
0.3%
(Missing) 3658
90.0%
2024-12-06T11:25:05.381739image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

solar_energy_used_for_outdoor_lighting
Boolean

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.5%
Missing3658
Missing (%)90.0%
Memory size8.1 KiB
False
372 
True
 
33
(Missing)
3658 
ValueCountFrequency (%)
False 372
 
9.2%
True 33
 
0.8%
(Missing) 3658
90.0%
2024-12-06T11:25:05.459085image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

solar_energy_used_for_car_charging
Boolean

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.5%
Missing3658
Missing (%)90.0%
Memory size8.1 KiB
False
402 
True
 
3
(Missing)
3658 
ValueCountFrequency (%)
False 402
 
9.9%
True 3
 
0.1%
(Missing) 3658
90.0%
2024-12-06T11:25:05.535168image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

solar_energy_used_for_agricultural_systems
Boolean

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.5%
Missing3658
Missing (%)90.0%
Memory size8.1 KiB
False
404 
True
 
1
(Missing)
3658 
ValueCountFrequency (%)
False 404
 
9.9%
True 1
 
< 0.1%
(Missing) 3658
90.0%
2024-12-06T11:25:05.612518image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

solar_energy_used_for_all_above
Boolean

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.5%
Missing3658
Missing (%)90.0%
Memory size8.1 KiB
False
397 
True
 
8
(Missing)
3658 
ValueCountFrequency (%)
False 397
 
9.8%
True 8
 
0.2%
(Missing) 3658
90.0%
2024-12-06T11:25:05.687265image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

solar_energy_used_for_other_purposes
Boolean

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.5%
Missing3658
Missing (%)90.0%
Memory size8.1 KiB
False
391 
True
 
14
(Missing)
3658 
ValueCountFrequency (%)
False 391
 
9.6%
True 14
 
0.3%
(Missing) 3658
90.0%
2024-12-06T11:25:05.762514image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

aware_of_no_of_units_generated_by_solar_system
Boolean

High correlation  Missing 

Distinct2
Distinct (%)0.5%
Missing3658
Missing (%)90.0%
Memory size8.1 KiB
False
 
210
True
 
195
(Missing)
3658 
ValueCountFrequency (%)
False 210
 
5.2%
True 195
 
4.8%
(Missing) 3658
90.0%
2024-12-06T11:25:05.837436image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

no_of_units_generated_by_solar_system
Real number (ℝ)

High correlation  Missing 

Distinct102
Distinct (%)52.3%
Missing3868
Missing (%)95.2%
Infinite0
Infinite (%)0.0%
Mean445.34718
Minimum0
Maximum2500
Zeros22
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size31.9 KiB
2024-12-06T11:25:05.933342image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1240
median400
Q3600
95-th percentile1045
Maximum2500
Range2500
Interquartile range (IQR)360

Descriptive statistics

Standard deviation381.12807
Coefficient of variation (CV)0.8557999
Kurtosis8.4346225
Mean445.34718
Median Absolute Deviation (MAD)180
Skewness2.3187772
Sum86842.7
Variance145258.61
MonotonicityNot monotonic
2024-12-06T11:25:06.048389image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)