Overview

Brought to you by YData

Dataset statistics

Number of variables4
Number of observations53,599
Missing cells21,780
Missing cells (%)10.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 MiB
Average record size in memory32.0 B

Variable types

Text3
Numeric1

Alerts

no_of_hours_used_during_last_week has 21780 (40.6%) missing values Missing
no_of_hours_used_during_last_week has 7837 (14.6%) zeros Zeros

Reproduction

Analysis started2024-12-06 05:54:32.951821
Analysis finished2024-12-06 05:54:33.577925
Duration0.63 seconds
Software versionydata-profiling vv4.11.0
Download configurationconfig.json

Variables

Distinct4055
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Memory size418.9 KiB
2024-12-06T11:24:33.762266image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters321,594
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique53 ?
Unique (%)0.1%

Sample

1st rowID0001
2nd rowID0001
3rd rowID0001
4th rowID0001
5th rowID0001
ValueCountFrequency (%)
id0255 66
 
0.1%
id0772 59
 
0.1%
id3663 55
 
0.1%
id2774 54
 
0.1%
id2262 53
 
0.1%
id3068 52
 
0.1%
id0841 52
 
0.1%
id3787 52
 
0.1%
id3910 51
 
0.1%
id2420 51
 
0.1%
Other values (4045) 53054
99.0%
2024-12-06T11:24:34.309345image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 53599
16.7%
D 53599
16.7%
3 30301
9.4%
0 29498
9.2%
2 29272
9.1%
1 29216
9.1%
7 16459
 
5.1%
4 16396
 
5.1%
6 16312
 
5.1%
8 15797
 
4.9%
Other values (2) 31145
9.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 214396
66.7%
Uppercase Letter 107198
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 30301
14.1%
0 29498
13.8%
2 29272
13.7%
1 29216
13.6%
7 16459
7.7%
4 16396
7.6%
6 16312
7.6%
8 15797
7.4%
9 15598
7.3%
5 15547
7.3%
Uppercase Letter
ValueCountFrequency (%)
I 53599
50.0%
D 53599
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 214396
66.7%
Latin 107198
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
3 30301
14.1%
0 29498
13.8%
2 29272
13.7%
1 29216
13.6%
7 16459
7.7%
4 16396
7.6%
6 16312
7.6%
8 15797
7.4%
9 15598
7.3%
5 15547
7.3%
Latin
ValueCountFrequency (%)
I 53599
50.0%
D 53599
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 321594
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 53599
16.7%
D 53599
16.7%
3 30301
9.4%
0 29498
9.2%
2 29272
9.1%
1 29216
9.1%
7 16459
 
5.1%
4 16396
 
5.1%
6 16312
 
5.1%
8 15797
 
4.9%
Other values (2) 31145
9.7%
Distinct210
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size418.9 KiB
2024-12-06T11:24:34.564292image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.8323663
Min length4

Characters and Unicode

Total characters259,010
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)0.1%

Sample

1st rowO1_1
2nd rowO12_1
3rd rowO26_1
4th rowO31_1
5th rowO45_1
ValueCountFrequency (%)
o45_1 3575
 
6.7%
o26_1 3482
 
6.5%
o1_1 3433
 
6.4%
o31_1 3230
 
6.0%
o12_1 3059
 
5.7%
o8_1 2965
 
5.5%
o45_2 2523
 
4.7%
o24_1 2351
 
4.4%
o33_1 1677
 
3.1%
o13_1 1586
 
3.0%
Other values (200) 25718
48.0%
2024-12-06T11:24:34.910773image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 61383
23.7%
O 53599
20.7%
_ 53599
20.7%
4 22678
 
8.8%
2 17531
 
6.8%
3 15588
 
6.0%
5 13717
 
5.3%
6 8615
 
3.3%
8 4936
 
1.9%
7 3911
 
1.5%
Other values (2) 3453
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 151812
58.6%
Uppercase Letter 53599
 
20.7%
Connector Punctuation 53599
 
20.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 61383
40.4%
4 22678
 
14.9%
2 17531
 
11.5%
3 15588
 
10.3%
5 13717
 
9.0%
6 8615
 
5.7%
8 4936
 
3.3%
7 3911
 
2.6%
9 2521
 
1.7%
0 932
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
O 53599
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 53599
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 205411
79.3%
Latin 53599
 
20.7%

Most frequent character per script

Common
ValueCountFrequency (%)
1 61383
29.9%
_ 53599
26.1%
4 22678
 
11.0%
2 17531
 
8.5%
3 15588
 
7.6%
5 13717
 
6.7%
6 8615
 
4.2%
8 4936
 
2.4%
7 3911
 
1.9%
9 2521
 
1.2%
Latin
ValueCountFrequency (%)
O 53599
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259010
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 61383
23.7%
O 53599
20.7%
_ 53599
20.7%
4 22678
 
8.8%
2 17531
 
6.8%
3 15588
 
6.0%
5 13717
 
5.3%
6 8615
 
3.3%
8 4936
 
1.9%
7 3911
 
1.5%
Other values (2) 3453
 
1.3%
Distinct77
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size418.9 KiB
2024-12-06T11:24:35.113349image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length72
Median length57
Mean length20.162316
Min length2

Characters and Unicode

Total characters1,080,680
Distinct characters54
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRefrigerator
2nd rowRice cooker
3rd rowElectric Iron including electric steam iron
4th rowTV
5th rowMobile phone - Smart phones
ValueCountFrequency (%)
electric 17801
 
10.1%
17416
 
9.9%
phones 11425
 
6.5%
mobile 10399
 
5.9%
phone 10399
 
5.9%
tv 9731
 
5.5%
smart 7976
 
4.5%
iron 7313
 
4.1%
refrigerator 3545
 
2.0%
including 3535
 
2.0%
Other values (159) 76839
43.6%
2024-12-06T11:24:35.446304image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 124713
 
11.5%
122805
 
11.4%
r 78347
 
7.2%
o 75831
 
7.0%
i 68925
 
6.4%
t 65338
 
6.0%
c 59481
 
5.5%
n 57009
 
5.3%
a 50318
 
4.7%
l 47206
 
4.4%
Other values (44) 330707
30.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 831145
76.9%
Space Separator 122805
 
11.4%
Uppercase Letter 104979
 
9.7%
Dash Punctuation 11153
 
1.0%
Other Punctuation 8505
 
0.8%
Close Punctuation 876
 
0.1%
Open Punctuation 851
 
0.1%
Decimal Number 366
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 124713
15.0%
r 78347
9.4%
o 75831
9.1%
i 68925
8.3%
t 65338
 
7.9%
c 59481
 
7.2%
n 57009
 
6.9%
a 50318
 
6.1%
l 47206
 
5.7%
s 37871
 
4.6%
Other values (14) 166106
20.0%
Uppercase Letter
ValueCountFrequency (%)
E 14614
13.9%
M 13398
12.8%
S 12735
12.1%
T 12058
11.5%
V 11852
11.3%
R 9525
9.1%
B 5736
 
5.5%
I 3948
 
3.8%
C 3179
 
3.0%
W 3100
 
3.0%
Other values (10) 14834
14.1%
Other Punctuation
ValueCountFrequency (%)
/ 7170
84.3%
, 748
 
8.8%
. 447
 
5.3%
: 140
 
1.6%
Decimal Number
ValueCountFrequency (%)
1 313
85.5%
2 53
 
14.5%
Space Separator
ValueCountFrequency (%)
122805
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11153
100.0%
Close Punctuation
ValueCountFrequency (%)
) 876
100.0%
Open Punctuation
ValueCountFrequency (%)
( 851
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 936124
86.6%
Common 144556
 
13.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 124713
13.3%
r 78347
 
8.4%
o 75831
 
8.1%
i 68925
 
7.4%
t 65338
 
7.0%
c 59481
 
6.4%
n 57009
 
6.1%
a 50318
 
5.4%
l 47206
 
5.0%
s 37871
 
4.0%
Other values (34) 271085
29.0%
Common
ValueCountFrequency (%)
122805
85.0%
- 11153
 
7.7%
/ 7170
 
5.0%
) 876
 
0.6%
( 851
 
0.6%
, 748
 
0.5%
. 447
 
0.3%
1 313
 
0.2%
: 140
 
0.1%
2 53
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1080680
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 124713
 
11.5%
122805
 
11.4%
r 78347
 
7.2%
o 75831
 
7.0%
i 68925
 
6.4%
t 65338
 
6.0%
c 59481
 
5.5%
n 57009
 
5.3%
a 50318
 
4.7%
l 47206
 
4.4%
Other values (44) 330707
30.6%

no_of_hours_used_during_last_week
Real number (ℝ)

Missing  Zeros 

Distinct374
Distinct (%)1.2%
Missing21780
Missing (%)40.6%
Infinite0
Infinite (%)0.0%
Mean22.21993
Minimum0
Maximum168
Zeros7837
Zeros (%)14.6%
Negative0
Negative (%)0.0%
Memory size418.9 KiB
2024-12-06T11:24:35.563670image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.12
median1.25
Q37
95-th percentile168
Maximum168
Range168
Interquartile range (IQR)6.88

Descriptive statistics

Standard deviation49.93416
Coefficient of variation (CV)2.247269
Kurtosis4.0407563
Mean22.21993
Median Absolute Deviation (MAD)1.25
Skewness2.3841297
Sum707015.96
Variance2493.4203
MonotonicityNot monotonic
2024-12-06T11:24:35.679708image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 7837
 
14.6%
1 2997
 
5.6%
168 2944
 
5.5%
0.25 1647
 
3.1%
2 1432
 
2.7%
0.5 1421
 
2.7%
7 1366
 
2.5%
3.5 1098
 
2.0%
14 999
 
1.9%
3 896
 
1.7%
Other values (364) 9182
17.1%
(Missing) 21780
40.6%
ValueCountFrequency (%)
0 7837
14.6%
0.02 1
 
< 0.1%
0.025 2
 
< 0.1%
0.05 10
 
< 0.1%
0.083 1
 
< 0.1%
0.1 78
 
0.1%
0.11 1
 
< 0.1%
0.117 3
 
< 0.1%
0.12 30
 
0.1%
0.125 9
 
< 0.1%
ValueCountFrequency (%)
168 2944
5.5%
167 3
 
< 0.1%
166.075 1
 
< 0.1%
166 3
 
< 0.1%
165 7
 
< 0.1%
164 5
 
< 0.1%
163 1
 
< 0.1%
161 1
 
< 0.1%
160 24
 
< 0.1%
159 1
 
< 0.1%

Interactions

2024-12-06T11:24:33.287783image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Missing values

2024-12-06T11:24:33.426865image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-12-06T11:24:33.518674image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

household_IDappliance_IDappliance_typeno_of_hours_used_during_last_week
0ID0001O1_1Refrigerator84.0
1ID0001O12_1Rice cooker3.0
2ID0001O26_1Electric Iron including electric steam iron1.0
3ID0001O31_1TV0.0
4ID0001O45_1Mobile phone - Smart phones5.0
5ID0001O45_2Mobile phone - Smart phones5.0
6ID0001O47_1Mobile phone - Basic phonesNaN
7ID0001O47_2Mobile phone - Basic phonesNaN
8ID0002O1_1Refrigerator168.0
9ID0002O12_1Rice cooker0.0
household_IDappliance_IDappliance_typeno_of_hours_used_during_last_week
53589ID4063O33_1Dialog TV / Peo TV / Satellite TV boxNaN
53590ID4063O43_1Laptops2.0
53591ID4063O44_1RoutersNaN
53592ID4063O45_1Mobile phone - Smart phones3.0
53593ID4063O45_2Mobile phone - Smart phones2.0
53594ID4063O45_3Mobile phone - Smart phonesNaN
53595ID4063O45_4Mobile phone - Smart phonesNaN
53596ID4063O45_5Mobile phone - Smart phonesNaN
53597ID4063O47_1Mobile phone - Basic phonesNaN
53598ID4063O47_2Mobile phone - Basic phonesNaN