Overview

Brought to you by YData

Dataset statistics

Number of variables6
Number of observations13,525
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory634.1 KiB
Average record size in memory48.0 B

Variable types

Text2
Categorical2
Numeric2

Alerts

no_of_hours_fan_was_on_during_daytime_last_week has 8917 (65.9%) zeros Zeros
no_of_hours_fan_was_on_during_night_last_week has 4647 (34.4%) zeros Zeros

Reproduction

Analysis started2024-12-06 05:54:51.202115
Analysis finished2024-12-06 05:54:51.981337
Duration0.78 seconds
Software versionydata-profiling vv4.11.0
Download configurationconfig.json

Variables

Distinct3817
Distinct (%)28.2%
Missing0
Missing (%)0.0%
Memory size105.8 KiB
2024-12-06T11:24:52.175372image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters81,150
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique676 ?
Unique (%)5.0%

Sample

1st rowID0001
2nd rowID0001
3rd rowID0001
4th rowID0001
5th rowID0001
ValueCountFrequency (%)
id2033 75
 
0.6%
id0282 62
 
0.5%
id1614 52
 
0.4%
id0901 50
 
0.4%
id0278 44
 
0.3%
id2969 40
 
0.3%
id1214 40
 
0.3%
id0034 40
 
0.3%
id0469 36
 
0.3%
id1227 36
 
0.3%
Other values (3807) 13050
96.5%
2024-12-06T11:24:52.511727image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 13525
16.7%
D 13525
16.7%
0 7847
9.7%
3 7395
9.1%
2 7379
9.1%
1 7357
9.1%
4 4177
 
5.1%
9 4117
 
5.1%
6 4050
 
5.0%
8 3978
 
4.9%
Other values (2) 7800
9.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 54100
66.7%
Uppercase Letter 27050
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7847
14.5%
3 7395
13.7%
2 7379
13.6%
1 7357
13.6%
4 4177
7.7%
9 4117
7.6%
6 4050
7.5%
8 3978
7.4%
7 3952
7.3%
5 3848
7.1%
Uppercase Letter
ValueCountFrequency (%)
I 13525
50.0%
D 13525
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 54100
66.7%
Latin 27050
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 7847
14.5%
3 7395
13.7%
2 7379
13.6%
1 7357
13.6%
4 4177
7.7%
9 4117
7.6%
6 4050
7.5%
8 3978
7.4%
7 3952
7.3%
5 3848
7.1%
Latin
ValueCountFrequency (%)
I 13525
50.0%
D 13525
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 81150
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 13525
16.7%
D 13525
16.7%
0 7847
9.7%
3 7395
9.1%
2 7379
9.1%
1 7357
9.1%
4 4177
 
5.1%
9 4117
 
5.1%
6 4050
 
5.0%
8 3978
 
4.9%
Other values (2) 7800
9.6%

room_ID
Categorical

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size105.8 KiB
I1
3674 
I2
3135 
I3
2524 
I4
1528 
I5
893 
Other values (18)
1771 

Length

Max length5
Median length2
Mean length2.0473198
Min length2

Characters and Unicode

Total characters27,690
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowI1
2nd rowI2
3rd rowI3
4th rowI4
5th rowI5

Common Values

ValueCountFrequency (%)
I1 3674
27.2%
I2 3135
23.2%
I3 2524
18.7%
I4 1528
11.3%
I5 893
 
6.6%
I6 534
 
3.9%
I7 332
 
2.5%
I8 217
 
1.6%
I9 178
 
1.3%
I10 127
 
0.9%
Other values (13) 383
 
2.8%

Length

2024-12-06T11:24:52.628012image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
i1 3674
27.2%
i2 3135
23.2%
i3 2524
18.7%
i4 1528
11.3%
i5 893
 
6.6%
i6 534
 
3.9%
i7 332
 
2.5%
i8 217
 
1.6%
i9 178
 
1.3%
i10 127
 
0.9%
Other values (13) 383
 
2.8%

Most occurring characters

ValueCountFrequency (%)
I 13525
48.8%
1 4254
 
15.4%
2 3213
 
11.6%
3 2595
 
9.4%
4 1557
 
5.6%
5 927
 
3.3%
6 553
 
2.0%
7 340
 
1.2%
8 224
 
0.8%
9 180
 
0.7%
Other values (4) 322
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 13970
50.5%
Uppercase Letter 13590
49.1%
Lowercase Letter 130
 
0.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 4254
30.5%
2 3213
23.0%
3 2595
18.6%
4 1557
 
11.1%
5 927
 
6.6%
6 553
 
4.0%
7 340
 
2.4%
8 224
 
1.6%
9 180
 
1.3%
0 127
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
I 13525
99.5%
O 65
 
0.5%
Lowercase Letter
ValueCountFrequency (%)
t 65
50.0%
h 65
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13970
50.5%
Latin 13720
49.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 4254
30.5%
2 3213
23.0%
3 2595
18.6%
4 1557
 
11.1%
5 927
 
6.6%
6 553
 
4.0%
7 340
 
2.4%
8 224
 
1.6%
9 180
 
1.3%
0 127
 
0.9%
Latin
ValueCountFrequency (%)
I 13525
98.6%
O 65
 
0.5%
t 65
 
0.5%
h 65
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27690
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 13525
48.8%
1 4254
 
15.4%
2 3213
 
11.6%
3 2595
 
9.4%
4 1557
 
5.6%
5 927
 
3.3%
6 553
 
2.0%
7 340
 
1.2%
8 224
 
0.8%
9 180
 
0.7%
Other values (4) 322
 
1.2%

fan_ID
Text

Distinct108
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size105.8 KiB
2024-12-06T11:24:52.778737image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length8
Median length5
Mean length5.0473937
Min length5

Characters and Unicode

Total characters68,266
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)0.2%

Sample

1st rowI1_F1
2nd rowI2_F1
3rd rowI3_F1
4th rowI4_F1
5th rowI5_F1
ValueCountFrequency (%)
i1_f1 2913
21.5%
i2_f1 2896
21.4%
i3_f1 2365
17.5%
i4_f1 1397
10.3%
i5_f1 786
 
5.8%
i1_f2 609
 
4.5%
i6_f1 437
 
3.2%
i7_f1 251
 
1.9%
i2_f2 192
 
1.4%
i8_f1 167
 
1.2%
Other values (98) 1512
11.2%
2024-12-06T11:24:53.034413image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 15989
23.4%
I 13525
19.8%
_ 13525
19.8%
F 13525
19.8%
2 4435
 
6.5%
3 2944
 
4.3%
4 1706
 
2.5%
5 975
 
1.4%
6 560
 
0.8%
7 346
 
0.5%
Other values (6) 736
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 27496
40.3%
Uppercase Letter 27115
39.7%
Connector Punctuation 13525
19.8%
Lowercase Letter 130
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 15989
58.2%
2 4435
 
16.1%
3 2944
 
10.7%
4 1706
 
6.2%
5 975
 
3.5%
6 560
 
2.0%
7 346
 
1.3%
8 229
 
0.8%
9 184
 
0.7%
0 128
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
I 13525
49.9%
F 13525
49.9%
O 65
 
0.2%
Lowercase Letter
ValueCountFrequency (%)
t 65
50.0%
h 65
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 13525
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 41021
60.1%
Latin 27245
39.9%

Most frequent character per script

Common
ValueCountFrequency (%)
1 15989
39.0%
_ 13525
33.0%
2 4435
 
10.8%
3 2944
 
7.2%
4 1706
 
4.2%
5 975
 
2.4%
6 560
 
1.4%
7 346
 
0.8%
8 229
 
0.6%
9 184
 
0.4%
Latin
ValueCountFrequency (%)
I 13525
49.6%
F 13525
49.6%
O 65
 
0.2%
t 65
 
0.2%
h 65
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 68266
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 15989
23.4%
I 13525
19.8%
_ 13525
19.8%
F 13525
19.8%
2 4435
 
6.5%
3 2944
 
4.3%
4 1706
 
2.5%
5 975
 
1.4%
6 560
 
0.8%
7 346
 
0.5%
Other values (6) 736
 
1.1%

type_of_the_fan
Categorical

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size105.8 KiB
Ceiling fan
7715 
Pedestal fan/ Stand fan
4416 
Wall fan
 
643
Table fan
 
466
Other
 
236
Other values (2)
 
49

Length

Max length23
Median length11
Mean length14.600813
Min length5

Characters and Unicode

Total characters197,476
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCeiling fan
2nd rowCeiling fan
3rd rowCeiling fan
4th rowCeiling fan
5th rowCeiling fan

Common Values

ValueCountFrequency (%)
Ceiling fan 7715
57.0%
Pedestal fan/ Stand fan 4416
32.7%
Wall fan 643
 
4.8%
Table fan 466
 
3.4%
Other 236
 
1.7%
Exhaust fan 42
 
0.3%
Tower fan 7
 
0.1%

Length

2024-12-06T11:24:53.141061image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-06T11:24:53.236463image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
fan 17705
49.7%
ceiling 7715
21.6%
pedestal 4416
 
12.4%
stand 4416
 
12.4%
wall 643
 
1.8%
table 466
 
1.3%
other 236
 
0.7%
exhaust 42
 
0.1%
tower 7
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n 29836
15.1%
a 27688
14.0%
22121
11.2%
f 17705
9.0%
e 17256
8.7%
i 15430
7.8%
l 13883
7.0%
t 9110
 
4.6%
d 8832
 
4.5%
C 7715
 
3.9%
Other values (16) 27900
14.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 152998
77.5%
Space Separator 22121
 
11.2%
Uppercase Letter 17941
 
9.1%
Other Punctuation 4416
 
2.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 29836
19.5%
a 27688
18.1%
f 17705
11.6%
e 17256
11.3%
i 15430
10.1%
l 13883
9.1%
t 9110
 
6.0%
d 8832
 
5.8%
g 7715
 
5.0%
s 4458
 
2.9%
Other values (7) 1085
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
C 7715
43.0%
P 4416
24.6%
S 4416
24.6%
W 643
 
3.6%
T 473
 
2.6%
O 236
 
1.3%
E 42
 
0.2%
Space Separator
ValueCountFrequency (%)
22121
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 4416
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 170939
86.6%
Common 26537
 
13.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 29836
17.5%
a 27688
16.2%
f 17705
10.4%
e 17256
10.1%
i 15430
9.0%
l 13883
8.1%
t 9110
 
5.3%
d 8832
 
5.2%
C 7715
 
4.5%
g 7715
 
4.5%
Other values (14) 15769
9.2%
Common
ValueCountFrequency (%)
22121
83.4%
/ 4416
 
16.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 197476
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 29836
15.1%
a 27688
14.0%
22121
11.2%
f 17705
9.0%
e 17256
8.7%
i 15430
7.8%
l 13883
7.0%
t 9110
 
4.6%
d 8832
 
4.5%
C 7715
 
3.9%
Other values (16) 27900
14.1%
Distinct101
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.0688487
Minimum0
Maximum70
Zeros8917
Zeros (%)65.9%
Negative0
Negative (%)0.0%
Memory size105.8 KiB
2024-12-06T11:24:53.346615image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q37
95-th percentile42
Maximum70
Range70
Interquartile range (IQR)7

Descriptive statistics

Standard deviation14.860882
Coefficient of variation (CV)2.1023058
Kurtosis7.062147
Mean7.0688487
Median Absolute Deviation (MAD)0
Skewness2.6631086
Sum95606.178
Variance220.84581
MonotonicityNot monotonic
2024-12-06T11:24:53.681370image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8917
65.9%
14 778
 
5.8%
7 562
 
4.2%
21 463
 
3.4%
28 358
 
2.6%
70 311
 
2.3%
35 291
 
2.2%
1 221
 
1.6%
2 172
 
1.3%
42 142
 
1.0%
Other values (91) 1310
 
9.7%
ValueCountFrequency (%)
0 8917
65.9%
0.03 1
 
< 0.1%
0.083 1
 
< 0.1%
0.1 1
 
< 0.1%
0.14 1
 
< 0.1%
0.21 1
 
< 0.1%
0.25 24
 
0.2%
0.3 2
 
< 0.1%
0.33 11
 
0.1%
0.45 1
 
< 0.1%
ValueCountFrequency (%)
70 311
2.3%
68 1
 
< 0.1%
66.5 2
 
< 0.1%
66 2
 
< 0.1%
65 3
 
< 0.1%
64 4
 
< 0.1%
63 14
 
0.1%
60 27
 
0.2%
56 59
 
0.4%
54 7
 
0.1%
Distinct169
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.29176
Minimum0
Maximum98
Zeros4647
Zeros (%)34.4%
Negative0
Negative (%)0.0%
Memory size105.8 KiB
2024-12-06T11:24:53.801099image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median14
Q356
95-th percentile84
Maximum98
Range98
Interquartile range (IQR)56

Descriptive statistics

Standard deviation29.365074
Coefficient of variation (CV)1.0759685
Kurtosis-0.74286419
Mean27.29176
Median Absolute Deviation (MAD)14
Skewness0.71209119
Sum369121.05
Variance862.30755
MonotonicityNot monotonic
2024-12-06T11:24:53.916479image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4647
34.4%
56 1177
 
8.7%
70 707
 
5.2%
14 618
 
4.6%
28 523
 
3.9%
49 521
 
3.9%
35 507
 
3.7%
7 474
 
3.5%
21 469
 
3.5%
42 461
 
3.4%
Other values (159) 3421
25.3%
ValueCountFrequency (%)
0 4647
34.4%
0.1 3
 
< 0.1%
0.12 2
 
< 0.1%
0.17 1
 
< 0.1%
0.2 1
 
< 0.1%
0.25 19
 
0.1%
0.3 2
 
< 0.1%
0.33 5
 
< 0.1%
0.5 38
 
0.3%
0.6 1
 
< 0.1%
ValueCountFrequency (%)
98 385
2.8%
96 4
 
< 0.1%
95 1
 
< 0.1%
94.5 1
 
< 0.1%
94 4
 
< 0.1%
92 2
 
< 0.1%
91 35
 
0.3%
90 25
 
0.2%
88 4
 
< 0.1%
87.5 1
 
< 0.1%

Interactions

2024-12-06T11:24:51.641435image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-12-06T11:24:51.438378image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-12-06T11:24:51.726592image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-12-06T11:24:51.555639image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2024-12-06T11:24:53.987700image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
no_of_hours_fan_was_on_during_daytime_last_weekno_of_hours_fan_was_on_during_night_last_weekroom_IDtype_of_the_fan
no_of_hours_fan_was_on_during_daytime_last_week1.0000.2820.0370.035
no_of_hours_fan_was_on_during_night_last_week0.2821.0000.1170.119
room_ID0.0370.1171.0000.113
type_of_the_fan0.0350.1190.1131.000

Missing values

2024-12-06T11:24:51.829606image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-12-06T11:24:51.930451image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

household_IDroom_IDfan_IDtype_of_the_fanno_of_hours_fan_was_on_during_daytime_last_weekno_of_hours_fan_was_on_during_night_last_week
0ID0001I1I1_F1Ceiling fan0.00.0
1ID0001I2I2_F1Ceiling fan0.00.0
2ID0001I3I3_F1Ceiling fan0.00.0
3ID0001I4I4_F1Ceiling fan0.00.0
4ID0001I5I5_F1Ceiling fan0.00.0
5ID0002I1I1_F1Pedestal fan/ Stand fan0.014.0
6ID0002I1I1_F2Pedestal fan/ Stand fan14.014.0
7ID0002I2I2_F1Pedestal fan/ Stand fan14.02.0
8ID0002I2I2_F2Pedestal fan/ Stand fan2.02.0
9ID0002I3I3_F1Pedestal fan/ Stand fan0.014.0
household_IDroom_IDfan_IDtype_of_the_fanno_of_hours_fan_was_on_during_daytime_last_weekno_of_hours_fan_was_on_during_night_last_week
13515ID4058I6I6_F3Other0.00.0
13516ID4059I1I1_F1Pedestal fan/ Stand fan8.021.0
13517ID4059I3I3_F1Pedestal fan/ Stand fan14.063.0
13518ID4062I1I1_F1Ceiling fan42.00.0
13519ID4062I2I2_F1Pedestal fan/ Stand fan0.016.0
13520ID4062I3I3_F1Pedestal fan/ Stand fan0.00.0
13521ID4062I5I5_F1Ceiling fan7.00.0
13522ID4063I1I1_F1Ceiling fan18.018.0
13523ID4063I2I2_F1Ceiling fan6.010.0
13524ID4063I3I3_F1Pedestal fan/ Stand fan4.016.0