Overview

Brought to you by YData

Dataset statistics

Number of variables5
Number of observations13525
Missing cells0
Missing cells (%)0.0%
Duplicate rows1262
Duplicate rows (%)9.3%
Total size in memory1.1 MiB
Average record size in memory87.1 B

Variable types

Categorical1
Text1
Numeric3

Alerts

Dataset has 1262 (9.3%) duplicate rowsDuplicates
no_of_hours_fan_was_on_during_daytime_last_week has 8917 (65.9%) zeros Zeros
no_of_hours_fan_was_on_during_night_last_week has 4647 (34.4%) zeros Zeros

Reproduction

Analysis started2024-11-18 08:40:15.265559
Analysis finished2024-11-18 08:40:16.405529
Duration1.14 second
Software versionydata-profiling vv4.11.0
Download configurationconfig.json

Variables

room_ID
Categorical

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size727.4 KiB
I_1
3674 
I_2
3135 
I_3
2524 
I_4
1528 
I_5
893 
Other values (18)
1771 

Length

Max length7
Median length3
Mean length3.0521257
Min length3

Characters and Unicode

Total characters41280
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowI_1
2nd rowI_2
3rd rowI_3
4th rowI_4
5th rowI_5

Common Values

ValueCountFrequency (%)
I_1 3674
27.2%
I_2 3135
23.2%
I_3 2524
18.7%
I_4 1528
11.3%
I_5 893
 
6.6%
I_6 534
 
3.9%
I_7 332
 
2.5%
I_8 217
 
1.6%
I_9 178
 
1.3%
I_10 127
 
0.9%
Other values (13) 383
 
2.8%

Length

2024-11-18T14:10:16.480377image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
i_1 3674
27.2%
i_2 3135
23.2%
i_3 2524
18.7%
i_4 1528
11.3%
i_5 893
 
6.6%
i_6 534
 
3.9%
i_7 332
 
2.5%
i_8 217
 
1.6%
i_9 178
 
1.3%
i_10 127
 
0.9%
Other values (13) 383
 
2.8%

Most occurring characters

ValueCountFrequency (%)
_ 13590
32.9%
I 13525
32.8%
1 4254
 
10.3%
2 3213
 
7.8%
3 2595
 
6.3%
4 1557
 
3.8%
5 927
 
2.2%
6 553
 
1.3%
7 340
 
0.8%
8 224
 
0.5%
Other values (5) 502
 
1.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 41280
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
_ 13590
32.9%
I 13525
32.8%
1 4254
 
10.3%
2 3213
 
7.8%
3 2595
 
6.3%
4 1557
 
3.8%
5 927
 
2.2%
6 553
 
1.3%
7 340
 
0.8%
8 224
 
0.5%
Other values (5) 502
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 41280
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
_ 13590
32.9%
I 13525
32.8%
1 4254
 
10.3%
2 3213
 
7.8%
3 2595
 
6.3%
4 1557
 
3.8%
5 927
 
2.2%
6 553
 
1.3%
7 340
 
0.8%
8 224
 
0.5%
Other values (5) 502
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 41280
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
_ 13590
32.9%
I 13525
32.8%
1 4254
 
10.3%
2 3213
 
7.8%
3 2595
 
6.3%
4 1557
 
3.8%
5 927
 
2.2%
6 553
 
1.3%
7 340
 
0.8%
8 224
 
0.5%
Other values (5) 502
 
1.2%

fan_ID
Text

Distinct108
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size727.4 KiB
2024-11-18T14:10:16.648593image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length11
Median length7
Mean length7.0521996
Min length7

Characters and Unicode

Total characters95381
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)0.2%

Sample

1st rowI_1_F_1
2nd rowI_2_F_1
3rd rowI_3_F_1
4th rowI_4_F_1
5th rowI_5_F_1
ValueCountFrequency (%)
i_1_f_1 2913
21.5%
i_2_f_1 2896
21.4%
i_3_f_1 2365
17.5%
i_4_f_1 1397
10.3%
i_5_f_1 786
 
5.8%
i_1_f_2 609
 
4.5%
i_6_f_1 437
 
3.2%
i_7_f_1 251
 
1.9%
i_2_f_2 192
 
1.4%
i_8_f_1 167
 
1.2%
Other values (98) 1512
11.2%
2024-11-18T14:10:16.926331image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 40640
42.6%
1 15989
 
16.8%
I 13525
 
14.2%
F 13525
 
14.2%
2 4435
 
4.6%
3 2944
 
3.1%
4 1706
 
1.8%
5 975
 
1.0%
6 560
 
0.6%
7 346
 
0.4%
Other values (6) 736
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 95381
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
_ 40640
42.6%
1 15989
 
16.8%
I 13525
 
14.2%
F 13525
 
14.2%
2 4435
 
4.6%
3 2944
 
3.1%
4 1706
 
1.8%
5 975
 
1.0%
6 560
 
0.6%
7 346
 
0.4%
Other values (6) 736
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 95381
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
_ 40640
42.6%
1 15989
 
16.8%
I 13525
 
14.2%
F 13525
 
14.2%
2 4435
 
4.6%
3 2944
 
3.1%
4 1706
 
1.8%
5 975
 
1.0%
6 560
 
0.6%
7 346
 
0.4%
Other values (6) 736
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 95381
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
_ 40640
42.6%
1 15989
 
16.8%
I 13525
 
14.2%
F 13525
 
14.2%
2 4435
 
4.6%
3 2944
 
3.1%
4 1706
 
1.8%
5 975
 
1.0%
6 560
 
0.6%
7 346
 
0.4%
Other values (6) 736
 
0.8%

type_of_the_fan
Real number (ℝ)

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9236229
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size727.4 KiB
2024-11-18T14:10:17.011593image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile4
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.2253516
Coefficient of variation (CV)0.63700195
Kurtosis2.9916267
Mean1.9236229
Median Absolute Deviation (MAD)0
Skewness1.4538611
Sum26017
Variance1.5014864
MonotonicityNot monotonic
2024-11-18T14:10:17.089971image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1 7715
57.0%
3 4416
32.7%
2 643
 
4.8%
4 466
 
3.4%
7 236
 
1.7%
5 42
 
0.3%
6 7
 
0.1%
ValueCountFrequency (%)
1 7715
57.0%
2 643
 
4.8%
3 4416
32.7%
4 466
 
3.4%
5 42
 
0.3%
6 7
 
0.1%
7 236
 
1.7%
ValueCountFrequency (%)
7 236
 
1.7%
6 7
 
0.1%
5 42
 
0.3%
4 466
 
3.4%
3 4416
32.7%
2 643
 
4.8%
1 7715
57.0%
Distinct101
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.0688487
Minimum0
Maximum70
Zeros8917
Zeros (%)65.9%
Negative0
Negative (%)0.0%
Memory size727.4 KiB
2024-11-18T14:10:17.191827image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q37
95-th percentile42
Maximum70
Range70
Interquartile range (IQR)7

Descriptive statistics

Standard deviation14.860882
Coefficient of variation (CV)2.1023058
Kurtosis7.062147
Mean7.0688487
Median Absolute Deviation (MAD)0
Skewness2.6631086
Sum95606.178
Variance220.84581
MonotonicityNot monotonic
2024-11-18T14:10:17.309909image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8917
65.9%
14 778
 
5.8%
7 562
 
4.2%
21 463
 
3.4%
28 358
 
2.6%
70 311
 
2.3%
35 291
 
2.2%
1 221
 
1.6%
2 172
 
1.3%
42 142
 
1.0%
Other values (91) 1310
 
9.7%
ValueCountFrequency (%)
0 8917
65.9%
0.03 1
 
< 0.1%
0.083 1
 
< 0.1%
0.1 1
 
< 0.1%
0.14 1
 
< 0.1%
0.21 1
 
< 0.1%
0.25 24
 
0.2%
0.3 2
 
< 0.1%
0.33 11
 
0.1%
0.45 1
 
< 0.1%
ValueCountFrequency (%)
70 311
2.3%
68 1
 
< 0.1%
66.5 2
 
< 0.1%
66 2
 
< 0.1%
65 3
 
< 0.1%
64 4
 
< 0.1%
63 14
 
0.1%
60 27
 
0.2%
56 59
 
0.4%
54 7
 
0.1%
Distinct169
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.29176
Minimum0
Maximum98
Zeros4647
Zeros (%)34.4%
Negative0
Negative (%)0.0%
Memory size727.4 KiB
2024-11-18T14:10:17.429595image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median14
Q356
95-th percentile84
Maximum98
Range98
Interquartile range (IQR)56

Descriptive statistics

Standard deviation29.365074
Coefficient of variation (CV)1.0759685
Kurtosis-0.74286419
Mean27.29176
Median Absolute Deviation (MAD)14
Skewness0.71209119
Sum369121.05
Variance862.30755
MonotonicityNot monotonic
2024-11-18T14:10:17.547005image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4647
34.4%
56 1177
 
8.7%
70 707
 
5.2%
14 618
 
4.6%
28 523
 
3.9%
49 521
 
3.9%
35 507
 
3.7%
7 474
 
3.5%
21 469
 
3.5%
42 461
 
3.4%
Other values (159) 3421
25.3%
ValueCountFrequency (%)
0 4647
34.4%
0.1 3
 
< 0.1%
0.12 2
 
< 0.1%
0.17 1
 
< 0.1%
0.2 1
 
< 0.1%
0.25 19
 
0.1%
0.3 2
 
< 0.1%
0.33 5
 
< 0.1%
0.5 38
 
0.3%
0.6 1
 
< 0.1%
ValueCountFrequency (%)
98 385
2.8%
96 4
 
< 0.1%
95 1
 
< 0.1%
94.5 1
 
< 0.1%
94 4
 
< 0.1%
92 2
 
< 0.1%
91 35
 
0.3%
90 25
 
0.2%
88 4
 
< 0.1%
87.5 1
 
< 0.1%

Interactions

2024-11-18T14:10:15.958554image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:10:15.405103image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:10:15.677157image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:10:16.047705image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:10:15.495550image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:10:15.774107image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:10:16.142903image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:10:15.592076image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:10:15.869179image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Missing values

2024-11-18T14:10:16.252343image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-18T14:10:16.351468image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

room_IDfan_IDtype_of_the_fanno_of_hours_fan_was_on_during_daytime_last_weekno_of_hours_fan_was_on_during_night_last_week
household_ID
ID0001I_1I_1_F_110.00.0
ID0001I_2I_2_F_110.00.0
ID0001I_3I_3_F_110.00.0
ID0001I_4I_4_F_110.00.0
ID0001I_5I_5_F_110.00.0
ID0002I_1I_1_F_130.014.0
ID0002I_1I_1_F_2314.014.0
ID0002I_2I_2_F_1314.02.0
ID0002I_2I_2_F_232.02.0
ID0002I_3I_3_F_130.014.0
room_IDfan_IDtype_of_the_fanno_of_hours_fan_was_on_during_daytime_last_weekno_of_hours_fan_was_on_during_night_last_week
household_ID
ID4058I_6I_6_F_370.00.0
ID4059I_1I_1_F_138.021.0
ID4059I_3I_3_F_1314.063.0
ID4062I_1I_1_F_1142.00.0
ID4062I_2I_2_F_130.016.0
ID4062I_3I_3_F_130.00.0
ID4062I_5I_5_F_117.00.0
ID4063I_1I_1_F_1118.018.0
ID4063I_2I_2_F_116.010.0
ID4063I_3I_3_F_134.016.0

Duplicate rows

Most frequently occurring

room_IDfan_IDtype_of_the_fanno_of_hours_fan_was_on_during_daytime_last_weekno_of_hours_fan_was_on_during_night_last_week# duplicates
0I_1I_1_F_110.00.0673
643I_3I_3_F_110.00.0340
344I_2I_2_F_110.00.0329
878I_4I_4_F_110.00.0285
245I_1I_1_F_210.00.0267
1020I_5I_5_F_110.00.0204
745I_3I_3_F_130.00.0133
508I_2I_2_F_130.056.0126
1109I_6I_6_F_110.00.0123
477I_2I_2_F_130.00.0115