Overview

Brought to you by YData

Dataset statistics

Number of variables6
Number of observations48489
Missing cells0
Missing cells (%)0.0%
Duplicate rows5119
Duplicate rows (%)10.6%
Total size in memory3.6 MiB
Average record size in memory77.8 B

Variable types

Categorical1
Text1
Numeric4

Alerts

Dataset has 5119 (10.6%) duplicate rowsDuplicates
wattage_of_the_bulb has 8576 (17.7%) zeros Zeros
no_of_hours_bulbs_was_on_during_daytime_last_week has 42744 (88.2%) zeros Zeros
no_of_hours_bulbs_was_on_during_night_last_week has 13926 (28.7%) zeros Zeros

Reproduction

Analysis started2024-11-18 08:39:21.367361
Analysis finished2024-11-18 08:39:23.814095
Duration2.45 seconds
Software versionydata-profiling vv4.11.0
Download configurationconfig.json

Variables

room_ID
Categorical

Distinct32
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
I_1
10095 
I_2
5696 
I_3
5097 
I_4
4918 
I_5
4649 
Other values (27)
18034 

Length

Max length8
Median length3
Mean length3.1670688
Min length3

Characters and Unicode

Total characters153568
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowI_1
2nd rowI_1
3rd rowI_1
4th rowI_1
5th rowI_2

Common Values

ValueCountFrequency (%)
I_1 10095
20.8%
I_2 5696
11.7%
I_3 5097
10.5%
I_4 4918
10.1%
I_5 4649
9.6%
I_6 4066
8.4%
I_7 3443
 
7.1%
I_8 2533
 
5.2%
I_9 2033
 
4.2%
I_10 1347
 
2.8%
Other values (22) 4612
9.5%

Length

2024-11-18T14:09:23.872983image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
i_1 10095
20.8%
i_2 5696
11.7%
i_3 5097
10.5%
i_4 4918
10.1%
i_5 4649
9.6%
i_6 4066
8.4%
i_7 3443
 
7.1%
i_8 2533
 
5.2%
i_9 2033
 
4.2%
i_10 1347
 
2.8%
Other values (22) 4612
9.5%

Most occurring characters

ValueCountFrequency (%)
_ 49199
32.0%
I 48489
31.6%
1 16872
 
11.0%
2 6582
 
4.3%
3 5767
 
3.8%
4 5374
 
3.5%
5 5009
 
3.3%
6 4310
 
2.8%
7 3627
 
2.4%
8 2666
 
1.7%
Other values (5) 5673
 
3.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 153568
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
_ 49199
32.0%
I 48489
31.6%
1 16872
 
11.0%
2 6582
 
4.3%
3 5767
 
3.8%
4 5374
 
3.5%
5 5009
 
3.3%
6 4310
 
2.8%
7 3627
 
2.4%
8 2666
 
1.7%
Other values (5) 5673
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 153568
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
_ 49199
32.0%
I 48489
31.6%
1 16872
 
11.0%
2 6582
 
4.3%
3 5767
 
3.8%
4 5374
 
3.5%
5 5009
 
3.3%
6 4310
 
2.8%
7 3627
 
2.4%
8 2666
 
1.7%
Other values (5) 5673
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 153568
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
_ 49199
32.0%
I 48489
31.6%
1 16872
 
11.0%
2 6582
 
4.3%
3 5767
 
3.8%
4 5374
 
3.5%
5 5009
 
3.3%
6 4310
 
2.8%
7 3627
 
2.4%
8 2666
 
1.7%
Other values (5) 5673
 
3.7%
Distinct412
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
2024-11-18T14:09:24.124974image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length12
Median length7
Mean length7.1979006
Min length7

Characters and Unicode

Total characters349019
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)0.2%

Sample

1st rowI_1_L_1
2nd rowI_1_L_2
3rd rowI_1_L_3
4th rowI_1_L_4
5th rowI_2_L_1
ValueCountFrequency (%)
i_1_l_1 4017
 
8.3%
i_2_l_1 3928
 
8.1%
i_3_l_1 3797
 
7.8%
i_4_l_1 3683
 
7.6%
i_5_l_1 3395
 
7.0%
i_6_l_1 2877
 
5.9%
i_7_l_1 2200
 
4.5%
i_1_l_2 2160
 
4.5%
i_8_l_1 1567
 
3.2%
i_1_l_3 1237
 
2.6%
Other values (402) 19628
40.5%
2024-11-18T14:09:24.483141image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 146177
41.9%
I 48489
 
13.9%
L 48489
 
13.9%
1 48408
 
13.9%
2 14577
 
4.2%
3 9281
 
2.7%
4 7662
 
2.2%
5 6479
 
1.9%
6 5337
 
1.5%
7 4379
 
1.3%
Other values (6) 9741
 
2.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 349019
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
_ 146177
41.9%
I 48489
 
13.9%
L 48489
 
13.9%
1 48408
 
13.9%
2 14577
 
4.2%
3 9281
 
2.7%
4 7662
 
2.2%
5 6479
 
1.9%
6 5337
 
1.5%
7 4379
 
1.3%
Other values (6) 9741
 
2.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 349019
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
_ 146177
41.9%
I 48489
 
13.9%
L 48489
 
13.9%
1 48408
 
13.9%
2 14577
 
4.2%
3 9281
 
2.7%
4 7662
 
2.2%
5 6479
 
1.9%
6 5337
 
1.5%
7 4379
 
1.3%
Other values (6) 9741
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 349019
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
_ 146177
41.9%
I 48489
 
13.9%
L 48489
 
13.9%
1 48408
 
13.9%
2 14577
 
4.2%
3 9281
 
2.7%
4 7662
 
2.2%
5 6479
 
1.9%
6 5337
 
1.5%
7 4379
 
1.3%
Other values (6) 9741
 
2.8%

type_of_the_bulb
Real number (ℝ)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.862938
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2024-11-18T14:09:24.577203image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median3
Q33
95-th percentile3
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.75761391
Coefficient of variation (CV)0.26462812
Kurtosis26.731934
Mean2.862938
Median Absolute Deviation (MAD)0
Skewness3.373147
Sum138821
Variance0.57397884
MonotonicityNot monotonic
2024-11-18T14:09:24.668763image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
3 37862
78.1%
2 8139
 
16.8%
1 1193
 
2.5%
5 398
 
0.8%
4 296
 
0.6%
9 228
 
0.5%
6 212
 
0.4%
8 139
 
0.3%
7 22
 
< 0.1%
ValueCountFrequency (%)
1 1193
 
2.5%
2 8139
 
16.8%
3 37862
78.1%
4 296
 
0.6%
5 398
 
0.8%
6 212
 
0.4%
7 22
 
< 0.1%
8 139
 
0.3%
9 228
 
0.5%
ValueCountFrequency (%)
9 228
 
0.5%
8 139
 
0.3%
7 22
 
< 0.1%
6 212
 
0.4%
5 398
 
0.8%
4 296
 
0.6%
3 37862
78.1%
2 8139
 
16.8%
1 1193
 
2.5%

wattage_of_the_bulb
Real number (ℝ)

Zeros 

Distinct71
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean204.3204
Minimum0
Maximum999
Zeros8576
Zeros (%)17.7%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2024-11-18T14:09:24.784213image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median10
Q340
95-th percentile999
Maximum999
Range999
Interquartile range (IQR)35

Descriptive statistics

Standard deviation390.34709
Coefficient of variation (CV)1.9104656
Kurtosis0.36126863
Mean204.3204
Median Absolute Deviation (MAD)7
Skewness1.5316966
Sum9907292
Variance152370.85
MonotonicityNot monotonic
2024-11-18T14:09:24.903641image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
999 9234
19.0%
0 8576
17.7%
5 5248
10.8%
12 5205
10.7%
7 3623
 
7.5%
9 2483
 
5.1%
15 2136
 
4.4%
10 1536
 
3.2%
8 1193
 
2.5%
6 1059
 
2.2%
Other values (61) 8196
16.9%
ValueCountFrequency (%)
0 8576
17.7%
1 80
 
0.2%
2 378
 
0.8%
3 479
 
1.0%
3.5 301
 
0.6%
4 86
 
0.2%
5 5248
10.8%
5.5 237
 
0.5%
6 1059
 
2.2%
7 3623
7.5%
ValueCountFrequency (%)
999 9234
19.0%
908 1
 
< 0.1%
900 179
 
0.4%
675 64
 
0.1%
250 1
 
< 0.1%
168 1
 
< 0.1%
165 255
 
0.5%
150 4
 
< 0.1%
125 1
 
< 0.1%
123 144
 
0.3%
Distinct123
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6227784
Minimum0
Maximum70
Zeros42744
Zeros (%)88.2%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2024-11-18T14:09:25.014199image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile10
Maximum70
Range70
Interquartile range (IQR)0

Descriptive statistics

Standard deviation7.0569084
Coefficient of variation (CV)4.3486581
Kurtosis51.180155
Mean1.6227784
Median Absolute Deviation (MAD)0
Skewness6.5893157
Sum78686.902
Variance49.799956
MonotonicityNot monotonic
2024-11-18T14:09:25.130278image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 42744
88.2%
7 1087
 
2.2%
14 701
 
1.4%
21 539
 
1.1%
1 492
 
1.0%
2 321
 
0.7%
10 264
 
0.5%
70 218
 
0.4%
35 196
 
0.4%
3 173
 
0.4%
Other values (113) 1754
 
3.6%
ValueCountFrequency (%)
0 42744
88.2%
0.033 1
 
< 0.1%
0.05 3
 
< 0.1%
0.1 5
 
< 0.1%
0.12 1
 
< 0.1%
0.125 1
 
< 0.1%
0.175 1
 
< 0.1%
0.2 2
 
< 0.1%
0.21 2
 
< 0.1%
0.25 112
 
0.2%
ValueCountFrequency (%)
70 218
0.4%
69 1
 
< 0.1%
66.5 4
 
< 0.1%
66 1
 
< 0.1%
65 1
 
< 0.1%
63 8
 
< 0.1%
60 18
 
< 0.1%
57 1
 
< 0.1%
56 19
 
< 0.1%
54 2
 
< 0.1%
Distinct294
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.515142
Minimum0
Maximum98
Zeros13926
Zeros (%)28.7%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2024-11-18T14:09:25.251194image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median7
Q328
95-th percentile59.5
Maximum98
Range98
Interquartile range (IQR)28

Descriptive statistics

Standard deviation20.584469
Coefficient of variation (CV)1.2463997
Kurtosis2.9408673
Mean16.515142
Median Absolute Deviation (MAD)7
Skewness1.678896
Sum800802.71
Variance423.72035
MonotonicityNot monotonic
2024-11-18T14:09:25.373286image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 13926
28.7%
28 6070
12.5%
7 3642
 
7.5%
14 3567
 
7.4%
21 3407
 
7.0%
35 2355
 
4.9%
1 1531
 
3.2%
2 1006
 
2.1%
3.5 923
 
1.9%
42 911
 
1.9%
Other values (284) 11151
23.0%
ValueCountFrequency (%)
0 13926
28.7%
0.00083 2
 
< 0.1%
0.0023 1
 
< 0.1%
0.025 2
 
< 0.1%
0.03 2
 
< 0.1%
0.033 1
 
< 0.1%
0.05 8
 
< 0.1%
0.066 1
 
< 0.1%
0.075 1
 
< 0.1%
0.083 1
 
< 0.1%
ValueCountFrequency (%)
98 396
0.8%
97 1
 
< 0.1%
96 6
 
< 0.1%
95 7
 
< 0.1%
94.5 1
 
< 0.1%
92 1
 
< 0.1%
91 143
 
0.3%
90 31
 
0.1%
88 1
 
< 0.1%
87.5 2
 
< 0.1%

Interactions

2024-11-18T14:09:23.180629image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:21.717107image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:22.130513image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:22.530743image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:23.287451image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:21.822875image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:22.234019image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:22.641101image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:23.385743image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:21.922428image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:22.327954image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:22.742106image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:23.491265image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:22.028983image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:22.429255image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-18T14:09:22.848767image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Missing values

2024-11-18T14:09:23.605981image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-18T14:09:23.723338image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

room_IDlight_IDtype_of_the_bulbwattage_of_the_bulbno_of_hours_bulbs_was_on_during_daytime_last_weekno_of_hours_bulbs_was_on_during_night_last_week
household_ID
ID0001I_1I_1_L_135.00.002.00
ID0001I_1I_1_L_235.00.002.00
ID0001I_1I_1_L_335.00.002.00
ID0001I_1I_1_L_435.00.002.00
ID0001I_2I_2_L_125.00.000.25
ID0001I_3I_3_L_125.00.000.25
ID0001I_4I_4_L_125.00.000.25
ID0001I_5I_5_L_125.00.250.00
ID0001I_6I_6_L_125.02.001.00
ID0001I_7I_7_L_125.00.002.00
room_IDlight_IDtype_of_the_bulbwattage_of_the_bulbno_of_hours_bulbs_was_on_during_daytime_last_weekno_of_hours_bulbs_was_on_during_night_last_week
household_ID
ID4062I_3I_3_L_137.00.00.0
ID4062I_4I_4_L_137.028.028.0
ID4062I_6I_6_L_137.00.07.0
ID4062I_7I_7_L_137.00.00.0
ID4062I_7I_7_L_237.00.00.0
ID4063I_1I_1_L_137.00.035.0
ID4063I_2I_2_L_137.00.02.0
ID4063I_3I_3_L_137.00.04.0
ID4063I_7I_7_L_137.00.02.0
ID4063I_8I_8_L_137.04.020.0

Duplicate rows

Most frequently occurring

room_IDlight_IDtype_of_the_bulbwattage_of_the_bulbno_of_hours_bulbs_was_on_during_daytime_last_weekno_of_hours_bulbs_was_on_during_night_last_week# duplicates
434I_1I_1_L_230.00.00.0287
600I_1I_1_L_330.00.00.0230
305I_1I_1_L_13999.00.028.0210
694I_1I_1_L_430.00.00.0193
218I_1I_1_L_1312.00.028.0142
750I_1I_1_L_530.00.00.0130
1955I_2I_2_L_230.00.00.0124
528I_1I_1_L_2312.00.00.0115
3066I_4I_4_L_230.00.00.0110
4409I_7I_7_L_230.00.00.0109