Overview

Brought to you by YData

Dataset statistics

Number of variables7
Number of observations48,489
Missing cells9,234
Missing cells (%)2.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.6 MiB
Average record size in memory56.0 B

Variable types

Text2
Categorical2
Numeric3

Alerts

type_of_the_bulb is highly imbalanced (67.1%) Imbalance
wattage_of_the_bulb has 9234 (19.0%) missing values Missing
wattage_of_the_bulb has 8576 (17.7%) zeros Zeros
no_of_hours_bulb_was_on_during_daytime_last_week has 42744 (88.2%) zeros Zeros
no_of_hours_bulb_was_on_during_night_last_week has 13926 (28.7%) zeros Zeros

Reproduction

Analysis started2024-12-06 05:54:46.469423
Analysis finished2024-12-06 05:54:48.195246
Duration1.73 second
Software versionydata-profiling vv4.11.0
Download configurationconfig.json

Variables

Distinct4054
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Memory size378.9 KiB
2024-12-06T11:24:48.367656image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters290,934
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowID0001
2nd rowID0001
3rd rowID0001
4th rowID0001
5th rowID0001
ValueCountFrequency (%)
id0469 228
 
0.5%
id2033 225
 
0.5%
id0278 209
 
0.4%
id0282 182
 
0.4%
id1589 171
 
0.4%
id1841 165
 
0.3%
id0399 162
 
0.3%
id0069 152
 
0.3%
id0901 150
 
0.3%
id2072 144
 
0.3%
Other values (4044) 46701
96.3%
2024-12-06T11:24:48.681341image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 48489
16.7%
D 48489
16.7%
0 30565
10.5%
1 27712
9.5%
2 25976
8.9%
3 22658
7.8%
4 15122
 
5.2%
6 14938
 
5.1%
7 14483
 
5.0%
8 14336
 
4.9%
Other values (2) 28166
9.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 193956
66.7%
Uppercase Letter 96978
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 30565
15.8%
1 27712
14.3%
2 25976
13.4%
3 22658
11.7%
4 15122
7.8%
6 14938
7.7%
7 14483
7.5%
8 14336
7.4%
9 14213
7.3%
5 13953
7.2%
Uppercase Letter
ValueCountFrequency (%)
I 48489
50.0%
D 48489
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 193956
66.7%
Latin 96978
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 30565
15.8%
1 27712
14.3%
2 25976
13.4%
3 22658
11.7%
4 15122
7.8%
6 14938
7.7%
7 14483
7.5%
8 14336
7.4%
9 14213
7.3%
5 13953
7.2%
Latin
ValueCountFrequency (%)
I 48489
50.0%
D 48489
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 290934
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 48489
16.7%
D 48489
16.7%
0 30565
10.5%
1 27712
9.5%
2 25976
8.9%
3 22658
7.8%
4 15122
 
5.2%
6 14938
 
5.1%
7 14483
 
5.0%
8 14336
 
4.9%
Other values (2) 28166
9.7%

room_ID
Categorical

Distinct32
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size378.9 KiB
I1
10095 
I2
5696 
I3
5097 
I4
4918 
I5
4649 
Other values (27)
18034 

Length

Max length6
Median length2
Mean length2.1524263
Min length2

Characters and Unicode

Total characters104,369
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowI1
2nd rowI1
3rd rowI1
4th rowI1
5th rowI2

Common Values

ValueCountFrequency (%)
I1 10095
20.8%
I2 5696
11.7%
I3 5097
10.5%
I4 4918
10.1%
I5 4649
9.6%
I6 4066
8.4%
I7 3443
 
7.1%
I8 2533
 
5.2%
I9 2033
 
4.2%
I10 1347
 
2.8%
Other values (22) 4612
9.5%

Length

2024-12-06T11:24:48.799980image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
i1 10095
20.8%
i2 5696
11.7%
i3 5097
10.5%
i4 4918
10.1%
i5 4649
9.6%
i6 4066
8.4%
i7 3443
 
7.1%
i8 2533
 
5.2%
i9 2033
 
4.2%
i10 1347
 
2.8%
Other values (22) 4612
9.5%

Most occurring characters

ValueCountFrequency (%)
I 48489
46.5%
1 16872
 
16.2%
2 6582
 
6.3%
3 5767
 
5.5%
4 5374
 
5.1%
5 5009
 
4.8%
6 4310
 
4.1%
7 3627
 
3.5%
8 2666
 
2.6%
9 2142
 
2.1%
Other values (4) 3531
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 53750
51.5%
Uppercase Letter 49199
47.1%
Lowercase Letter 1420
 
1.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 16872
31.4%
2 6582
 
12.2%
3 5767
 
10.7%
4 5374
 
10.0%
5 5009
 
9.3%
6 4310
 
8.0%
7 3627
 
6.7%
8 2666
 
5.0%
9 2142
 
4.0%
0 1401
 
2.6%
Uppercase Letter
ValueCountFrequency (%)
I 48489
98.6%
O 710
 
1.4%
Lowercase Letter
ValueCountFrequency (%)
t 710
50.0%
h 710
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 53750
51.5%
Latin 50619
48.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 16872
31.4%
2 6582
 
12.2%
3 5767
 
10.7%
4 5374
 
10.0%
5 5009
 
9.3%
6 4310
 
8.0%
7 3627
 
6.7%
8 2666
 
5.0%
9 2142
 
4.0%
0 1401
 
2.6%
Latin
ValueCountFrequency (%)
I 48489
95.8%
O 710
 
1.4%
t 710
 
1.4%
h 710
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104369
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 48489
46.5%
1 16872
 
16.2%
2 6582
 
6.3%
3 5767
 
5.5%
4 5374
 
5.1%
5 5009
 
4.8%
6 4310
 
4.1%
7 3627
 
3.5%
8 2666
 
2.6%
9 2142
 
2.1%
Other values (4) 3531
 
3.4%
Distinct412
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size378.9 KiB
2024-12-06T11:24:49.004955image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length9
Median length5
Mean length5.1832581
Min length5

Characters and Unicode

Total characters251,331
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)0.2%

Sample

1st rowI1_L1
2nd rowI1_L2
3rd rowI1_L3
4th rowI1_L4
5th rowI2_L1
ValueCountFrequency (%)
i1_l1 4017
 
8.3%
i2_l1 3928
 
8.1%
i3_l1 3797
 
7.8%
i4_l1 3683
 
7.6%
i5_l1 3395
 
7.0%
i6_l1 2877
 
5.9%
i7_l1 2200
 
4.5%
i1_l2 2160
 
4.5%
i8_l1 1567
 
3.2%
i1_l3 1237
 
2.6%
Other values (402) 19628
40.5%
2024-12-06T11:24:49.323397image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/